Commit graph

24 commits

Author SHA1 Message Date
Philipp Wagner
bf5dd7ec15 Icache: It's not a draft any more
The Icache might not be fully perfect, but it's certainly not a draft
any more.
2020-07-03 14:39:48 +01:00
Rupert Swarbrick
49c9113d03 Formal protocol checking for icache <-> core interface
This turns out to be quite complicated, because the icache has a lot
of different counters that all track addresses or fill buffer state.
For an inductive proof to go through, you need to make the relations
between them explicit, which takes lots of assertions.

All of the signals defined in the formal_tb are prefixed with 'f_'.
This isn't strictly necessary, but it makes it much easier to see what
came from the design (since we are "bound in", our ports don't have _o
or _i suffixes).

We add a couple of protocol assumptions:

ICache <-> Core:

  - Branch target addresses are half-word aligned
  - The branch_spec signal is driven if branch is

ICache <-> Memory:

  - The bus doesn't respond unless there is an outstanding request
  - The bus doesn't grant more than 4 billion outstanding requests(!)

There's also some protocol state tracking:

  - f_addr_valid tracks whether the ICache currently has an
    architectural address. It goes high with branch_i (which gives the
    cache an address) and goes low when the cache completes a
    transaction with err_o set (since the data is bad, there's no
    notion of a "next address").

  - f_reqs_on_bus tracks the number of requests currently outstanding
    on the bus. This is used for the ICache <-> Memory assumptions
    above. We have some internal assertions that check this equals the
    sum of the "ext" counters minus the sum of the "rvd" counters.

With these assumptions, we can prove:

  - Once asserted, valid_o stays high until a transaction is completed
    by ready_i or until branch_i is asserted (which cancels the
    transaction).

  - While the transaction is pending, addr_o remains stable.

  - While the transaction is pending, err_o remains stable and, if
    err_o is asserted, so does err_plus2_o.

  - While the transaction is pending, if err_o and err_plus2_o are
    high then bottom 16 bits of the returned instruction data imply an
    uncompressed instruction.

  - While the transaction is pending, if err_o is low then the bottom
    16 bits of the returned instruction remain stable.

  - While the transaction is pending, if err_o is low and the bottom
    16 bits of the returned instruction imply an uncompressed
    instruction then the top 16 bits of the returned instruction
    remain stable.
2020-07-02 15:19:11 +01:00
Rupert Swarbrick
ee1ca61fe4 A simple formal flow for the ICache based on SymbiYosys
To get this working, you need a corresponding patch in Edalize, which
adds SymbiYosys as an EDA tool.

At the moment, this proves a couple of simple bus assertions. Later
patches will add more.

There are currently some rough edges to this flow:

  (1) We use a hacky pre_build hook to run sv2v and edit the files in
      the work tree. Among other problems, this means that the any
      failure messages that come out of sby have bogus line numbers.

  (2) Since we haven't yet got bind support in Yosys, we have to
      include a fragment from the design itself.
2020-07-02 15:19:11 +01:00
Tom Roberts
71b3474781 [rtl] Fix icache xprop issue
- invalidate all ways on a tag error to prevent xprop from the data
  error signal and reduce the likelyhood of multi-way allocations

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-06-25 09:46:35 +01:00
Rupert Swarbrick
a247cd45e9 Add some basic protocol checking to the icache's RAM interface
Since we are binding in an interface anyway, we can add some SV
assertions to make sure nothing too strange is happening.

Note that they aren't as strong as you might expect: we don't check
that rdata isn't X, for example. This is because the cache makes
speculative reads, which it (hopefully) ignores if the data is
invalid.
2020-06-22 10:37:34 +01:00
Tom Roberts
d79eb58ae5 [rtl] Simplify I$ ECC error handling
- Remove the timing optimisations that delay the factoring-in of ecc
  errors into valid_o.
- Optimisations are probably unnecessary here due to the minimal logic
  hanging off valid_o, and the optimisations cause protocol checker
  violations.

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-06-12 14:04:30 +01:00
Philipp Wagner
dd39ec0c91 Optimize use of RAM primitive in icache
The RAM primitive provides a way to specify the granularity of the write
mask (wmask) signal, which can be used to select an appropriate
implementation (e.g. a SRAM with only byte selects, or no subselects at
all).
2020-05-27 11:38:33 +01:00
Philipp Wagner
8b42024cd5 Use vendored-in primitives from OpenTitan
Instead of using copies of primitives from OpenTitan, vendor the files
in directly from OpenTitan, and use them.

Benefits:

- Less potential for diverging code between OpenTitan and Ibex, causing
  problems when importing Ibex into OT.

- Use of the abstract primitives instead of the generic ones. The
  abstract primitives are replaced during synthesis time with
  target-dependent implementations. For simulation, nothing changes. For
  synthesis for a given target technology (e.g. a specific ASIC or FPGA
  technology), the primitives system can be instructed to choose
  optimized versions (if available).

  This is most relevant for the icache, which hard-coded the generic
  SRAM primitive before. This primitive is always implemented as
  registers. By using the abstract primitive (prim_ram_1p) instead, the
  RAMs can be replaced with memory-compiler-generated ones if necessary.

There are no real draw-backs, but a couple points to be aware of:

- Our ram_1p and ram_2p implementations are kept as wrapper around the
  primitives, since their interface deviates slightly from the one in
  prim_ram*. This also includes a rather unfortunate naming confusion
  around rvalid, which means "read data valid" in the OpenTitan advanced
  RAM primitives (prim_ram_1p_adv for example), but means "ack" in
  PULP-derived IP and in our bus implementation.

- The core_ibex UVM DV doesn't use FuseSoC to generate its file list,
  but uses a hard-coded list in `ibex_files.f` instead. Since the
  dynamic primitives system requires the use of FuseSoC we need to
  provide a stop-gap until this file is removed. Issue #893 tracks
  progress on that.

- Dynamic primitives depend no a not-yet-merged feature of FuseSoC
  (https://github.com/olofk/fusesoc/pull/391). We depend on the same
  functionality in OpenTitan and have instructed users to use a patched
  branch of FuseSoC for a long time through `python-requirements.txt`,
  so no action is needed for users which are either successfully
  interacting with the OpenTitan source code, or have followed our
  instructions. All other users will see a reasonably descriptive error
  message during a FuseSoC run.

- This commit is massive, but there are no good ways to split it into
  bisectable, yet small, chunks. I'm sorry. Reviewers can safely ignore
  all code in `vendor/lowrisc_ip`, it's an import from OpenTitan.

- The check_tool_requirements tooling isn't easily vendor-able from
  OpenTitan at the moment. I've filed
  https://github.com/lowRISC/opentitan/issues/2309 to get that sorted.

- The LFSR primitive doesn't have a own core file, forcing us to include
  the catch-all `lowrisc:prim:all` core. I've filed
  https://github.com/lowRISC/opentitan/issues/2310 to get that sorted.
2020-05-27 10:23:15 +01:00
Tom Roberts
12b39476c0 [rtl] Add speculative branch signal
- Drive a speculative version of the branch signal into the IF stage to
  drive address muxing
- The speculative signal is the same as the regular branch signal but
  assumes all conditional branches are taken
- This breaks the timing path from branch condition calculation into
  address muxing (and therefore PMP error calculation)
- When the branch is not taken, any external request we might otherwise
  have made is suppressed
- This has a minor performance cost (0.8% without I$, ~0% with I$)

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-26 09:41:37 +01:00
Tom Roberts
84c5373c27 [rtl] Various small icache bugfixes
- Remove any ready -> valid dependency by allowing the skid buffer to
  accept data when the core is not ready
- Tighten-up behaviour around invalidations and cache enable/disable
- Remove xprop through output_compressed from invalid data when driving errors
- Make behaviour more consistent where speculative requests return
  different data/error conditions to existing cache hit

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-26 09:29:36 +01:00
Tom Roberts
c862f104af [rtl] icache error signalling fix
- Data valid should only be signalled when the current beat is
  signalling an error
- PMP errors for future beats can sneak in while waiting for the
  current beat

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-04 09:16:55 +01:00
Tom Roberts
36b05e2ebf [rtl/icache] Fix typo in ram req
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-04-27 13:52:37 +01:00
Tom Roberts
eb4913c8f0 [rtl/icache] Fix some issues in icache
- valid_o needs qualifying with output_error to mask X's in the output
  data.
- err_plus2_o should not depend on the output data, only the alignment and
  error status are relevant. Also needs skid_valid_q to mask
  skid_error_q which is not reset.

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-04-24 14:11:34 +01:00
Tom Roberts
a3a1f9f40a [rtl] Fix icache PMP error handling
- Requests receiving a PMP error need to output a valid indicator, even
  though they will not have received any beats of data

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-04-20 16:05:19 +01:00
Tom Roberts
3c561e4106 [rtl/icache] Fix an inconsistency in data output
- valid_o could be asserted for one cycle then dropped when receiving
  rvalid data for a request which has branched into  the middle of a
  line.
- This fix keeps valid_o asserted by using the offset version of
  fill_rvd_cnt_q (fill_rvd_beat) to compare against fill_out_cnt_q
  (which is also offset by the branch).

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-03-24 13:47:57 +00:00
Tom Roberts
0a540cffb3 [rtl/icache] Fix a couple of icache bugs
- Speculative requests observing a PMP error shouldn't increment the
  external request counter
- Remove redundant logic on fill_rvd_exp

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-03-23 12:57:54 +00:00
Tom Roberts
c054a63c3d [rtl] Instantiate instruction cache
- Add parameters and actual instantiation of icache
- Add a custom CSR in the M-mode custom RW range to enable the cache
- Wire up the cache invalidation signal to trigger on fence.i

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-03-23 12:57:31 +00:00
Tom Roberts
42aa761c5d [rtl] Fix mtval for unaligned instr errors
mtval should record which half of the instruction caused the error
rather than just recording the PC.
An extra signal is added in the IF stage to indicate when an error is
caused by the second half of an unaligned instruction. This signal is
then used to increment the PC by 2 for mtval capture on an error.

Fixes #709
2020-03-18 12:53:35 +00:00
Tom Roberts
8bb649e4ab [rtl/icache] Fix PMP error logic
Instruction requests triggering PMP errors have their external request
suppressed. The beat counting logic therefore needs to know that these
requests will never receive any rvalid data responses.

This fix stops the external request counter from incrementing, and marks
all external requests complete as soon as any error is received.

The data in the cache line beyond the error is not required since the
core cannot access it without consuming the error first.
2020-03-18 12:53:09 +00:00
Tom Roberts
ef17d4fcc2 [rtl] Add Icache ECC
- Add modules for ecc generation and checking
- Add supporting logic to icache module

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-03-18 11:28:06 +00:00
Tom Roberts
fe00eb46e9 [rtl] Icache RAM primitive changes
- Bring in a version of ram primitive with configurable width similar to
  the OT RAM primitive.
- Change the RAM banking structure to be a single bank of LineSize (64
  bits) to match the upcoming ECC granularity.

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-03-18 11:28:06 +00:00
Tom Roberts
854faeda39 [rtl/icache] Make age matrix more consistent
The design currently relies on fill_done remaining set in the cycle
after the fill buffer completes to ensure the fill_older_q entry gets
cleared (when a fill buffer completes in the same cycle that one is
allocated). This fix makes the behaviour a bit more consistent and easy
to reason about.
2020-03-16 13:12:19 +00:00
Rupert Swarbrick
0a0a18c2cb Notes on the ICache specification
This also adds a couple of comments splitting up the ports in
ibex_icache.sv that I found helpful when working out what everything
did.
2020-03-13 13:13:19 +00:00
Tom Roberts
82ebf6fd20 [I-Cache] Initial commit of prototype RTL
- Working prototype of RTL
- Initial documentation
- Still some TODOs to be dealt with
2020-03-06 16:34:48 +00:00