Commit graph

522 commits

Author SHA1 Message Date
Rupert Swarbrick
8a145a9330 Enable ECC in ICache tests
Eventually we probably want to make this configurable but, for now,
enable it unconditionally.
2020-06-16 09:28:57 +01:00
Rupert Swarbrick
2569a63eb7 Drive the branch_spec line in ICache UVM tests
This signal already got driven (to 1) when signalling a branch with
the interface's branch_to task. This patch now drives the branch_spec
line occasionally even if we don't actually do a branch. (One cycle in
64, for now).
2020-06-16 09:19:30 +01:00
Bert Pieters
632ebcfe86 [ibex/dv] add Questa support
Signed-off-by: Bert Pieters <bert.pieters@gmail.com>
2020-06-15 11:06:21 +01:00
Rupert Swarbrick
ec6e08d1cf Simplify timestamps in ICache tests 2020-06-12 16:15:11 +01:00
Udi
976e7edcda [ibex/dv] Add several PMP tests
Signed-off-by: Udi <udij@google.com>
2020-06-09 12:52:10 -07:00
Rupert Swarbrick
13135e2e70 Collect transaction functional coverage for ICache-Mem iface
This single covergroup was extracted from the icache
documentation (icache.rst).
2020-06-09 12:03:24 +01:00
Rupert Swarbrick
a581981271 Collect transaction functional coverage for ICache-Core iface
These cover points were extracted by reading down the icache
documentation (icache.rst). There aren't yet cover points to check
that the targets of the testplan were executed properly, nor are there
any uarch coverpoints (which would be bound into the design, rather
than the interface).

The rather elaborate flow of

   sequence -> function -> trigger -> task -> covergroup

for cancelled_valid_cg follows a skeleton described in Doug Smith, "A
Practical Look @ SystemVerilog Coverage" (slides from a Doulos
course). I'm not completely convinced it's worth the effort, but I
guess it shows how to extract information from a temporal sequence in
the interface and shove it in a covergroup properly via the monitor.
2020-06-09 12:03:24 +01:00
Rupert Swarbrick
7844ed105d Wire in the branch_spec signal properly in ICache testbench
This should have no functional change - it's still set iff branch is
set - but the logic now lies in the UVM code, rather than the
structural code in tb.sv.
2020-06-09 09:11:37 +01:00
Rupert Swarbrick
d0323618fe Add a "many_errors" icache test sequence 2020-06-08 16:58:34 +01:00
Rupert Swarbrick
a15ec69dbd Add backward_line icache test sequence
This turns out to be reasonably easy to plumb in: derive from the core
sequence base class, overriding its run_req method (once I've
remembered to make it virtual). Then pick the right core sequence by
adding a factory override in the vseq.
2020-06-08 16:54:01 +01:00
Tom Roberts
8332f7de9d [rtl] Rewrite perf counters to be Yosys compatible
- Yosys doesn't like unpacked array partial slicing
- Instantiate counters individually instead

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-06-08 11:20:09 +01:00
Rupert Swarbrick
33ad42debb Spelling fix: seperate -> separate 2020-06-05 11:37:37 +01:00
Rupert Swarbrick
de05509cb6 Add test to check that disabling doesn't invalidate the icache
This is an entry in the testplan. Renaming it to "oldval", because
suffixing every class name with "disable_without_invalidation" was
getting ridiculous.
2020-06-05 09:22:43 +01:00
Rupert Swarbrick
f53d5ac645 Add the "invalidation" icache test sequence
Also move the shared "run_opts" options to top-level in
ibex_icache_sim_cfg.hjson.
2020-06-05 09:22:43 +01:00
Rupert Swarbrick
a56f14e060 Handle a seed change that clears a PMP error in icache memory driver
When the existing code in drive_pmp() decided that an error needed
signalling, it waited until the request was dropped, or the address
changed, before clearing the PMP error.

This is fine, unless the memory seed is changed (by magical means!)
under our feet. The monitor spots a new request, but the driver needs
to know to clear the PMP error. This patch forcibly tells the driver
to drop the existing item if a new one comes in.
2020-06-04 08:09:51 +01:00
Rupert Swarbrick
608ad2dfbc Spot repeated requests with same address in ICache memory monitor
Without this, you get test failures if there are two back-to-back
branches to the same address that happen at the same time as a seed
update. The problem is that you only see one request transaction (with
the first seed), and the two memory responses both come back with the
first seed, when the second should have had the second seed.
2020-06-04 08:09:51 +01:00
Rupert Swarbrick
e79e6b58ca Make sure we don't see multi-way hits in icache testbench
One aspect of (i)cache design that I didn't know about before writing
test code for this block is the problem of multi-way hits. The icache,
as implemented, stores data to parallel ways and it's possible for a
fetch to match more than one way. The data from matching ways all gets
ORed together, which doesn't matter so long as it never
changes (because V | V == V for all V).

Of course, things go poorly if you have two different values, V and W,
at an address which are both stored in the cache. Then the result is V
| W, which isn't necessarily equal to either instruction.

Avoiding this needs priority encoders, which are rather large, so it
seems the usual approach is to disallow branching to modified code
before flushing the cache. This patch teaches the testbench to do this
properly.

Sadly, this means there's now a connection between the core agent and
the memory agent: the memory agent can no longer generate new seeds
whenever it pleases.
2020-06-04 08:09:51 +01:00
Rupert Swarbrick
2c195c591e Reorder check in icache scoreboard for more helpful error messages
The test is the same, but the reordering means that if we see an error
that we weren't expecting, we'll complain about that, rather than
about the instruction data itself.
2020-06-04 08:09:51 +01:00
Rupert Swarbrick
deb163c544 Add a "caching" sequence for ICache testing 2020-06-02 09:22:14 +01:00
Rupert Swarbrick
0577104c98 Track how well the icache caches tight loops in the scoreboard
In practice, this check will only trigger if you constrain your core
to fetch in a tight loop for a while and you don't invalidate the
cache very often.

The check has an assumption about the cache size (at least 1kB), but
that only has an effect on the tightness of the loop needed before we
do any checking.
2020-06-02 09:22:14 +01:00
Rupert Swarbrick
3f37114f5b Change how enable/disable is configured in ICache core sequence
This version allows us to force the cache to be always enabled, as
well as allowing us to force always disabled, as before.
2020-06-02 09:22:14 +01:00
Rupert Swarbrick
7f6543499a Move "enable" state into ICache core agent's sequence
This was in the driver, but that turns out to be a mistake if you want
to control the cache with "always on" as well as "always off".
2020-06-02 09:22:14 +01:00
Dawid Zimonczyk
d51b50350f Add support for running Icache test with Riviera-PRO 2020-05-29 10:07:23 +01:00
Tom Roberts
7e1852024a [DV] Tie test_en_i to zero
- test_en_i is a DFT feature that shouldn't be enabled for normal
  runtime testing
- Only really affects the clock gate in the design, but is needed for
  running tests with the latch-based register file

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-27 11:34:47 +01:00
Tom Roberts
f26cc9bb58 [dv] Add missing signal to i$ tb
- Fixes #904
- Will need a future update to drive the new signal properly

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-27 11:31:02 +01:00
Philipp Wagner
8b42024cd5 Use vendored-in primitives from OpenTitan
Instead of using copies of primitives from OpenTitan, vendor the files
in directly from OpenTitan, and use them.

Benefits:

- Less potential for diverging code between OpenTitan and Ibex, causing
  problems when importing Ibex into OT.

- Use of the abstract primitives instead of the generic ones. The
  abstract primitives are replaced during synthesis time with
  target-dependent implementations. For simulation, nothing changes. For
  synthesis for a given target technology (e.g. a specific ASIC or FPGA
  technology), the primitives system can be instructed to choose
  optimized versions (if available).

  This is most relevant for the icache, which hard-coded the generic
  SRAM primitive before. This primitive is always implemented as
  registers. By using the abstract primitive (prim_ram_1p) instead, the
  RAMs can be replaced with memory-compiler-generated ones if necessary.

There are no real draw-backs, but a couple points to be aware of:

- Our ram_1p and ram_2p implementations are kept as wrapper around the
  primitives, since their interface deviates slightly from the one in
  prim_ram*. This also includes a rather unfortunate naming confusion
  around rvalid, which means "read data valid" in the OpenTitan advanced
  RAM primitives (prim_ram_1p_adv for example), but means "ack" in
  PULP-derived IP and in our bus implementation.

- The core_ibex UVM DV doesn't use FuseSoC to generate its file list,
  but uses a hard-coded list in `ibex_files.f` instead. Since the
  dynamic primitives system requires the use of FuseSoC we need to
  provide a stop-gap until this file is removed. Issue #893 tracks
  progress on that.

- Dynamic primitives depend no a not-yet-merged feature of FuseSoC
  (https://github.com/olofk/fusesoc/pull/391). We depend on the same
  functionality in OpenTitan and have instructed users to use a patched
  branch of FuseSoC for a long time through `python-requirements.txt`,
  so no action is needed for users which are either successfully
  interacting with the OpenTitan source code, or have followed our
  instructions. All other users will see a reasonably descriptive error
  message during a FuseSoC run.

- This commit is massive, but there are no good ways to split it into
  bisectable, yet small, chunks. I'm sorry. Reviewers can safely ignore
  all code in `vendor/lowrisc_ip`, it's an import from OpenTitan.

- The check_tool_requirements tooling isn't easily vendor-able from
  OpenTitan at the moment. I've filed
  https://github.com/lowRISC/opentitan/issues/2309 to get that sorted.

- The LFSR primitive doesn't have a own core file, forcing us to include
  the catch-all `lowrisc:prim:all` core. I've filed
  https://github.com/lowRISC/opentitan/issues/2310 to get that sorted.
2020-05-27 10:23:15 +01:00
Philipp Wagner
3f4e706062 Move Verilator simutil upstream to OpenTitan
https://github.com/lowRISC/opentitan/pull/2311 added the Verilator
memutils to OpenTitan as upstream. This commit is the second part of the
story, removing the code from the Ibex repository, and vendoring it back
in from OpenTitan.

This also superseded #844, which has now been included through
OpenTitan.
2020-05-27 10:23:15 +01:00
Tudor Timi
c5b5c7164f Add wave dumping to Xcelium simulation setup 2020-05-26 19:57:54 +01:00
Tudor Timi
3d796ee2aa Fix non-standard usage of '" operator 2020-05-26 19:57:54 +01:00
Tudor Timi
8c6395a257 Add Xcelium simulation setup 2020-05-26 19:57:54 +01:00
Tom Roberts
db149880bd [dv/icache] revert ff5c0c5
Ready valid dependency now resolved in rtl, see #850

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-26 09:29:36 +01:00
Tom Roberts
65b21c6741 [dv/icache] Add qualifications in protocol checker
The rdata driven by the cache is undefined when there is an error. There
are therefore no requirements on stability.

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-26 09:29:36 +01:00
Tobias Wölfel
4e7b981911 [rtl] Add RVFI IXL interface
Following the RISC-V Formal Interface (RVFI) specification the output is
added to set the value of MXL/SXL/UXL of the current privilege level.
2020-05-25 16:47:25 +01:00
ganoam
66687e927c [bitmanip] Add ZBR instruction group
This commit implements the Bit Manipulation Extension ZBR instruction
group: crc32[c].[bhw].

CRC-32 (CRC-32/ISO-HDLC) and CRC-32C (CRC-32/ISCSI) are directly
implemented. The CRC operation solves the following equation using
binary polynomial arithmetic:

rev(rd)(x) = rev(rs1)(x) * x**n mod {1, P}(x),

where {1,P}(x) denotes the crc polynomial. Using barret reduction one
can write this as

rd = (rs1 >> n) ^ rev(rev( (rs1 << (32-1)) cx rev(mu)) cx P)
                      ^-- cycle 0--------------------^
     ^-- cycle 1 ------------------------------------------^

Where cx denotes carry-less multiplication and mu = polydiv(x**64,
{1,P}), omitting the MSB (bit 32).

The implementation increases area consumption by ~0.6kGE for synthesis
with relaxed timing constraints. With tight timing constraints that is
~1.6kGE. There is no significant impact on frequency.

Signed-off-by: ganoam <gnoam@live.com>
2020-05-22 17:21:03 +02:00
Philipp Wagner
42eee90cf8 Make RAM size configurable in compliance test system
Switch from hardcoding 64 kB in two places to a fixed parameter.
2020-05-22 14:33:11 +01:00
Philipp Wagner
9d976c7ab7 Use prim_generic_ram_1p in ram_1p
ram_1p is almost a copy of the single-port RAM primitive we have in
OpenTitan, called prim_ram_1p, with its generic implementation
prim_generic_ram_1p. Instead of having a copy of that file in Ibex,
consistently use the OpenTitan one.

Unfortunately, ram_1p has slightly different semantics around some
signals, especially rvalid. This commit adjusts the meanings of the
signals for now, since I don't have a way to test the Arty board
which also uses this primitive (together with the compliance test
suite). With the testing in the compliance suite I'm reasonably certain
that the Arty board will work as well.
2020-05-22 14:33:11 +01:00
Stefan Wallentowitz
8b8327d820 [dv] Change performance counter access to DPI
DPI access is suggested and more generic than Verilator direct signal
access. This changes the access to the performance counters from the
Verilator testbench to use DPI instead of directly accessing the
array.

Signed-off-by: Stefan Wallentowitz <stefan.wallentowitz@hm.edu>
2020-05-21 20:34:26 +01:00
Rupert Swarbrick
30fff2da79 Get rid of some duplication in ICache virtual sequences 2020-05-21 16:38:20 +01:00
Rupert Swarbrick
ab3ff7d57e Use control knobs rather than subclassing in ICache core sequences
This is a bit less verbose than what we had: we can just set the
control knobs in the virtual sequence rather than subclassing
everything.
2020-05-21 16:38:20 +01:00
Rupert Swarbrick
d750d3e53e Add passthru test for ICache
This test constrains the address range (giving the cache a chance to
do some caching), but leaves the cache disabled. Seed changes are more
frequent than usual, to give us a good chance to spot any caching that
shouldn't have happened.
2020-05-21 16:38:20 +01:00
Tom Roberts
d5ee96fff6 [rtl] Add dummy instruction insertion
- Adds a new module in the IF stage to inject dummy instructions into
  the pipeline
- Control / frequency of insertion is governed by configuration CSRs
- Extra CSR added to allow reseed of the internal LFSR useed for
  randomizing insertion
- Extra logic added to the register file to make dummy instruction
  writebacks look like real intructions (via the zero register)

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-21 13:58:01 +01:00
Dawid Zimonczyk
5c7cdfe14e added missing cmp_opts to Riviera compilation options 2020-05-20 15:34:57 +01:00
Rupert Swarbrick
e3fe0c5032 Search backwards for grant seeds in icache memory model
The code before this patch maintained a mailbox, where it would add an
item for each request it saw, and then pop items off until finding the
right address whenever it saw a grant.

Most of the time, you might expect to see a sequence like this:

    request 100
    grant   100
    request 104
    grant   104
    request 108
    grant   108

This scheme is also resilient when glitches (to do with the
delta-cycle scheduling in the simulator) mean you actually see
something like:

    request 999
    request 100
    grant   100
    request 104
    grant   104
    ...

However, there's another source of "mismatch" possible too: the cache
can change the request address if the request hasn't been granted (as
opposed to a ready/valid interface, where this sort of tomfoolery is
not allowed!).

When the cache is branching all over the place, as in the sanity
sequence, this doesn't really matter. But if the branch destinations
are constrained, as in the passthru sequence, you can see things like
this:

    request 100     (1)
    request 120     (2)
    request 100     (3)
    grant   100     (4)
    request 104
    grant   104
    ...

Note that the mailbox has two entries for address 100 when searching
at point (4). This might be ok, but will cause failures if we get a
new seed at (2) or (3).

This patch replaces the mailbox with a queue. New requests get
inserted at the end, as before, but grants search from the end, rather
than the start. This means that when we get to (4) in the example
above, we'll pick the latest seed (and duplicate entries disappear
quickly).
2020-05-19 10:31:27 +01:00
Rupert Swarbrick
c385354b3c Apply new seeds to memory request in icache memory model
When the memory model sees a new fetch on the bus, it might decide to
pick a new seed for the backing memory. Before this patch, the seed
applied to every fetch strictly after this one. Now, it applies to
this fetch too.

This is what the scoreboard expects. In particular, you can trigger
problems here by disabling the cache and branching lots: things will
go wrong if we pick a new seed at the same time as handling the
branch.

To fix things, we either have to teach the scoreboard to "look one
seed backwards" when the cache is disabled, which is ugly and not as
sensitive to errors in the cache, or we have to apply the new seed
immediately. This is a little painful, because we end up having to
randomize the response item and then calculate a field based on a
possible new seed (see the logic between start_item and end_item in
take_req), but I think it's cleaner than the alternative.

As part of the patch, I've also split the "req" and "grant" handling
code into separate tasks. There's no real change there, except to get
rid of a level of indentation, but I think it makes the code a bit
easier to understand.
2020-05-19 10:31:27 +01:00
Rupert Swarbrick
b7800ba75b Use --start_seed rather than --seed in core_ibex/Makefile
The --seed argument has kept its original meaning: Run the one and
only iteration of the test with this seed. We've added another
argument, --start_seed to riscv-dv's run.py and our sim.py which says
"run the first iteration with this seed, and count up for later
iterations".

This should fix issue #859.
2020-05-19 09:40:26 +01:00
Rupert Swarbrick
a325430904 Add a --start_seed argument to core_ibex/sim.py
If --iterations is 1, this is equivalent to the existing --seed
argument (which we're keeping unchanged). If --iterations is
0 (reading iteration counts from the config) or positive, successive
test iterations use successive seeds. So if you pass --start_seed 123
and run ten iterations, they will run with seeds 123, 124, ... through
133.

Lots of the added code is to check that you don't do something silly
like --seed=123 --iterations=10. Since the next patch will convert the
Makefile which runs this script to using --start_seed, that's all dead
code. Maybe we should get rid of that argument at some point.
2020-05-19 09:40:26 +01:00
ganoam
f173e2baba [bitmanip] Add ZBC instruction group
This commit implements the Bit Manipulation Extension ZBC instruction
group: clmul[rh] (carry-less multiply [reverse][high])

Carry-less multiplication can be understood as multiplication based on
the addition interpreted as the bit-wise xor operation.

Example: 1101 X 1011 = 1111111:

      1011 X 1101
      -----------
             1101
        xor 1101
        ---------
            10111
       xor 0000
       ----------
           010111
      xor 1101
      -----------
          1111111

Architectural details:
        A 32 x 32-bit array
        [ operand_b[i] ? (operand_a << i) : '0 for i in 0 ... 31 ]
        is generated. The entries of the array are pairwise 'xor-ed'
        together in a 5-stage binary tree.

The area increase when synthesized with relaxed timing constraints is
1.6-1.7kGE.

Timing figures are improve by 0.1 ns for the 3-stage configuration and
worsen by 0.04ns for the 2-stage implementation. This suggests
fluctuations due to the heuristic nature of the synthesis tools.

Signed-off-by: ganoam <gnoam@live.com>
2020-05-19 10:38:38 +02:00
Rupert Swarbrick
d20833c639 Update ICache testplan after review meeting
I think these represent the test cases we discussed. I've also removed
non-existent entries from the "tests" keys: I didn't really understand
how dvsim.py worked when I wrote the original version and they just
cause irritating warnings.
2020-05-18 17:24:15 +01:00
Rupert Swarbrick
fc3750978e Move seed updates into sequence in ICache memory agent
The previous code kind of worked, but we were making the "should I
make a new seed" decision in the monitor, rather than the sequence.
The problem is that this is difficult to customize with other test
sequences (they sit adjacent to the monitor in the class hierarchy,
not above it).

The new code seems a little cleaner. We generate new seeds in the
sequence (which is in charge of keeping track of the current seed
anyway). These new seeds get passed to the driver, which has an
analysis port by which it can tell the scoreboard about them. Note
that we have to pass them from the driver, rather than the monitor,
because the new seed doesn't directly appear on the interface.

The rest of the changes are simplifying the ibex_icache_mem_bus_item
class, which now only has two modes and removing the seed field from
the ibex_icache_mem_req_item class.
2020-05-15 17:24:04 +01:00
Tom Roberts
5fd3cad9a1 [config] Change default PMPNumRegions
Change default to 4 rather than 0. Makes no difference when PMPEnable==0
and gets rid of lint failures due to 0 array referencing (0 is an
unsupported value for this parameter).

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-15 11:12:31 +01:00