Commit graph

1633 commits

Author SHA1 Message Date
Rupert Swarbrick
d750d3e53e Add passthru test for ICache
This test constrains the address range (giving the cache a chance to
do some caching), but leaves the cache disabled. Seed changes are more
frequent than usual, to give us a good chance to spot any caching that
shouldn't have happened.
2020-05-21 16:38:20 +01:00
Udi
ec42eb4409 Update google_riscv-dv to google/riscv-dv@7b38e54
Update code from upstream repository https://github.com/google/riscv-
dv to revision 7b38e54c5e833f147edc03717b3fd711be923026

* add cmdline configuration of mstatus.mprv (Udi Jonnalagadda)
* Add Xcelium support (google/riscv-dv#579) (Tudor Timi)

Signed-off-by: Udi <udij@google.com>
2020-05-21 08:28:58 -07:00
Tom Roberts
d5ee96fff6 [rtl] Add dummy instruction insertion
- Adds a new module in the IF stage to inject dummy instructions into
  the pipeline
- Control / frequency of insertion is governed by configuration CSRs
- Extra CSR added to allow reseed of the internal LFSR useed for
  randomizing insertion
- Extra logic added to the register file to make dummy instruction
  writebacks look like real intructions (via the zero register)

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-21 13:58:01 +01:00
Dawid Zimonczyk
5c7cdfe14e added missing cmp_opts to Riviera compilation options 2020-05-20 15:34:57 +01:00
Greg Chadwick
5b97c26510 [syn] Add more Ibex parameters to flow
Can now control writeback stage inclusion, bitmanip extension and
multiplier implementation.
2020-05-20 12:08:10 +01:00
Greg Chadwick
2cfb5e8d78 [syn] Add STA util for investigating feedthroughs 2020-05-20 12:08:10 +01:00
Rupert Swarbrick
e3fe0c5032 Search backwards for grant seeds in icache memory model
The code before this patch maintained a mailbox, where it would add an
item for each request it saw, and then pop items off until finding the
right address whenever it saw a grant.

Most of the time, you might expect to see a sequence like this:

    request 100
    grant   100
    request 104
    grant   104
    request 108
    grant   108

This scheme is also resilient when glitches (to do with the
delta-cycle scheduling in the simulator) mean you actually see
something like:

    request 999
    request 100
    grant   100
    request 104
    grant   104
    ...

However, there's another source of "mismatch" possible too: the cache
can change the request address if the request hasn't been granted (as
opposed to a ready/valid interface, where this sort of tomfoolery is
not allowed!).

When the cache is branching all over the place, as in the sanity
sequence, this doesn't really matter. But if the branch destinations
are constrained, as in the passthru sequence, you can see things like
this:

    request 100     (1)
    request 120     (2)
    request 100     (3)
    grant   100     (4)
    request 104
    grant   104
    ...

Note that the mailbox has two entries for address 100 when searching
at point (4). This might be ok, but will cause failures if we get a
new seed at (2) or (3).

This patch replaces the mailbox with a queue. New requests get
inserted at the end, as before, but grants search from the end, rather
than the start. This means that when we get to (4) in the example
above, we'll pick the latest seed (and duplicate entries disappear
quickly).
2020-05-19 10:31:27 +01:00
Rupert Swarbrick
c385354b3c Apply new seeds to memory request in icache memory model
When the memory model sees a new fetch on the bus, it might decide to
pick a new seed for the backing memory. Before this patch, the seed
applied to every fetch strictly after this one. Now, it applies to
this fetch too.

This is what the scoreboard expects. In particular, you can trigger
problems here by disabling the cache and branching lots: things will
go wrong if we pick a new seed at the same time as handling the
branch.

To fix things, we either have to teach the scoreboard to "look one
seed backwards" when the cache is disabled, which is ugly and not as
sensitive to errors in the cache, or we have to apply the new seed
immediately. This is a little painful, because we end up having to
randomize the response item and then calculate a field based on a
possible new seed (see the logic between start_item and end_item in
take_req), but I think it's cleaner than the alternative.

As part of the patch, I've also split the "req" and "grant" handling
code into separate tasks. There's no real change there, except to get
rid of a level of indentation, but I think it makes the code a bit
easier to understand.
2020-05-19 10:31:27 +01:00
Rupert Swarbrick
b7800ba75b Use --start_seed rather than --seed in core_ibex/Makefile
The --seed argument has kept its original meaning: Run the one and
only iteration of the test with this seed. We've added another
argument, --start_seed to riscv-dv's run.py and our sim.py which says
"run the first iteration with this seed, and count up for later
iterations".

This should fix issue #859.
2020-05-19 09:40:26 +01:00
Rupert Swarbrick
f767214d88 Update google_riscv-dv to google/riscv-dv@e6a63ff
Update code from upstream repository https://github.com/google/riscv-
dv to revision e6a63ff19ddf162a89379f9e03f76345c3558ecc

* Restructure coverage (google/riscv-dv#569) (weicaiyang)
* Add --seed_start argument and tidy up seed handling (google/riscv-
  dv#570) (Rupert Swarbrick)
*  Move `sext.b/h` bitmanip instructions to ZB_TMP (google/riscv-
  dv#573) (weicaiyang)
* PR to minor fix for running riscv_asm_program_gen.py (google/riscv-
  dv#571) (Hai Hoang Dang)
* Quickly fix broken link (google/riscv-dv#568) (weicaiyang)

Signed-off-by: Rupert Swarbrick <rswarbrick@lowrisc.org>
2020-05-19 09:40:26 +01:00
Rupert Swarbrick
a325430904 Add a --start_seed argument to core_ibex/sim.py
If --iterations is 1, this is equivalent to the existing --seed
argument (which we're keeping unchanged). If --iterations is
0 (reading iteration counts from the config) or positive, successive
test iterations use successive seeds. So if you pass --start_seed 123
and run ten iterations, they will run with seeds 123, 124, ... through
133.

Lots of the added code is to check that you don't do something silly
like --seed=123 --iterations=10. Since the next patch will convert the
Makefile which runs this script to using --start_seed, that's all dead
code. Maybe we should get rid of that argument at some point.
2020-05-19 09:40:26 +01:00
ganoam
f173e2baba [bitmanip] Add ZBC instruction group
This commit implements the Bit Manipulation Extension ZBC instruction
group: clmul[rh] (carry-less multiply [reverse][high])

Carry-less multiplication can be understood as multiplication based on
the addition interpreted as the bit-wise xor operation.

Example: 1101 X 1011 = 1111111:

      1011 X 1101
      -----------
             1101
        xor 1101
        ---------
            10111
       xor 0000
       ----------
           010111
      xor 1101
      -----------
          1111111

Architectural details:
        A 32 x 32-bit array
        [ operand_b[i] ? (operand_a << i) : '0 for i in 0 ... 31 ]
        is generated. The entries of the array are pairwise 'xor-ed'
        together in a 5-stage binary tree.

The area increase when synthesized with relaxed timing constraints is
1.6-1.7kGE.

Timing figures are improve by 0.1 ns for the 3-stage configuration and
worsen by 0.04ns for the 2-stage implementation. This suggests
fluctuations due to the heuristic nature of the synthesis tools.

Signed-off-by: ganoam <gnoam@live.com>
2020-05-19 10:38:38 +02:00
Rupert Swarbrick
d20833c639 Update ICache testplan after review meeting
I think these represent the test cases we discussed. I've also removed
non-existent entries from the "tests" keys: I didn't really understand
how dvsim.py worked when I wrote the original version and they just
cause irritating warnings.
2020-05-18 17:24:15 +01:00
Rupert Swarbrick
fc3750978e Move seed updates into sequence in ICache memory agent
The previous code kind of worked, but we were making the "should I
make a new seed" decision in the monitor, rather than the sequence.
The problem is that this is difficult to customize with other test
sequences (they sit adjacent to the monitor in the class hierarchy,
not above it).

The new code seems a little cleaner. We generate new seeds in the
sequence (which is in charge of keeping track of the current seed
anyway). These new seeds get passed to the driver, which has an
analysis port by which it can tell the scoreboard about them. Note
that we have to pass them from the driver, rather than the monitor,
because the new seed doesn't directly appear on the interface.

The rest of the changes are simplifying the ibex_icache_mem_bus_item
class, which now only has two modes and removing the seed field from
the ibex_icache_mem_req_item class.
2020-05-15 17:24:04 +01:00
Tom Roberts
8934267c78 [rtl] Fix instr_valid_i exception issue
- The controller state machine could only progress to FLUSH to handle an
  exception if instr_valid_i was set
- When the exception comes from a load/store in the Writeback stage, and
  no new instruction has been driven into the ID stage, this could cause
  exception to be missed
- The instr_valid_i qualification is therefore removed from the state
  machine as all relevant signals inside that if block are already
  qualified by instr_valid_i anyway
- Fixes #849

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-15 11:13:20 +01:00
Tom Roberts
5fd3cad9a1 [config] Change default PMPNumRegions
Change default to 4 rather than 0. Makes no difference when PMPEnable==0
and gets rid of lint failures due to 0 array referencing (0 is an
unsupported value for this parameter).

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-15 11:12:31 +01:00
Tom Roberts
a5ae9f4995 [rtl] Add data-independent timing to multdiv_fast
- No early return on divide by zero

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-15 10:19:55 +01:00
Tom Roberts
d19189ba43 [rtl] data-independent execution for multdiv_slow
- Remove all early exit's from multiply and divide operations when in
  fixed time execution mode.

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-15 10:19:55 +01:00
Tom Roberts
0ba0ad5a43 [rtl] multdiv_slow general tidy-up
- Correct some typos and fix various lint / style guide issues
- No functional changes

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-15 10:19:55 +01:00
Greg Chadwick
10bc77ddcc [dv] Enable use of ibex configs in DV 2020-05-15 09:03:04 +01:00
Greg Chadwick
00b46d9abe [cfg] Add PMP parameters to ibex_config.yaml
Also renames configs as part of this as they start to get unweildy if
all features get described in the config name.
2020-05-15 09:03:04 +01:00
ganoam
9bd3350bb3 [bitmanip] Add sext.b/h instructions
This commit implements the Bit Manipulation Extension sign-extend
instructions: sext.b (sign-extend byte) and sext.h (sign-extend half
word).

The implementation is basically a one-liner, duplicating the msb of the
byte / half-word into the msb of the output register.

Signed-off-by: ganoam <gnoam@live.com>
2020-05-14 22:03:45 +02:00
ganoam
fac404a6f3 [bitmanip] Add ZBF instruction group
This commit implements the Bit Manipulation Extension ZBF instruction
group, which consists only of the one instruction bfp (bit-field
place).
This instruction places a field of length len < 16 from rs2 in rs1 at
offset off.

Architectureal details:
        The implementation works exactly the same as proposed by Claire
        Wolf in her reference implementation.
        1. bfp_mask = slo(o, len)
        2. bfp_result =
                (rs1 & ~(bfp_mask << off)) | (rs2 & bfp_mask) << off
                        ^------ shifter-^
        The existing shifter structure is shared for the indicated
        operation.

Impact on area:

        * When synthesizing without the B-extension, the 2 stage
        design seems to move the timing bottleneck, leading to
        optimizations which result in an area increase by 1 kGE,
        when synthesized with tight timing constraints. For the
        3 stage configuration there is no change.
        When synthesized with relaxed timing constraints there is no
        significant change in either configuration.

        * With the B-extension enabled, the area increase for tight
        timing constraints is 1.1-1.2 kGE. For relaxed timing
        constraints that is ~0.4kGE

Impact on timing: No significant impact.

Signed-off-by: ganoam <gnoam@live.com>
2020-05-14 21:34:49 +02:00
ganoam
0afd000a09 [bitmanip] Add ZBE Instruction Group
This commit implements the Bit Manipulation Extension ZBE instruction
group: bext (bit extract) and bdep (bit deposit).

Architectural details:
        * bext/bdep: A new butterfly and inverse butterfly network is
        implemented. The generation of its controlbits depend on a
        parallel prefix bitcount of the deposit / extract mask.

        * bitcounter: The path for bext / bdep instructions traverses
        the bit counter and the butterfly network, resulting in both a
        larger delay and area. To mitigate the bitcounter has been
        changed from a serial bit counter to a radix-2 tree structure.

        * grev/gorc: Zbp instructions general reverse and general
        or-combine have as of yet shared the shifters reversal
        structure. It has proven benefitial to area and timing to reuse
        the novel butterfly network instead

The butterfly network itself consumes ~3.5kGE and ~1.1kGE for synthesis
with tight and relaxed timing constraints respectively. Including the
optimizations of the bitcounter and grev/gorc, the overall change in
area consumption is +4.6kGE (+1.2kGE) and +3.3kGE (+1.1kGE) for
synthesis with tight (relaxed) timing constraints for 2- and 3-stage
configurations respectively. For tight timing constraints that is a
growth by around ~10%, for relaxed ~5%.

The impact on the maximum frequency is negligable.

Signed-off-by: ganoam <gnoam@live.com>
2020-05-14 16:43:19 +02:00
Rupert Swarbrick
dd12d97934 Print commands in core_ibex/Makefile when VERBOSE=1
See issue #852 for discussion.
2020-05-12 16:36:04 +01:00
Rupert Swarbrick
9e19d3ea63 Check for correct "high" bits in icache core protocol checker 2020-05-12 12:08:50 +01:00
Rupert Swarbrick
22b0609b4f Weaken some checks on cache in ibex_icache_core_protocol_checker
Once the cache has passed an error to the core, we now allow it to
wiggle its valid, addr, rdata, err and err_plus2 lines however it sees
fit until the core issues a new branch.

Since the core isn't allowed to assert ready until then, the values
will not be read and this won't matter.

This was exposed by

  make -C dv/uvm/icache/dv run SEED=1314810947 WAVES=1
2020-05-12 12:08:50 +01:00
Rupert Swarbrick
d51d970089 Fix assertion in ibex_icache_core_protocol_checker
This assertion is supposed to say "the core may not request more data
from the cache when there's no valid address".

Unfortunately, I'd represented "requesting more data" by req being
high, rather than ready being high. This is wrong: req is a signal
saying "the core isn't currently asleep". ready (of a ready/valid
pair) is the one I wanted.
2020-05-12 12:08:50 +01:00
Udi
4814b6776f Update google_riscv-dv to google/riscv-dv@162ea73
Update code from upstream repository https://github.com/google/riscv-
dv to revision 162ea7312d21ac0b8ae73669fb68bf284b68f851

* Add experimental python based generator (google/riscv-dv#567)
  (taoliug)
* Check return code for ovpsim (google/riscv-dv#566) (taoliug)
* fix bug in PMP handler routine (google/riscv-dv#562) (udinator)

Signed-off-by: Udi <udij@google.com>
2020-05-11 13:35:21 -07:00
Rupert Swarbrick
592b9fb793 Add an empty common_cov_excl.el
Our hjson-based logic for constructing VCS commands always passes
-elfile, but this doesn't work if the following list of arguments is
empty.

It seems difficult to figure out how to teach dvsim.py to do something
like "prepend X to Y if Y is nonempty", so let's just add an empty
file for now.
2020-05-11 17:40:24 +01:00
Rupert Swarbrick
ac7da2b274 Allow coverage collection in icache/dv/Makefile 2020-05-11 17:40:24 +01:00
Rupert Swarbrick
ff5c0c5823 Always assert ready in core driver for ICache UVM testbench
This works around a bug tracked in issue #850.
2020-05-11 16:28:48 +01:00
Tom Roberts
863fb56eb1 [dv/cs_registers] Remove .* binding
- Only specifying the signals that the TB cares about means people will
  no longer have to update this file every time the cs_regs port list
  changes

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-07 14:58:28 +01:00
Rupert Swarbrick
4f349a094e Specify "-xlrm uniq_prior_final" for VCS
As discussed in issue #845, this tells VCS to wait for signals to
settle in combinatorial blocks before checking uniqueness in
constructs like unique case.

Otherwise things like this can cause spurious warnings:

  always_comb b = ~in;

  always_comb c = in;

  always_comb begin
    unique case (1'b1)
      b: x = 1;
      c: x = 0;
      default: x = 0;  // not that it matters, but this won't happen
    endcase
  end

For example, on a falling edge of the in signal, if the processes are
executed in the order 1, 3, 2 then the unique case block will appear
to see both b and c true at the same time.
2020-05-07 10:22:01 +01:00
Udi
e1ec5b63f8 Update google_riscv-dv to google/riscv-dv@ace2805
Update code from upstream repository https://github.com/google/riscv-
dv to revision ace2805b63100f46c3dcd02b4fcf6a7184582110

* Fix vector instruction randomization (google/riscv-dv#560) (taoliug)
* Change generate_instr_stream to a virtual function (google/riscv-
  dv#559) (taoliug)
* fix bug with compressed ebreak generation (google/riscv-dv#557)
  (udinator)
* update PMP exception handlers to 'fix' config CSRs (google/riscv-
  dv#546) (udinator)
* Add bitmanip doc (google/riscv-dv#555) (weicaiyang)
* specify physical pmp addresses from cmdline (Udi Jonnalagadda)
* Fix branch hit coverage issue (google/riscv-dv#551) (taoliug)
* B extension coverage part2 (google/riscv-dv#548) (weicaiyang)
* B extension coverage part1 (google/riscv-dv#542) (weicaiyang)
* Fix typo in riscv_instr_test_lib (google/riscv-dv#545) (ANIL SHARMA)
* Add target rv64imcb (google/riscv-dv#543) (weicaiyang)

Signed-off-by: Udi <udij@google.com>
2020-05-07 01:23:20 -07:00
Udi
b72d263eac [dv] Manually update dvsim config files
Signed-off-by: Udi <udij@google.com>
2020-05-05 10:42:56 -07:00
Udi
7710947fe6 [dv] enable writeback stage and branch ALU
Signed-off-by: Udi <udij@google.com>
2020-05-05 09:09:33 -07:00
Tom Roberts
15ab023e25 [rtl] Stop regfile writeback for load errors
- A data or PMP error should stop the register file from being updated
- Fixes #832

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-05 09:26:55 +01:00
Tom Roberts
9451df2965 [prim] Split out primitives used by icache
- All primitives the icache uses are specified in distinct core files
  with names that match those existing (or about to exist) in OpenTitan
- When vendoring-in Ibex, none of those primitives need to be copied
  across, since OpenTitan will use its own versions
- Relates to lowRISC/opentitan/#1231

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-04 17:19:58 +01:00
Udi
7dafdf6456 [dv/fcov] Fix Ibex log parsing script
Signed-off-by: Udi <udij@google.com>
2020-05-04 02:28:06 -07:00
Rupert Swarbrick
717cb90aef Rework choosing new seeds in icache UVM memory model
The flow for a memory fetch is:

  1. Cache requests data for a memory address
  2. Agent spots the request, maybe signalling a PMP error
  3. Grant line goes high, at which point the request is granted.
  4. Sometime later (in-order pipeline), agent sends a response

Occasionally, we need to pick a new seed for the backing memory.
Before this patch, we picked these seeds at point (3).

Unfortunately this was wrong in the following case:

  1. We're switching from seed S0 to seed S1.
  2. The request is spotted with seed S0 and doesn't signal a PMP error
  3. The request is granted and we switch to seed S1.
  4. We respond with data from memory based on S1, with no memory
     error either

If S1 would have caused a PMP error, the resulting fetch (no error,
but data from S1) doesn't match any possible seed and the scoreboard
gets confused.

This patch changes to picking new seeds at (2) to solve the problem.
This isn't quite enough by itself, because if a request is granted on
a clock-edge, a new request address might appear and there isn't a
guaranteed ordering in the simulation between the new request and the
old grant (both things happen at the same time). To fix this, the
response sequence now maintains a queue of requests and their
corresponding seeds to make sure that all the checks for a fetch are
done with a single seed.

The patch also gets rid of the seed state in the memory model: it
turns out that this didn't really help: the scoreboard is always
asking "what would I get with this seed?" and now the sequence is
doing something similar.
2020-05-04 09:57:30 +01:00
Tom Roberts
c862f104af [rtl] icache error signalling fix
- Data valid should only be signalled when the current beat is
  signalling an error
- PMP errors for future beats can sneak in while waiting for the
  current beat

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-04 09:16:55 +01:00
Tom Roberts
5bcacb876d [rtl] Fix jump signal stuck high during stall
- Stalls due to preceding memory accesses in the WB stage shouldn't
  cause the jump signal to remain high.
- The jump signal being stuck high causes repeated memory accesses to
  the same address, and unnecessary stalling.
- Fixes lowRISC/opentitan#2099

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-04 08:28:59 +01:00
Pirmin Vogel
fd01562ff7 [doc] Minor fixes
Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
2020-05-01 20:09:59 +02:00
Pirmin Vogel
3922b2582f [rtl] Rework generation and use of mult/div_sel/en
Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
2020-05-01 17:29:59 +02:00
Pirmin Vogel
511c59db18 [rtl] Switch multdiv_en to multdiv_sel where possible
Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
2020-05-01 17:29:59 +02:00
Rupert Swarbrick
439513ba68 Fixes to invalidation logic in icache core agent
Firstly, the pulse shouldn't be zero length (since that wouldn't
actually do anything).

Also, 1 time in 500 is too rare for either invalidations or "long
invalidations", I think, so this patch also increases how often we see
each.
2020-05-01 10:21:45 +01:00
Rupert Swarbrick
34098bc315 Increase length of icache tests
Now that we have a working framework, let's drive some more items
through (a bit more efficient than running more tests, and also less
skewed by the initial cache invalidation).
2020-05-01 10:21:45 +01:00
Rupert Swarbrick
f717c04ad1 Initial icache scoreboard
This correctly tracks fetch addresses and fetched data. It understands
changing memory seeds, errors, invalidations and enable/disable.

Most of the complexity is in checking whether a fetch got the right
answer, given the set of memory seeds that it might have used. This
isn't conceptually hard (use a local memory model; check each seed and
see whether it matches), but is a bit fiddly in practice. In
particular, a misaligned 4-byte load might actually correspond to two
different seeds: note the nested loops in check_compatible_2.

The general flow of these checks is:

     check_compatible
  -> check_compatible_<i>    (loop over plausible seeds)
  -> is_seed_compatible_<i>  (ask the memory what data to expect)
  -> is_fetch_compatible_<i> (compare seen/expected data)

Note that the check_compatible_<i> functions have a "chatty"
parameter. This is to help with debugging when something goes wrong:
if the check fails then the check_compatible function calls it again
with chatty=1, which dumps a list of the seeds we checked and why they
didn't match.

Other than the scoreboard itself, this patch also adds a "seed" field
to ibex_icache_mem_bus_item. This is used by the monitor to inform the
scoreboard when a new memory seed has been set. Obviously, this
doesn't correspond to any actual monitored signals, but we need some
sort of back channel, and this looked like a sensible way to do it.

The patch also stops reporting PMP responses to the scoreboard: it
turns out the scoreboard doesn't need to care about them, so we can
simplify things slightly this way.

At the moment, the scoreboard doesn't check that fetched data isn't
stored when the cache is disabled. You could see this by disabling the
cache, fetching from an address, enabling the cache and changing the
memory seed, and fetching from the address again. I think it would be
reasonably easy to make an imprecise version of the check, where a
seed gets discarded from the queue if its live period is completely
within a period where the cache was disabled, but I want to wait until
we've got some tests that actually get cache hits before I implement
this.

There's also a slight imprecision in the busy line check that needs
tightening up.

Both of these to-do items have TODO comments in the code.
2020-05-01 08:49:23 +01:00
Udi
be9af77b35 [dv] makefile:cov LSF arg fix
Signed-off-by: Udi <udij@google.com>
2020-04-29 14:41:25 -07:00