This is manually squashed with a change to import dv_base_reg too, a
new module that was created by Weicai's "csr backdoor support" patch.
It's needed because it is a dependency of dv_lib.
Update code from upstream repository
https://github.com/lowRISC/opentitan to revision
c91b50f357a76dae2ada104e397f6a91f72a33da
* [prim_ram*_adv] Update core files and add prim_util dependency
(Michael Schaffner)
* [prim_ram*_adv] Implement Byte parity in prim_ram*_adv (Michael
Schaffner)
* [dvsim] Run tests in "interleaved" order (Rupert Swarbrick)
* [dvsim] Remove unnecessary getattr/setattr calls from SimCfg.py
(Rupert Swarbrick)
* [dv] Add support for multiple ral models (Srikrishna Iyer)
* [rtl] Fix prim flash dependency (Srikrishna Iyer)
* [prim_fifo_sync] Make FIFO output zero when empty (Noah Moroze)
* [dv] csr backdoor support (Weicai Yang)
* [prim] Add a "clog2 width" function (Philipp Wagner)
* [dvsim] Allow max-parallel to be set in the environment (Rupert
Swarbrick)
* [dvsim] Fix --reseed argument (Rupert Swarbrick)
* [prim_ram/rom*_adv] Break out into individual core files (Michael
Schaffner)
* [prim_rom] Align port naming with prim_ram* (Michael Schaffner)
* [dv] Allow a test to have "simple" timestamps (Rupert Swarbrick)
* [dvsim] Improve --help message (Rupert Swarbrick)
* [dvsim] Remove unused --local argument (Rupert Swarbrick)
* [dvsim] Small tidy-ups to mode selection in SimCfg.py (Rupert
Swarbrick)
* [fpv] formal compile fix required by VC Formal (Cindy Chen)
* [dvsim] Fix error detection logic in Deploy.py (Rupert Swarbrick)
Signed-off-by: Rupert Swarbrick <rswarbrick@lowrisc.org>
- Remove the timing optimisations that delay the factoring-in of ecc
errors into valid_o.
- Optimisations are probably unnecessary here due to the minimal logic
hanging off valid_o, and the optimisations cause protocol checker
violations.
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
This matches the style used in the FF based register file. It gives each
register its own always block with a single enable rather than having
multiple registers with enables in a single always block.
Signed-off-by: Greg Chadwick <gac@lowrisc.org>
Gives the option of a maximum performance configuration without PMP
enabled, which is more of an orthogonal security feature.
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
By giving each register its own always_ff block clock gating is more
obvious to synthesis tools.
This also includes some minor naming tweaks to make use of the _q
convention for flops.
- If an interrupt arrives at the same time as a load/store instruction
is in ID stage, the interrupt must wait until load/store completes.
Without the WB stage this happens naturally as the core stalls. With
the WB stage, we need to allow the load/store to progress to the WB
stage (and clear the ID stage) then hold back the interrupt until it
completes.
- Also cleaned up some lsu related stalling terms and signal naming.
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
- Protocol-wise data_err_i is notionally X when !data_rvalid_i
- In addition, the design does not appear to rely on the asserted
behaviour
- Removing as it is firing in chip-level OT simulations
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
These cover points were extracted by reading down the icache
documentation (icache.rst). There aren't yet cover points to check
that the targets of the testplan were executed properly, nor are there
any uarch coverpoints (which would be bound into the design, rather
than the interface).
The rather elaborate flow of
sequence -> function -> trigger -> task -> covergroup
for cancelled_valid_cg follows a skeleton described in Doug Smith, "A
Practical Look @ SystemVerilog Coverage" (slides from a Doulos
course). I'm not completely convinced it's worth the effort, but I
guess it shows how to extract information from a temporal sequence in
the interface and shove it in a covergroup properly via the monitor.
This should have no functional change - it's still set iff branch is
set - but the logic now lies in the UVM code, rather than the
structural code in tb.sv.
This turns out to be reasonably easy to plumb in: derive from the core
sequence base class, overriding its run_req method (once I've
remembered to make it virtual). Then pick the right core sequence by
adding a factory override in the vseq.
This is an entry in the testplan. Renaming it to "oldval", because
suffixing every class name with "disable_without_invalidation" was
getting ridiculous.
When the existing code in drive_pmp() decided that an error needed
signalling, it waited until the request was dropped, or the address
changed, before clearing the PMP error.
This is fine, unless the memory seed is changed (by magical means!)
under our feet. The monitor spots a new request, but the driver needs
to know to clear the PMP error. This patch forcibly tells the driver
to drop the existing item if a new one comes in.
Without this, you get test failures if there are two back-to-back
branches to the same address that happen at the same time as a seed
update. The problem is that you only see one request transaction (with
the first seed), and the two memory responses both come back with the
first seed, when the second should have had the second seed.
One aspect of (i)cache design that I didn't know about before writing
test code for this block is the problem of multi-way hits. The icache,
as implemented, stores data to parallel ways and it's possible for a
fetch to match more than one way. The data from matching ways all gets
ORed together, which doesn't matter so long as it never
changes (because V | V == V for all V).
Of course, things go poorly if you have two different values, V and W,
at an address which are both stored in the cache. Then the result is V
| W, which isn't necessarily equal to either instruction.
Avoiding this needs priority encoders, which are rather large, so it
seems the usual approach is to disallow branching to modified code
before flushing the cache. This patch teaches the testbench to do this
properly.
Sadly, this means there's now a connection between the core agent and
the memory agent: the memory agent can no longer generate new seeds
whenever it pleases.
The test is the same, but the reordering means that if we see an error
that we weren't expecting, we'll complain about that, rather than
about the instruction data itself.
When a potential exception occurs in ID/EX controller must wait for any
outstanding instruction in WB to complete before resolving it. The
instruction in WB may also have an exception which takes priority over
ID/EX.
Signed-off-by: Greg Chadwick <gac@lowrisc.org>
With the writeback stage enabled the controller can see a load or store
error from the writeback stage whilst seeing some other fault/exception
from ID/EX (e.g. an instruction fetch error). The writeback stage fault
must take priority, however without the writeback stage the
priortisation changes.
This introduces more explicit prioritisation logic for faults/exceptions
and gives the correct prioritisation for configurations both with and
without a writeback stage.
Fixes#912
Signed-off-by: Greg Chadwick <gac@lowrisc.org>
- The speculative branch behaviour causes a performance degradation of
around 3% in the max config. This change enables that behaviour only
the maximum PMP config, which is where it is most needed for timing
closure.
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
- Split out address matching into less than, greater than and equals to
correctly match NAPOT, NA4 and TOR modes.
- Relates to #902
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
In practice, this check will only trigger if you constrain your core
to fetch in a tight loop for a while and you don't invalidate the
cache very often.
The check has an assumption about the cache size (at least 1kB), but
that only has an effect on the tightness of the loop needed before we
do any checking.
- The prefetch buffer needs to know when space is available in the fetch
FIFO to accept a new external request.
- This change updates that logic to look at what is in the FIFO and what
is outstanding on the bus to decide when space is available rather
than always assuming the maximum number of requests are outstanding.
- This improves the usage efficiency of the FIFO and fixes#574
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
The RAM primitive provides a way to specify the granularity of the write
mask (wmask) signal, which can be used to select an appropriate
implementation (e.g. a SRAM with only byte selects, or no subselects at
all).
- test_en_i is a DFT feature that shouldn't be enabled for normal
runtime testing
- Only really affects the clock gate in the design, but is needed for
running tests with the latch-based register file
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
Instead of using copies of primitives from OpenTitan, vendor the files
in directly from OpenTitan, and use them.
Benefits:
- Less potential for diverging code between OpenTitan and Ibex, causing
problems when importing Ibex into OT.
- Use of the abstract primitives instead of the generic ones. The
abstract primitives are replaced during synthesis time with
target-dependent implementations. For simulation, nothing changes. For
synthesis for a given target technology (e.g. a specific ASIC or FPGA
technology), the primitives system can be instructed to choose
optimized versions (if available).
This is most relevant for the icache, which hard-coded the generic
SRAM primitive before. This primitive is always implemented as
registers. By using the abstract primitive (prim_ram_1p) instead, the
RAMs can be replaced with memory-compiler-generated ones if necessary.
There are no real draw-backs, but a couple points to be aware of:
- Our ram_1p and ram_2p implementations are kept as wrapper around the
primitives, since their interface deviates slightly from the one in
prim_ram*. This also includes a rather unfortunate naming confusion
around rvalid, which means "read data valid" in the OpenTitan advanced
RAM primitives (prim_ram_1p_adv for example), but means "ack" in
PULP-derived IP and in our bus implementation.
- The core_ibex UVM DV doesn't use FuseSoC to generate its file list,
but uses a hard-coded list in `ibex_files.f` instead. Since the
dynamic primitives system requires the use of FuseSoC we need to
provide a stop-gap until this file is removed. Issue #893 tracks
progress on that.
- Dynamic primitives depend no a not-yet-merged feature of FuseSoC
(https://github.com/olofk/fusesoc/pull/391). We depend on the same
functionality in OpenTitan and have instructed users to use a patched
branch of FuseSoC for a long time through `python-requirements.txt`,
so no action is needed for users which are either successfully
interacting with the OpenTitan source code, or have followed our
instructions. All other users will see a reasonably descriptive error
message during a FuseSoC run.
- This commit is massive, but there are no good ways to split it into
bisectable, yet small, chunks. I'm sorry. Reviewers can safely ignore
all code in `vendor/lowrisc_ip`, it's an import from OpenTitan.
- The check_tool_requirements tooling isn't easily vendor-able from
OpenTitan at the moment. I've filed
https://github.com/lowRISC/opentitan/issues/2309 to get that sorted.
- The LFSR primitive doesn't have a own core file, forcing us to include
the catch-all `lowrisc:prim:all` core. I've filed
https://github.com/lowRISC/opentitan/issues/2310 to get that sorted.
https://github.com/lowRISC/opentitan/pull/2311 added the Verilator
memutils to OpenTitan as upstream. This commit is the second part of the
story, removing the code from the Ibex repository, and vendoring it back
in from OpenTitan.
This also superseded #844, which has now been included through
OpenTitan.