The only core which needs lint waivers is ibex_core.core; it will then
inherit those waivers to other cores, such as ibex_core_tracing.core,
and higher-up users of these cores (such as the simple system).
Also remove the Verible lint configuration, which happens to be the
default by now. This fixes#736 by making it unnecessary.
The lint target in ibex_core_tracing was used to also lint unrelated
files which are needed for some simulations (e.g. the simple system).
Remove them from there, as they really don't belong there.
Ibex depends on a clock gating primitive. This has always been the case;
previously, we have under-specified the dependency list by simply not
including this dependency. This is problematic not only because "IT'S
WRONG!", but also because it breaks self-contained targets, like
Verilator lint, which cannot run on ibex_core or ibex_core_tracing
without having a way to find the clock gating primitive.
Before we changed to vendoring in our RAM primitives from OpenTitan,
there was an SRAM_INIT_FILE define that you could stick in your core
file. That's now gone away, replaced by a parameter (MemInitFile). If
we want to plumb that in, we need to pass it properly through the
wrappers.
By now, ibex depends not only on the a clock gating cell, but also on
assert macros and a LFSR; there will be more to come. This removed piece
of documentation was from the early days, when only a clock gating cell
was to be provided (and we didn't ship one).
Today, users need to either use FuseSoC, or run fusesoc on the simple
system and "harvest" the resulting files if they want to copy-paste Ibex
into their build system. That's not ideal, but not something we can very
easily fix -- so let's remove the outdated documentation first to at
least reduce the confusion.
The waiver files currently don't support comments, and require the
"empty" regex; bugs have been filed to get that resolved upstream.
But beyond that, waivers are now functional.
Fix all remaining issues reported by Verible lint.
It turns out that #965 undid some of the fixes in `ibex_alu.sv`
that were done in #980 around the `SHUFFLE_*`/`FLIP_*` signals.
This turns out to be quite complicated, because the icache has a lot
of different counters that all track addresses or fill buffer state.
For an inductive proof to go through, you need to make the relations
between them explicit, which takes lots of assertions.
All of the signals defined in the formal_tb are prefixed with 'f_'.
This isn't strictly necessary, but it makes it much easier to see what
came from the design (since we are "bound in", our ports don't have _o
or _i suffixes).
We add a couple of protocol assumptions:
ICache <-> Core:
- Branch target addresses are half-word aligned
- The branch_spec signal is driven if branch is
ICache <-> Memory:
- The bus doesn't respond unless there is an outstanding request
- The bus doesn't grant more than 4 billion outstanding requests(!)
There's also some protocol state tracking:
- f_addr_valid tracks whether the ICache currently has an
architectural address. It goes high with branch_i (which gives the
cache an address) and goes low when the cache completes a
transaction with err_o set (since the data is bad, there's no
notion of a "next address").
- f_reqs_on_bus tracks the number of requests currently outstanding
on the bus. This is used for the ICache <-> Memory assumptions
above. We have some internal assertions that check this equals the
sum of the "ext" counters minus the sum of the "rvd" counters.
With these assumptions, we can prove:
- Once asserted, valid_o stays high until a transaction is completed
by ready_i or until branch_i is asserted (which cancels the
transaction).
- While the transaction is pending, addr_o remains stable.
- While the transaction is pending, err_o remains stable and, if
err_o is asserted, so does err_plus2_o.
- While the transaction is pending, if err_o and err_plus2_o are
high then bottom 16 bits of the returned instruction data imply an
uncompressed instruction.
- While the transaction is pending, if err_o is low then the bottom
16 bits of the returned instruction remain stable.
- While the transaction is pending, if err_o is low and the bottom
16 bits of the returned instruction imply an uncompressed
instruction then the top 16 bits of the returned instruction
remain stable.
To get this working, you need a corresponding patch in Edalize, which
adds SymbiYosys as an EDA tool.
At the moment, this proves a couple of simple bus assertions. Later
patches will add more.
There are currently some rough edges to this flow:
(1) We use a hacky pre_build hook to run sv2v and edit the files in
the work tree. Among other problems, this means that the any
failure messages that come out of sby have bogus line numbers.
(2) Since we haven't yet got bind support in Yosys, we have to
include a fragment from the design itself.
Verible is still experimental, but having a newer version in CI helps us
to make use of some of its newer features as we move forward (such as
waivers).
This commit contains some final optimizations regarding the bit
manipulation extension as well as the parametrization into a balanced
version and a full performance version.
Balanced Version:
* Supports ZBB, ZBS, ZBF and ZBT extensions
* Dual cycle instructions:
ror[i], rol, cmov, cmix fsl, fsr[i]
* Everything else completes in a single cycle.
Full Version:
* Supports all 32b sub extensions.
* Dual cycle instructions:
ror[i], rol, cmov, cmix fsl, fsr[i], crc32[c], bext, bdep
* Everything else completes in a single cycle.
Notable Changes:
* bext/bdep are now multi-cycle: Sharing additional register
with multiplier module
* grev/gorc instructions are implemented in separate structures
rather than sharing the shifter or butterfly network.
* Speed up decision on using rs1 or rs3 for alu_operand_a by
introducing single-bit register, to identify ternary
instructions in their first cycle.
* Introduce enumerated parameter to chose bit manipulation
implementation
Signed-off-by: ganoam <gnoam@live.com>
- invalidate all ways on a tag error to prevent xprop from the data
error signal and reduce the likelyhood of multi-way allocations
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
Although CSR_SECURESEED is unreadable, an attacker can write a new seed,
which brings convenience to the attack. This patch is used to XOR the
historical seed, so that even if the attacker writes a new value to
CSR_SECURESEED , he cannot know the value of the seed.
Signed-off-by: Xiang Wang <merle@hardenedlinux.org>
This PR adds clocking blocks to all major Ibex interfaces and updates
all corresponding interface accesses to use these clocking blocks.
A few notes:
- `ibex_mem_intf` has two driver clocking blocks, one for host side and
one for device side.
This is because our Ibex testbench currently provides both host and
device agents for both I/D interfaces (of course we only use the
reactive device agents in the main testbench).
- `csr_if` and `dut_if` only have one clocking block each, as all
signals in each will only be either sampled or driven, never both.
- Some utility tasks have been added to some interfaces to wait for a
specified number of clock cycles.
This is like the stress_all test, picking other sequences at random
and running them back-to-back. The difference is in the reset
behaviour, where we randomly pull the reset line at unexpected times
to try to trigger any strange glitches this might cause.
This requires slight changes to the core and memory drivers, which
need to learn to stop and return early from the current item when they
see a reset.
When we chain sequences together, we are careful to pass seeds between
neighbouring sequences. However, I didn't think to check
mem_err_shift. Before this patch, you see problems if you have a
"caching" sequence followed by a "many_errors" sequence with no reset
and no change of seed and they both happen to pick the same address
range.
The problem is that if the data at address A is cached in the first
sequence, the icache will merrily return it when address A comes up in
the second. However, the change to mem_err_shift might mean that this
would cause a memory error if it hadn't been cached, causing the
scoreboard to get upset.
This patch ensures that we always start a sequence with an
invalidation if there was a previous sequence with a different value
of mem_err_shift.
To do this cleanly, the patch also moves some of the "grab the guts of
the old sequence and put it in the new one" logic from
ibex_icache_combo_vseq and into the underlying sequence classes. The
trick is that a sequence now has a handle to the previous sequence (if
there was one), and can use that to extract whatever information it
needs.
This fixes several problems. Firstly, the window_reset function was
switching off tracking until it next saw busy_o go low, which is
correct at the start of time, but not what we want after we've
started. This patch splits that behaviour into a new tracking_reset
function (which calls window_reset). This is called on reset or
invalidate.
Secondly, this check was occasionally failing where we'd have an ECC
sequence (which should disable the check) immediately followed by a
caching sequence with similar addresses. If the window ended in the
caching sequence, we'd see a high fetch ratio and conclude that
something had gone wrong.
Now we clear the window completely whenever we fetch an instruction
when the check is disabled, which should avoid the problem (at worst,
you might get 1 instruction overlap, which is unlikely to matter).
Finally, we move the call to tracking_reset up to the end of the reset
sequence. It doesn't usually matter, but if there's a pending item
from the core monitor with busy = 0, we need to make sure that item
comes in before we set not_invalidating = 1. Otherwise, the scoreboard
incorrectly thinks it's seen the end of the invalidation
sequence (before it's even started) and starts tracking fetch ratios
too early.
This runs sequences back-to-back, occasionally resetting between
sequences.
Because our virtual sequences are composed of several smaller
sequences, we have to stop them when the core sequence finishes (see
the calls to kill() in ibex_icache_base_vseq). We also have to make
sure that we don't drop items in the memory sequence, which can be
pre-empted as part of sending a response (see the peek/get code
there).
Finally, the memory sequence also has a current seed and a list of
pending grants: this patch has to copy those across between sequences
to make everything work correctly.
This virtual sequence controls what sequence we use in the core agent
with a factory override. We need to make sure that we "tidy up" after
starting it, otherwise every sequence afterwards will use the wrong
core sequence.
This will have no effect for now: we just move the "pick a number in
the range 800..1000" logic to the virtual sequence.
The reason to do this is for tests that combine sequences: we want to
be able to shorten each component sequence so that the combined test
isn't way longer than the original ones were.
As with the ECC sequence, it turns out that you don't actually need
the separate test class for this, so this commit gets rid of it. The
advantage of doing this is that we can now chain this vseq with
others.
This has no immediate effect, but it means that the memory agent's
config's "mem_err_shift" value can be changed in the middle of the
test, rather than being fixed in the build_phase.
It turns out that you don't actually need the separate test class for
this, so this commit gets rid of it. The advantage of doing this is
that we can now chain this vseq with others.
Since we are binding in an interface anyway, we can add some SV
assertions to make sure nothing too strange is happening.
Note that they aren't as strong as you might expect: we don't check
that rdata isn't X, for example. This is because the cache makes
speculative reads, which it (hopefully) ignores if the data is
invalid.
It seems that dvsim.py doesn't actually use fusesoc to do things like
pass parameters. Instead, we have to set the tool-specific options in
the hjson file by hand.
Fixes issue #964.