Update code from upstream repository https://github.com/google/riscv-
dv to revision 42264b7782a10848935e995063c212893820e561
* fix pmp generation in bare program mode (Udi Jonnalagadda)
* Use literal instead array concatenation (Daniel Mlynek)
* fix access rights (Daniel Mlynek)
* fix in WA fo Aldec Riviera rand cannot be defined in packed struct
(Daniel Mlynek)
* Fix ius compile error (Weicai Yang)
* fix pmp randomization to adhere to max offset (Udi Jonnalagadda)
* Add options to enable bitmanip by group (google/riscv-dv#532)
(weicaiyang)
* [pmp] Relative addressing scheme to configure pmpaddr (google/riscv-
dv#534) (udinator)
* redunant variable ALDEC_PATH removed (danielmlynek)
* riviera 2020.04 beta initial support (danielmlynek)
* Removed system function call from the gen_section() function
arguments list. (google/riscv-dv#531) (Dariusz Stachańczyk)
* Dynamic arrays declared as parameter changed to const variables.
(google/riscv-dv#530) (danielmlynek)
* enhance pmp configuration to make safe region configurable (Udi
Jonnalagadda)
* Fix a typo in riscvOVPsim (google/riscv-dv#529) (weicaiyang)
Signed-off-by: Udi <udij@google.com>
This commit fixes three possible cases for erroneous generation of
illegal instruction signals. Also, the bit-slices considered for
decoding ALU instructions are corrected to better reflect their
encoding specifications.
* Fix decoding of orc_b in illegal_insn generation.
* Insn[31] is no longer checked for generation of illegal instructions:
This bit is part of the rs3 register adress for ternary
bitmanipulation instructions (zbt).
* Correct bit-slicing for ALU reg-immediate instructions according
to specification: immediates are encoded in the range
insn[26:20] in all cases. Where a shift-amount is encoded, bits
[26:25] will have no effect, but will no longer generate
illegal instructions.
Signed-off-by: ganoam <gnoam@live.com>
Since this agent doesn't currently do any monitoring (will be
addressed in a later patch), the monitor_cb clocking block doesn't do
very much at the moment.
The driver_cb clocking block *is* used, though. The input lines are just
those needed to drive things correctly (ready is needed to do
ready/valid signalling properly; err is needed to abort instruction
fetches and do a branch after an error).
I've marked the output signals as negedge: this doesn't really make
any difference to simulation results, since the design samples
everything on posedge, but makes it rather easier to read dumped
waves.
This commits implements the Bit Manipulateion Extension ZBT instruction
group: cmix, cmov, fsr[i] and fsl. Those are instructions depend on
three ALU operands. Completeion of these instructions takes 2 clock
cycles. Additionally, the rotation shifts rol and ror are made
multicycle instructions.
All multicycle instructions take exactly two cycles to complete.
Architectural additions:
* Multicycle Stage Register in ID stage.
multicycle_op_stage_reg
* Decoder generates alu_multicycle signal, to stall pipeline
* For all ternary instructions:
1. cycle: connect alu operands a and b to rs1 and rs2
respectively
2. cycle: connect operands a and be to rs3 and rs2
respectively
* Reduce the physical size of the shifter from 64 bit to 63
bit: 32-bit operand + 1 bit for arithmetic / one-shift
* Make rotation shifts multicycle instructions.
Instruction Details:
* cmov:
1. store operand a (rs1) in stage reg.
2. return stage reg output (rs2) or rs3.
if rs2 != 0 the output (rs1) is already known in the
first cycle. -> variable latency implementation is
possible.
* cmix:
1. store rs1 & rs2 in stage reg
2. return stage_reg_q | (rs2 & ~rs3)
reusing bwlogic from zbb
* rol/ror: (here: ror)
shift_amt = rs2 & 31;
shift_amt_compl = (32 - shift_amt) & 31
1. store (rs1 >> shift_amt) in stage reg
2. return (rs1 << shift_amt_compl) | stage_reg_q
* fsl/fsr:
For funnel shifts, the order of applying the shift
amount or its complement is determined by bit [5] of
shift_amt. Pseudocode for fsr:
shift_amt = rs2 & 63
shift_amt_compl = (32 - shift_amt[4:0])
1. if (shift_amt >= 33):
store (rs1 >> shift_amt_compl[4:0]) in stage reg
else if (shift_amt <0 && shift_amt <= 31):
store (rs1 << shift_amt[4:0]) in stage reg
else if (shift_amt == 32 || shift_amt == 0):
store rs1 in stage reg
2. if (shift_amt >= 33):
return stage_reg_q | (rs3 << shift_amt[4:0])
else if (shift_amt <0 && shift_amt <= 31):
return stage_reg_q | (rs3 >> shift_amt_compl[4:0])
else if (shift_amt == 32):
return rs3
else if (shift_amt == 0):
return rs1
Signed-off-by: ganoam <gnoam@live.com>
We need this specific edalize version because recent verilators have
got pickier about string parameter passing, breaking the
"MultiplierImplementation" parameter.
As well as teaching check_tool_requirements.py to get the edalize
version from pip3, this patch also does a bit of tidying up, coping
better if tool_requirements.py is missing or malformed.
There is more than one icache-specific agent that we need for the
icache testbench, so "ibex_icache_agent" isn't a very helpful name.
This commit was pretty much automated, except for a few spacing
cleanups, with commands like:
git grep -l ibex_icache_agent | \
xargs sed -i 's!ibex_icache_agent!ibex_icache_core_agent!g'
(and then rename the directory and files).
This fills in the sequencer, driver etc. to actually drive signals.
You can "run" a test with
make -C dv/uvm/icache/dv run
This won't do anything useful (it will stop with a timeout) because
there is no memory agent yet.
This patch also includes a hacky test timeout. We'll remove this (or
at least make it bigger) when we start actually running data through
the tests, but this is handy for now because it means simulations
finish without having to pkill them.
Fusesoc has an unfortunate bug[1] where a boolean parameter which has
default true can't be disabled. For now, just make all our boolean
parameters back into integers again. In the future, when that's fixed,
maybe we should switch things back.
[1] https://github.com/olofk/fusesoc/issues/392
Multi-config CI wasn't actually trying multiple configurations. This
fixes that issue and uses a less fragile method of producing fusesoc
options. They are generated once and stored in a variable so we cannot
accidentally break one or more steps by using an incorrect
ibex_config.py command in one step whilst using a correct
ibex_config.py in the display step (which is also intended to check the
ibex_config.py command is correct).
The new information is:
- Branch addresses must be 16-bit aligned.
- Explicitly allow top 16 bits of rdata to change when lower 16 bits
contain a compressed instruction.
- Explicitly allow the core to drop ready without valid.
I've also rejigged the layout slightly, improving (I think!) the
description of compressed and uncompressed instructions.
- A new parameter and a run-time control bit (DataIndTiming and
data_ind_timing) enabling different behaviour for running security critical
code sections.
- In the new mode, all branches act as if taken, with not-taken
branches executing as a branch to the next instruction.
- This should give similar execution time/power characteristics
regardless of the branch condition.
- Note that with the BranchTargetALU, branches stall an extra cycle in
secure mode to avoid factoring the branch-taken decision into the
branch target address mux.
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
IUS lacks support for certain language features used in Ibex (such as
use of $clog2 in localparam definitions) so remove it as a simulator
that can be used by dv.
Previous to this change the entire process would die on an issue with
processing a single log. This alters it so this will just add to the
failure count with the error logged and the log processing continued to
its end.
When xprop is enabled various case and if/else constructs will propagate
X leading to failures in ASSERT_KNOWN. This introduces enable terms to
various ASSERT_KNOWN uses that would otherwise fail without them.
prim_assert.sv changes copied across from OpenTitan respository.
Note this doesn't introduce any testing of the RV32B instructions,
simply runs existing tests on a configuration with the RV32B extension
enabled.
Fixes#745
The previous code correctly dumped to "waves.fsdb" if you had Verdi
installed. Unfortunately, it dumped to the same file name if you
didn't, which was rather confusing.
This patch passes a "DUMP_BASE" environment variable, rather than
"DUMP_FILE", which doesn't include the extension. Then it appends the
correct extension at runtime in the TCL, when we tell VCS what sort of
dumping to do.
The code now also checks for all environment variables before reading
them, allowing defaults if they don't exist. The defaults might not be
what you want, but a syntax error at this point causes VCS to sit
waiting for terminal input (with no stdin!), which is kind of annoying.
I've also removed the copy-pasted Verdi documentation. Apart from
anything else, this is probably copyright, so we shouldn't have a copy
in the repo!
- Make parameter declaration order and default values in
ibex_core_tracing.sv match the documentation
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
Add a few sentences to describe the behaviour/meaning of the err_plus2_o
signal, and how it is used by the core.
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
This commit implements the Bit Manipulation Extension ZBB instruction
group: clz, ctz, pcnt, slo, sro, rol, ror, rev, rev8, orcb, pack
packu, packh, min, max, andn, orn, and xnor.
* Bit counting instructions clz, ctz and pcnt can be implemented to
share much of the architecture:
clz: Count Leading Zeros. Counts the number of 0 bits at the
MSB end of the argument.
ctz: Count Trailing Zeros. Counts the number of 0 bits at the
LSB end of the argument.
pcnt: Counts the number of set bits of the argument.
The implementation uses:
- 32 one bit adders, counting the set bits of a signal
bitcnt_bits, starting from the LSB end.
- For pcnt the argument is fed directly into bitcnt_bits.
- For clz, the operand is reversed such that leading zeros are
located at the LSB end of bitcnt_bits.
- For ctz and clz: counter enable signal for 1-bit counter i
is high, if the previous enable signal, and
its corresponting bitcnt_bit was high.
* Instructions sll[i], srl[i],slo[i], sro[i], rol, ror[i], rev, rev8
and orc.b are summarized as shifting instructions and related:
The following instructions are slight variations of the
existing base spec's sll, srl and sra instructions.
- slo[i] and sro[i]: shift left/right ones: similar to
shift-logical operations from base spec, but shifting
in ones instead of zeros.
- rol and ror[i]: rotate left/right ones: circular shift
operations. shifting in values from the oposite end
of the operand instead of zeros.
Those instructions can be implemented, sharing the base spec's
shifting structure. In order to support rotate operations, a
64-bit shifting structure is needed.
In the existing ALU, hardware is described only for right
shifts. For left shifts the operand is initially reversed,
right shifted and the result is reversed back. This gives rise
to an additional resource sharing oportunity for some more
zbb operations:
- rev: bitwise reversal.
- rev8: byte-order swap.
- orc.b: byte-wise reverse and or-combine.
* Instructions min, max:
For the B-extension's min/max instructions, we can share the
existing comparison operations. The result is obtained by
activating the comparison structure accordingly and
multiplexing the operands using the comparison result.
* Logic-with-negate instructions andn, orn, xnor:
For the B-extension's logic-with-negate instructions we can
share the structures of the base spec's logic structures
already present for 'xnor', 'or' and 'and' instructions as
well as the conditionally negated b operand generated for
subtraction operations.
* Instructions pack, packu, packh:
For the pack, packh and packu instructions I don't see any
opportunities for resource sharing. However, the architecture
is quite simple.
- pack: pack the lower halves of rs1 and rs2 into rd, with rs1
in the lower half and rs2 in the upper half.
- packu: pack the upper halves of rs1 and rs2 into rd, with
rs1 in the lower half and rs2 in the upper half.
- packh: pack the LSB bytes of rs1 and rs2 into rd, with rs1
in the lower half and rs2 in the upper half.
Signed-off-by: ganoam <gnoam@live.com>
When a PMP error comes in, the cache doesn't quite behave as if the
request was granted (if it did: it would wait forever for a response).
Hopefully this version is a bit clearer.
Also, this makes explicit that the upper bits of a 16-bit instruction
fetch can be bogus.
The ibex_pkg.sv file is effectively a "header" with useful defines;
we need them in ibex_tracer_pkg.sv, and in other places around Ibex.
Currently, the dependency between ibex_tracer_pkg.sv and ibex_pkg.sv
wasn't covered in a FuseSoC core file, leading to unstable behavior.
This patch adds this dependency by
- factoring out the ibex_pkg.sv file into a separate core file,
ibex_pkg.core, and
- adding a dependency on the new ibex_pkg core to the ibex_tracer core.