- Before this fix, the branch signal was qualified by the illegal
instruction signal and the illegal csr signal.
- This patch removes both of these since the decoder already masks
branches with illegal isntruction, and a branch cannot be a CSR op.
- This improves the worst path in the design significantly without the
branch target ALU.
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
- Create separate operand muxes for the branch/jump target ALU
- Complete jump instructions in one cycle when BT ALU configured
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
- valid_o could be asserted for one cycle then dropped when receiving
rvalid data for a request which has branched into the middle of a
line.
- This fix keeps valid_o asserted by using the offset version of
fill_rvd_cnt_q (fill_rvd_beat) to compare against fill_out_cnt_q
(which is also offset by the branch).
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
- Speculative requests observing a PMP error shouldn't increment the
external request counter
- Remove redundant logic on fill_rvd_exp
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
- Add parameters and actual instantiation of icache
- Add a custom CSR in the M-mode custom RW range to enable the cache
- Wire up the cache invalidation signal to trigger on fence.i
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
mtval should record which half of the instruction caused the error
rather than just recording the PC.
An extra signal is added in the IF stage to indicate when an error is
caused by the second half of an unaligned instruction. This signal is
then used to increment the PC by 2 for mtval capture on an error.
Fixes#709
Instruction requests triggering PMP errors have their external request
suppressed. The beat counting logic therefore needs to know that these
requests will never receive any rvalid data responses.
This fix stops the external request counter from incrementing, and marks
all external requests complete as soon as any error is received.
The data in the cache line beyond the error is not required since the
core cannot access it without consuming the error first.
- Bring in a version of ram primitive with configurable width similar to
the OT RAM primitive.
- Change the RAM banking structure to be a single bank of LineSize (64
bits) to match the upcoming ECC granularity.
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
The design currently relies on fill_done remaining set in the cycle
after the fill buffer completes to ensure the fill_older_q entry gets
cleared (when a fill buffer completes in the same cycle that one is
allocated). This fix makes the behaviour a bit more consistent and easy
to reason about.
Verible doesn't do real pre-processing currently, and fails to parse
code if define sections span across headers of blocks, as we did.
Use another syntax instead for the one case where we did that to work
around this limitation. The code isn't less readable as result, making
this an acceptable trade-off.
Works around https://github.com/google/verible/issues/228
- ready_i input to the prefetch buffer factored both it's own valid_o
output and the pc_set branch signal, neither of which are required.
- Refactoring the ready_i signal to just id_in_ready_i improves timing
significantly for the icache.
- Also removed offset_in_init signal which appeared to serve no purpose.
The third pipeline stage is a new writeback stage. Ibex can now be
configured as the original two stage design or the new three stage
design using the `WritebackStage` parameter in ibex_core. This defaults
to 0 (giving the original two stage design).
The three stage design is *EXPERIMENTAL*
In the three stage design all register write back occurs in the third,
final stage. This allows a cycle for responses to loads and stores so
when the memory system can respond in a single cycle there will be no
stall. This offers significant performance benefits.
Documentation of the three stage design is still to be written so
existing documentation applies to the two stage design only as various
aspects of Ibex behaviour will change in the three stage design.
Signed-off-by: Greg Chadwick <gac@lowrisc.org>
This should cause no functional change, but avoids a seeming
combinatorial loop reported by Verilator.
The seeming loop is because the always_comb process that contained
have_instr is sensitive to if_id_pipe_reg_we but that process wrote
the have_instr signal, which is used in the continuous assignment to
if_id_pipe_reg_we later on.
By default, variables in functions are static in SystemVerilog. This caused `string desc = "";` in `get_fence_description` to be executed only once, i.e. the text was continuously extended from the last call.
Mark all functions `automatic` to get behavior as one would expect from normal functions.
Following RISC-V privileged architecture version 1.11,
the "E" bit of misa should return the complement of the "I" bit.
Set the "I" bit only if RV32E is not used.
CloseslowRISC/ibex#612.
Use of case inside in always_ff block does not meet style guide
recomendations. Refactored to remove this.
Signed-off-by: Greg Chadwick <gac@lowrisc.org>
* Integrate option to implement a multiplier using 3 parallel 17 bit
multipliers in order to compute MUL instructions in 1 cycle
MULH in 2 cycles.
* Add parameter SingleCycleMultiply to select single cycle
multiplication.
The single cycle multiplication capability is intended for FPGA
targets. Using three parallel multiplication units improves performance
of multiplication operations at the cost of DSP primitives. For ASIC
targets, the area consumed by the multiplication structure will grow
approximately 3-4x.
The functionality is selected within the module using the parameter
`SingleCycleMultiply`. From the top level it can be chosen by setting
the parameter `MultiplierImplementation` to 'single_cc'.
Signed-off-by: ganoam <gnoam@live.com>
- The slow multiplier is modified to terminate iterations early instead
of always going the full 32 iterations for `MUL` instructions.
- Multiplications now terminate early after clog2(`op_b`) iterations.
- The slow multiplier can be further optimized by swapping the smaller
operand into `op_b` when in the `MD_IDLE` state.
This commit modifies the `mip` CSR to not depend on the `mie` CSR. While
the values of both these CSRs are combined to decide whether an
interrupt shall be handled, the RISC-V spec does not state that the
content of of `mip` should depend on `mie`. This commit better aligns
Ibex with other open-source RISC-V cores.
This resolveslowRISC/ibex#567 reported by @pfmooney.
Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
multdiv_sel signals the mult/div operand should be selected for the ALU
inputs. Previously the mult_en/div_en signals were used but these factor
in whether the instruction is actually happening which is not relevant
for the mux select. The dedicated select signal gives better timing.
Adds a second set of instruction flops that are used to determine ALU
operation and operand selection. This reduces fanout from the
instruction flops and so helps timing.
prim_assert.sv is a file containing assertion macros (defines).
Previously, prim_assert.sv was compiled as normal SystemVerilog file.
This made the defines available for the whole compilation unit as soon
as they were defined. Since all cores using prim_assert depended (in
fusesoc) on the lowrisc:prim:assert core, prim_assert was always
compiled first, and the defines were visible in subsequent files.
All of that is only true if all files end up in one comilation unit. The
SV standard states that what makes up a compilation unit is
tool-defined, but also states that typically, passing multiple files (or
a file list/.f file) to a single tool invocation means that all files
end up in one compilation unit; if the tool is called multiple times,
then the files end up in separate compilation units.
Edalize (the fusesoc backend) doesn't guarantee either behavior, and so
it happens that for Vivado, Verilator, Cadence and Synopsys simulators,
all files are compiled into a single compilation unit. But for Riviera,
each file is a separate compilation unit.
To avoid relying on the definition of compilation units, and to do the
generally right thing (TM), this commit changes the code to always
include the prim_assert.sv file when it is used in a source file.
Include guards are introduced in the prim_assert.sv file to avoid
defining things twice.
This commit adds a register file designed to be synthesized into FPGA
synchronous-write / asynchronous-read design elements.
For the artya7-100 example, the register file is implemented by 12
RAM32M primitives, conserving approximately 600 Logic LUTs and 1000
flip-flops at the expense of 48 LUTRAMs.
Signed-off-by: ganoam <gnoam@live.com>
Refactors performance counters so only flops that are required from the
given parameters are explicitly elaborated without relying on
optimization to remove unused flops.
Fixes#473
This commit ensures that the assertions in the compressed decoder do
not fire if the compressed decoder sees invalid data from the prefetch
buffer.
Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
This commit modifies the compressed decoder to forward the incoming
instruction to the output. It is marked as legal, unless:
1) the decoder cannot determine if the instruction is compressed (e.g.
because of unknown selector bits), or
2) the instruction is compressed but
a) it cannot be successfully decompressed (e.g. because of unknown
selector bits), or
b) it is indeed illegal.
In the case of 2b) the compressed decoder may output an illegal
decompressed instruction instead of the incoming instruction.
Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
This commit replaces all X assignments in the RTL with defined
values. In addition, SystemVerilog Assertions are added to catch
invalid signal values in simulation. A new file containing the
corresponding assertion macros is added as well.
Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
- Add the minimum amount of trigger system to support GDB hbreak
- Only a single trigger is implemented
- Only instruction address matching
- Only break into debug mode (no native debug)
- Fixes#382
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>