- Move various unused signal fixes from the waiver file to the rtl, so
that all tools can pick them up.
- Fix some oversize line issues.
- Fix some signed / unsigned casting issues.
- Remove some extraneous semicolons.
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
Check that the number of cycles are always as specified for the current
configuration for data independent operations.
The required input signals for each arithmetic operation are split into
different files which are included into the testbench.
For each combination of operation and configured configuration
(slow/fast/single) a define stores the number of cycles in a separate
file. A target exists for each combination.
For a convenient execution the targets are grouped together in a
makefile.
The implementation is based on the formal/icache checks.
For the selection of the single cycle multiplication with the fast
multiplication the parameter is set directly to the enum integer value.
This commit replaces the previous combination of `RV32M` bit parameter
used to en/disable the M extension and the `MultiplierImplementation`
used to select the multiplier implementation by a single enum parameter.
Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
This commit contains some final optimizations regarding the bit
manipulation extension as well as the parametrization into a balanced
version and a full performance version.
Balanced Version:
* Supports ZBB, ZBS, ZBF and ZBT extensions
* Dual cycle instructions:
ror[i], rol, cmov, cmix fsl, fsr[i]
* Everything else completes in a single cycle.
Full Version:
* Supports all 32b sub extensions.
* Dual cycle instructions:
ror[i], rol, cmov, cmix fsl, fsr[i], crc32[c], bext, bdep
* Everything else completes in a single cycle.
Notable Changes:
* bext/bdep are now multi-cycle: Sharing additional register
with multiplier module
* grev/gorc instructions are implemented in separate structures
rather than sharing the shifter or butterfly network.
* Speed up decision on using rs1 or rs3 for alu_operand_a by
introducing single-bit register, to identify ternary
instructions in their first cycle.
* Introduce enumerated parameter to chose bit manipulation
implementation
Signed-off-by: ganoam <gnoam@live.com>
This commits implements the Bit Manipulateion Extension ZBT instruction
group: cmix, cmov, fsr[i] and fsl. Those are instructions depend on
three ALU operands. Completeion of these instructions takes 2 clock
cycles. Additionally, the rotation shifts rol and ror are made
multicycle instructions.
All multicycle instructions take exactly two cycles to complete.
Architectural additions:
* Multicycle Stage Register in ID stage.
multicycle_op_stage_reg
* Decoder generates alu_multicycle signal, to stall pipeline
* For all ternary instructions:
1. cycle: connect alu operands a and b to rs1 and rs2
respectively
2. cycle: connect operands a and be to rs3 and rs2
respectively
* Reduce the physical size of the shifter from 64 bit to 63
bit: 32-bit operand + 1 bit for arithmetic / one-shift
* Make rotation shifts multicycle instructions.
Instruction Details:
* cmov:
1. store operand a (rs1) in stage reg.
2. return stage reg output (rs2) or rs3.
if rs2 != 0 the output (rs1) is already known in the
first cycle. -> variable latency implementation is
possible.
* cmix:
1. store rs1 & rs2 in stage reg
2. return stage_reg_q | (rs2 & ~rs3)
reusing bwlogic from zbb
* rol/ror: (here: ror)
shift_amt = rs2 & 31;
shift_amt_compl = (32 - shift_amt) & 31
1. store (rs1 >> shift_amt) in stage reg
2. return (rs1 << shift_amt_compl) | stage_reg_q
* fsl/fsr:
For funnel shifts, the order of applying the shift
amount or its complement is determined by bit [5] of
shift_amt. Pseudocode for fsr:
shift_amt = rs2 & 63
shift_amt_compl = (32 - shift_amt[4:0])
1. if (shift_amt >= 33):
store (rs1 >> shift_amt_compl[4:0]) in stage reg
else if (shift_amt <0 && shift_amt <= 31):
store (rs1 << shift_amt[4:0]) in stage reg
else if (shift_amt == 32 || shift_amt == 0):
store rs1 in stage reg
2. if (shift_amt >= 33):
return stage_reg_q | (rs3 << shift_amt[4:0])
else if (shift_amt <0 && shift_amt <= 31):
return stage_reg_q | (rs3 >> shift_amt_compl[4:0])
else if (shift_amt == 32):
return rs3
else if (shift_amt == 0):
return rs1
Signed-off-by: ganoam <gnoam@live.com>
The third pipeline stage is a new writeback stage. Ibex can now be
configured as the original two stage design or the new three stage
design using the `WritebackStage` parameter in ibex_core. This defaults
to 0 (giving the original two stage design).
The three stage design is *EXPERIMENTAL*
In the three stage design all register write back occurs in the third,
final stage. This allows a cycle for responses to loads and stores so
when the memory system can respond in a single cycle there will be no
stall. This offers significant performance benefits.
Documentation of the three stage design is still to be written so
existing documentation applies to the two stage design only as various
aspects of Ibex behaviour will change in the three stage design.
Signed-off-by: Greg Chadwick <gac@lowrisc.org>
Use of case inside in always_ff block does not meet style guide
recomendations. Refactored to remove this.
Signed-off-by: Greg Chadwick <gac@lowrisc.org>
* Integrate option to implement a multiplier using 3 parallel 17 bit
multipliers in order to compute MUL instructions in 1 cycle
MULH in 2 cycles.
* Add parameter SingleCycleMultiply to select single cycle
multiplication.
The single cycle multiplication capability is intended for FPGA
targets. Using three parallel multiplication units improves performance
of multiplication operations at the cost of DSP primitives. For ASIC
targets, the area consumed by the multiplication structure will grow
approximately 3-4x.
The functionality is selected within the module using the parameter
`SingleCycleMultiply`. From the top level it can be chosen by setting
the parameter `MultiplierImplementation` to 'single_cc'.
Signed-off-by: ganoam <gnoam@live.com>
prim_assert.sv is a file containing assertion macros (defines).
Previously, prim_assert.sv was compiled as normal SystemVerilog file.
This made the defines available for the whole compilation unit as soon
as they were defined. Since all cores using prim_assert depended (in
fusesoc) on the lowrisc:prim:assert core, prim_assert was always
compiled first, and the defines were visible in subsequent files.
All of that is only true if all files end up in one comilation unit. The
SV standard states that what makes up a compilation unit is
tool-defined, but also states that typically, passing multiple files (or
a file list/.f file) to a single tool invocation means that all files
end up in one compilation unit; if the tool is called multiple times,
then the files end up in separate compilation units.
Edalize (the fusesoc backend) doesn't guarantee either behavior, and so
it happens that for Vivado, Verilator, Cadence and Synopsys simulators,
all files are compiled into a single compilation unit. But for Riviera,
each file is a separate compilation unit.
To avoid relying on the definition of compilation units, and to do the
generally right thing (TM), this commit changes the code to always
include the prim_assert.sv file when it is used in a source file.
Include guards are introduced in the prim_assert.sv file to avoid
defining things twice.
This commit replaces all X assignments in the RTL with defined
values. In addition, SystemVerilog Assertions are added to catch
invalid signal values in simulation. A new file containing the
corresponding assertion macros is added as well.
Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
We currently have a documentation block at the beginning of each file,
containing author credits and module-level documentation. The
module-level documentation is retained for historic reasons and
duplicated with the newer comments below it.
For the authors, maintaining author credits in the file is error-prone,
as this information gets outdated very soon. A more reliable way to see
who modified a file is to use the history information in git.
Additionally, we now have the CREDITS.md file, which lists all
contributors, even the ones which don't appear in the git history (e.g.
because the code was copied and commited by someone else).
This file doesn't contain defines any more, but a normal SV package.
The diff is best viewed without whitespace changes, as the reindents
cause a lof of diff noise.
Fixeslowrisc/ibex#173
The EX block actually signals when its output is valid, and not when it
is ready to accept new input. The LSU valid signal is not needed inside
the EX block and can thus be fed directly to the ID stage.
This commit simplifies the assignment of literals to enum types in
default cases by:
- defining or using existing enum values for all-zero values,
- feeding a single `1'bX` into the type cast instead of exact width
(the tools are fine with that).
This commit adds a `default` to all `unique case` statements. Also, in case
FSMs reach an undefined state, the `'X` is propagated to ease detection
in simulation. Both these changes are required by our coding guidelines.
It is not obvious that the MSB in the MAC of the multiplier can safely
be ignored, and it is tedious to derive this from solely studying the
architecture of the multiplier. This comment makes clear that it is
indeed safe and also gives some reasoning.
This change has been informed by advice from the lowRISC legal
committee.
The Solderpad 0.51 license states "the Licensor permits any Work
licensed under this License, at the option of the Licensee, to be
treated as licensed under the Apache License Version 2.0". We use this
freedom to convert license markings to Apache 2.0. This commit ensures
that we retain all authorship and copyright attribution information.