Commit graph

122 commits

Author SHA1 Message Date
Yuichi Sugiyama
6bc3022cc2 [rtl/sw] Add PAC and AUT counters 2020-07-15 17:02:42 +02:00
Yuichi Sugiyama
46bef34dee [rtl] Fix hazard detection issues for Pointer Authentication
With the writeback stage enabled we execute the PAC/AUT instruction
before the required data is written to the register file.
For example, when the load instruction precedes AUT instruction,
AUT instruction is started before the loaded data is written
to the register file. It is a problem that the hazard detection
(stall_ld_hz) using rf_ren_a/b_o was not active for PAC/AUT instruction.
Also, I change codes not to activate pa_pac_en or pa_aut_en
when load hazard occurs.
2020-07-15 17:02:42 +02:00
Yuichi Sugiyama
f13ac7b8b9 [rtl] Add Pointer Authentication 2020-07-15 17:02:42 +02:00
Tom Roberts
c542edbb1a [rtl] Add register-file ECC checking
- Add SECDED ECC checking to the register file when SecureIbex is
  enabled
- No correction is attempted, but an alert is raised for the system to
  intervene

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-07-15 09:50:23 +01:00
Tom Roberts
aae437d75b [rtl] Add alert outputs
- Add a major and minor alert output which can be used by the system to
  react to fault injection attacks

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-07-15 09:50:23 +01:00
Philipp Wagner
4223803d22 Lint: Fix some line length warnings
AscentLint complains about lines longer than 100 characters, as seen in
the nightly lint reports. Fix some (all?) of them.
2020-07-09 13:42:33 +01:00
Pirmin Vogel
414ff7eeb0 [doc] Fix spelling of CoreMark
Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
2020-07-06 12:30:02 +02:00
ganoam
1aa4d5a32b [bitmanip] Optimizations and Parametrization
This commit contains some final optimizations regarding the bit
manipulation extension as well as the parametrization into a balanced
version and a full performance version.

Balanced Version:
        * Supports ZBB, ZBS, ZBF and ZBT extensions
        * Dual cycle instructions:
          ror[i], rol, cmov, cmix fsl, fsr[i]
        * Everything else completes in a single cycle.

Full Version:
        * Supports all 32b sub extensions.
        * Dual cycle instructions:
          ror[i], rol, cmov, cmix fsl, fsr[i], crc32[c], bext, bdep
        * Everything else completes in a single cycle.

Notable Changes:
        * bext/bdep are now multi-cycle: Sharing additional register
          with multiplier module
        * grev/gorc instructions are implemented in separate structures
          rather than sharing the shifter or butterfly network.
        * Speed up decision on using rs1 or rs3 for alu_operand_a by
          introducing single-bit register, to identify ternary
          instructions in their first cycle.
        * Introduce enumerated parameter to chose bit manipulation
          implementation

Signed-off-by: ganoam <gnoam@live.com>
2020-06-26 14:43:24 +02:00
Michael Schaffner
ae547c8d30 [top_pkg] Fix style lint warnings
Signed-off-by: Michael Schaffner <msf@google.com>
2020-06-22 20:52:15 +01:00
Bert Pieters
fdfdcc0467 [rtl] disable clock between reset and fetch_enable_i
Fixes lowRISC#957

Signed-off-by: Bert Pieters <bert.pieters@gmail.com>
2020-06-22 13:25:39 +02:00
Greg Chadwick
3c55a72d08 [rtl] Use gated clock for wb_stage and rf
Corrects a typo, ibex_wb_stage and ibex_register_file were being
supplied with the ungated clk.
2020-06-12 10:45:51 +01:00
Tom Roberts
5ecaa11c63 [rtl] Fix writeback stage interrupt issue
- If an interrupt arrives at the same time as a load/store instruction
  is in ID stage, the interrupt must wait until load/store completes.
  Without the WB stage this happens naturally as the core stalls. With
  the WB stage, we need to allow the load/store to progress to the WB
  stage (and clear the ID stage) then hold back the interrupt until it
  completes.
- Also cleaned up some lsu related stalling terms and signal naming.

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-06-10 15:13:32 +01:00
Tom Roberts
ff5375db5c [rtl] Make speculative branch optional
- The speculative branch behaviour causes a performance degradation of
  around 3% in the max config. This change enables that behaviour only
  the maximum PMP config, which is where it is most needed for timing
  closure.

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-06-02 13:41:29 +01:00
Tom Roberts
12b39476c0 [rtl] Add speculative branch signal
- Drive a speculative version of the branch signal into the IF stage to
  drive address muxing
- The speculative signal is the same as the regular branch signal but
  assumes all conditional branches are taken
- This breaks the timing path from branch condition calculation into
  address muxing (and therefore PMP error calculation)
- When the branch is not taken, any external request we might otherwise
  have made is suppressed
- This has a minor performance cost (0.8% without I$, ~0% with I$)

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-26 09:41:37 +01:00
Tom Roberts
b4d952e297 [assertions] Tweak xprop assertion qualifiers
- Tighten up enable conditions to stop properties firing when there is
  an instruction fetch error

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-26 09:33:50 +01:00
Tobias Wölfel
46fab41f5b [rtl] Remove redundant assignment 2020-05-25 16:47:25 +01:00
Tobias Wölfel
d1854e8cb5 [rtl] Update RVFI order
Use the stage value which contains the last value.
2020-05-25 16:47:25 +01:00
Tobias Wölfel
12f4f3b9ae [rtl] Forward register data with status
Instead of accessing the signals via the module instance, use the
signals connected to the output port of the module.

Only set the values for RS1/2 if they are used.

Without this addition an instruction without a valid encoding for a
register would reuse invalid data as the address of the register.
Certain checks require that the data must match the register content if
the address is non-zero.

Reuse the signal from the instruction decoder to set the registers to
non-zero values only if the instruction contains a valid encoding for
the register.
2020-05-25 16:47:25 +01:00
Tobias Wölfel
f45b3eca99 [rtl] Set RVFI program counter
The next program counter is not always the program counter of the
fetched instruction. When updating the counter, the actual next
instruction is given by the branch target.
2020-05-25 16:47:25 +01:00
Tobias Wölfel
4e7b981911 [rtl] Add RVFI IXL interface
Following the RISC-V Formal Interface (RVFI) specification the output is
added to set the value of MXL/SXL/UXL of the current privilege level.
2020-05-25 16:47:25 +01:00
Tom Roberts
d5ee96fff6 [rtl] Add dummy instruction insertion
- Adds a new module in the IF stage to inject dummy instructions into
  the pipeline
- Control / frequency of insertion is governed by configuration CSRs
- Extra CSR added to allow reseed of the internal LFSR useed for
  randomizing insertion
- Extra logic added to the register file to make dummy instruction
  writebacks look like real intructions (via the zero register)

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-21 13:58:01 +01:00
Tom Roberts
d19189ba43 [rtl] data-independent execution for multdiv_slow
- Remove all early exit's from multiply and divide operations when in
  fixed time execution mode.

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-15 10:19:55 +01:00
Pirmin Vogel
3922b2582f [rtl] Rework generation and use of mult/div_sel/en
Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
2020-05-01 17:29:59 +02:00
ganoam
133fef2c2f [bitmanip] Add ZBS Instruction Group
This commit implements the Bit Manipulation Extension SBS instruction
group: sbset[i], sbclr[i], sbinv[i] and sbext[i]. These instructions
set, clear, invert or extract bit rs1[rs2] or rs1[imm] for reg-reg and
reg-imm instructions respectively.

Archtectural details:
        * A multiplexer is added to the shifter structure in order to
          chose between 32'h1, used for the single-bit instructions as
          summarized below, and regular operand_b input.

        * Dedicated bitwise-logic blocks are introduced for multicycle
          shifts and cmix instructions (fsr, fsl, ror, rol),
          single-bit instructions (sbset, sbclr, sbinv, sbext), and
          stanard-ALU and zbb instructions (or, and xor, orn, andn,
          xnor).

Instruction details: All of the zbs instructions rely on sharing the
        existing shifter structure. The instructions are carried out in
        one cycle.

        * sbset, sbclr, sbinv:
                shift_result = 32'h1 << rs2[4:0];
                singlebit_result = rs1 [|, ^ , &~] shift_result;

        * sbext:
                shift_result = rs1 >> rs2[4:0];
                singlebit_result = {31'0,shift_resutl[0]};

Signed-off-by: ganoam <gnoam@live.com>
2020-04-24 08:32:30 +02:00
Pirmin Vogel
dcf18d86c3 Add missing include
Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
2020-04-23 15:44:56 +02:00
ganoam
4cb77b8121 [bitmanip] Add ZBT Instruction Group
This commits implements the Bit Manipulateion Extension ZBT instruction
group: cmix, cmov, fsr[i] and fsl. Those are instructions depend on
three ALU operands. Completeion of these instructions takes 2 clock
cycles. Additionally, the rotation shifts rol and ror are made
multicycle instructions.

All multicycle instructions take exactly two cycles to complete.

Architectural additions:

        * Multicycle Stage Register in ID stage.
                multicycle_op_stage_reg

        * Decoder generates alu_multicycle signal, to stall pipeline

        * For all ternary instructions:
                1. cycle: connect alu operands a and b to rs1 and rs2
                          respectively
                2. cycle: connect operands a and be to rs3 and rs2
                          respectively

        * Reduce the physical size of the shifter from 64 bit to 63
                bit: 32-bit operand + 1 bit for arithmetic / one-shift

        * Make rotation shifts multicycle instructions.

Instruction Details:
        * cmov:
                1. store operand a (rs1) in stage reg.
                2. return stage reg output (rs2)  or rs3.

                if rs2 != 0 the output (rs1) is already known in the
                  first cycle. -> variable latency implementation is
                  possible.

        * cmix:
                1. store rs1 & rs2 in stage reg
                2. return stage_reg_q | (rs2 & ~rs3)

                reusing bwlogic from zbb

        * rol/ror: (here: ror)
              shift_amt       = rs2 & 31;
              shift_amt_compl = (32 - shift_amt) & 31
              1. store (rs1 >> shift_amt) in stage reg
              2. return (rs1 << shift_amt_compl) | stage_reg_q

        * fsl/fsr:
        For funnel shifts, the order of applying the shift
        amount or its complement is determined by bit [5] of
        shift_amt. Pseudocode for fsr:

              shift_amt       = rs2 & 63
              shift_amt_compl = (32 - shift_amt[4:0])

              1. if (shift_amt >= 33):
                    store (rs1 >> shift_amt_compl[4:0]) in stage reg
                 else if (shift_amt <0 && shift_amt <= 31):
                    store (rs1 << shift_amt[4:0]) in stage reg
                 else if (shift_amt == 32 || shift_amt == 0):
                    store rs1 in stage reg

              2. if (shift_amt >= 33):
                    return stage_reg_q | (rs3 << shift_amt[4:0])
                 else if (shift_amt <0 && shift_amt <= 31):
                    return stage_reg_q | (rs3 >> shift_amt_compl[4:0])
                 else if (shift_amt == 32):
                    return rs3
                 else if (shift_amt == 0):
                    return rs1

Signed-off-by: ganoam <gnoam@live.com>
2020-04-16 14:03:35 +02:00
Tom Roberts
97a50d7f12 [rtl] Add fixed time execution of branches
- A new parameter and a run-time control bit (DataIndTiming and
  data_ind_timing) enabling different behaviour for running security critical
  code sections.
- In the new mode, all branches act as if taken, with not-taken
  branches executing as a branch to the next instruction.
- This should give similar execution time/power characteristics
  regardless of the branch condition.
- Note that with the BranchTargetALU, branches stall an extra cycle in
  secure mode to avoid factoring the branch-taken decision into the
  branch target address mux.

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-04-13 14:27:40 +01:00
Greg Chadwick
cb8afccb7c [rtl] Modify ASSERT_KNOWN uses to work with xprop
When xprop is enabled various case and if/else constructs will propagate
X leading to failures in ASSERT_KNOWN. This introduces enable terms to
various ASSERT_KNOWN uses that would otherwise fail without them.

prim_assert.sv changes copied across from OpenTitan respository.
2020-04-07 09:08:26 +01:00
ganoam
8a26111f40 [bitmanip] Add ZBB Instruction Group
This commit implements the Bit Manipulation Extension ZBB instruction
group: clz, ctz, pcnt, slo, sro, rol, ror, rev, rev8, orcb, pack
packu, packh, min, max, andn, orn, and xnor.

* Bit counting instructions clz, ctz and pcnt can be implemented to
        share much of the architecture:

        clz: Count Leading Zeros. Counts the number of 0 bits at the
                MSB end of the argument.
        ctz: Count Trailing Zeros. Counts the number of 0 bits at the
                LSB end of the argument.
        pcnt: Counts the number of set bits of the argument.

        The implementation uses:

        - 32 one bit adders, counting the set bits of a signal
                bitcnt_bits, starting from the LSB end.

        - For pcnt the argument is fed directly into bitcnt_bits.

        - For clz, the operand is reversed such that leading zeros are
                located at the LSB end of bitcnt_bits.

        - For ctz and clz: counter enable signal for 1-bit counter i
                is high, if the previous enable signal, and
                its corresponting bitcnt_bit was high.

* Instructions sll[i], srl[i],slo[i], sro[i], rol, ror[i], rev, rev8
        and orc.b are summarized as shifting instructions and related:

        The following instructions are slight variations of the
        existing base spec's sll, srl and sra instructions.

        - slo[i] and sro[i]: shift left/right ones: similar to
                shift-logical operations from base spec, but shifting
                in ones instead of zeros.

        - rol and ror[i]: rotate left/right ones: circular shift
                operations. shifting in values from the oposite end
                of the operand instead of zeros.

        Those instructions can be implemented, sharing the base spec's
        shifting structure. In order to support rotate operations, a
        64-bit shifting structure is needed.

        In the existing ALU, hardware is described only for right
        shifts. For left shifts the operand is initially reversed,
        right shifted and the result is reversed back. This gives rise
        to an additional resource sharing oportunity for some more
        zbb operations:

        - rev: bitwise reversal.

        - rev8: byte-order swap.

        - orc.b: byte-wise reverse and or-combine.

* Instructions min, max:
        For the B-extension's min/max instructions, we can share the
        existing comparison operations. The result is obtained by
        activating the comparison structure accordingly and
        multiplexing the operands using the comparison result.

* Logic-with-negate instructions andn, orn, xnor:
        For the B-extension's logic-with-negate instructions we can
        share the structures of the base spec's logic structures
        already present for 'xnor', 'or' and 'and' instructions as
        well as the conditionally negated b operand generated for
        subtraction operations.

* Instructions pack, packu, packh:
        For the pack, packh and packu instructions I don't see any
        opportunities for resource sharing. However, the architecture
        is quite simple.

        - pack: pack the lower halves of rs1 and rs2 into rd, with rs1
                in the lower half and rs2 in the upper half.

        - packu: pack the upper halves of rs1 and rs2 into rd, with
                rs1 in the lower half and rs2 in the upper half.

        - packh: pack the LSB bytes of rs1 and rs2 into rd, with rs1
                in the lower half and rs2 in the upper half.

Signed-off-by: ganoam <gnoam@live.com>
2020-03-27 17:13:26 +01:00
Greg Chadwick
e1aac0735c [rtl] Lint fixes 2020-03-27 10:30:46 +00:00
Tom Roberts
624ef41462 [rtl] Extend BT ALU to be used for all jumps
- Create separate operand muxes for the branch/jump target ALU
- Complete jump instructions in one cycle when BT ALU configured

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-03-25 15:25:22 +00:00
Tom Roberts
c054a63c3d [rtl] Instantiate instruction cache
- Add parameters and actual instantiation of icache
- Add a custom CSR in the M-mode custom RW range to enable the cache
- Wire up the cache invalidation signal to trigger on fence.i

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-03-23 12:57:31 +00:00
Tom Roberts
42aa761c5d [rtl] Fix mtval for unaligned instr errors
mtval should record which half of the instruction caused the error
rather than just recording the PC.
An extra signal is added in the IF stage to indicate when an error is
caused by the second half of an unaligned instruction. This signal is
then used to increment the PC by 2 for mtval capture on an error.

Fixes #709
2020-03-18 12:53:35 +00:00
Greg Chadwick
3927fd8d2a [rtl/sw] Add multiply and divide wait counters 2020-03-13 14:48:29 +00:00
Greg Chadwick
89e5fc11ed [RTL] Add configurable third pipeline stage
The third pipeline stage is a new writeback stage. Ibex can now be
configured as the original two stage design or the new three stage
design using the `WritebackStage` parameter in ibex_core. This defaults
to 0 (giving the original two stage design).

The three stage design is *EXPERIMENTAL*

In the three stage design all register write back occurs in the third,
final stage. This allows a cycle for responses to loads and stores so
when the memory system can respond in a single cycle there will be no
stall. This offers significant performance benefits.

Documentation of the three stage design is still to be written so
existing documentation applies to the two stage design only as various
aspects of Ibex behaviour will change in the three stage design.

Signed-off-by: Greg Chadwick <gac@lowrisc.org>
2020-03-06 15:29:14 +00:00
Greg Chadwick
24cbc32249 [rtl] Fix assertion issues
Fixes #548
2020-02-10 17:01:38 +00:00
Pirmin Vogel
2a42c23eaf [rtl] Decouple mip and mie CSRs
This commit modifies the `mip` CSR to not depend on the `mie` CSR. While
the values of both these CSRs are combined to decide whether an
interrupt shall be handled, the RISC-V spec does not state that the
content of of `mip` should depend on `mie`. This commit better aligns
Ibex with other open-source RISC-V cores.

This resolves lowRISC/ibex#567 reported by @pfmooney.

Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
2020-02-04 16:15:38 +01:00
Greg Chadwick
b52aacf91b [rtl] Add multdiv_sel signal to decode
multdiv_sel signals the mult/div operand should be selected for the ALU
inputs. Previously the mult_en/div_en signals were used but these factor
in whether the instruction is actually happening which is not relevant
for the mux select. The dedicated select signal gives better timing.
2020-01-31 09:32:20 +00:00
Greg Chadwick
486bf45711 [rtl] Replicate instruction flops to reduce fanout
Adds a second set of instruction flops that are used to determine ALU
operation and operand selection. This reduces fanout from the
instruction flops and so helps timing.
2020-01-31 09:32:20 +00:00
Greg Chadwick
639964514c [RTL] Added seperate ALU for branch target
On branches now compute target same cycle as the condition.  This
removes a stall cycle from all taken conditional branches.
2020-01-31 09:32:20 +00:00
Greg Chadwick
328aabb548 [RTL] Only restore from mstack in nmi mode
Fixes #492
2019-12-16 19:51:22 +00:00
Tom Roberts
088cd11593 [dbg] Add minimal hardware breakpoint support
- Add the minimum amount of trigger system to support GDB hbreak
- Only a single trigger is implemented
- Only instruction address matching
- Only break into debug mode (no native debug)
- Fixes #382

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2019-12-11 15:02:06 +00:00
Pirmin Vogel
9738b6c703 [rtl] Rework core_busy signals, remove feedback to clk
This commit reworks the generation of the `core_busy` signal used to
control the main clock gate of the core. Without this commit, the
controller generates a separate `first_fetch` signal only asserted in
the FIRST_FETCH state that directly controls `core_busy` and thus the
main clock gate. This is problematic as it introduces a feedback to
from the controller state into the clock.

This commit removes the problematic signal and changes the generation of
`ctrl_busy` in the FIRST_FETCH state of the controller. This signal is
now used to control the main clock gate in all states (previously all
except FIRST_FETCH) but it gets registered, thus it does not introduce
the feedback into the clock.

This resolves lowRISC/ibex#211.

Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
2019-11-14 12:55:16 +01:00
Tom Roberts
0243e08111 [rtl] Switch to M mode on debug entry
- Core should operate as if in M-mode while in debug mode
- Previous priv level is restored from dcsr on DRET
- Fixes #463
2019-11-14 09:37:02 +00:00
Pascal Cotret
e5cf0c0fcf Error synthesis in Vivado 2019-10-28 20:36:37 +00:00
Marek Pikuła
294849bb18 [RTL] Add MultiplierImplementation parameter in top level 2019-10-24 14:33:24 +01:00
Greg Chadwick
b94961402c [RTL] Fix ebreak behaviour in U-mode
Fixes #370

Whether EBREAK enters debug mode is controlled by the
ebreaku and ebreakm dcsr fields. Which is relevant depends upon the
privilege level.
2019-10-16 09:10:34 +01:00
Philipp Wagner
3db46f91e0 Tie off csr_pmp_* signals for all lint tools
Our generic way of marking signals as unused is assigning them to an
unused_* signal. That works for all lint tools and avoids tool-specific
waivers.
2019-10-09 13:35:01 +01:00
Tom Roberts
2aacd2b98b [Priv modes] Add support for U-Mode
- General changes to suport U-mode (fixes #88)
- Update documentation
- Add priv mode flops to CSRs module
- Propagate correct priv mode to PMP module
- Implement CSR priv-mode permission checking
- Implement illegal U-mode instruction checking
- Add extra mstatus bits for U-mode (MPRV and TW)
2019-10-03 10:41:29 +01:00
Philipp Wagner
e2848f2181 ibex_core: Use correct width for param assignments
These parameters are of type bit, we need to assign a value of the
correct width to avoid Verilator lint warnings.
2019-10-01 10:38:45 +01:00