Commit graph

243 commits

Author SHA1 Message Date
Pirmin Vogel
2ef5e5e3f2 Add a single RV32M enum parameter to select multiplier implementation
This commit replaces the previous combination of `RV32M` bit parameter
used to en/disable the M extension and the `MultiplierImplementation`
used to select the multiplier implementation by a single enum parameter.

Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
2020-08-20 11:50:08 +02:00
Pirmin Vogel
4127a5464b B extension: Correct doc and parameter usage
Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
2020-08-20 11:50:08 +02:00
Tom Roberts
35abca14ab [syn] Use latch-based register file in yosys
- Add a technology map for latches (only works with nandgate45 library
  at the moment)
- Add a real latch-based clock gating cell
- Update timing path reporting to differentiate between register and
  latch paths
- Update summary results in README to reflect the latch-based numbers,
  plus add numbers for a micro-riscy-style (RV32EC) config

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-08-10 13:36:32 +01:00
ganoam
53aff0aa18 [doc] Update FPGA Synthesis Paragraph in Intro
Clarify FPGA register file situation in introduction.

Signed-off-by: ganoam <gnoam@live.com>
2020-08-03 09:38:30 +02:00
Rupert Swarbrick
46ff63ad88 Properly vendor in mem_model from OpenTitan
This removes the manually copied version at dv/uvm/core_ibex/common
and vendors things properly now that the vendor tool supports such
things (this picks up the same OpenTitan version as the previous
commit: lowRISC/opentitan@067272a2).
2020-07-24 08:05:40 +01:00
Tom Roberts
03a8ae70d6 [rtl] Add security hardened PC
- Checks that PC increments as expected
- Raises an alert if not

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-07-16 15:00:05 +01:00
Tom Roberts
c542edbb1a [rtl] Add register-file ECC checking
- Add SECDED ECC checking to the register file when SecureIbex is
  enabled
- No correction is attempted, but an alert is raised for the system to
  intervene

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-07-15 09:50:23 +01:00
Tom Roberts
aae437d75b [rtl] Add alert outputs
- Add a major and minor alert output which can be used by the system to
  react to fault injection attacks

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-07-15 09:50:23 +01:00
Pirmin Vogel
ede658b92a [doc] Clarify that the supported version of the B extension is a draft
Support for this extension is not experimental (it's fully verified using
RISCV-DV) but the extension might change before being ratified.

Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
2020-07-05 13:52:56 +02:00
Philipp Wagner
6ca325fa70 Remove outdated documentation
By now, ibex depends not only on the a clock gating cell, but also on
assert macros and a LFSR; there will be more to come. This removed piece
of documentation was from the early days, when only a clock gating cell
was to be provided (and we didn't ship one).

Today, users need to either use FuseSoC, or run fusesoc on the simple
system and "harvest" the resulting files if they want to copy-paste Ibex
into their build system. That's not ideal, but not something we can very
easily fix -- so let's remove the outdated documentation first to at
least reduce the confusion.
2020-07-03 14:25:24 +01:00
Philipp Wagner
40a52ab8b4 [doc] Add bitmanip spec to introduction page
We list all specifications we implement, even optional ones. Add
Bitmanip there as well.

Fixes #966
2020-07-02 15:03:50 +01:00
ganoam
1aa4d5a32b [bitmanip] Optimizations and Parametrization
This commit contains some final optimizations regarding the bit
manipulation extension as well as the parametrization into a balanced
version and a full performance version.

Balanced Version:
        * Supports ZBB, ZBS, ZBF and ZBT extensions
        * Dual cycle instructions:
          ror[i], rol, cmov, cmix fsl, fsr[i]
        * Everything else completes in a single cycle.

Full Version:
        * Supports all 32b sub extensions.
        * Dual cycle instructions:
          ror[i], rol, cmov, cmix fsl, fsr[i], crc32[c], bext, bdep
        * Everything else completes in a single cycle.

Notable Changes:
        * bext/bdep are now multi-cycle: Sharing additional register
          with multiplier module
        * grev/gorc instructions are implemented in separate structures
          rather than sharing the shifter or butterfly network.
        * Speed up decision on using rs1 or rs3 for alu_operand_a by
          introducing single-bit register, to identify ternary
          instructions in their first cycle.
        * Introduce enumerated parameter to chose bit manipulation
          implementation

Signed-off-by: ganoam <gnoam@live.com>
2020-06-26 14:43:24 +02:00
Philipp Wagner
b302b6da92 Fix documentation markup for tracer
The unordered list wasn't rendered properly due to a missing empty line
before it. Purely editorial change.
2020-06-18 15:40:54 +01:00
Tom Roberts
b81c311481 [doc] Clarify fetch_enable_i meaning
Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-06-10 14:44:24 +01:00
Rupert Swarbrick
33ad42debb Spelling fix: seperate -> separate 2020-06-05 11:37:37 +01:00
Rupert Swarbrick
e79e6b58ca Make sure we don't see multi-way hits in icache testbench
One aspect of (i)cache design that I didn't know about before writing
test code for this block is the problem of multi-way hits. The icache,
as implemented, stores data to parallel ways and it's possible for a
fetch to match more than one way. The data from matching ways all gets
ORed together, which doesn't matter so long as it never
changes (because V | V == V for all V).

Of course, things go poorly if you have two different values, V and W,
at an address which are both stored in the cache. Then the result is V
| W, which isn't necessarily equal to either instruction.

Avoiding this needs priority encoders, which are rather large, so it
seems the usual approach is to disallow branching to modified code
before flushing the cache. This patch teaches the testbench to do this
properly.

Sadly, this means there's now a connection between the core agent and
the memory agent: the memory agent can no longer generate new seeds
whenever it pleases.
2020-06-04 08:09:51 +01:00
Tom Roberts
12b39476c0 [rtl] Add speculative branch signal
- Drive a speculative version of the branch signal into the IF stage to
  drive address muxing
- The speculative signal is the same as the regular branch signal but
  assumes all conditional branches are taken
- This breaks the timing path from branch condition calculation into
  address muxing (and therefore PMP error calculation)
- When the branch is not taken, any external request we might otherwise
  have made is suppressed
- This has a minor performance cost (0.8% without I$, ~0% with I$)

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-26 09:41:37 +01:00
ganoam
66687e927c [bitmanip] Add ZBR instruction group
This commit implements the Bit Manipulation Extension ZBR instruction
group: crc32[c].[bhw].

CRC-32 (CRC-32/ISO-HDLC) and CRC-32C (CRC-32/ISCSI) are directly
implemented. The CRC operation solves the following equation using
binary polynomial arithmetic:

rev(rd)(x) = rev(rs1)(x) * x**n mod {1, P}(x),

where {1,P}(x) denotes the crc polynomial. Using barret reduction one
can write this as

rd = (rs1 >> n) ^ rev(rev( (rs1 << (32-1)) cx rev(mu)) cx P)
                      ^-- cycle 0--------------------^
     ^-- cycle 1 ------------------------------------------^

Where cx denotes carry-less multiplication and mu = polydiv(x**64,
{1,P}), omitting the MSB (bit 32).

The implementation increases area consumption by ~0.6kGE for synthesis
with relaxed timing constraints. With tight timing constraints that is
~1.6kGE. There is no significant impact on frequency.

Signed-off-by: ganoam <gnoam@live.com>
2020-05-22 17:21:03 +02:00
Tom Roberts
d5ee96fff6 [rtl] Add dummy instruction insertion
- Adds a new module in the IF stage to inject dummy instructions into
  the pipeline
- Control / frequency of insertion is governed by configuration CSRs
- Extra CSR added to allow reseed of the internal LFSR useed for
  randomizing insertion
- Extra logic added to the register file to make dummy instruction
  writebacks look like real intructions (via the zero register)

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-05-21 13:58:01 +01:00
ganoam
f173e2baba [bitmanip] Add ZBC instruction group
This commit implements the Bit Manipulation Extension ZBC instruction
group: clmul[rh] (carry-less multiply [reverse][high])

Carry-less multiplication can be understood as multiplication based on
the addition interpreted as the bit-wise xor operation.

Example: 1101 X 1011 = 1111111:

      1011 X 1101
      -----------
             1101
        xor 1101
        ---------
            10111
       xor 0000
       ----------
           010111
      xor 1101
      -----------
          1111111

Architectural details:
        A 32 x 32-bit array
        [ operand_b[i] ? (operand_a << i) : '0 for i in 0 ... 31 ]
        is generated. The entries of the array are pairwise 'xor-ed'
        together in a 5-stage binary tree.

The area increase when synthesized with relaxed timing constraints is
1.6-1.7kGE.

Timing figures are improve by 0.1 ns for the 3-stage configuration and
worsen by 0.04ns for the 2-stage implementation. This suggests
fluctuations due to the heuristic nature of the synthesis tools.

Signed-off-by: ganoam <gnoam@live.com>
2020-05-19 10:38:38 +02:00
ganoam
fac404a6f3 [bitmanip] Add ZBF instruction group
This commit implements the Bit Manipulation Extension ZBF instruction
group, which consists only of the one instruction bfp (bit-field
place).
This instruction places a field of length len < 16 from rs2 in rs1 at
offset off.

Architectureal details:
        The implementation works exactly the same as proposed by Claire
        Wolf in her reference implementation.
        1. bfp_mask = slo(o, len)
        2. bfp_result =
                (rs1 & ~(bfp_mask << off)) | (rs2 & bfp_mask) << off
                        ^------ shifter-^
        The existing shifter structure is shared for the indicated
        operation.

Impact on area:

        * When synthesizing without the B-extension, the 2 stage
        design seems to move the timing bottleneck, leading to
        optimizations which result in an area increase by 1 kGE,
        when synthesized with tight timing constraints. For the
        3 stage configuration there is no change.
        When synthesized with relaxed timing constraints there is no
        significant change in either configuration.

        * With the B-extension enabled, the area increase for tight
        timing constraints is 1.1-1.2 kGE. For relaxed timing
        constraints that is ~0.4kGE

Impact on timing: No significant impact.

Signed-off-by: ganoam <gnoam@live.com>
2020-05-14 21:34:49 +02:00
ganoam
0afd000a09 [bitmanip] Add ZBE Instruction Group
This commit implements the Bit Manipulation Extension ZBE instruction
group: bext (bit extract) and bdep (bit deposit).

Architectural details:
        * bext/bdep: A new butterfly and inverse butterfly network is
        implemented. The generation of its controlbits depend on a
        parallel prefix bitcount of the deposit / extract mask.

        * bitcounter: The path for bext / bdep instructions traverses
        the bit counter and the butterfly network, resulting in both a
        larger delay and area. To mitigate the bitcounter has been
        changed from a serial bit counter to a radix-2 tree structure.

        * grev/gorc: Zbp instructions general reverse and general
        or-combine have as of yet shared the shifters reversal
        structure. It has proven benefitial to area and timing to reuse
        the novel butterfly network instead

The butterfly network itself consumes ~3.5kGE and ~1.1kGE for synthesis
with tight and relaxed timing constraints respectively. Including the
optimizations of the bitcounter and grev/gorc, the overall change in
area consumption is +4.6kGE (+1.2kGE) and +3.3kGE (+1.1kGE) for
synthesis with tight (relaxed) timing constraints for 2- and 3-stage
configurations respectively. For tight timing constraints that is a
growth by around ~10%, for relaxed ~5%.

The impact on the maximum frequency is negligable.

Signed-off-by: ganoam <gnoam@live.com>
2020-05-14 16:43:19 +02:00
Rupert Swarbrick
22b0609b4f Weaken some checks on cache in ibex_icache_core_protocol_checker
Once the cache has passed an error to the core, we now allow it to
wiggle its valid, addr, rdata, err and err_plus2 lines however it sees
fit until the core issues a new branch.

Since the core isn't allowed to assert ready until then, the values
will not be read and this won't matter.

This was exposed by

  make -C dv/uvm/icache/dv run SEED=1314810947 WAVES=1
2020-05-12 12:08:50 +01:00
Pirmin Vogel
fd01562ff7 [doc] Minor fixes
Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
2020-05-01 20:09:59 +02:00
ganoam
a68923a404 [bitmanip] Add ZBP Instruction Group
This commit implements the Bit Manipulation Extension ZBP instruction
group: grev[i] (generalized reverse), gorc[i] (generalized or-combine)
and [un]shfl[i] (generalized shuffle) and all of their
pseudo-instructions.

Architectural details:
        * grev / gorc: The shifter structure features only a right
        shift structure. In order to perform a left shift therefore the
        operand needs to be reversed, shifted and reversed again. The
        architecture of the back-reversal is implemented in stages
        which are activated using the general reverse / orcombine
        operand, or a signal marking left-shifts.

        * shfl / unshfl: Also known as zip / unzip or interlace /
        uninterlace operation. These instructions are implemented
        in their own structure using a permutation networ of 6 stages.
        4 stages thereof implement the shuffle permutations. the first
        and last stage is the flip stage, which effectively reverse s
        the order of the inner stages, for unshuffle operations.

Signed-off-by: ganoam <gnoam@live.com>
2020-04-29 11:10:44 +02:00
Pirmin Vogel
8bd0423962 [dv] Enable verification of the Bitmanip Extension with OVPsim and Spike
This is related to lowRISC/ibex#703.

Signed-off-by: Pirmin Vogel <vogelpi@lowrisc.org>
2020-04-27 09:51:57 +02:00
ganoam
133fef2c2f [bitmanip] Add ZBS Instruction Group
This commit implements the Bit Manipulation Extension SBS instruction
group: sbset[i], sbclr[i], sbinv[i] and sbext[i]. These instructions
set, clear, invert or extract bit rs1[rs2] or rs1[imm] for reg-reg and
reg-imm instructions respectively.

Archtectural details:
        * A multiplexer is added to the shifter structure in order to
          chose between 32'h1, used for the single-bit instructions as
          summarized below, and regular operand_b input.

        * Dedicated bitwise-logic blocks are introduced for multicycle
          shifts and cmix instructions (fsr, fsl, ror, rol),
          single-bit instructions (sbset, sbclr, sbinv, sbext), and
          stanard-ALU and zbb instructions (or, and xor, orn, andn,
          xnor).

Instruction details: All of the zbs instructions rely on sharing the
        existing shifter structure. The instructions are carried out in
        one cycle.

        * sbset, sbclr, sbinv:
                shift_result = 32'h1 << rs2[4:0];
                singlebit_result = rs1 [|, ^ , &~] shift_result;

        * sbext:
                shift_result = rs1 >> rs2[4:0];
                singlebit_result = {31'0,shift_resutl[0]};

Signed-off-by: ganoam <gnoam@live.com>
2020-04-24 08:32:30 +02:00
Philipp Wagner
3c11c95981 [doc] Link to RISC-V Debug Specification
Unfortunately, no versioned links to the PDF are available (at least I
couldn't find one), so only link the entry page for the spec.
2020-04-23 13:48:32 +01:00
Philipp Wagner
82c113764c [doc] Clean up Sphinx warnings 2020-04-23 13:48:32 +01:00
Philipp Wagner
38912e115d [doc] Add note about debug system integration
Ibex only provides the necessary interfaces and core-internal
functionality for run-control debug. To get a fully working,
"debuggable" toplevel design, more components are needed. Describe where
to get them from, and include OpenTitan as an exemplary integration.
2020-04-23 13:48:32 +01:00
Michael Gielda
d6d23917ae Fix rst syntax
Signed-off-by: Michael Gielda <mgielda@antmicro.com>
2020-04-18 17:28:06 +01:00
ganoam
4cb77b8121 [bitmanip] Add ZBT Instruction Group
This commits implements the Bit Manipulateion Extension ZBT instruction
group: cmix, cmov, fsr[i] and fsl. Those are instructions depend on
three ALU operands. Completeion of these instructions takes 2 clock
cycles. Additionally, the rotation shifts rol and ror are made
multicycle instructions.

All multicycle instructions take exactly two cycles to complete.

Architectural additions:

        * Multicycle Stage Register in ID stage.
                multicycle_op_stage_reg

        * Decoder generates alu_multicycle signal, to stall pipeline

        * For all ternary instructions:
                1. cycle: connect alu operands a and b to rs1 and rs2
                          respectively
                2. cycle: connect operands a and be to rs3 and rs2
                          respectively

        * Reduce the physical size of the shifter from 64 bit to 63
                bit: 32-bit operand + 1 bit for arithmetic / one-shift

        * Make rotation shifts multicycle instructions.

Instruction Details:
        * cmov:
                1. store operand a (rs1) in stage reg.
                2. return stage reg output (rs2)  or rs3.

                if rs2 != 0 the output (rs1) is already known in the
                  first cycle. -> variable latency implementation is
                  possible.

        * cmix:
                1. store rs1 & rs2 in stage reg
                2. return stage_reg_q | (rs2 & ~rs3)

                reusing bwlogic from zbb

        * rol/ror: (here: ror)
              shift_amt       = rs2 & 31;
              shift_amt_compl = (32 - shift_amt) & 31
              1. store (rs1 >> shift_amt) in stage reg
              2. return (rs1 << shift_amt_compl) | stage_reg_q

        * fsl/fsr:
        For funnel shifts, the order of applying the shift
        amount or its complement is determined by bit [5] of
        shift_amt. Pseudocode for fsr:

              shift_amt       = rs2 & 63
              shift_amt_compl = (32 - shift_amt[4:0])

              1. if (shift_amt >= 33):
                    store (rs1 >> shift_amt_compl[4:0]) in stage reg
                 else if (shift_amt <0 && shift_amt <= 31):
                    store (rs1 << shift_amt[4:0]) in stage reg
                 else if (shift_amt == 32 || shift_amt == 0):
                    store rs1 in stage reg

              2. if (shift_amt >= 33):
                    return stage_reg_q | (rs3 << shift_amt[4:0])
                 else if (shift_amt <0 && shift_amt <= 31):
                    return stage_reg_q | (rs3 >> shift_amt_compl[4:0])
                 else if (shift_amt == 32):
                    return rs3
                 else if (shift_amt == 0):
                    return rs1

Signed-off-by: ganoam <gnoam@live.com>
2020-04-16 14:03:35 +02:00
Rupert Swarbrick
56883f19ed Clarifications in icache detailed documentation
The new information is:

  - Branch addresses must be 16-bit aligned.

  - Explicitly allow top 16 bits of rdata to change when lower 16 bits
    contain a compressed instruction.

  - Explicitly allow the core to drop ready without valid.

I've also rejigged the layout slightly, improving (I think!) the
description of compressed and uncompressed instructions.
2020-04-13 14:29:34 +01:00
Tom Roberts
97a50d7f12 [rtl] Add fixed time execution of branches
- A new parameter and a run-time control bit (DataIndTiming and
  data_ind_timing) enabling different behaviour for running security critical
  code sections.
- In the new mode, all branches act as if taken, with not-taken
  branches executing as a branch to the next instruction.
- This should give similar execution time/power characteristics
  regardless of the branch condition.
- Note that with the BranchTargetALU, branches stall an extra cycle in
  secure mode to avoid factoring the branch-taken decision into the
  branch target address mux.

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-04-13 14:27:40 +01:00
Philipp Wagner
f8aacd15be [doc] Add Ibex Concierge documentation
Add a document how we plan to run the Ibex Concierge duty.

Signed-off-by: Philipp Wagner <phw@lowrisc.org>
2020-04-01 18:16:23 +01:00
udinator
f69c6fbabd
[dv] initial icache testbench (#711)
* [dv] add vendor .hjson files for dv tools

Signed-off-by: Udi Jonnalagadda <udij@google.com>

* Update common_ifs to lowRISC/opentitan@0d7f7ac7

Update code from subdir hw/dv/sv/common_ifs in upstream repository
https://github.com/lowRISC/opentitan to revision
0d7f7ac755d4e00811257027dd814edb2afca050

Signed-off-by: Udi Jonnalagadda <udij@google.com>

* Update csr_utils to lowRISC/opentitan@0d7f7ac7

Update code from subdir hw/dv/sv/csr_utils in upstream repository
https://github.com/lowRISC/opentitan to revision
0d7f7ac755d4e00811257027dd814edb2afca050

Signed-off-by: Udi Jonnalagadda <udij@google.com>

* Update dv_lib to lowRISC/opentitan@0d7f7ac7

Update code from subdir hw/dv/sv/dv_lib in upstream repository
https://github.com/lowRISC/opentitan to revision
0d7f7ac755d4e00811257027dd814edb2afca050

Signed-off-by: Udi Jonnalagadda <udij@google.com>

* Update dvsim to lowRISC/opentitan@0d7f7ac7

Update code from subdir util/dvsim in upstream repository
https://github.com/lowRISC/opentitan to revision
0d7f7ac755d4e00811257027dd814edb2afca050

Signed-off-by: Udi Jonnalagadda <udij@google.com>

* Update uvmdvgen to lowRISC/opentitan@0d7f7ac7

Update code from subdir util/uvmdvgen in upstream repository
https://github.com/lowRISC/opentitan to revision
0d7f7ac755d4e00811257027dd814edb2afca050

Signed-off-by: Udi Jonnalagadda <udij@google.com>

* Update dv_utils to lowRISC/opentitan@0d7f7ac7

Update code from subdir hw/dv/sv/dv_utils in upstream repository
https://github.com/lowRISC/opentitan to revision
0d7f7ac755d4e00811257027dd814edb2afca050

Signed-off-by: Udi Jonnalagadda <udij@google.com>

* [dv] initial icache testbench

Signed-off-by: Udi Jonnalagadda <udij@google.com>

* [dv] add top_pkg and its core file to icache/dv

Signed-off-by: Udi Jonnalagadda <udij@google.com>

* [dv] update ibex_core and ibex_icache corefile dependencies

Signed-off-by: Udi Jonnalagadda <udij@google.com>

* [dv] add .vpd support for wave-dumping

Signed-off-by: Udi Jonnalagadda <udij@google.com>
2020-03-27 11:02:47 -07:00
Tom Roberts
9f9ff3ee8e [doc/icache] Document the err_plus2_o signal
Add a few sentences to describe the behaviour/meaning of the err_plus2_o
signal, and how it is used by the core.

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-03-27 17:27:54 +00:00
ganoam
8a26111f40 [bitmanip] Add ZBB Instruction Group
This commit implements the Bit Manipulation Extension ZBB instruction
group: clz, ctz, pcnt, slo, sro, rol, ror, rev, rev8, orcb, pack
packu, packh, min, max, andn, orn, and xnor.

* Bit counting instructions clz, ctz and pcnt can be implemented to
        share much of the architecture:

        clz: Count Leading Zeros. Counts the number of 0 bits at the
                MSB end of the argument.
        ctz: Count Trailing Zeros. Counts the number of 0 bits at the
                LSB end of the argument.
        pcnt: Counts the number of set bits of the argument.

        The implementation uses:

        - 32 one bit adders, counting the set bits of a signal
                bitcnt_bits, starting from the LSB end.

        - For pcnt the argument is fed directly into bitcnt_bits.

        - For clz, the operand is reversed such that leading zeros are
                located at the LSB end of bitcnt_bits.

        - For ctz and clz: counter enable signal for 1-bit counter i
                is high, if the previous enable signal, and
                its corresponting bitcnt_bit was high.

* Instructions sll[i], srl[i],slo[i], sro[i], rol, ror[i], rev, rev8
        and orc.b are summarized as shifting instructions and related:

        The following instructions are slight variations of the
        existing base spec's sll, srl and sra instructions.

        - slo[i] and sro[i]: shift left/right ones: similar to
                shift-logical operations from base spec, but shifting
                in ones instead of zeros.

        - rol and ror[i]: rotate left/right ones: circular shift
                operations. shifting in values from the oposite end
                of the operand instead of zeros.

        Those instructions can be implemented, sharing the base spec's
        shifting structure. In order to support rotate operations, a
        64-bit shifting structure is needed.

        In the existing ALU, hardware is described only for right
        shifts. For left shifts the operand is initially reversed,
        right shifted and the result is reversed back. This gives rise
        to an additional resource sharing oportunity for some more
        zbb operations:

        - rev: bitwise reversal.

        - rev8: byte-order swap.

        - orc.b: byte-wise reverse and or-combine.

* Instructions min, max:
        For the B-extension's min/max instructions, we can share the
        existing comparison operations. The result is obtained by
        activating the comparison structure accordingly and
        multiplexing the operands using the comparison result.

* Logic-with-negate instructions andn, orn, xnor:
        For the B-extension's logic-with-negate instructions we can
        share the structures of the base spec's logic structures
        already present for 'xnor', 'or' and 'and' instructions as
        well as the conditionally negated b operand generated for
        subtraction operations.

* Instructions pack, packu, packh:
        For the pack, packh and packu instructions I don't see any
        opportunities for resource sharing. However, the architecture
        is quite simple.

        - pack: pack the lower halves of rs1 and rs2 into rd, with rs1
                in the lower half and rs2 in the upper half.

        - packu: pack the upper halves of rs1 and rs2 into rd, with
                rs1 in the lower half and rs2 in the upper half.

        - packh: pack the LSB bytes of rs1 and rs2 into rd, with rs1
                in the lower half and rs2 in the upper half.

Signed-off-by: ganoam <gnoam@live.com>
2020-03-27 17:13:26 +01:00
Rupert Swarbrick
158e9b9714 Clarify a couple of points in icache documentation
When a PMP error comes in, the cache doesn't quite behave as if the
request was granted (if it did: it would wait forever for a response).
Hopefully this version is a bit clearer.

Also, this makes explicit that the upper bits of a 16-bit instruction
fetch can be bogus.
2020-03-27 14:34:02 +00:00
Philipp Wagner
38c3d19a0f Correct PMP granularity equation
The `+2` part should have been part of the exponent, as indicated by the
RISC-V spec.
2020-03-26 10:02:06 +00:00
Tom Roberts
c054a63c3d [rtl] Instantiate instruction cache
- Add parameters and actual instantiation of icache
- Add a custom CSR in the M-mode custom RW range to enable the cache
- Wire up the cache invalidation signal to trigger on fence.i

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-03-23 12:57:31 +00:00
Tom Roberts
8bb649e4ab [rtl/icache] Fix PMP error logic
Instruction requests triggering PMP errors have their external request
suppressed. The beat counting logic therefore needs to know that these
requests will never receive any rvalid data responses.

This fix stops the external request counter from incrementing, and marks
all external requests complete as soon as any error is received.

The data in the cache line beyond the error is not required since the
core cannot access it without consuming the error first.
2020-03-18 12:53:09 +00:00
Tom Roberts
ef17d4fcc2 [rtl] Add Icache ECC
- Add modules for ecc generation and checking
- Add supporting logic to icache module

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-03-18 11:28:06 +00:00
Tom Roberts
fe00eb46e9 [rtl] Icache RAM primitive changes
- Bring in a version of ram primitive with configurable width similar to
  the OT RAM primitive.
- Change the RAM banking structure to be a single bank of LineSize (64
  bits) to match the upcoming ECC granularity.

Signed-off-by: Tom Roberts <tomroberts@lowrisc.org>
2020-03-18 11:28:06 +00:00
Greg Chadwick
3927fd8d2a [rtl/sw] Add multiply and divide wait counters 2020-03-13 14:48:29 +00:00
Rupert Swarbrick
0a0a18c2cb Notes on the ICache specification
This also adds a couple of comments splitting up the ports in
ibex_icache.sv that I found helpful when working out what everything
did.
2020-03-13 13:13:19 +00:00
Rupert Swarbrick
134f515e4f Add missing space after code-block directive
Without this, Sphinx spits out a warning message and swallows the
block.
2020-03-12 16:51:21 +00:00
Philipp Wagner
a28170d6a7 [doc] Fix paths in verification documentation
The files moved; also add an explicit `cd` to the command listing to
help people only skimming the docs.
2020-03-12 11:05:05 +00:00
Tom Roberts
82ebf6fd20 [I-Cache] Initial commit of prototype RTL
- Working prototype of RTL
- Initial documentation
- Still some TODOs to be dealt with
2020-03-06 16:34:48 +00:00
Greg Chadwick
89e5fc11ed [RTL] Add configurable third pipeline stage
The third pipeline stage is a new writeback stage. Ibex can now be
configured as the original two stage design or the new three stage
design using the `WritebackStage` parameter in ibex_core. This defaults
to 0 (giving the original two stage design).

The three stage design is *EXPERIMENTAL*

In the three stage design all register write back occurs in the third,
final stage. This allows a cycle for responses to loads and stores so
when the memory system can respond in a single cycle there will be no
stall. This offers significant performance benefits.

Documentation of the three stage design is still to be written so
existing documentation applies to the two stage design only as various
aspects of Ibex behaviour will change in the three stage design.

Signed-off-by: Greg Chadwick <gac@lowrisc.org>
2020-03-06 15:29:14 +00:00