removed obsolete TINY_SHIFT_EN generic; added new shifter co-processor

This commit is contained in:
stnolting 2021-06-21 16:51:42 +02:00
parent d7be0a37cc
commit 27b59b5b84
4 changed files with 20 additions and 31 deletions

View file

@ -399,12 +399,10 @@ instructions:
* environment: `ecall`, `ebreak`, `fence`
[NOTE]
In order to keep the hardware footprint low, the CPU's shift unit uses a hybrid parallel/serial approach. Shift
operations are split in coarse shifts (multiples of 4) and a final fine shift (0 to 3). The total execution
time depends on the shift amount. Alternatively, the shift operations can be processed completely in parallels by a fast
(but large) barrel shifter when the `FAST_SHIFT_EN` generic is _true_. In that case, shift operations
complete within 2 cycles regardless of the shift amount. Shift operations can also be executed in a pure serial manner when
then `TINY_SHIFT_EN` generic is _true_. In that case, shift operations take up to 32 cycles depending on the shift amount.
In order to keep the hardware footprint low, the CPU's shift unit uses a bit-serial serial approach. Hence, shift operations
take up to 32 cycles (plus overhead) depending on the actual shift amount. Alternatively, the shift operations can be processed
completely in parallels by a fast (but large) barrel shifter when the `FAST_SHIFT_EN` generic is _true_. In that case, shift operations
complete within 2 cycles (plus overhead) regardless of the actual shift amount.
[NOTE]
Internally, the `fence` instruction does not perform any operation inside the CPU. It only sets the
@ -643,7 +641,7 @@ configurations are presented in <<_cpu_performance>>.
| ALU | `I/E` | `addi` `slti` `sltiu` `xori` `ori` `andi` `add` `sub` `slt` `sltu` `xor` `or` `and` `lui` `auipc` | 2
| ALU | `C` | `c.addi4spn` `c.nop` `c.addi` `c.li` `c.addi16sp` `c.lui` `c.andi` `c.sub` `c.xor` `c.or` `c.and` `c.add` `c.mv` | 2
| ALU | `I/E` | `slli` `srli` `srai` `sll` `srl` `sra` | 3 + SAfootnote:[Shift amount.]/4 + SA%4; FAST_SHIFTfootnote:[Barrel shift when `FAST_SHIFT_EN` is enabled.]: 4; TINY_SHIFTfootnote:[Serial shift when `TINY_SHIFT_EN` is enabled.]: 2..32
| ALU | `C` | `c.srli` `c.srai` `c.slli` | 3 + SAfootnote:[Shift amount.]/4 + SA%4; FAST_SHIFTfootnote:[Barrel shift when `FAST_SHIFT_EN` is enabled.]: 4; TINY_SHIFTfootnote:[Serial shift when `TINS_SHIFT_EN` is enabled.]: 2..32
| ALU | `C` | `c.srli` `c.srai` `c.slli` | 3 + SAfootnote:[Shift amount (0..31).]; FAST_SHIFTfootnote:[Barrel shifter when `FAST_SHIFT_EN` is enabled.]:
| Branches | `I/E` | `beq` `bne` `blt` `bge` `bltu` `bgeu` | Taken: 5 + MLfootnote:[Memory latency.]; Not taken: 3
| Branches | `C` | `c.beqz` `c.bnez` | Taken: 5 + MLfootnote:[Memory latency.]; Not taken: 3
| Jumps / Calls | `I/E` | `jal` `jalr` | 4 + ML

View file

@ -193,21 +193,24 @@ files, like alternative top entities, can be assigned to any library.
...................................
neorv32_top.vhd - NEORV32 Processor top entity
├neorv32_cpu.vhd - NEORV32 CPU top entity
│├neorv32_package.vhd - Processor/CPU main VHDL package file
│├neorv32_cpu_alu.vhd - Arithmetic/logic unit
││├neorv32_cpu_cp_fpu.vhd - Floating-point co-processor (Zfinx extension)
││├neorv32_cpu_cp_muldiv.vhd - Mul/Div co-processor (M extension)
││└neorv32_cpu_cp_shifter.vhd - Bit-shift co-processor
│├neorv32_cpu_bus.vhd - Bus interface unit + physical memory protection
│├neorv32_cpu_control.vhd - CPU control, exception/IRQ system and CSRs
││└neorv32_cpu_decompressor.vhd - Compressed instructions decoder
│└neorv32_cpu_regfile.vhd - Data register file
├neorv32_boot_rom.vhd - Bootloader ROM
│└neorv32_bootloader_image.vhd - Bootloader boot ROM memory image
├neorv32_busswitch.vhd - Processor bus switch for CPU buses (I&D)
├neorv32_bus_keeper.vhd - Processor-internal bus monitor
├neorv32_icache.vhd - Processor-internal instruction cache
├neorv32_cfs.vhd - Custom functions subsystem
├neorv32_cpu.vhd - NEORV32 CPU top entity
│├neorv32_package.vhd - Processor/CPU main VHDL package file
│├neorv32_cpu_alu.vhd - Arithmetic/logic unit
│├neorv32_cpu_bus.vhd - Bus interface unit + physical memory protection
│├neorv32_cpu_control.vhd - CPU control, exception/IRQ system and CSRs
││└neorv32_cpu_decompressor.vhd - Compressed instructions decoder
│├neorv32_cpu_cp_fpu.vhd - Floating-point co-processor (Zfinx extension)
│├neorv32_cpu_cp_muldiv.vhd - Mul/Div co-processor (M extension)
│└neorv32_cpu_regfile.vhd - Data register file
├neorv32_debug_dm.vhd - on-chip debugger: debug module
├neorv32_debug_dtm.vhd - on-chip debugger: debug transfer module
├neorv32_dmem.vhd - Processor-internal data memory

View file

@ -339,21 +339,9 @@ enabled (<<_cpu_extension_riscv_m>> is _true_).
[frame="all",grid="none"]
|======
| **FAST_SHIFT_EN** | _boolean_ | false
3+| When this generic is enabled the shifter unit of the CPU's ALU is implement as fast barrel shifter (requiring
more hardware resources).
|======
:sectnums!:
===== _TINY_SHIFT_EN_
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **TINY_SHIFT_EN** | _boolean_ | false
3+| If this generic is enabled the shifter unit of the CPU's ALU is implemented as (slow but tiny) single-bit iterative shifter
(requires up to 32 clock cycles for a shift operations, but reducing hardware footprint). The configuration of
this generic is ignored if <<_fast_shift_en>> is _true_.
3+| When this generic is set _true_ the shifter unit of the CPU's ALU is implemented as fast barrel shifter (requiring
more hardware resources). If it is set _false_ the CPU uses a serial shifter that only performs a single bit shift per cycle
(small but slow).
|======

Binary file not shown.

Before

Width:  |  Height:  |  Size: 37 KiB

After

Width:  |  Height:  |  Size: 66 KiB

Before After
Before After