Merge pull request #264 from stnolting/zxcfu_isa_extension

[Zxcfu ISA ext.] add option to implement custom RISC-V instructions
This commit is contained in:
stnolting 2022-01-31 08:10:19 +01:00 committed by GitHub
commit 3ac6303788
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
35 changed files with 993 additions and 64 deletions

View file

@ -26,6 +26,7 @@ defined by the `hw_version_c` constant in the main VHDL package file [`rtl/core/
| Date (*dd.mm.yyyy*) | Version | Comment |
|:----------:|:-------:|:--------|
| 30.01.2022 | 1.6.7.1 | :sparkles: added **`Zxcfu` ISA extension for user-defined custom RISC-V instructions**; see [PR #264](https://github.com/stnolting/neorv32/pull/264) |
| 28.01.2022 |[**:rocket:1.6.7**](https://github.com/stnolting/neorv32/releases/tag/v1.6.7) | **New release** |
| 28.01.2022 | 1.6.6.10 | :bug: fixed bug in **bit-manipulation co-processor**: decoding collision between `cpop` and `rol` instructions; :bug: fixed bug in co-processor arbitration when an illegal instruction is detected; added four additional (yet unused) **CPU** co-processor slots; [PR #262](https://github.com/stnolting/neorv32/pull/262) |
| 27.01.2022 | 1.6.6.9 | reworked **CFS** "user" logic; added CFS demo program; see [PR #261](https://github.com/stnolting/neorv32/pull/261) |

View file

@ -127,7 +127,9 @@ the "Minimal RISC-V Debug Specification Version 0.13.2" and compatible with **Op
* _true random_ number generator ([TRNG](https://stnolting.github.io/neorv32/#_true_random_number_generator_trng))
* execute in place module ([XIP](https://stnolting.github.io/neorv32/#_execute_in_place_module_xip)) to directly execute code from SPI flash
* custom functions subsystem ([CFS](https://stnolting.github.io/neorv32/#_custom_functions_subsystem_cfs))
for tightly-coupled custom co-processor extensions and interfaces
for tightly-coupled custom accelerators and interfaces
* custom functions unit ([CFU](https://stnolting.github.io/neorv32/#_custom_functions_unit_cfu)) for up to 1024
_custom RISC-V instructions_
[[back to top](#The-NEORV32-RISC-V-Processor)]
@ -187,6 +189,7 @@ documentation section).
[[`Zihpm`](https://stnolting.github.io/neorv32/#_zihpm_hardware_performance_monitors)]
[[`Zifencei`](https://stnolting.github.io/neorv32/#_zifencei_instruction_stream_synchronization)]
[[`Zmmul`](https://stnolting.github.io/neorv32/#_zmmul_integer_multiplication)]
[[`Zxcfu`](https://stnolting.github.io/neorv32/#_zxcfu_custom_instructions_extension_cfu)]
[[`PMP`](https://stnolting.github.io/neorv32/#_pmp_physical_memory_protection)]
[[`DEBUG`](https://stnolting.github.io/neorv32/#_cpu_debug_mode)]**

View file

@ -1,7 +1,7 @@
:sectnums:
== NEORV32 Central Processing Unit (CPU)
image::riscv_logo.png[width=350,align=center]
image::neorv32_cpu_block.png[width=600,align=center]
**Key Features**
@ -20,6 +20,7 @@ image::riscv_logo.png[width=350,align=center]
** `Zihpm` - hardware performance monitors
** `Zifencei` - instruction stream synchronization
** `Zmmul` - integer multiplication hardware
** `Zxcfu` - custom instructions extension
** `PMP` - physical memory protection
** `Debug` - debug mode
* Compatible to the RISC-V user specifications and a subset of the RISC-V privileged architecture specifications - passes the official RISC-V Architecture Tests (v2+)
@ -684,6 +685,30 @@ high for one cycle to inform the memory system (like the i-cache to perform a fl
Any additional flags within the `fence.i` instruction word are ignore by the hardware.
==== **`Zxcfu`** Custom Instructions Extension (CFU)
The `Zxcfu` presents a NEORV32-specific _custom RISC-V_ ISA extension (`Z` = sub-extension, `x` = platform-specific
custom extension, `cfu` = name of the custom extension). When enabled via the `CPU_EXTENSION_RISCV_Zxcfu` configuration
generic, this ISA extensions adds the <<_custom_functions_unit_cfu>> to the CPU core. The CFU is a module that is
allows to add **custom RISC-V instructions** to the processor core.
The CPU is implemented as ALU co-processor and is integrated right into the CPU's pipeline providing minimal data
transfer latency as it has direct access to the core's register file. Up to 1024 custom instructions can be
implemented within the CFU. These instructions are mapped to an OPCODE space that has been explicitly reserved by
the RISC-V spec for custom extensions.
Software can utilize the custom instructions by using _intrinsic functions_, which are inline assembly functions that
behave like "regular" C functions.
[TIP]
For more information regarding the CFU see section <<_custom_functions_unit_cfu>>.
[TIP]
The CFU / `Zxcfu` ISA extension is intended for application-specific _instructions_.
If you like to add more complex accelerators or interfaces that can also operate independently of
the CPU take a look at the memory-mapped <<_custom_functions_subsystem_cfs>>.
==== **`PMP`** Physical Memory Protection
The NEORV32 physical memory protection (PMP) is compatible to the RISC-V PMP specifications. It can be used
@ -796,6 +821,7 @@ configurations are presented in <<_cpu_performance>>.
| Bit-manipulation - single-bit | `B(Zbs)` | `sbset[i]` `sbclr[i]` `sbinv[i]` `sbext[i]` | 3
| Bit-manipulation - shifted-add | `B(Zba)` | `sh1add` `sh2add` `sh3add` | 3
| Bit-manipulation - carry-less multiply | `B(Zbc)` | `clmul` `clmulh` `clmulr` | 3 + 32
| CFU: custom instructions | `Zxcfu` | - | min. 4
|=======================
[NOTE]
@ -1146,3 +1172,9 @@ be enabled ba enabling a constant in the main VHDL package file (`rtl/core/neorv
-- "critical" number of PMP regions --
constant dedicated_reset_c : boolean := false; -- use dedicated hardware reset value for UNCRITICAL registers (FALSE=reset value is irrelevant (might simplify HW), default; TRUE=defined LOW reset value)
----
<<<
// ####################################################################################################################
include::cpu_cfu.adoc[]

154
docs/datasheet/cpu_cfu.adoc Normal file
View file

@ -0,0 +1,154 @@
<<<
:sectnums:
=== Custom Functions Unit (CFU)
The Custom Functions Unit is the central part of the <<_zxcfu_custom_instructions_extension_cfu>> and represents
the actual hardware module, which is used to implement _custom RISC-V instructions_. The concept of the NEORV32
CFU has been highly inspired by https://github.com/google/CFU-Playground[google's CFU-Playground].
The CFU is intended for operations that are inefficient in terms of performance, latency, energy consumption or
program memory requirements when implemented in pure software. Some potential application fields and exemplary
use-cases might include:
* **AI:** sub-word / vector / SIMD operations like adding all four bytes of a 32-bit data word
* **Cryptographic:** bit substitution and permutation
* **Communication:** conversions like binary to gray-code
* **Image processing:** look-up-tables for color space transformations
* implementing instructions from other RISC-V ISA extensions that are not yet supported by the NEORV32
[NOTE]
The CFU is not intended for complex and autonomous functional units that implement complete accelerators
like block-based AES de-/encoding). Such accelerator can be implemented within the <<_custom_functions_subsystem_cfs>>.
A comparison of all chip-internal hardware extension options is provided in the user guide section
https://stnolting.github.io/neorv32/ug/#_adding_custom_hardware_modules[Adding Custom Hardware Modules].
:sectnums:
==== Custom CFU Instructions - General
The custom instruction utilize a specific instruction space that has been explicitly reserved for user-defined
extensions by the RISC-V specifications ("_Guaranteed Non-Standard Encoding Space_"). The NEORV32 CFU uses the
_CUSTOM0_ opcode to identify custom instructions. The binary encoding of this opcode is `0001011`.
The custom instructions processed by the CFU use the 32-bit **R2-type** RISC-V instruction format, which consists
of six bit-fields:
* `funct7`: 7-bit immediate
* `rs2`: address of second source register
* `rs1`: address of first source register
* `funct3`: 3-bit immediate
* `rd`: address of destination register
* `opcode`: always `0001011` to identify custom instructions
.CFU instruction format (RISC-V R2-type)
image::cfu_r2type_instruction.png[align=center]
[NOTE]
Obviously, all bit-fields including the immediates have to be static at compile time.
.Custom Instructions - Exceptions
[NOTE]
The CPU control logic can only check the _CUSTOM0_ opcode of the custom instructions to check if the
instruction word is valid. It cannot check the `funct3` and `funct7` bit-fields since they are
implementation-defined. Hence, a custom CFU instruction can never raise an illegal instruction exception.
However, custom will raise an illegal instruction exception if the CFU is not enabled/implemented
(i.e. `Zxcfu` ISA extension is not enabled).
The CFU operates on the two source operands and return the processing result to the destination register.
The actual instruction to be performed can be defined by using the `funct7` and `funct3` bit fields.
These immediate bit-fields can also be used to pass additional data to the CFU like offsets, look-up-tables
addresses or shift-amounts. However, the actual functionality is completely user-defined.
:sectnums:
==== Using Custom Instructions in Software
The custom instructions provided by the CFU are included into plain C code by using **intrinsics**. Intrinsics
behave like "normal" functions but under the hood they are a set of macros that hide the complexity of inline assembly.
Using such intrinsics removes the need to modify the compiler, built-in libraries and the assembler when including custom
instructions.
The NEORV32 software framework provides 8 pre-defined custom instructions macros, which are defined in
`sw/lib/include/neorv32_cpu_cfu.h`. Each intrinsic provides an implicit definition of the instruction word's
`funct3` bit-field:
.CFU instruction prototypes
[source,c]
----
neorv32_cfu_cmd0(funct7, rs1, rs2) // funct3 = 000
neorv32_cfu_cmd1(funct7, rs1, rs2) // funct3 = 001
neorv32_cfu_cmd2(funct7, rs1, rs2) // funct3 = 010
neorv32_cfu_cmd3(funct7, rs1, rs2) // funct3 = 011
neorv32_cfu_cmd4(funct7, rs1, rs2) // funct3 = 100
neorv32_cfu_cmd5(funct7, rs1, rs2) // funct3 = 101
neorv32_cfu_cmd6(funct7, rs1, rs2) // funct3 = 110
neorv32_cfu_cmd7(funct7, rs1, rs2) // funct3 = 111
----
Each intrinsic functions always returns a 32-bit value (the processing result). Furthermore,
each intrinsic function requires three arguments:
* `funct7` - 7-bit immediate
* `rs2` - source operand 2, 32-bit
* `rs1` - source operand 1, 32-bit
The `funct7` bit-field is used to pass a 7-bit literal to the CFU. The `rs1` and `rs2` arguments to pass the
actual data to the CFU. These arguments can be populated with variables or literals. The following example
show how to pass arguments when executing `neorv32_cfu_cmd6`: `funct7` is set to all-zero, `rs1` is given
the literal _2751_ and `rs2` is given a variable that contains the return value from `some_function()`.
.CFU instruction usage example
[source,c]
----
uint32_t opb = some_function();
uint32_t res = neorv32_cfu_cmd6(0b0000000, 2751, opb);
----
.CFU Example Program
[TIP]
There is a simple example program for the CFU, which shows how to use the _default_ CFU hardware module.
The example program is located in `sw/example/demo_cfu`.
:sectnums:
==== Custom Instructions Hardware
The actual functionality of the CFU's custom instruction is defined by the logic in the CFU itself.
It is the responsibility of the designer to implement this logic within the CFU hardware module
`rtl/core/neorv32_cpu_cp_cfu.vhd`.
The CFU hardware module receives the data from instruction word's immediate bit-fields and also
the operation data, which is fetched from the CPU's register file.
.CFU instruction data passing example
[source,c]
----
uint32_t opb = 0x12345678;
uint32_t res = neorv32_cfu_cmd6(0b0100111, 0x00cafe00, opb);
----
In this example the CFU hardware module receives the two source operands as 32-bit signal
and the immediate values as 7-bit and 3-bit signals:
* `rs1_i` (32-bit) contains the data from the `rs1` register (here = `0x00cafe00`)
* `rs2_i` (32-bit) contains the data from the `rs2` register (here = 0x12345678)
* `control.funct3` (3-bit) contains the immediate value from the `funct3` bit-field (here = `0b110`; "cmd6")
* `control.funct7` (7-bit) contains the immediate value from the `funct7` bit-field (here = `0b0100111`)
The CFU executes the according instruction (for example this is selected by the `control.funct3` signal)
and provides the operation result in the 32-bit `control.result` signal. The processing can be entirely
combinatorial, so the result is available at the end of the current clock cycle. Processing can also
take several clock cycles and may also include internal states and memories. As soon as the CFU has
completed operations it sets the `control.done` signal high.
.CFU Hardware Example & More Details
[TIP]
The default CFU module already implement some exemplary instructions that are used for illustration
by the CFU example program. See the CFU's VHDL source file (`rtl/core/neorv32_cpu_cp_cfu.vhd`), which
is highly commented to explain the available signals and the handshake with the CPU pipeline.
.CFU Execution Time
[NOTE]
The CFU is not required to finish processing within a bound time.
However, the designer should keep in mind that the CPU is **stalled** until the CFU has finished processing.
This also means the CPU cannot react to pending interrupts. Nevertheless, interrupt requests will still be queued.

View file

@ -54,6 +54,7 @@ include::rationale.adoc[]
* **NEORV32 CPU**: 32-bit `rv32i` RISC-V CPU
** RISC-V compatibility: passes the official architecture tests
** base architecture + privileged architecture (optional) + ISA extensions (optional)
** option to add custom RISC-V instructions (as custom ISA extension)
** rich set of customization options (ISA extensions, design goal: performance / area (/ energy), ...)
** aims to support <<_full_virtualization>> capabilities (CPU _and_ SoC) to increase execution safety
** official https://github.com/riscv/riscv-isa-manual/blob/master/marchid.md[RISC-V open source architecture ID]
@ -78,6 +79,21 @@ include::rationale.adoc[]
For more in-depth details regarding the feature provided by he hardware see the according sections:
<<_neorv32_central_processing_unit_cpu>> and <<_neorv32_processor_soc>>.
**Extensibility and Customization**
The NEORV32 processor was designed to ease customization and extensibility and provides several options for adding
application-specific custom hardware modules and accelerators. The three most common options for adding custom
on-chip modules are listed below.
* <<_processor_external_memory_interface_wishbone_axi4_lite>> for processor-external modules
* <<_custom_functions_subsystem_cfs>> for tightly-coupled processor-internal co-processors
* <<_custom_functions_unit_cfu>> for custom RISC-V instructions
[TIP]
A more detailed comparison of the extension/customization options can be found in section
https://stnolting.github.io/neorv32/ug/#_adding_custom_hardware_modules[Adding Custom Hardware Modules]
of the user guide.
<<<
// ####################################################################################################################
@ -143,6 +159,7 @@ neorv32_top.vhd - NEORV32 Processor top entity
├neorv32_cpu.vhd - NEORV32 CPU top entity
│├neorv32_cpu_alu.vhd - Arithmetic/logic unit
││├neorv32_cpu_cp_bitmanip.vhd - Bit-manipulation co-processor (B ext.)
││├neorv32_cpu_cp_cfu.vhd - Custom functions (instruction) co-processor (Zxcfu ext.)
││├neorv32_cpu_cp_fpu.vhd - Floating-point co-processor (Zfinx ext.)
││├neorv32_cpu_cp_muldiv.vhd - Mul/Div co-processor (M extension)
││└neorv32_cpu_cp_shifter.vhd - Bit-shift co-processor

View file

@ -31,6 +31,8 @@ co-processors and even user-defined instructions.
**Why RISC-V?**
image::riscv_logo.png[width=250,align=left]
[quote, RISC-V International, https://riscv.org/about/]
____
RISC-V is a free and open ISA enabling a new era of processor innovation through open standard collaboration.
@ -60,7 +62,7 @@ https://github.com/olofk/serv[SERV] in terms of size. It was build having a diff
The project aims to provide _another option_ in the RISC-V / soft-core design space with a different performance
vs. size trade-off and a different focus: _embrace_ concepts like documentation, platform-independence / portability,
RISC-V compatibility, _customization_ and _ease of use_ (see the <<_project_key_features>> below).
RISC-V compatibility, _ extensibility & customization_ and _ease of use_ (see the <<_project_key_features>> below).
Furthermore, the NEORV32 pays special focus on _execution safety_ using <<_full_virtualization>>. The CPU aims to
provide fall-backs for _everything that could go wrong_. This includes malformed instruction words, privilege escalations

View file

@ -399,6 +399,18 @@ cannot be used together with the `M` extension. See section <<_zmmul_integer_mul
|======
:sectnums!:
===== _CPU_EXTENSION_RISCV_Zxcfu_
[cols="4,4,2"]
[frame="all",grid="none"]
|======
| **CPU_EXTENSION_RISCV_Zxcfu** | _boolean_ | false
3+| NEORV32-specific "custom RISC-V" ISA extensions: Implement the <<_custom_functions_unit_cfu>> for user-defined
custom instruction when _true_. See section <<_zxcfu_custom_instructions_extension_cfu>> for more information.
|======
// ####################################################################################################################
:sectnums:
==== Extension Options

View file

@ -35,7 +35,11 @@ dedicated hardware accelerators for en-/decryption (AES), signal processing (FFT
(CNNs) as well as custom IO systems like fast memory interfaces (DDR) and mass storage (SDIO), networking (CAN)
or real-time data transport (I2S).
[INFO]
[TIP]
If you like to implement _custom instructions_ that are executed right within the CPU's ALU
see the <<_zxcfu_custom_instructions_extension_cfu>> and the according <<_custom_functions_unit_cfu>>.
[TIP]
Take a look at the template CFS VHDL source file (`rtl/core/neorv32_cfs.vhd`). The file is highly
commented to illustrate all aspects that are relevant for implementing custom CFS-based co-processor designs.

View file

@ -52,6 +52,7 @@ will signal a "DEVICE ERROR" in this case.
| `0` | _SYSINFO_CPU_ZICSR_ | `Zicsr` extension (`I` sub-extension) available when set (via top's <<_cpu_extension_riscv_zicsr>> generic)
| `1` | _SYSINFO_CPU_ZIFENCEI_ | `Zifencei` extension (`I` sub-extension) available when set (via top's <<_cpu_extension_riscv_zifencei>> generic)
| `2` | _SYSINFO_CPU_ZMMUL_ | `Zmmul` extension (`M` sub-extension) available when set (via top's <<_cpu_extension_riscv_zmmul>> generic)
| `3` | _SYSINFO_CPU_ZXCFU_ | `Zxcfu` extension (custom functions unit for custom instructions) available when set (via top's <<_cpu_extension_riscv_zxcfu>> generic)
| `5` | _SYSINFO_CPU_ZFINX_ | `Zfinx` extension (`F` sub-/alternative-extension) available when set (via top's <<_cpu_extension_riscv_zfinx>> generic)
| `6` | _SYSINFO_CPU_ZXSCNT_ | Custom extension - _Small_ CPU counters: `[m]cycle` & `[m]instret` CSRs have less than 64-bit when set (via top's <<_cpu_cnt_width>> generic)
| `7` | _SYSINFO_CPU_ZXNOCNT_ | Custom extension - _NO_ CPU counters: `[m]cycle` & `[m]instret` CSRs are NOT available at all when set (via top's <<_cpu_cnt_width>> generic)

View file

@ -63,24 +63,25 @@ files are currently part of the NEORV32 core library:
[options="header",grid="rows"]
|=======================
| C source file | C header file | Description
| - | `neorv32.h` | main NEORV32 definitions and library file
| `neorv32_cfs.c` | `neorv32_cfs.h` | HW driver (stub)footnote:[This driver file only represents a stub, since the real CFS drivers are defined by the actual CFS implementation.] functions for the custom functions subsystem
| `neorv32_cpu.c` | `neorv32_cpu.h` | HW driver functions for the NEORV32 **CPU**
| `neorv32_gpio.c` | `neorv32_gpio.h` | HW driver functions for the **GPIO**
| `neorv32_gptmr.c` | `neorv32_gptmr.h` | HW driver functions for the **GPTRM**
| - | `neorv32_intrinsics.h` | macros for (custom) intrinsics/instructions
| `neorv32_mtime.c` | `neorv32_mtime.h` | HW driver functions for the **MTIME**
| `neorv32_neoled.c` | `neorv32_neoled.h` | HW driver functions for the **NEOLED**
| `neorv32_pwm.c` | `neorv32_pwm.h` | HW driver functions for the **PWM**
| `neorv32_rte.c` | `neorv32_rte.h` | NEORV32 **runtime environment** and helpers
| `neorv32_slink.c` | `neorv32_slink.h` | HW driver functions for the **SLINK**
| `neorv32_spi.c` | `neorv32_spi.h` | HW driver functions for the **SPI**
| `neorv32_trng.c` | `neorv32_trng.h` | HW driver functions for the **TRNG**
| `neorv32_twi.c` | `neorv32_twi.h` | HW driver functions for the **TWI**
| `neorv32_uart.c` | `neorv32_uart.h` | HW driver functions for the **UART0** and **UART1**
| `neorv32_wdt.c` | `neorv32_wdt.h` | HW driver functions for the **WDT**
| `neorv32_xip.c` | `neorv32_xip.h` | HW driver functions for the **XIP**
| `neorv32_xirq.c` | `neorv32_xirq.h` | HW driver functions for the **XIRQ**
| - | `neorv32.h` | main NEORV32 definitions and library file
| `neorv32_cfs.c` | `neorv32_cfs.h` | HW driver (stub)footnote:[This driver file only represents a stub, since the real CFS drivers are defined by the actual CFS implementation.] functions for the custom functions subsystem
| `neorv32_cpu.c` | `neorv32_cpu.h` | HW driver functions for the NEORV32 **CPU**
| `neorv32_cpu_cfu.c` | `neorv32_cpu_cfu.h` | HW driver functions for the NEORV32 **CFU** (custom instructions)
| `neorv32_gpio.c` | `neorv32_gpio.h` | HW driver functions for the **GPIO**
| `neorv32_gptmr.c` | `neorv32_gptmr.h` | HW driver functions for the **GPTRM**
| - | `neorv32_intrinsics.h` | macros for (custom) intrinsics/instructions
| `neorv32_mtime.c` | `neorv32_mtime.h` | HW driver functions for the **MTIME**
| `neorv32_neoled.c` | `neorv32_neoled.h` | HW driver functions for the **NEOLED**
| `neorv32_pwm.c` | `neorv32_pwm.h` | HW driver functions for the **PWM**
| `neorv32_rte.c` | `neorv32_rte.h` | NEORV32 **runtime environment** and helpers
| `neorv32_slink.c` | `neorv32_slink.h` | HW driver functions for the **SLINK**
| `neorv32_spi.c` | `neorv32_spi.h` | HW driver functions for the **SPI**
| `neorv32_trng.c` | `neorv32_trng.h` | HW driver functions for the **TRNG**
| `neorv32_twi.c` | `neorv32_twi.h` | HW driver functions for the **TWI**
| `neorv32_uart.c` | `neorv32_uart.h` | HW driver functions for the **UART0** and **UART1**
| `neorv32_wdt.c` | `neorv32_wdt.h` | HW driver functions for the **WDT**
| `neorv32_xip.c` | `neorv32_xip.h` | HW driver functions for the **XIP**
| `neorv32_xirq.c` | `neorv32_xirq.h` | HW driver functions for the **XIRQ**
|=======================
.Documentation

Binary file not shown.

After

Width:  |  Height:  |  Size: 3.5 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 65 KiB

After

Width:  |  Height:  |  Size: 66 KiB

Before After
Before After

Binary file not shown.

After

Width:  |  Height:  |  Size: 30 KiB

Binary file not shown.

Before

Width:  |  Height:  |  Size: 111 KiB

After

Width:  |  Height:  |  Size: 118 KiB

Before After
Before After

View file

@ -4,6 +4,7 @@
In resemblance to the RISC-V ISA, the NEORV32 processor was designed to ease customization and _extensibility_.
The processor provides several predefined options to add application-specific custom hardware modules and accelerators.
A <<_comparative_summary>> is given at the end of this section.
=== Standard (_External_) Interfaces
@ -15,47 +16,94 @@ https://stnolting.github.io/neorv32/#_primary_universal_asynchronous_receiver_an
https://stnolting.github.io/neorv32/#_serial_peripheral_interface_controller_spi[SPI] and
https://stnolting.github.io/neorv32/#_two_wire_serial_interface_controller_twi[TWI].
The SPI and (especially) the GPIO interfaces might be the most straightforward approaches since they
have a minimal protocol overhead. Device-specific interrupt capabilities can be added using the
The SPI and especially the GPIO interfaces might be the most straightforward approaches since they
have a minimal protocol overhead. Device-specific interrupt capabilities could be added using the
https://stnolting.github.io/neorv32/#_external_interrupt_controller_xirq[External Interrupt Controller (XIRQ)].
Beyond simplicity, these interface only provide a very limited bandwidth and require more sophisticated
software handling ("bit-banging" for the GPIO).
software handling ("bit-banging" for the GPIO). Hence, i is not recommend to use them for _chip-internal_ communication.
=== External Bus Interface
The https://stnolting.github.io/neorv32/#_processor_external_memory_interface_wishbone_axi4_lite[External Bus Interface]
provides the classic approach to connect to custom IP. By default, the bus interface implements the widely adopted
Wishbone interface standard. However, this project also includes wrappers to bridge to other protocol standards like ARM's
AXI4-Lite or Intel's Avalon. By using a full-featured bus protocol, complex SoC structures can be implemented (including
several modules and even multi-core architectures). Many FPGA EDA tools provide graphical editors to build and customize
whole SoC architectures and even include pre-defined IP libraries.
provides the classic approach for attaching custom IP. By default, the bus interface implements the widely adopted
Wishbone interface standard. This project also includes wrappers to convert to other protocol standards like ARM's
AXI4-Lite or Intel's Avalon protocols. By using a full-featured bus protocol, complex SoC designs can be implemented
including several modules and even multi-core architectures. Many FPGA EDA tools provide graphical editors to build
and customize whole SoC architectures and even include pre-defined IP libraries.
.Example AXI SoC using Xilinx Vivado
image::neorv32_axi_soc.png[]
Custom hardware modules attached to the processor's bus interface have no limitations regarding their functionality.
User-defined interfaces (like DDR memory access) can be implemented and the hardware module can operate completely
independent of the CPU.
The bus interface uses a memory-mapped approach. All data transfers are handled by simple load/store operations since the
external bus interface is mapped into the processor's https://stnolting.github.io/neorv32/#_address_space[address space].
This allows a very simple still high-bandwidth communications.
This allows a very simple still high-bandwidth communications. However, high bus traffic may increase access latencies.
=== Stream Link Interface
The NEORV32 https://stnolting.github.io/neorv32/#_stream_link_interface_slink[Stream Link Interface] provides
point-to-point, unidirectional and parallel data channels that can be used to transfer streaming data. In
contrast to the external bus interface, the streaming data does not provide any kind of "direction" control,
so it can be seen as "constant address bursts". The stream link interface provides less protocol overhead
and less latency than the bus interface. Furthermore, FIFOs can be be configured to each direction (RX/TX) to
allow more CPU-independent operation.
The https://stnolting.github.io/neorv32/#_stream_link_interface_slink[Stream Link Interface (SLINK)] provides a
point-to-point, unidirectional and parallel data interface that can be used to transfer _streaming_ data. In
contrast to the external bus interface, the streaming interface does not provide any kind of advanced control,
so it can be seen as "constant address bursts" where data is transmitted _sequentially_ (no random accesses).
While the CPU needs to "feed" the stream link interfaces with data (and read back incoming data), the actual
processor-external processing of the data run independently of the CPU.
The stream link interface provides less protocol overhead and less latency than the bus interface. Furthermore,
FIFOs can be be configured to each direction (RX/TX) to allow more CPU-independent operation.
=== Custom Functions Subsystem
The NEORV32 https://stnolting.github.io/neorv32/#_custom_functions_subsystem_cfs[Custom Functions Subsystem] is
an "empty" template for a processor-internal module. It provides 32 32-bit memory-mapped interface
registers that can be used to communicate with any arbitrary custom design logic. The intentions of this
subsystem is to provide a simple base, where the user can concentrate on implementing the actual design logic
rather than taking care of the communication between the CPU/software and the design logic. The interface
registers are already allocated within the processor's address space and are supported by the software framework
via low-level hardware access mechanisms. Additionally, the CFS provides a direct pre-defined interrupt channel to
the CPU, which is also supported by the NEORV32 runtime environment.
The https://stnolting.github.io/neorv32/#_custom_functions_subsystem_cfs[Custom Functions Subsystem (CFS)] is
an "empty" template for a memory-mapped, processor-internal module.
The basic idea of this subsystem is to provide a convenient, simple and flexible platform, where the user can
concentrate on implementing the actual design logic rather than taking care of the communication between the
CPU/software and the design logic. Note that the CFS does not have direct access to memory. All data (and control
instruction) have to be send by the CPU.
The use-cases for the CFS include medium-scale hardware accelerators that need to be tightly-coupled to the CPU.
Potential use cases could be DSP modules like CORDIC, cryptographic accelerators or custom interfaces (like IIS).
=== Custom Functions Unit
The https://stnolting.github.io/neorv32/#_custom_functions_unit_cfu[Custom Functions Unit (CFU)] is a functional
unit that is integrated right into the CPU's pipeline. It allows to implement custom RISC-V instructions.
This extension option is intended for rather small logic that implements operations, which cannot be emulated
in pure software in an efficient way. Since the CFU has direct access to the core's register file it can operate
with minimal data latency.
=== Comparative Summary
The following table gives a comparative summary of the most important factors when choosing one of the
chip-internal extension options:
* https://stnolting.github.io/neorv32/#_custom_functions_unit_cfu[Custom Functions Unit] for CPU-internal custom RISC-V instructions
* https://stnolting.github.io/neorv32/#_custom_functions_subsystem_cfs[Custom Functions Subsystem] for tightly-coupled processor-internal co-processors
* https://stnolting.github.io/neorv32/#_stream_link_interface_slink[Stream Link Interface] for processor-external streaming modules
* https://stnolting.github.io/neorv32/#_processor_external_memory_interface_wishbone_axi4_lite[External Bus Interface] for processor-external memory-mapped modules
.Comparison of On-Chip Extension Options
[cols="<1,^1,^1,^1,^1"]
[options="header",grid="rows"]
|=======================
| | Custom Functions Unit | Custom Functions Subsystem | Stream Link Interface | External Bus Interface
| **SoC location** | CPU-internal | processor-internal | processor-external | processor-external
| **HW complexity/size** | small | medium | unlimited | unlimited
| **CPU-independent operation** | no | partly | partly | completely
| **CPU interface** | register-file access | memory-mapped | memory-mapped | memory-mapped
| **Low-level CPU access scheme** | custom instructions | load/store | load/store | load/store
| **Random access** | - | yes | no, only sequential | yes
| **Access latency** | minimal | low | low | medium to high
| **External IO interfaces** | no | yes, but limited | yes | yes
| **Interrupt-capable** | no | yes | yes | user-defined
|=======================

View file

@ -5,6 +5,7 @@
-- # * neorv32_cpu.vhd - CPU top entity #
-- # * neorv32_cpu_alu.vhd - Arithmetic/logic unit #
-- # * neorv32_cpu_cp_bitmanip.vhd - Bit-manipulation co-processor #
-- # * neorv32_cpu_cp_cfu.vhd - Custom instructions co-processor #
-- # * neorv32_cpu_cp_fpu.vhd - Single-precision FPU co-processor #
-- # * neorv32_cpu_cp_muldiv.vhd - Integer multiplier/divider co-processor #
-- # * neorv32_cpu_cp_shifter.vhd - Base ISA shifter unit #
@ -76,6 +77,7 @@ entity neorv32_cpu is
CPU_EXTENSION_RISCV_Zihpm : boolean; -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei : boolean; -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit?
CPU_EXTENSION_RISCV_DEBUG : boolean; -- implement CPU debug mode?
-- Extension Options --
FAST_MUL_EN : boolean; -- use DSPs for M extension's multiplier
@ -182,6 +184,7 @@ begin
cond_sel_string_f(CPU_EXTENSION_RISCV_Zifencei, "_Zifencei", "") &
cond_sel_string_f(CPU_EXTENSION_RISCV_Zfinx, "_Zfinx", "") &
cond_sel_string_f(CPU_EXTENSION_RISCV_Zmmul, "_Zmmul", "") &
cond_sel_string_f(CPU_EXTENSION_RISCV_Zxcfu, "_Zxcfu", "") &
cond_sel_string_f(CPU_EXTENSION_RISCV_DEBUG, "_DEBUG", "") &
""
severity note;
@ -225,6 +228,9 @@ begin
-- Mul-extension --
assert not ((CPU_EXTENSION_RISCV_Zmmul = true) and (CPU_EXTENSION_RISCV_M = true)) report "NEORV32 CPU CONFIG ERROR! <M> and <Zmmul> extensions cannot co-exist!" severity error;
-- Custom Functions Unit --
assert not (CPU_EXTENSION_RISCV_Zxcfu = true) report "NEORV32 CPU CONFIG NOTE: Implementing Custom Functions Unit (CFU) as <Zxcfu> ISA extension." severity note;
-- Debug mode --
assert not ((CPU_EXTENSION_RISCV_DEBUG = true) and (CPU_EXTENSION_RISCV_Zicsr = false)) report "NEORV32 CPU CONFIG ERROR! Debug mode requires <CPU_EXTENSION_RISCV_Zicsr> extension to be enabled." severity error;
assert not ((CPU_EXTENSION_RISCV_DEBUG = true) and (CPU_EXTENSION_RISCV_Zifencei = false)) report "NEORV32 CPU CONFIG ERROR! Debug mode requires <CPU_EXTENSION_RISCV_Zifencei> extension to be enabled." severity error;
@ -257,6 +263,7 @@ begin
CPU_EXTENSION_RISCV_Zihpm => CPU_EXTENSION_RISCV_Zihpm, -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei => CPU_EXTENSION_RISCV_Zifencei, -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul => CPU_EXTENSION_RISCV_Zmmul, -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu => CPU_EXTENSION_RISCV_Zxcfu, -- implement custom (instr.) functions unit?
CPU_EXTENSION_RISCV_DEBUG => CPU_EXTENSION_RISCV_DEBUG, -- implement CPU debug mode?
-- Extension Options --
CPU_CNT_WIDTH => CPU_CNT_WIDTH, -- total width of CPU cycle and instret counters (0..64)
@ -349,6 +356,7 @@ begin
CPU_EXTENSION_RISCV_M => CPU_EXTENSION_RISCV_M, -- implement mul/div extension?
CPU_EXTENSION_RISCV_Zmmul => CPU_EXTENSION_RISCV_Zmmul, -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zfinx => CPU_EXTENSION_RISCV_Zfinx, -- implement 32-bit floating-point extension (using INT reg!)
CPU_EXTENSION_RISCV_Zxcfu => CPU_EXTENSION_RISCV_Zxcfu, -- implement custom (instr.) functions unit?
-- Extension Options --
FAST_MUL_EN => FAST_MUL_EN, -- use DSPs for M extension's multiplier
FAST_SHIFT_EN => FAST_SHIFT_EN -- use barrel shifter for shift operations

View file

@ -1,7 +1,7 @@
-- #################################################################################################
-- # << NEORV32 - Arithmetical/Logical Unit >> #
-- # ********************************************************************************************* #
-- # Main data and address ALU and co-processor interface/arbiter. #
-- # Main data/address ALU and ALU co-processor (= multi-cycle function units). #
-- # ********************************************************************************************* #
-- # BSD 3-Clause License #
-- # #
@ -48,6 +48,7 @@ entity neorv32_cpu_alu is
CPU_EXTENSION_RISCV_M : boolean; -- implement mul/div extension?
CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zfinx : boolean; -- implement 32-bit floating-point extension (using INT reg!)
CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit?
-- Extension Options --
FAST_MUL_EN : boolean; -- use DSPs for M extension's multiplier
FAST_SHIFT_EN : boolean -- use barrel shifter for shift operations
@ -334,10 +335,31 @@ begin
end generate;
-- Co-Processor 4: Reserved ---------------------------------------------------------------
-- Co-Processor 4: Custom (Instructions) Functions Unit ('Zxcfu' Extension) ---------------
-- -------------------------------------------------------------------------------------------
cp_result(4) <= (others => '0');
cp_valid(4) <= '0';
neorv32_cpu_cp_cfu_inst_true:
if (CPU_EXTENSION_RISCV_Zxcfu = true) generate
neorv32_cpu_cp_cfu_inst: neorv32_cpu_cp_cfu
port map (
-- global control --
clk_i => clk_i, -- global clock, rising edge
rstn_i => rstn_i, -- global reset, low-active, async
ctrl_i => ctrl_i, -- main control bus
start_i => cp_start(4), -- trigger operation
-- data input --
rs1_i => rs1_i, -- rf source 1
rs2_i => rs2_i, -- rf source 2
-- result and status --
res_o => cp_result(4), -- operation result
valid_o => cp_valid(4) -- data output valid
);
end generate;
neorv32_cpu_cp_cfu_inst_false:
if (CPU_EXTENSION_RISCV_Zxcfu = false) generate
cp_result(4) <= (others => '0');
cp_valid(4) <= '0';
end generate;
-- Co-Processor 5: Reserved ---------------------------------------------------------------

View file

@ -67,6 +67,7 @@ entity neorv32_cpu_control is
CPU_EXTENSION_RISCV_Zihpm : boolean; -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei : boolean; -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit?
CPU_EXTENSION_RISCV_DEBUG : boolean; -- implement CPU debug mode?
-- Extension Options --
CPU_CNT_WIDTH : natural; -- total width of CPU cycle and instret counters (0..64)
@ -1127,6 +1128,17 @@ begin
end if;
when opcode_cust0_c => -- CFU: custom RISC-V instructions (CUSTOM0 OPCODE space)
-- ------------------------------------------------------------
if (CPU_EXTENSION_RISCV_Zxcfu = true) then
ctrl_nxt(ctrl_cp_id_msb_c downto ctrl_cp_id_lsb_c) <= cp_sel_cfu_c; -- trigger CFU CP
ctrl_nxt(ctrl_alu_func1_c downto ctrl_alu_func0_c) <= alu_func_copro_c;
execute_engine.state_nxt <= ALU_WAIT;
else
execute_engine.state_nxt <= SYS_WAIT;
end if;
when others => -- system/csr access OR illegal opcode - nothing bad (= no commits) will happen here if there is an illegal opcode
-- ------------------------------------------------------------
if (CPU_EXTENSION_RISCV_Zicsr = true) then
@ -1188,7 +1200,7 @@ begin
ctrl_nxt(ctrl_alu_func1_c downto ctrl_alu_func0_c) <= alu_func_copro_c;
-- wait for completion or abort on illegal instruction exception (the co-processor will also terminate operations)
if (alu_idone_i = '1') or (trap_ctrl.exc_buf(exception_iillegal_c) = '1') then
ctrl_nxt(ctrl_rf_wb_en_c) <= '1'; -- valid RF write-back
ctrl_nxt(ctrl_rf_wb_en_c) <= '1'; -- valid RF write-back (won't happen in case of an illegal instruction)
execute_engine.state_nxt <= DISPATCH;
end if;
@ -1556,7 +1568,17 @@ begin
end if;
-- illegal E-CPU register? --
-- FIXME: rs2 is not checked!
illegal_register <= execute_engine.i_reg(instr_rs1_msb_c) or execute_engine.i_reg(instr_rd_msb_c);
illegal_register <= bool_to_ulogic_f(CPU_EXTENSION_RISCV_Zfinx) and (execute_engine.i_reg(instr_rs1_msb_c) or execute_engine.i_reg(instr_rd_msb_c));
when opcode_cust0_c => -- CFU: custom instructions
-- ------------------------------------------------------------
if (CPU_EXTENSION_RISCV_Zxcfu = true) then -- CFU extension implemented
illegal_instruction <= '0';
else
illegal_instruction <= '1';
end if;
-- illegal E-CPU register? --
illegal_register <= bool_to_ulogic_f(CPU_EXTENSION_RISCV_Zxcfu) and (execute_engine.i_reg(instr_rs2_msb_c) or execute_engine.i_reg(instr_rs1_msb_c) or execute_engine.i_reg(instr_rd_msb_c));
when others => -- undefined instruction -> illegal!
-- ------------------------------------------------------------

View file

@ -0,0 +1,189 @@
-- #################################################################################################
-- # << NEORV32 - CPU Co-Processor: Custom (Instructions) Functions Unit >> #
-- # ********************************************************************************************* #
-- # Intended for user-defined custom instructions (R2-type format only). #
-- # See the CPU's documentation for more information. #
-- # #
-- # NOTE: Take a look at the "software-counterpart" of this CFU example in 'sw/example/demo_cfu'. #
-- # #
-- # NOTE: This is a very early and very exemplary implementation of the custom functions unit. #
-- # Hence, it is not yet optimized for minimal interface latency. #
-- # ********************************************************************************************* #
-- # BSD 3-Clause License #
-- # #
-- # Copyright (c) 2022, Stephan Nolting. All rights reserved. #
-- # #
-- # Redistribution and use in source and binary forms, with or without modification, are #
-- # permitted provided that the following conditions are met: #
-- # #
-- # 1. Redistributions of source code must retain the above copyright notice, this list of #
-- # conditions and the following disclaimer. #
-- # #
-- # 2. Redistributions in binary form must reproduce the above copyright notice, this list of #
-- # conditions and the following disclaimer in the documentation and/or other materials #
-- # provided with the distribution. #
-- # #
-- # 3. Neither the name of the copyright holder nor the names of its contributors may be used to #
-- # endorse or promote products derived from this software without specific prior written #
-- # permission. #
-- # #
-- # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS #
-- # OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF #
-- # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE #
-- # COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, #
-- # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE #
-- # GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED #
-- # AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING #
-- # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED #
-- # OF THE POSSIBILITY OF SUCH DAMAGE. #
-- # ********************************************************************************************* #
-- # The NEORV32 Processor - https://github.com/stnolting/neorv32 (c) Stephan Nolting #
-- #################################################################################################
library ieee;
use ieee.std_logic_1164.all;
use ieee.numeric_std.all;
library neorv32;
use neorv32.neorv32_package.all;
entity neorv32_cpu_cp_cfu is
port (
-- global control --
clk_i : in std_ulogic; -- global clock, rising edge
rstn_i : in std_ulogic; -- global reset, low-active, async
ctrl_i : in std_ulogic_vector(ctrl_width_c-1 downto 0); -- main control bus
start_i : in std_ulogic; -- trigger operation
-- data input --
rs1_i : in std_ulogic_vector(data_width_c-1 downto 0); -- rf source 1
rs2_i : in std_ulogic_vector(data_width_c-1 downto 0); -- rf source 2
-- result and status --
res_o : out std_ulogic_vector(data_width_c-1 downto 0); -- operation result
valid_o : out std_ulogic -- data output valid
);
end neorv32_cpu_cp_cfu;
architecture neorv32_cpu_cp_cfu_rtl of neorv32_cpu_cp_cfu is
-- CFU controller - do not modify --
type control_t is record
busy : std_ulogic; -- CFU is busy
done : std_ulogic; -- set to '1' when processing is done
result : std_ulogic_vector(data_width_c-1 downto 0); -- user's processing result (for write-back to register file)
funct3 : std_ulogic_vector(2 downto 0); -- "funct3" bit-field from custom instruction
funct7 : std_ulogic_vector(6 downto 0); -- "funct7" bit-field from custom instruction
end record;
signal control : control_t;
begin
-- CFU Controller -------------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
-- This controller is required to handle the CPU/pipeline interface. Do not modify!
cfu_control: process(rstn_i, clk_i)
begin
if (rstn_i = '0') then
control.busy <= '0';
res_o <= (others => '0');
elsif rising_edge(clk_i) then
res_o <= (others => '0'); -- default
if (control.busy = '0') then -- idle
if (start_i = '1') then
control.busy <= '1';
end if;
else -- busy
if (control.done = '1') or (ctrl_i(ctrl_trap_c) = '1') then -- processing done? abort if trap
res_o <= control.result;
control.busy <= '0';
end if;
end if;
end if;
end process cfu_control;
-- CPU feedback --
valid_o <= control.busy and control.done; -- set one cycle before result data
-- pack user-defined instruction function bits --
control.funct3 <= ctrl_i(ctrl_ir_funct3_2_c downto ctrl_ir_funct3_0_c);
control.funct7 <= ctrl_i(ctrl_ir_funct12_11_c downto ctrl_ir_funct12_5_c);
-- ****************************************************************************************************************************
-- Actual CFU user logic - Add your custom logic below
-- ****************************************************************************************************************************
-- The CFU only supports the R2-type RISC-V instruction format. This format consists of two source registers (rs1 and rs2),
-- a destination register (rd) and two "immediate" bit-fields (funct7 and funct3). It is up to the user to decide which
-- of these fields are actually used by the CFU logic.
--
-- The user logic of the CFU has access to the following pre-defined signals:
--
-- -------------------------------------------------------------------------------------------
-- Input Operands
-- -------------------------------------------------------------------------------------------
-- > rs1_i (input, 32-bit): source register 1
-- > rs2_i (input, 32-bit): source register 2
-- > control.funct3 (input, 3-bit): 3-bit function select / immediate, driven by instruction word's funct3 bit field
-- > control.funct7 (input, 7-bit): 7-bit function select / immediate, driven by instruction word's funct7 bit field
--
-- The two signal rs1_i and rs2_i provide the data read from the CPU's register file, which is adressed by the
-- instruction word's rs1 and rs2 bit-fields.
--
-- The actual CFU operation can be defined by using the funct3 and funct7 signals. Both signals are directly driven by
-- the according bit-fields of the custom instruction. Note that these signals represent "immediates" that have to be
-- static already at compile time. These immediates can be used to select the actual function to be executed or they
-- can be used as immediates for certain operations (like shift amounts, addresses or offsets).
--
-- [NOTE]: rs1_i and rs2_i are directly driven by the register file (block RAM). It is recommended to buffer these signals
-- using CFU-internal registers before using them for computations as the rs1 and rs2 nets need to drive a lot of logic
-- in the CPU.
--
-- [NOTE]: It is not possible for the CFU and it's according instruction words to cause any kind of exception. The CPU
-- control logic only verifies the custom instructions OPCODE and checks if the CFU is implemented at all. No combination
-- of funct7 and funct3 will cause an exception.
--
-- -------------------------------------------------------------------------------------------
-- Result output
-- -------------------------------------------------------------------------------------------
-- > control.result (output, 32-bit): processing result
--
-- When the CFU has finished computation, the data in the control.result signal will be written to the CPU's register
-- file. The destination register is addressed by the rd bit-field in the instruction. The CFU result output is
-- registered in the CFU controller (see above) so do not worry too much about increasing the CPU's critical path. ;)
--
-- -------------------------------------------------------------------------------------------
-- Control
-- -------------------------------------------------------------------------------------------
-- > rstn_i (input, 1-bit): asynchronous reset, low-active
-- > clk_i (input, 1-bit): main clock
-- > start_i (input, 1-bit): operation trigger (start processing, high for one cycle)
-- > control.done (output, 1-bit): set high when the processing is done
--
-- For pure-combinatorial instructions (without internal state) a subset of those signals is sufficient; see the minimal
-- example below. If the CFU shall also include states (like memories, registers or "buffers") the start_i signal can be
-- used to trigger a new CFU operation. As soon as all internal computations have completed, the control.done signal has
-- to be set to indicate completion. This will write the result data (control.result) to the CPU register file.
--
-- [IMPORTANT]: The control.done *has to be set at some time*, otherwise the CPU will be halted forever.
-- User Logic Example ---------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
user_logic_function_select: process(control, rs1_i, rs2_i)
begin
-- This is a simple ALU that implements four pure-combinatorial instructions.
-- The actual function to-be-executed is selected by the "funct3" bit-field of the custom instruction.
case control.funct3 is
when "000" => control.result <= bin_to_gray_f(rs1_i); -- funct3 = "000": convert rs1 from binary to gray
when "001" => control.result <= gray_to_bin_f(rs1_i); -- funct3 = "001": convert rs1 from gray to binary
when "010" => control.result <= bit_rev_f(rs1_i); -- funct3 = "010": bit-reversal of rs1
when "011" => control.result <= rs1_i xnor rs2_i; -- funct3 = "011": XNOR input operands
when others => control.result <= (others => '0'); -- not implemented, set to zero
end case;
end process user_logic_function_select;
-- processing done? --
control.done <= '1'; -- we are just doing pure-combinatorial data processing here, which is done "immediately"
end neorv32_cpu_cp_cfu_rtl;

View file

@ -63,7 +63,7 @@ package neorv32_package is
-- Architecture Constants (do not modify!) ------------------------------------------------
-- -------------------------------------------------------------------------------------------
constant data_width_c : natural := 32; -- native data path width - do not change!
constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01060700"; -- no touchy!
constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01060701"; -- no touchy!
constant archid_c : natural := 19; -- official NEORV32 architecture ID - hands off!
-- Check if we're inside the Matrix -------------------------------------------------------
@ -454,6 +454,9 @@ package neorv32_package is
constant opcode_atomic_c : std_ulogic_vector(6 downto 0) := "0101111"; -- atomic operations (A extension)
-- floating point operations (Zfinx-only) (F/D/H/Q) --
constant opcode_fop_c : std_ulogic_vector(6 downto 0) := "1010011"; -- dual/single operand instruction
-- official "custom0/1" RISC-V opcodes - free for custom instructions --
constant opcode_cust0_c : std_ulogic_vector(6 downto 0) := "0001011"; -- custom instructions 0
--constant opcode_cust1_c : std_ulogic_vector(6 downto 0) := "0101011"; -- custom instructions 1
-- RISC-V Funct3 --------------------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
@ -780,7 +783,7 @@ package neorv32_package is
constant cp_sel_muldiv_c : std_ulogic_vector(2 downto 0) := "001"; -- CP1: multiplication/division operations ('M' extensions)
constant cp_sel_bitmanip_c : std_ulogic_vector(2 downto 0) := "010"; -- CP2: bit manipulation ('B' extensions)
constant cp_sel_fpu_c : std_ulogic_vector(2 downto 0) := "011"; -- CP3: floating-point unit ('Zfinx' extension)
--constant cp_sel_res0_c : std_ulogic_vector(2 downto 0) := "100"; -- CP4: reserved
constant cp_sel_cfu_c : std_ulogic_vector(2 downto 0) := "100"; -- CP4: custom instructions CFU ('Zxcfu' extension)
--constant cp_sel_res1_c : std_ulogic_vector(2 downto 0) := "101"; -- CP5: reserved
--constant cp_sel_res2_c : std_ulogic_vector(2 downto 0) := "110"; -- CP6: reserved
--constant cp_sel_res3_c : std_ulogic_vector(2 downto 0) := "111"; -- CP7: reserved
@ -945,6 +948,7 @@ package neorv32_package is
CPU_EXTENSION_RISCV_Zihpm : boolean := false; -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei : boolean := false; -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul : boolean := false; -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu : boolean := false; -- implement custom (instr.) functions unit?
-- Extension Options --
FAST_MUL_EN : boolean := false; -- use DSPs for M extension's multiplier
FAST_SHIFT_EN : boolean := false; -- use barrel shifter for shift operations
@ -1104,6 +1108,7 @@ package neorv32_package is
CPU_EXTENSION_RISCV_Zihpm : boolean; -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei : boolean; -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit?
CPU_EXTENSION_RISCV_DEBUG : boolean; -- implement CPU debug mode?
-- Extension Options --
FAST_MUL_EN : boolean; -- use DSPs for M extension's multiplier
@ -1181,6 +1186,7 @@ package neorv32_package is
CPU_EXTENSION_RISCV_Zihpm : boolean; -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei : boolean; -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit?
CPU_EXTENSION_RISCV_DEBUG : boolean; -- implement CPU debug mode?
-- Extension Options --
CPU_CNT_WIDTH : natural; -- total width of CPU cycle and instret counters (0..64)
@ -1267,6 +1273,7 @@ package neorv32_package is
CPU_EXTENSION_RISCV_M : boolean; -- implement mul/div extension?
CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zfinx : boolean; -- implement 32-bit floating-point extension (using INT reg!)
CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit?
-- Extension Options --
FAST_MUL_EN : boolean; -- use DSPs for M extension's multiplier
FAST_SHIFT_EN : boolean -- use barrel shifter for shift operations
@ -1379,6 +1386,24 @@ package neorv32_package is
);
end component;
-- Component: CPU Co-Processor Custom (Instr.) Functions Unit ('Zxcfu' extension) ---------
-- -------------------------------------------------------------------------------------------
component neorv32_cpu_cp_cfu
port (
-- global control --
clk_i : in std_ulogic; -- global clock, rising edge
rstn_i : in std_ulogic; -- global reset, low-active, async
ctrl_i : in std_ulogic_vector(ctrl_width_c-1 downto 0); -- main control bus
start_i : in std_ulogic; -- trigger operation
-- data input --
rs1_i : in std_ulogic_vector(data_width_c-1 downto 0); -- rf source 1
rs2_i : in std_ulogic_vector(data_width_c-1 downto 0); -- rf source 2
-- result and status --
res_o : out std_ulogic_vector(data_width_c-1 downto 0); -- operation result
valid_o : out std_ulogic -- data output valid
);
end component;
-- Component: CPU Bus Interface -----------------------------------------------------------
-- -------------------------------------------------------------------------------------------
component neorv32_cpu_bus
@ -2026,6 +2051,7 @@ package neorv32_package is
CPU_EXTENSION_RISCV_Zihpm : boolean; -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei : boolean; -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit?
CPU_EXTENSION_RISCV_DEBUG : boolean; -- implement CPU debug mode?
-- Extension Options --
FAST_MUL_EN : boolean; -- use DSPs for M extension's multiplier

View file

@ -54,6 +54,7 @@ entity neorv32_sysinfo is
CPU_EXTENSION_RISCV_Zihpm : boolean; -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei : boolean; -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul : boolean; -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu : boolean; -- implement custom (instr.) functions unit?
CPU_EXTENSION_RISCV_DEBUG : boolean; -- implement CPU debug mode?
-- Extension Options --
FAST_MUL_EN : boolean; -- use DSPs for M extension's multiplier
@ -141,8 +142,9 @@ begin
sysinfo_mem(1)(00) <= bool_to_ulogic_f(CPU_EXTENSION_RISCV_Zicsr); -- Zicsr
sysinfo_mem(1)(01) <= bool_to_ulogic_f(CPU_EXTENSION_RISCV_Zifencei); -- Zifencei
sysinfo_mem(1)(02) <= bool_to_ulogic_f(CPU_EXTENSION_RISCV_Zmmul); -- Zmmul
sysinfo_mem(1)(03) <= bool_to_ulogic_f(CPU_EXTENSION_RISCV_Zxcfu); -- Zxcfu
--
sysinfo_mem(1)(04 downto 03) <= (others => '0'); -- reserved
sysinfo_mem(1)(04) <= '0'; -- reserved
--
sysinfo_mem(1)(05) <= bool_to_ulogic_f(CPU_EXTENSION_RISCV_Zfinx); -- Zfinx ("F-alternative")
sysinfo_mem(1)(06) <= bool_to_ulogic_f(boolean(CPU_CNT_WIDTH /= 64)); -- reduced-size CPU counters (Zxscnt)

View file

@ -67,6 +67,7 @@ entity neorv32_top is
CPU_EXTENSION_RISCV_Zihpm : boolean := false; -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei : boolean := false; -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul : boolean := false; -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu : boolean := false; -- implement custom (instr.) functions unit?
-- Extension Options --
FAST_MUL_EN : boolean := false; -- use DSPs for M extension's multiplier
@ -492,6 +493,7 @@ begin
CPU_EXTENSION_RISCV_Zihpm => CPU_EXTENSION_RISCV_Zihpm, -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei => CPU_EXTENSION_RISCV_Zifencei, -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul => CPU_EXTENSION_RISCV_Zmmul, -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu => CPU_EXTENSION_RISCV_Zxcfu, -- implement custom (instr.) functions unit?
CPU_EXTENSION_RISCV_DEBUG => ON_CHIP_DEBUGGER_EN, -- implement CPU debug mode?
-- Extension Options --
FAST_MUL_EN => FAST_MUL_EN, -- use DSPs for M extension's multiplier
@ -1503,6 +1505,7 @@ begin
CPU_EXTENSION_RISCV_Zihpm => CPU_EXTENSION_RISCV_Zihpm, -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei => CPU_EXTENSION_RISCV_Zifencei, -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul => CPU_EXTENSION_RISCV_Zmmul, -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu => CPU_EXTENSION_RISCV_Zxcfu, -- implement custom (instr.) functions unit?
CPU_EXTENSION_RISCV_DEBUG => ON_CHIP_DEBUGGER_EN, -- implement CPU debug mode?
-- Extension Options --
FAST_MUL_EN => FAST_MUL_EN, -- use DSPs for M extension's multiplier

View file

@ -59,6 +59,8 @@ entity neorv32_ProcessorTop_stdlogic is
CPU_EXTENSION_RISCV_Zicntr : boolean := true; -- implement base counters?
CPU_EXTENSION_RISCV_Zihpm : boolean := false; -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei : boolean := false; -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul : boolean := false; -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu : boolean := false; -- implement custom (instr.) functions unit?
-- Extension Options --
FAST_MUL_EN : boolean := false; -- use DSPs for M extension's multiplier
FAST_SHIFT_EN : boolean := false; -- use barrel shifter for shift operations
@ -296,6 +298,8 @@ begin
CPU_EXTENSION_RISCV_Zicntr => CPU_EXTENSION_RISCV_Zicntr, -- implement base counters?
CPU_EXTENSION_RISCV_Zihpm => CPU_EXTENSION_RISCV_Zihpm, -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei => CPU_EXTENSION_RISCV_Zifencei, -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul => CPU_EXTENSION_RISCV_Zmmul, -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu => CPU_EXTENSION_RISCV_Zxcfu, -- implement custom (instr.) functions unit?
-- Extension Options --
FAST_MUL_EN => FAST_MUL_EN, -- use DSPs for M extension's multiplier
FAST_SHIFT_EN => FAST_SHIFT_EN, -- use barrel shifter for shift operations

View file

@ -65,6 +65,7 @@ entity neorv32_top_avalonmm is
CPU_EXTENSION_RISCV_Zihpm : boolean := false; -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei : boolean := false; -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul : boolean := false; -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu : boolean := false; -- implement custom (instr.) functions unit?
-- Extension Options --
FAST_MUL_EN : boolean := false; -- use DSPs for M extension's multiplier
@ -259,6 +260,7 @@ begin
CPU_EXTENSION_RISCV_Zihpm => CPU_EXTENSION_RISCV_Zihpm,
CPU_EXTENSION_RISCV_Zifencei => CPU_EXTENSION_RISCV_Zifencei,
CPU_EXTENSION_RISCV_Zmmul => CPU_EXTENSION_RISCV_Zmmul,
CPU_EXTENSION_RISCV_Zxcfu => CPU_EXTENSION_RISCV_Zxcfu,
-- Extension Options --
FAST_MUL_EN => FAST_MUL_EN,

View file

@ -65,6 +65,8 @@ entity neorv32_SystemTop_axi4lite is
CPU_EXTENSION_RISCV_Zicntr : boolean := true; -- implement base counters?
CPU_EXTENSION_RISCV_Zihpm : boolean := false; -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei : boolean := false; -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul : boolean := false; -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu : boolean := false; -- implement custom (instr.) functions unit?
-- Extension Options --
FAST_MUL_EN : boolean := false; -- use DSPs for M extension's multiplier
FAST_SHIFT_EN : boolean := false; -- use barrel shifter for shift operations
@ -303,6 +305,8 @@ begin
CPU_EXTENSION_RISCV_Zicntr => CPU_EXTENSION_RISCV_Zicntr, -- implement base counters?
CPU_EXTENSION_RISCV_Zihpm => CPU_EXTENSION_RISCV_Zihpm, -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei => CPU_EXTENSION_RISCV_Zifencei, -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul => CPU_EXTENSION_RISCV_Zmmul, -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu => CPU_EXTENSION_RISCV_Zxcfu, -- implement custom (instr.) functions unit?
-- Extension Options --
FAST_MUL_EN => FAST_MUL_EN, -- use DSPs for M extension's multiplier
FAST_SHIFT_EN => FAST_SHIFT_EN, -- use barrel shifter for shift operations

View file

@ -294,6 +294,8 @@ begin
CPU_EXTENSION_RISCV_Zicntr => true, -- implement base counters?
CPU_EXTENSION_RISCV_Zihpm => true, -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei => true, -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul => false, -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu => true, -- implement custom (instr.) functions unit?
-- Extension Options --
FAST_MUL_EN => false, -- use DSPs for M extension's multiplier
FAST_SHIFT_EN => false, -- use barrel shifter for shift operations

View file

@ -187,6 +187,7 @@ begin
CPU_EXTENSION_RISCV_Zihpm => true, -- implement hardware performance monitors?
CPU_EXTENSION_RISCV_Zifencei => CPU_EXTENSION_RISCV_Zifencei, -- implement instruction stream sync.?
CPU_EXTENSION_RISCV_Zmmul => false, -- implement multiply-only M sub-extension?
CPU_EXTENSION_RISCV_Zxcfu => true, -- implement custom (instr.) functions unit?
-- Extension Options --
FAST_MUL_EN => false, -- use DSPs for M extension's multiplier
FAST_SHIFT_EN => false, -- use barrel shifter for shift operations

183
sw/example/demo_cfu/main.c Normal file
View file

@ -0,0 +1,183 @@
// #################################################################################################
// # << NEORV32 - CFU Custom Instructions Example Program >> #
// # ********************************************************************************************* #
// # BSD 3-Clause License #
// # #
// # Copyright (c) 2022, Stephan Nolting. All rights reserved. #
// # #
// # Redistribution and use in source and binary forms, with or without modification, are #
// # permitted provided that the following conditions are met: #
// # #
// # 1. Redistributions of source code must retain the above copyright notice, this list of #
// # conditions and the following disclaimer. #
// # #
// # 2. Redistributions in binary form must reproduce the above copyright notice, this list of #
// # conditions and the following disclaimer in the documentation and/or other materials #
// # provided with the distribution. #
// # #
// # 3. Neither the name of the copyright holder nor the names of its contributors may be used to #
// # endorse or promote products derived from this software without specific prior written #
// # permission. #
// # #
// # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS #
// # OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF #
// # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE #
// # COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, #
// # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE #
// # GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED #
// # AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING #
// # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED #
// # OF THE POSSIBILITY OF SUCH DAMAGE. #
// # ********************************************************************************************* #
// # The NEORV32 Processor - https://github.com/stnolting/neorv32 (c) Stephan Nolting #
// #################################################################################################
/**********************************************************************//**
* @file demo_cfu/main.c
* @author Stephan Nolting
* @brief Example program showing how to use the CFU's custom instructions.
**************************************************************************/
#include <neorv32.h>
/**********************************************************************//**
* @name User configuration
**************************************************************************/
/**@{*/
/** UART BAUD rate */
#define BAUD_RATE 19200
/** Number of test cases per CFU instruction */
#define TESTCASES 4
/**@}*/
/**********************************************************************//**
* @name Prototypes
**************************************************************************/
uint32_t xorshift32(void);
/**********************************************************************//**
* Main function
*
* @note This program requires the CFU and UART0.
*
* @return 0 if execution was successful
**************************************************************************/
int main() {
// initialize NEORV32 run-time environment
neorv32_rte_setup();
// setup UART0 at default baud rate, no parity bits, no HW flow control
neorv32_uart0_setup(BAUD_RATE, PARITY_NONE, FLOW_CONTROL_NONE);
// check if UART0 is implemented
if (neorv32_uart0_available() == 0) {
return 1; // UART0 not available, exit
}
// check if the CFU is implemented at all
// note that the CFU is wrapped in the core's "Zxcfu" ISA extension
if (neorv32_cpu_cfu_available() == 0) {
neorv32_uart0_printf("ERROR! CFU ('Zxcfu' ISA extensions) not implemented!\n");
return 1;
}
// intro
neorv32_uart0_printf("\n<<< NEORV32 Custom Functions Unit (CFU) 'Custom Instructions' Example Program >>>\n\n");
neorv32_uart0_printf("NOTE: This program assumes the _default_ CFU hardware module, which implements\n"
" four simple data conversion instructions.\n\n");
neorv32_uart0_printf("NOTE: This program (and it's comments) just shows how to USE the CFU's custom\n"
" instructions. The actual implementation of these instructions is done\n"
" in the CFU hardware module (-> rtl/core/neorv32_cpu_cp_cfu.vhd).\n\n");
// custom instructions usage examples
uint32_t i, opa, opb;
neorv32_uart0_printf("\n--- CFU 'binary to gray' instruction (funct3 = 000) ---\n");
for (i=0; i<TESTCASES; i++) {
opa = xorshift32(); // get random test data
opb = 0;
neorv32_uart0_printf("%u: neorv32_cfu_cmd0 - OPA = 0x%x, OPB = 0x%x, ", i, opa, opb);
// The CFU custom instruction can be used as plain C functions!
//
// There are 8 "prototypes" for the CFU instructions:
// - neorv32_cfu_cmd0(funct7, rs1, rs2) - sets the instruction's "funct3" bit field to 000
// - neorv32_cfu_cmd1(funct7, rs1, rs2) - sets the instruction's "funct3" bit field to 001
// - ...
// - neorv32_cfu_cmd7(funct7, rs1, rs2) - sets the instruction's "funct3" bit field to 111
//
// These functions are turned into 32-bit instruction words resembling a R2-type RISC-V instruction
// (=> "intrinsics").
//
// Each neorv32_cfu_cmd* function requires three arguments:
// - funct7: a compile-time static 7-bit immediate (put in the instruction's "funct7" bit field)
// - rs1: a 32-bit operand A (this is the first register file source rs1)
// - rs2: a 32-bit operand B (this is the first register second source rs2)
//
// The operands can be literals, variables, function return values, ... you name it.
//
// Each neorv32_cfu_cmd* function returns a 32-bit uint32_t data word, which represents
// the result of the according instruction.
//
// The 7-bit immediate ("funct7") can be used to pass small _static_ literals to the CFU
// or to do a more fine-grained function selection - it all depends on your hardware implementation! ;)
neorv32_uart0_printf("Result = 0x%x\n", neorv32_cfu_cmd0(0b0000000, opa, opb));
}
neorv32_uart0_printf("\n--- CFU 'gray to binary' instruction (funct3 = 001) ---\n");
for (i=0; i<TESTCASES; i++) {
opa = xorshift32();
neorv32_uart0_printf("%u: neorv32_cfu_cmd1 - OPA = 0x%x, OPB = 0x%x, ", i, opa, 0);
// you can also pass literals instead of variables to the intrinsics (0 instead of opb):
neorv32_uart0_printf("Result = 0x%x\n", neorv32_cfu_cmd1(0b0000000, opa, 0));
}
neorv32_uart0_printf("\n--- CFU 'bit reversal' instruction (funct3 = 010) ---\n");
for (i=0; i<TESTCASES; i++) {
opa = xorshift32();
neorv32_uart0_printf("%u: neorv32_cfu_cmd2 - OPA = 0x%x, OPB = 0x%x, ", i, opa, 0);
// here we are setting the funct7 bit-field to all-one; however, this is not
// used at all by the default CFU hardware module
// note that all funct3/funct7 combinations are treated as "valid" by the CPU
// - so there is no chance of causing an illegal instruction exception by using the CFU intrinsics
neorv32_uart0_printf("Result = 0x%x\n", neorv32_cfu_cmd2(0b1111111, opa, 0));
}
neorv32_uart0_printf("\n--- CFU 'XNOR' instruction (funct3 = 011) ---\n");
for (i=0; i<TESTCASES; i++) {
opa = xorshift32();
opb = xorshift32();
neorv32_uart0_printf("%u: neorv32_cfu_cmd3 - OPA = 0x%x, OPB = 0x%x, ", i, opa, opb);
neorv32_uart0_printf("Result = 0x%x\n", neorv32_cfu_cmd3(0b0000000, opa, opb));
}
neorv32_uart0_printf("\nCFU demo program completed.\n");
return 0;
}
/**********************************************************************//**
* Pseudo-random number generator (to generate deterministic test data).
*
* @return Random data (32-bit).
**************************************************************************/
uint32_t xorshift32(void) {
static uint32_t x32 = 314159265;
x32 ^= x32 << 13;
x32 ^= x32 >> 17;
x32 ^= x32 << 5;
return x32;
}

View file

@ -0,0 +1,40 @@
#################################################################################################
# << NEORV32 - Application Makefile >> #
# ********************************************************************************************* #
# Make sure to add the RISC-V GCC compiler's bin folder to your PATH environment variable. #
# ********************************************************************************************* #
# BSD 3-Clause License #
# #
# Copyright (c) 2021, Stephan Nolting. All rights reserved. #
# #
# Redistribution and use in source and binary forms, with or without modification, are #
# permitted provided that the following conditions are met: #
# #
# 1. Redistributions of source code must retain the above copyright notice, this list of #
# conditions and the following disclaimer. #
# #
# 2. Redistributions in binary form must reproduce the above copyright notice, this list of #
# conditions and the following disclaimer in the documentation and/or other materials #
# provided with the distribution. #
# #
# 3. Neither the name of the copyright holder nor the names of its contributors may be used to #
# endorse or promote products derived from this software without specific prior written #
# permission. #
# #
# THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS #
# OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF #
# MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE #
# COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, #
# EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE #
# GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED #
# AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING #
# NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED #
# OF THE POSSIBILITY OF SUCH DAMAGE. #
# ********************************************************************************************* #
# The NEORV32 Processor - https://github.com/stnolting/neorv32 (c) Stephan Nolting #
#################################################################################################
# Modify this variable to fit your NEORV32 setup (neorv32 home folder)
NEORV32_HOME ?= ../../..
include $(NEORV32_HOME)/sw/common/common.mk

View file

@ -1254,6 +1254,7 @@ enum NEORV32_SYSINFO_CPU_enum {
SYSINFO_CPU_ZICSR = 0, /**< SYSINFO_CPU (0): Zicsr extension (I sub-extension) available when set (r/-) */
SYSINFO_CPU_ZIFENCEI = 1, /**< SYSINFO_CPU (1): Zifencei extension (I sub-extension) available when set (r/-) */
SYSINFO_CPU_ZMMUL = 2, /**< SYSINFO_CPU (2): Zmmul extension (M sub-extension) available when set (r/-) */
SYSINFO_CPU_ZXCFU = 3, /**< SYSINFO_CPU (3): Zxcfu extension (custom functions unit for custom instructions) available when set (r/-) */
SYSINFO_CPU_ZFINX = 5, /**< SYSINFO_CPU (5): Zfinx extension (F sub-/alternative-extension) available when set (r/-) */
SYSINFO_CPU_ZXSCNT = 6, /**< SYSINFO_CPU (6): Custom extension - Small CPU counters: "cycle" & "instret" CSRs have less than 64-bit when set (r/-) */
@ -1322,14 +1323,15 @@ enum NEORV32_SYSINFO_SOC_enum {
// ----------------------------------------------------------------------------
// Include all IO driver headers
// Include all system header files
// ----------------------------------------------------------------------------
// cpu core
#include "neorv32_cpu.h"
// intrinsics
#include "neorv32_intrinsics.h"
// cpu core
#include "neorv32_cpu.h"
#include "neorv32_cpu_cfu.h"
// neorv32 runtime environment
#include "neorv32_rte.h"

View file

@ -0,0 +1,71 @@
// #################################################################################################
// # << NEORV32: neorv32_cfu.h - CPU Core - CFU Co-Processor Hardware Driver >> #
// # ********************************************************************************************* #
// # BSD 3-Clause License #
// # #
// # Copyright (c) 2022, Stephan Nolting. All rights reserved. #
// # #
// # Redistribution and use in source and binary forms, with or without modification, are #
// # permitted provided that the following conditions are met: #
// # #
// # 1. Redistributions of source code must retain the above copyright notice, this list of #
// # conditions and the following disclaimer. #
// # #
// # 2. Redistributions in binary form must reproduce the above copyright notice, this list of #
// # conditions and the following disclaimer in the documentation and/or other materials #
// # provided with the distribution. #
// # #
// # 3. Neither the name of the copyright holder nor the names of its contributors may be used to #
// # endorse or promote products derived from this software without specific prior written #
// # permission. #
// # #
// # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS #
// # OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF #
// # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE #
// # COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, #
// # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE #
// # GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED #
// # AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING #
// # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED #
// # OF THE POSSIBILITY OF SUCH DAMAGE. #
// # ********************************************************************************************* #
// # The NEORV32 Processor - https://github.com/stnolting/neorv32 (c) Stephan Nolting #
// #################################################################################################
/**********************************************************************//**
* @file neorv32_cpu_cfu.h
* @author Stephan Nolting
* @brief CPU Core custom functions unit HW driver header file.
**************************************************************************/
#ifndef neorv32_cpu_cfu_h
#define neorv32_cpu_cfu_h
// prototypes
int neorv32_cpu_cfu_available(void);
/**********************************************************************//**
* @name CFU custom instructions (intrinsic)
**************************************************************************/
/**@{*/
/** CFU custom instruction 0 (funct3 = 000) */
#define neorv32_cfu_cmd0(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 0, RISCV_OPCODE_CUSTOM0)
/** CFU custom instruction 1 (funct3 = 001) */
#define neorv32_cfu_cmd1(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 1, RISCV_OPCODE_CUSTOM0)
/** CFU custom instruction 2 (funct3 = 010) */
#define neorv32_cfu_cmd2(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 2, RISCV_OPCODE_CUSTOM0)
/** CFU custom instruction 3 (funct3 = 011) */
#define neorv32_cfu_cmd3(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 3, RISCV_OPCODE_CUSTOM0)
/** CFU custom instruction 4 (funct3 = 100) */
#define neorv32_cfu_cmd4(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 4, RISCV_OPCODE_CUSTOM0)
/** CFU custom instruction 5 (funct3 = 101) */
#define neorv32_cfu_cmd5(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 5, RISCV_OPCODE_CUSTOM0)
/** CFU custom instruction 6 (funct3 = 110) */
#define neorv32_cfu_cmd6(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 6, RISCV_OPCODE_CUSTOM0)
/** CFU custom instruction 7 (funct3 = 111) */
#define neorv32_cfu_cmd7(funct7, rs1, rs2) CUSTOM_INSTR_R2_TYPE(funct7, rs2, rs1, 7, RISCV_OPCODE_CUSTOM0)
/**@}*/
#endif // neorv32_cpu_cfu_h

View file

@ -154,6 +154,10 @@ asm(".set regnum_t3 , 28");
asm(".set regnum_t4 , 29");
asm(".set regnum_t5 , 30");
asm(".set regnum_t6 , 31");
/** Official RISC-V opcodes for custom extensions (CUSTOM0, CUSTOM1) */
asm(".set RISCV_OPCODE_CUSTOM0 , 0b0001011");
asm(".set RISCV_OPCODE_CUSTOM1 , 0b0101011");
/**@}*/
@ -193,7 +197,8 @@ asm(".set regnum_t6 , 31");
asm volatile ( \
"" \
: [output] "=r" (__return) \
: [input_i] "r" (rs1), [input_j] "r" (rs2) \
: [input_i] "r" (rs1), \
[input_j] "r" (rs2) \
); \
asm volatile ( \
".word ( \
@ -205,7 +210,8 @@ asm(".set regnum_t6 , 31");
(((" #opcode ") & 0x7f) << 0) \
);" \
: [rd] "=r" (__return) \
: "r" (rs1), "r" (rs2) \
: "r" (rs1), \
"r" (rs2) \
); \
__return; \
})
@ -220,7 +226,9 @@ asm(".set regnum_t6 , 31");
asm volatile ( \
"" \
: [output] "=r" (__return) \
: [input_i] "r" (rs1), [input_j] "r" (rs2), [input_k] "r" (rs3) \
: [input_i] "r" (rs1), \
[input_j] "r" (rs2), \
[input_k] "r" (rs3) \
); \
asm volatile ( \
".word ( \
@ -232,7 +240,9 @@ asm(".set regnum_t6 , 31");
(((" #opcode ") & 0x7f) << 0) \
);" \
: [rd] "=r" (__return) \
: "r" (rs1), "r" (rs2), "r" (rs3) \
: "r" (rs1), \
"r" (rs2), \
"r" (rs3) \
); \
__return; \
})

View file

@ -3,7 +3,7 @@
// # ********************************************************************************************* #
// # BSD 3-Clause License #
// # #
// # Copyright (c) 2021, Stephan Nolting. All rights reserved. #
// # Copyright (c) 2022, Stephan Nolting. All rights reserved. #
// # #
// # Redistribution and use in source and binary forms, with or without modification, are #
// # permitted provided that the following conditions are met: #
@ -49,7 +49,7 @@
/**********************************************************************//**
* Check if custom functions unit 0 was synthesized.
* Check if custom functions subsystem was synthesized.
*
* @return 0 if CFS was not synthesized, 1 if CFS is available.
**************************************************************************/

View file

@ -0,0 +1,60 @@
// #################################################################################################
// # << NEORV32: neorv32_cfu.c - CPU Core - CFU Co-Processor Hardware Driver >> #
// # ********************************************************************************************* #
// # BSD 3-Clause License #
// # #
// # Copyright (c) 2022, Stephan Nolting. All rights reserved. #
// # #
// # Redistribution and use in source and binary forms, with or without modification, are #
// # permitted provided that the following conditions are met: #
// # #
// # 1. Redistributions of source code must retain the above copyright notice, this list of #
// # conditions and the following disclaimer. #
// # #
// # 2. Redistributions in binary form must reproduce the above copyright notice, this list of #
// # conditions and the following disclaimer in the documentation and/or other materials #
// # provided with the distribution. #
// # #
// # 3. Neither the name of the copyright holder nor the names of its contributors may be used to #
// # endorse or promote products derived from this software without specific prior written #
// # permission. #
// # #
// # THIS SOFTWARE IS PROVIDED BY THE COPYRIGHT HOLDERS AND CONTRIBUTORS "AS IS" AND ANY EXPRESS #
// # OR IMPLIED WARRANTIES, INCLUDING, BUT NOT LIMITED TO, THE IMPLIED WARRANTIES OF #
// # MERCHANTABILITY AND FITNESS FOR A PARTICULAR PURPOSE ARE DISCLAIMED. IN NO EVENT SHALL THE #
// # COPYRIGHT HOLDER OR CONTRIBUTORS BE LIABLE FOR ANY DIRECT, INDIRECT, INCIDENTAL, SPECIAL, #
// # EXEMPLARY, OR CONSEQUENTIAL DAMAGES (INCLUDING, BUT NOT LIMITED TO, PROCUREMENT OF SUBSTITUTE #
// # GOODS OR SERVICES; LOSS OF USE, DATA, OR PROFITS; OR BUSINESS INTERRUPTION) HOWEVER CAUSED #
// # AND ON ANY THEORY OF LIABILITY, WHETHER IN CONTRACT, STRICT LIABILITY, OR TORT (INCLUDING #
// # NEGLIGENCE OR OTHERWISE) ARISING IN ANY WAY OUT OF THE USE OF THIS SOFTWARE, EVEN IF ADVISED #
// # OF THE POSSIBILITY OF SUCH DAMAGE. #
// # ********************************************************************************************* #
// # The NEORV32 Processor - https://github.com/stnolting/neorv32 (c) Stephan Nolting #
// #################################################################################################
/**********************************************************************//**
* @file neorv32_cpu_cfu.c
* @author Stephan Nolting
* @brief CPU Core custom functions unit HW driver source file.
**************************************************************************/
#include "neorv32.h"
#include "neorv32_cpu_cfu.h"
/**********************************************************************//**
* Check if custom functions unit was synthesized.
*
* @return 0 if CFU was not synthesized, 1 if CFU is available.
**************************************************************************/
int neorv32_cpu_cfu_available(void) {
// this is an ISA extension - not a SoC module
if (NEORV32_SYSINFO.CPU & (1 << SYSINFO_CPU_ZXCFU)) {
return 1;
}
else {
return 0;
}
}

View file

@ -355,6 +355,9 @@ void neorv32_rte_print_hw_config(void) {
if (tmp & (1<<SYSINFO_CPU_ZFINX)) {
neorv32_uart0_printf("Zfinx ");
}
if (tmp & (1<<SYSINFO_CPU_ZXCFU)) {
neorv32_uart0_printf("Zxcfu ");
}
if (tmp & (1<<SYSINFO_CPU_ZXSCNT)) {
neorv32_uart0_printf("Zxscnt(!) ");
}