mirror of
https://github.com/stnolting/neorv32.git
synced 2025-04-23 21:57:33 -04:00
[docs] updated documentation
This commit is contained in:
parent
3de295379f
commit
7084048551
3 changed files with 19 additions and 12 deletions
13
CHANGELOG.md
13
CHANGELOG.md
|
@ -14,6 +14,7 @@ For the HDL sources the version number is globally defined by the `hw_version_c`
|
|||
|
||||
| Date (*dd.mm.yyyy*) | Version | Comment |
|
||||
|:----------:|:-------:|:--------|
|
||||
| 17.10.2020 | 1.4.5.5 | New makefile target `upload` allows to directly upload an executable to the bootloader from the console |
|
||||
| 17.10.2020 | 1.4.5.4 | Added new CPU/Processor generic `FAST_SHIFT_EN` (default = *false*) to enable implementation of a fast (but large) barrel shifter for accelerating CPU shift instructions; updated CoreMark performance results |
|
||||
| 16.10.2020 | 1.4.5.2 | Added read-only flag to custom `mzext` CSR to check if physical memory protection (PMP) is implemented; added [C] `mzext` CSR name aliases to neorv32.h |
|
||||
| 15.10.2020 | 1.4.5.1 | Fixed "unprecise exceptions": `mtval` did not always reflect the correct value according to the instruction that caused the exceptions; fixed bug in RTE: Debug trap handler was not showing the correct `mepc` value |
|
||||
|
@ -21,7 +22,7 @@ For the HDL sources the version number is globally defined by the `hw_version_c`
|
|||
| 12.10.2020 | 1.4.4.9 | Added *alignment flags* to makefiles: branch/jump/call targets are forced to be 32-bit aligned -> increases performance when using the `C` extension; added makefile flag listing to NEORV32.pdf; updated performance results for CPUs with `C` extension; `crt0.S` will initialize *all* registers with zero if not using `E` extension and not compiling bootloader |
|
||||
| 11.10.2020 | 1.4.4.8 | Reworked pipeline frontend: Optimized fetch enginge, added issue engine, faster instruction fetch after taken branches + reduced hardware requirements; updated synthesis and performance results |
|
||||
| 11.10.2020 | 1.4.4.6 | Added option to configure external memory interface (Wishbone) to either use *standard/classic protocol* (default) or *pipelined protocol* (for better timing): via `wb_pipe_mode_c` constant in VHDL package file (`rtl/core/neorv32_package.vhd`); added help text to NEORV32.pdf section "3.4.4. Processor-External Memory Interface (WISHBONE)" |
|
||||
| 08.10.2020 | 1.4.4.5 | Removed CPU’s `BUS_TIMEOUT` and processor’s `MEM_EXT_TIMEOUT` generics; instead, a global configuration `bus_timeout_c` in the VHDL package file is used now |
|
||||
| 08.10.2020 | 1.4.4.5 | Removed CPU's `BUS_TIMEOUT` and processor's `MEM_EXT_TIMEOUT` generics; instead, a global configuration `bus_timeout_c` in the VHDL package file is used now |
|
||||
| 08.10.2020 | 1.4.4.4 | Removed `DEVNULL` device; all simulation output options from this device are now available as `SIM_MODE` in the `UART`; `mcause` CSR can now also be written; FIXED: trying to write a read-only CSR will cause an illegal instruction exception; for compatibility reasons any write access to the misa CSR will be ignored and will NOT cause an exception |
|
||||
| 07.10.2020 | 1.4.4.2 | Simplified ALU's set of core operations; removed co-processor data mux right after ALU -> shorter critical path; CPU control VHDL code clean-up and CSR write logic optimization; optimized IMEM/DMEM access logic; added note regarding alignment of IMEM/DMEM |
|
||||
| 05.10.2020 | [**:rocket:1.4.4.0**](https://github.com/stnolting/neorv32/releases/tag/v1.4.4.0) | Fixed bug in external memory interface: Executing code from external memory was causing an instruction fetch stall |
|
||||
|
@ -29,14 +30,14 @@ For the HDL sources the version number is globally defined by the `hw_version_c`
|
|||
| 01.10.2020 | 1.4.3.8 | Added CPU top entity wrapper with resolved port signals `rtl/top_templetes/neorv32_cpu_stdlogic.vhd`; optimized ALU core functions – shorter critical path, less control overhead, reduced HW footprint |
|
||||
| 27.09.2020 | 1.4.3.3 | Further improved ALU and control logic; CSR access instruction require one additional cycle now (to let side effects kick in); updated synthesis results; added CFU hardware driver dummy |
|
||||
| 26.09.2020 | 1.4.3.2 | Fixed bug in `CSRRWI` instruction (introduced with version 1.4.3.1); further ALU operand logic optimizations; updated CPU data path figure |
|
||||
| 25.09.2020 | 1.4.3.1 | Register file’s `x0` is now a physical register; this register is initialized by the hardware and locked afterwards; removed "set to zero" stage -> smaller hardware footprint and shorter critical path; added processor top entity wrapper with resolved signals `rtl/top_templetes/neorv32_top_stdlogic.vhd` |
|
||||
| 16.09.2020 | [**:rocket:1.4.3.0**](https://github.com/stnolting/neorv32/releases/tag/v1.4.3.0) | Simplified memory configuration: removed processor top’s memory space configuration generics (`MEM_ISPACE_BASE`, `MEM_ISPACE_SIZE`, `MEM_DSPACE_BASE`, `MEM_DSPACE_SIZE`); data/instruction space sizes are irrelevant for hardware; instruction/data space base addresses are fixed (but can be modified in NEORV32 VHDL package file); modified SYSINFO registers; adapted bootloader, crt0 start-up code and linker script; stack configuration is now done via linker script; reworked chapter "address space"; added CFU interrupt -> fast interrupt channel 1 (shared with GPIO) |
|
||||
| 25.09.2020 | 1.4.3.1 | Register file's `x0` is now a physical register; this register is initialized by the hardware and locked afterwards; removed "set to zero" stage -> smaller hardware footprint and shorter critical path; added processor top entity wrapper with resolved signals `rtl/top_templetes/neorv32_top_stdlogic.vhd` |
|
||||
| 16.09.2020 | [**:rocket:1.4.3.0**](https://github.com/stnolting/neorv32/releases/tag/v1.4.3.0) | Simplified memory configuration: removed processor top's memory space configuration generics (`MEM_ISPACE_BASE`, `MEM_ISPACE_SIZE`, `MEM_DSPACE_BASE`, `MEM_DSPACE_SIZE`); data/instruction space sizes are irrelevant for hardware; instruction/data space base addresses are fixed (but can be modified in NEORV32 VHDL package file); modified SYSINFO registers; adapted bootloader, crt0 start-up code and linker script; stack configuration is now done via linker script; reworked chapter "address space"; added CFU interrupt -> fast interrupt channel 1 (shared with GPIO) |
|
||||
| 14.09.2020 | 1.4.2.0 | Removed option to disable CSR counters (via `CSR_COUNTERS_USE` generic) since these counters are mandatory according to the RISC-V specs; added new IO/peripheral device: custom functions unit (`CFU`) for tightly-coupled custom co-processors; improved timing of processor-internal clock generator; fixed wrong labels in address space figure and removed dedicated exception vectors box; added mask register to GPIO unit to specify which input pins can trigger a pin-change interrupt |
|
||||
| 11.09.2020 | 1.4.0.4 | Reworked `TRNG` architecture and interface; added text regarding fast interrupt channels usage for the NEORV32 processor |
|
||||
| 02.09.2020 | 1.4.0.2 | Fixed bugs in external memory interface; added option to define latency of simulated external memory in testbench; hardware configuration sanity checks will now only appear once in console; added more details to data sheet section 3.3. Address Space; fixed typos in MEM_*_BASE and MEM_*_SIZE generic names |
|
||||
| 01.09.2020 | 1.4.0.1 | Using registers above `x15` when the `E` extensions is enabled will now correctly cause an illegal instruction exception |
|
||||
| 29.08.2020 | [**:rocket:1.4.0.0**](https://github.com/stnolting/neorv32/releases/tag/v1.4.0.0) | Rearranged and reworked this document; added FreeRTOS port, demo & short referencing chapter; removed booloader-specific linker scripts – main linker script is used for both, applications and bootloader; bootloader can now have `.data` and `.bss` sections; improved IMEM and BOOTROM memory initialization – faster synthesis; image generator now constrains init array size to actual executable size; peripheral/IO devices can only be written in full word mode (= 32-bit); GPIO ports are now 32-bit wide |
|
||||
| 23.08.2020 | 1.3.7.3 | Added custom `mzext` CSR to check for available Z* CPU extensions; multiplier’s FAST_MUL mode is one cycle faster now; updated performance data |
|
||||
| 23.08.2020 | 1.3.7.3 | Added custom `mzext` CSR to check for available Z* CPU extensions; multiplier's FAST_MUL mode is one cycle faster now; updated performance data |
|
||||
| 20.08.2020 | 1.3.7.2 | Removed bootloader-specific crt0 – bootloader now uses std crt0; makefiles now also support asm and cpp files; made linker scripts more general; renamed makefile "compile" (which is still available for compatibility) target into "exe" |
|
||||
| 14.08.2020 | [**:rocket:1.3.7.0**](https://github.com/stnolting/neorv32/releases/tag/v1.3.7.0) | Simplified CPU fetch engine; added configurable CPU instruction prefetch buffer (ipb) FIFO; optimized CPU execute engine; updated performance data |
|
||||
| 06.08.2020 | 1.3.6.5 | Added `FAST_MUL_EN` generic to enable mapping of the multiplier core to DSP blocks; ALU.shifter is no more triggered when executing MULDIV operations; added benchmark results for DSP-based multiplier configurations; updated implementation and performance results; simplified makefiles – using implicit libc definition; crt0 only initializes lowest 16 registers |
|
||||
|
@ -45,9 +46,9 @@ For the HDL sources the version number is globally defined by the `hw_version_c`
|
|||
| 30.07.2020 | 1.3.5.1 | Fixed bug(s) in PMP mask generation; `misa.Z` flag is not yet defined by the RISC-V specs., hence it is read-only and read as zero |
|
||||
| 29.07.2020 | 1.3.5.0 | Added user privilege level, enabled via new `CPU_EXTENSION_RISCV_U` generic; fixed error in `mstatus(mpie)` logic; implemented RISC-V spec.-compliant Physical Memory Protection (PMP); allows up to 8 regions but only NAPOT mode is supported yet |
|
||||
| 25.07.2020 | 1.3.0.0 | `mcause` CSR is read-only now!; removed `CLIC`, added 4 fast IRQ channels to CPU with according flags in `mie` and `mip` and trap IDs; updated core libraries; updated NEORV32 RTE; highly reworked data sheet; updated synthesis and performance results |
|
||||
| 21.07.2020 | 1.2.0.6 | Added doc section regarding the CPU’s data and instruction interfaces; optimized CPU fetch engine; updated iCE40 synthesis results |
|
||||
| 21.07.2020 | 1.2.0.6 | Added doc section regarding the CPU's data and instruction interfaces; optimized CPU fetch engine; updated iCE40 synthesis results |
|
||||
| 20.07.2020 | 1.2.0.5 | Less penalty for taken branches and jumps (2 cycles faster) |
|
||||
| 19.07.2020 | 1.2.0.0 | CPU bus unit now has independent busses for instruction fetch and data access – merged into single processor bus via new bus switch unit; doubled speed of ALU shifter unit again; all bits of `mcause` CSR can now be modified by application program (full RISC-V-compliant); performance counters CSRs `[m]cycleh` and `[m]instreth` are only 20-bit wide; removed NEORV32-specific custom CSRs – all processor-related information can be obtained from the new `SYSINFO` IO module (CPU is now more independent from processor configuration); changed IO address of `DEVNULL`; fixed bug in bootloader’s trap handler; added `USER_CODE` generic to assign a custom user code that can be read by software (from `SYSINFO`) |
|
||||
| 19.07.2020 | 1.2.0.0 | CPU bus unit now has independent busses for instruction fetch and data access – merged into single processor bus via new bus switch unit; doubled speed of ALU shifter unit again; all bits of `mcause` CSR can now be modified by application program (full RISC-V-compliant); performance counters CSRs `[m]cycleh` and `[m]instreth` are only 20-bit wide; removed NEORV32-specific custom CSRs – all processor-related information can be obtained from the new `SYSINFO` IO module (CPU is now more independent from processor configuration); changed IO address of `DEVNULL`; fixed bug in bootloader's trap handler; added `USER_CODE` generic to assign a custom user code that can be read by software (from `SYSINFO`) |
|
||||
| 14.07.2020 | 1.1.0.0 | Added `fence_o` and `fencei_o` signals to top entity to show if a `fence` or `fencei` instruction is executed; added `mvendorid` and `marchid` CSRs (both are always zero); ALU shift unit is faster now; two lowest bits of `mtvec` are always zero; fixed wrong instruction exception priority; removed `HART_ID` generic – `mhartid` CSR is always read as zero; performance counters (`[m]cycle[h]`, `[m]instret[h]` and `time[h]`) are also available in embedded mode – but can be explicitly disabled via the `CSR_COUNTERS_USE` generic; mcause CSR only allows write access to bit 31 and bits 3:0; updated synthesis reports |
|
||||
| 10.07.2020 | 1.0.6.0 | Non-taken branches are now 1 cycle faster; the `time[h]` CSR now correctly reflects the system time from the MTIME unit; fixed WFI instruction permanently stalling the CPU; `[m]cycle[h]` counters now stop counting when CPU is in sleep mode; `minstret[h]` and `mcycle[h]` now also allow write-access |
|
||||
| 09.07.2020 | 1.0.5.0 | `X` flag of `misa` CSR is zero now; the default SPI flash boot address of the bootloader is now `0x0080000`; new exemplary FPGA utilization results for Intel, Lattice and Xilinx; `misa` CSR is read-only again, switching compressed extension on/off is pretty bad for the fetch engine; `mtval` and `mcause` CSRs now allow write accesses and are finally RISC-V-compliant; time low and high registers of `MTIME` peripheral can now also be written by user; `MTIME` registers only allow full-word write accesses |
|
||||
|
|
18
README.md
18
README.md
|
@ -242,7 +242,7 @@ a DE0-nano board. The design was synthesized using **Intel Quartus Prime Lite 19
|
|||
information is derived from the Timing Analyzer / Slow 1200mV 0C Model. If not otherwise specified, the default configuration
|
||||
of the CPU's generics is assumed (for example no PMP). No constraints were used at all.
|
||||
|
||||
Results generated for hardware version: `1.4.4.8`
|
||||
Results generated for hardware version `1.4.4.8`.
|
||||
|
||||
| CPU Configuration | LEs | FFs | Memory bits | DSPs | f_max |
|
||||
|:---------------------------------------|:----------:|:--------:|:-----------:|:----:|:--------:|
|
||||
|
@ -255,7 +255,7 @@ Results generated for hardware version: `1.4.4.8`
|
|||
|
||||
### NEORV32 Processor-Internal Peripherals and Memories
|
||||
|
||||
Results generated for hardware version: `1.4.4.8`
|
||||
Results generated for hardware version `1.4.4.8`.
|
||||
|
||||
| Module | Description | LEs | FFs | Memory bits | DSPs |
|
||||
|:----------|:-----------------------------------------------------|----:|----:|------------:|-----:|
|
||||
|
@ -283,7 +283,7 @@ no external memory interface and only internal instruction and data memories. IM
|
|||
processor's [top entity](https://github.com/stnolting/neorv32/blob/master/rtl/core/neorv32_top.vhd) signals
|
||||
to FPGA pins - except for the Wishbone bus and the interrupt signals.
|
||||
|
||||
Results generated for hardware version: `1.4.4.8`
|
||||
Results generated for hardware version `1.4.4.8`.
|
||||
|
||||
| Vendor | FPGA | Board | Toolchain | Strategy | CPU Configuration | LUT / LE | FF / REG | DSP | Memory Bits | BRAM / EBR | SPRAM | Frequency |
|
||||
|:--------|:----------------------------------|:-----------------|:---------------------------|:-------- |:-----------------------------------------------|:-----------|:-----------|:-------|:-------------|:-----------|:---------|--------------:|
|
||||
|
@ -309,7 +309,7 @@ The [CoreMark CPU benchmark](https://www.eembc.org/coremark) was executed on the
|
|||
[sw/example/coremark](https://github.com/stnolting/neorv32/blob/master/sw/example/coremark) project folder. This benchmark
|
||||
tests the capabilities of a CPU itself rather than the functions provided by the whole system / SoC.
|
||||
|
||||
Results generated for hardware version: `1.4.5.4`
|
||||
Results generated for hardware version `1.4.5.4`.
|
||||
|
||||
~~~
|
||||
**Configuration**
|
||||
|
@ -349,13 +349,13 @@ iterations, which reflects a pretty good "real-life" work load. The average CPI
|
|||
dividing the total number of required clock cycles (only the timed core to avoid distortion due to IO wait cycles; sampled via the `cycle[h]` CSRs)
|
||||
by the number of executed instructions (`instret[h]` CSRs). The executables were generated using optimization `-O3`.
|
||||
|
||||
Results generated for hardware version: `1.4.5.4`
|
||||
Results generated for hardware version `1.4.5.4`.
|
||||
|
||||
| CPU | Required Clock Cycles | Executed Instructions | Average CPI |
|
||||
|:--------------------------------------------|----------------------:|----------------------:|:-----------:|
|
||||
| `rv32i` | 5 945 938 586 | 1 469 587 406 | **4.05** |
|
||||
| `rv32im` | 3 110 282 586 | 602 225 760 | **5.16** |
|
||||
| `rv32imc` | 3 172 969 968 | 615 388 924 | **5.16** |
|
||||
| `rv32imc` | 3 172 969 968 | 615 388 890 | **5.16** |
|
||||
| `rv32imc` + `FAST_MUL_EN` | 2 590 417 968 | 615 388 890 | **4.21** |
|
||||
| `rv32imc` + `FAST_MUL_EN` + `FAST_SHIFT_EN` | 2 456 318 408 | 615 388 890 | **3.99** |
|
||||
|
||||
|
@ -613,6 +613,12 @@ which you can start your own application. Simply compile one of these projects.
|
|||
|
||||
### Upload the Executable via the Bootloader
|
||||
|
||||
You can upload a generated executable directly from the command line using the makefile's `upload` target. Replace `/dev/ttyUSB0` with
|
||||
the according serial port.
|
||||
|
||||
sw/exeample/blink_example$ make COM_PORT=/dev/ttyUSB0` upload
|
||||
|
||||
A more "secure" way is to use a dedicated terminal program. This allows to directly interact with the bootloader console.
|
||||
Connect your FPGA board via UART to your computer and open the according port to interface with the NEORV32 bootloader. The bootloader
|
||||
uses the following default UART configuration:
|
||||
|
||||
|
|
BIN
docs/NEORV32.pdf
BIN
docs/NEORV32.pdf
Binary file not shown.
Loading…
Add table
Add a link
Reference in a new issue