[docs] updated documentation

This commit is contained in:
stnolting 2020-10-17 19:18:04 +02:00
parent 3de295379f
commit 7084048551
3 changed files with 19 additions and 12 deletions

View file

@ -14,6 +14,7 @@ For the HDL sources the version number is globally defined by the `hw_version_c`
| Date (*dd.mm.yyyy*) | Version | Comment |
|:----------:|:-------:|:--------|
| 17.10.2020 | 1.4.5.5 | New makefile target `upload` allows to directly upload an executable to the bootloader from the console |
| 17.10.2020 | 1.4.5.4 | Added new CPU/Processor generic `FAST_SHIFT_EN` (default = *false*) to enable implementation of a fast (but large) barrel shifter for accelerating CPU shift instructions; updated CoreMark performance results |
| 16.10.2020 | 1.4.5.2 | Added read-only flag to custom `mzext` CSR to check if physical memory protection (PMP) is implemented; added [C] `mzext` CSR name aliases to neorv32.h |
| 15.10.2020 | 1.4.5.1 | Fixed "unprecise exceptions": `mtval` did not always reflect the correct value according to the instruction that caused the exceptions; fixed bug in RTE: Debug trap handler was not showing the correct `mepc` value |
@ -21,7 +22,7 @@ For the HDL sources the version number is globally defined by the `hw_version_c`
| 12.10.2020 | 1.4.4.9 | Added *alignment flags* to makefiles: branch/jump/call targets are forced to be 32-bit aligned -> increases performance when using the `C` extension; added makefile flag listing to NEORV32.pdf; updated performance results for CPUs with `C` extension; `crt0.S` will initialize *all* registers with zero if not using `E` extension and not compiling bootloader |
| 11.10.2020 | 1.4.4.8 | Reworked pipeline frontend: Optimized fetch enginge, added issue engine, faster instruction fetch after taken branches + reduced hardware requirements; updated synthesis and performance results |
| 11.10.2020 | 1.4.4.6 | Added option to configure external memory interface (Wishbone) to either use *standard/classic protocol* (default) or *pipelined protocol* (for better timing): via `wb_pipe_mode_c` constant in VHDL package file (`rtl/core/neorv32_package.vhd`); added help text to NEORV32.pdf section "3.4.4. Processor-External Memory Interface (WISHBONE)" |
| 08.10.2020 | 1.4.4.5 | Removed CPUs `BUS_TIMEOUT` and processors `MEM_EXT_TIMEOUT` generics; instead, a global configuration `bus_timeout_c` in the VHDL package file is used now |
| 08.10.2020 | 1.4.4.5 | Removed CPU's `BUS_TIMEOUT` and processor's `MEM_EXT_TIMEOUT` generics; instead, a global configuration `bus_timeout_c` in the VHDL package file is used now |
| 08.10.2020 | 1.4.4.4 | Removed `DEVNULL` device; all simulation output options from this device are now available as `SIM_MODE` in the `UART`; `mcause` CSR can now also be written; FIXED: trying to write a read-only CSR will cause an illegal instruction exception; for compatibility reasons any write access to the misa CSR will be ignored and will NOT cause an exception |
| 07.10.2020 | 1.4.4.2 | Simplified ALU's set of core operations; removed co-processor data mux right after ALU -> shorter critical path; CPU control VHDL code clean-up and CSR write logic optimization; optimized IMEM/DMEM access logic; added note regarding alignment of IMEM/DMEM |
| 05.10.2020 | [**:rocket:1.4.4.0**](https://github.com/stnolting/neorv32/releases/tag/v1.4.4.0) | Fixed bug in external memory interface: Executing code from external memory was causing an instruction fetch stall |
@ -29,14 +30,14 @@ For the HDL sources the version number is globally defined by the `hw_version_c`
| 01.10.2020 | 1.4.3.8 | Added CPU top entity wrapper with resolved port signals `rtl/top_templetes/neorv32_cpu_stdlogic.vhd`; optimized ALU core functions shorter critical path, less control overhead, reduced HW footprint |
| 27.09.2020 | 1.4.3.3 | Further improved ALU and control logic; CSR access instruction require one additional cycle now (to let side effects kick in); updated synthesis results; added CFU hardware driver dummy |
| 26.09.2020 | 1.4.3.2 | Fixed bug in `CSRRWI` instruction (introduced with version 1.4.3.1); further ALU operand logic optimizations; updated CPU data path figure |
| 25.09.2020 | 1.4.3.1 | Register files `x0` is now a physical register; this register is initialized by the hardware and locked afterwards; removed "set to zero" stage -> smaller hardware footprint and shorter critical path; added processor top entity wrapper with resolved signals `rtl/top_templetes/neorv32_top_stdlogic.vhd` |
| 16.09.2020 | [**:rocket:1.4.3.0**](https://github.com/stnolting/neorv32/releases/tag/v1.4.3.0) | Simplified memory configuration: removed processor tops memory space configuration generics (`MEM_ISPACE_BASE`, `MEM_ISPACE_SIZE`, `MEM_DSPACE_BASE`, `MEM_DSPACE_SIZE`); data/instruction space sizes are irrelevant for hardware; instruction/data space base addresses are fixed (but can be modified in NEORV32 VHDL package file); modified SYSINFO registers; adapted bootloader, crt0 start-up code and linker script; stack configuration is now done via linker script; reworked chapter "address space"; added CFU interrupt -> fast interrupt channel 1 (shared with GPIO) |
| 25.09.2020 | 1.4.3.1 | Register file's `x0` is now a physical register; this register is initialized by the hardware and locked afterwards; removed "set to zero" stage -> smaller hardware footprint and shorter critical path; added processor top entity wrapper with resolved signals `rtl/top_templetes/neorv32_top_stdlogic.vhd` |
| 16.09.2020 | [**:rocket:1.4.3.0**](https://github.com/stnolting/neorv32/releases/tag/v1.4.3.0) | Simplified memory configuration: removed processor top's memory space configuration generics (`MEM_ISPACE_BASE`, `MEM_ISPACE_SIZE`, `MEM_DSPACE_BASE`, `MEM_DSPACE_SIZE`); data/instruction space sizes are irrelevant for hardware; instruction/data space base addresses are fixed (but can be modified in NEORV32 VHDL package file); modified SYSINFO registers; adapted bootloader, crt0 start-up code and linker script; stack configuration is now done via linker script; reworked chapter "address space"; added CFU interrupt -> fast interrupt channel 1 (shared with GPIO) |
| 14.09.2020 | 1.4.2.0 | Removed option to disable CSR counters (via `CSR_COUNTERS_USE` generic) since these counters are mandatory according to the RISC-V specs; added new IO/peripheral device: custom functions unit (`CFU`) for tightly-coupled custom co-processors; improved timing of processor-internal clock generator; fixed wrong labels in address space figure and removed dedicated exception vectors box; added mask register to GPIO unit to specify which input pins can trigger a pin-change interrupt |
| 11.09.2020 | 1.4.0.4 | Reworked `TRNG` architecture and interface; added text regarding fast interrupt channels usage for the NEORV32 processor |
| 02.09.2020 | 1.4.0.2 | Fixed bugs in external memory interface; added option to define latency of simulated external memory in testbench; hardware configuration sanity checks will now only appear once in console; added more details to data sheet section 3.3. Address Space; fixed typos in MEM_*_BASE and MEM_*_SIZE generic names |
| 01.09.2020 | 1.4.0.1 | Using registers above `x15` when the `E` extensions is enabled will now correctly cause an illegal instruction exception |
| 29.08.2020 | [**:rocket:1.4.0.0**](https://github.com/stnolting/neorv32/releases/tag/v1.4.0.0) | Rearranged and reworked this document; added FreeRTOS port, demo & short referencing chapter; removed booloader-specific linker scripts main linker script is used for both, applications and bootloader; bootloader can now have `.data` and `.bss` sections; improved IMEM and BOOTROM memory initialization faster synthesis; image generator now constrains init array size to actual executable size; peripheral/IO devices can only be written in full word mode (= 32-bit); GPIO ports are now 32-bit wide |
| 23.08.2020 | 1.3.7.3 | Added custom `mzext` CSR to check for available Z* CPU extensions; multipliers FAST_MUL mode is one cycle faster now; updated performance data |
| 23.08.2020 | 1.3.7.3 | Added custom `mzext` CSR to check for available Z* CPU extensions; multiplier's FAST_MUL mode is one cycle faster now; updated performance data |
| 20.08.2020 | 1.3.7.2 | Removed bootloader-specific crt0 bootloader now uses std crt0; makefiles now also support asm and cpp files; made linker scripts more general; renamed makefile "compile" (which is still available for compatibility) target into "exe" |
| 14.08.2020 | [**:rocket:1.3.7.0**](https://github.com/stnolting/neorv32/releases/tag/v1.3.7.0) | Simplified CPU fetch engine; added configurable CPU instruction prefetch buffer (ipb) FIFO; optimized CPU execute engine; updated performance data |
| 06.08.2020 | 1.3.6.5 | Added `FAST_MUL_EN` generic to enable mapping of the multiplier core to DSP blocks; ALU.shifter is no more triggered when executing MULDIV operations; added benchmark results for DSP-based multiplier configurations; updated implementation and performance results; simplified makefiles using implicit libc definition; crt0 only initializes lowest 16 registers |
@ -45,9 +46,9 @@ For the HDL sources the version number is globally defined by the `hw_version_c`
| 30.07.2020 | 1.3.5.1 | Fixed bug(s) in PMP mask generation; `misa.Z` flag is not yet defined by the RISC-V specs., hence it is read-only and read as zero |
| 29.07.2020 | 1.3.5.0 | Added user privilege level, enabled via new `CPU_EXTENSION_RISCV_U` generic; fixed error in `mstatus(mpie)` logic; implemented RISC-V spec.-compliant Physical Memory Protection (PMP); allows up to 8 regions but only NAPOT mode is supported yet |
| 25.07.2020 | 1.3.0.0 | `mcause` CSR is read-only now!; removed `CLIC`, added 4 fast IRQ channels to CPU with according flags in `mie` and `mip` and trap IDs; updated core libraries; updated NEORV32 RTE; highly reworked data sheet; updated synthesis and performance results |
| 21.07.2020 | 1.2.0.6 | Added doc section regarding the CPUs data and instruction interfaces; optimized CPU fetch engine; updated iCE40 synthesis results |
| 21.07.2020 | 1.2.0.6 | Added doc section regarding the CPU's data and instruction interfaces; optimized CPU fetch engine; updated iCE40 synthesis results |
| 20.07.2020 | 1.2.0.5 | Less penalty for taken branches and jumps (2 cycles faster) |
| 19.07.2020 | 1.2.0.0 | CPU bus unit now has independent busses for instruction fetch and data access merged into single processor bus via new bus switch unit; doubled speed of ALU shifter unit again; all bits of `mcause` CSR can now be modified by application program (full RISC-V-compliant); performance counters CSRs `[m]cycleh` and `[m]instreth` are only 20-bit wide; removed NEORV32-specific custom CSRs all processor-related information can be obtained from the new `SYSINFO` IO module (CPU is now more independent from processor configuration); changed IO address of `DEVNULL`; fixed bug in bootloaders trap handler; added `USER_CODE` generic to assign a custom user code that can be read by software (from `SYSINFO`) |
| 19.07.2020 | 1.2.0.0 | CPU bus unit now has independent busses for instruction fetch and data access merged into single processor bus via new bus switch unit; doubled speed of ALU shifter unit again; all bits of `mcause` CSR can now be modified by application program (full RISC-V-compliant); performance counters CSRs `[m]cycleh` and `[m]instreth` are only 20-bit wide; removed NEORV32-specific custom CSRs all processor-related information can be obtained from the new `SYSINFO` IO module (CPU is now more independent from processor configuration); changed IO address of `DEVNULL`; fixed bug in bootloader's trap handler; added `USER_CODE` generic to assign a custom user code that can be read by software (from `SYSINFO`) |
| 14.07.2020 | 1.1.0.0 | Added `fence_o` and `fencei_o` signals to top entity to show if a `fence` or `fencei` instruction is executed; added `mvendorid` and `marchid` CSRs (both are always zero); ALU shift unit is faster now; two lowest bits of `mtvec` are always zero; fixed wrong instruction exception priority; removed `HART_ID` generic `mhartid` CSR is always read as zero; performance counters (`[m]cycle[h]`, `[m]instret[h]` and `time[h]`) are also available in embedded mode but can be explicitly disabled via the `CSR_COUNTERS_USE` generic; mcause CSR only allows write access to bit 31 and bits 3:0; updated synthesis reports |
| 10.07.2020 | 1.0.6.0 | Non-taken branches are now 1 cycle faster; the `time[h]` CSR now correctly reflects the system time from the MTIME unit; fixed WFI instruction permanently stalling the CPU; `[m]cycle[h]` counters now stop counting when CPU is in sleep mode; `minstret[h]` and `mcycle[h]` now also allow write-access |
| 09.07.2020 | 1.0.5.0 | `X` flag of `misa` CSR is zero now; the default SPI flash boot address of the bootloader is now `0x0080000`; new exemplary FPGA utilization results for Intel, Lattice and Xilinx; `misa` CSR is read-only again, switching compressed extension on/off is pretty bad for the fetch engine; `mtval` and `mcause` CSRs now allow write accesses and are finally RISC-V-compliant; time low and high registers of `MTIME` peripheral can now also be written by user; `MTIME` registers only allow full-word write accesses |

View file

@ -242,7 +242,7 @@ a DE0-nano board. The design was synthesized using **Intel Quartus Prime Lite 19
information is derived from the Timing Analyzer / Slow 1200mV 0C Model. If not otherwise specified, the default configuration
of the CPU's generics is assumed (for example no PMP). No constraints were used at all.
Results generated for hardware version: `1.4.4.8`
Results generated for hardware version `1.4.4.8`.
| CPU Configuration | LEs | FFs | Memory bits | DSPs | f_max |
|:---------------------------------------|:----------:|:--------:|:-----------:|:----:|:--------:|
@ -255,7 +255,7 @@ Results generated for hardware version: `1.4.4.8`
### NEORV32 Processor-Internal Peripherals and Memories
Results generated for hardware version: `1.4.4.8`
Results generated for hardware version `1.4.4.8`.
| Module | Description | LEs | FFs | Memory bits | DSPs |
|:----------|:-----------------------------------------------------|----:|----:|------------:|-----:|
@ -283,7 +283,7 @@ no external memory interface and only internal instruction and data memories. IM
processor's [top entity](https://github.com/stnolting/neorv32/blob/master/rtl/core/neorv32_top.vhd) signals
to FPGA pins - except for the Wishbone bus and the interrupt signals.
Results generated for hardware version: `1.4.4.8`
Results generated for hardware version `1.4.4.8`.
| Vendor | FPGA | Board | Toolchain | Strategy | CPU Configuration | LUT / LE | FF / REG | DSP | Memory Bits | BRAM / EBR | SPRAM | Frequency |
|:--------|:----------------------------------|:-----------------|:---------------------------|:-------- |:-----------------------------------------------|:-----------|:-----------|:-------|:-------------|:-----------|:---------|--------------:|
@ -309,7 +309,7 @@ The [CoreMark CPU benchmark](https://www.eembc.org/coremark) was executed on the
[sw/example/coremark](https://github.com/stnolting/neorv32/blob/master/sw/example/coremark) project folder. This benchmark
tests the capabilities of a CPU itself rather than the functions provided by the whole system / SoC.
Results generated for hardware version: `1.4.5.4`
Results generated for hardware version `1.4.5.4`.
~~~
**Configuration**
@ -349,13 +349,13 @@ iterations, which reflects a pretty good "real-life" work load. The average CPI
dividing the total number of required clock cycles (only the timed core to avoid distortion due to IO wait cycles; sampled via the `cycle[h]` CSRs)
by the number of executed instructions (`instret[h]` CSRs). The executables were generated using optimization `-O3`.
Results generated for hardware version: `1.4.5.4`
Results generated for hardware version `1.4.5.4`.
| CPU | Required Clock Cycles | Executed Instructions | Average CPI |
|:--------------------------------------------|----------------------:|----------------------:|:-----------:|
| `rv32i` | 5 945 938 586 | 1 469 587 406 | **4.05** |
| `rv32im` | 3 110 282 586 | 602 225 760 | **5.16** |
| `rv32imc` | 3 172 969 968 | 615 388 924 | **5.16** |
| `rv32imc` | 3 172 969 968 | 615 388 890 | **5.16** |
| `rv32imc` + `FAST_MUL_EN` | 2 590 417 968 | 615 388 890 | **4.21** |
| `rv32imc` + `FAST_MUL_EN` + `FAST_SHIFT_EN` | 2 456 318 408 | 615 388 890 | **3.99** |
@ -613,6 +613,12 @@ which you can start your own application. Simply compile one of these projects.
### Upload the Executable via the Bootloader
You can upload a generated executable directly from the command line using the makefile's `upload` target. Replace `/dev/ttyUSB0` with
the according serial port.
sw/exeample/blink_example$ make COM_PORT=/dev/ttyUSB0` upload
A more "secure" way is to use a dedicated terminal program. This allows to directly interact with the bootloader console.
Connect your FPGA board via UART to your computer and open the according port to interface with the NEORV32 bootloader. The bootloader
uses the following default UART configuration:

Binary file not shown.