mirror of
https://github.com/stnolting/neorv32.git
synced 2025-04-24 22:27:21 -04:00
[docs] update: Zalrsc -> Zaamo
This commit is contained in:
parent
69e82684eb
commit
d65663e93d
6 changed files with 65 additions and 109 deletions
|
@ -106,7 +106,7 @@ setup according to your needs. Note that all of the following SoC modules are en
|
|||
[[`B`](https://stnolting.github.io/neorv32/#_b_isa_extension)]
|
||||
[[`U`](https://stnolting.github.io/neorv32/#_u_isa_extension)]
|
||||
[[`X`](https://stnolting.github.io/neorv32/#_x_isa_extension)]
|
||||
[[`Zalrsc`](https://stnolting.github.io/neorv32/#_zalrsc_isa_extension)]
|
||||
[[`Zaamo`](https://stnolting.github.io/neorv32/#_zaamo_isa_extension)]
|
||||
[[`Zba`](https://stnolting.github.io/neorv32/#_zba_isa_extension)]
|
||||
[[`Zbb`](https://stnolting.github.io/neorv32/#_zbb_isa_extension)]
|
||||
[[`Zbkb`](https://stnolting.github.io/neorv32/#_zbkb_isa_extension)]
|
||||
|
|
|
@ -415,7 +415,8 @@ always valid when set.
|
|||
| `rw` | 1 | Access direction (`0` = read, `1` = write)
|
||||
| `src` | 1 | Access source (`0` = instruction fetch, `1` = load/store)
|
||||
| `priv` | 1 | Set if privileged (M-mode) access
|
||||
| `rvso` | 1 | Set if current access is a reservation-set operation (`lr` or `sc` instruction, <<_zalrsc_isa_extension>>)
|
||||
| `amo` | 1 | Set if current access is an atomic memory operation (<<_atomic_memory_access>>)
|
||||
| `amoop` | 4 | Type of atomic memory operation (<<_atomic_memory_access>>)
|
||||
3+^| **Out-Of-Band Signals**
|
||||
| `fence` | 1 | Data/instruction fence request; single-shot
|
||||
| `sleep` | 1 | Set if ALL upstream devices are in <<_sleep_mode>>
|
||||
|
@ -463,36 +464,31 @@ additional latency). However, _all_ bus signals (request and response) need to b
|
|||
|
||||
|
||||
:sectnums:
|
||||
==== Atomic Accesses
|
||||
==== Atomic Memory Access
|
||||
|
||||
The load-reservate (`lr.w`) and store-conditional (`sc.w`) instructions from the <<_zalrsc_isa_extension>> execute as standard
|
||||
load/store bus transactions but with the `rvso` ("reservation set operation") signal being set. It is the task of the
|
||||
<<_reservation_set_controller>> to handle these LR/SC bus transactions accordingly. Note that these reservation set operations
|
||||
are intended for processor-internal usage only (i.e. the reservation state is not available for processor-external modules yet).
|
||||
The <<_zaamo_isa_extension>> adds atomic read-modify-write memory operations. Since the <<_bus_interface_protocol>>
|
||||
only supports read-or-write operations, the atomic memory requests are handled by a dedicated module of the bus
|
||||
infrastructure - the <<_atomic_memory_operations_controller>>.
|
||||
|
||||
.Reservation Set Controller
|
||||
[NOTE]
|
||||
See section <<_address_space>> / <<_reservation_set_controller>> for more information.
|
||||
For the CPU, the atomic memory accesses are handled as plain "load" operation but with the `amo` signal set
|
||||
and also providing write data (see <<_bus_interface>>). The `amoop` signal defines the actual atomic processing
|
||||
operation:
|
||||
|
||||
The figure below shows three exemplary bus accesses (1 to 3 from left to right). The `req` signal record represents
|
||||
the CPU-side of the bus interface. For easier understanding the current state of the reservation set is added as `rvs_valid` signal.
|
||||
|
||||
[start=1]
|
||||
. A load-reservate (LR) instruction using `addr` as address. This instruction returns the loaded data `rdata` via `rsp.data`
|
||||
and also registers a reservation for the address `addr` (`rvs_valid` becomes set).
|
||||
. A store-conditional (SC) instruction attempts to write `wdata1` to address `addr`. This SC operation **succeeds**, so
|
||||
`wdata1` is actually written to address `addr`. The successful operation is indicated by a **0** being returned via
|
||||
`rsp.data` together with `ack`. As the LR/SC is completed the registered reservation is invalidated (`rvs_valid` becomes cleared).
|
||||
. Another store-conditional (SC) instruction attempts to write `wdata2` to address `addr`. As the reservation set is already
|
||||
invalidated (`rvs_valid` is `0`) the store access fails, so `wdata2` is **not** written to address `addr` at all. The failed
|
||||
operation is indicated by a **1** being returned via `rsp.data` together with `ack`.
|
||||
|
||||
.Three Exemplary LR/SC Bus Transactions (showing only in-band signals)
|
||||
image::bus_interface_atomic.png[700]
|
||||
|
||||
.Store-Conditional Status
|
||||
[NOTE]
|
||||
The "normal" load data mechanism is used to return success/failure of the `sc.w` instruction to the CPU (via the LSB of `rsp.data`).
|
||||
.AMO Operation Type Encoding
|
||||
[cols="<1,<4"]
|
||||
[options="header",grid="rows"]
|
||||
|=======================
|
||||
| `bus_req_t.amoop` | Description
|
||||
| `-000` | swap
|
||||
| `-001` | unsigned add
|
||||
| `-010` | logical xor
|
||||
| `-011` | logical and
|
||||
| `-100` | logical or
|
||||
| `0110` | unsigned minimum
|
||||
| `0111` | unsigned maximum
|
||||
| `1110` | signed minimum
|
||||
| `1111` | signed maximum
|
||||
|=======================
|
||||
|
||||
.Cache Coherency
|
||||
[IMPORTANT]
|
||||
|
@ -521,7 +517,7 @@ This chapter gives a brief overview of all available ISA extensions.
|
|||
| <<_m_isa_extension,`M`>> | Integer multiplication and division instructions | <<_processor_top_entity_generics, `RISCV_ISA_M`>>
|
||||
| <<_u_isa_extension,`U`>> | Less-privileged _user_ mode extension | <<_processor_top_entity_generics, `RISCV_ISA_U`>>
|
||||
| <<_x_isa_extension,`X`>> | Platform-specific / NEORV32-specific extension | Always enabled
|
||||
| <<_zalrsc_isa_extension,`Zalrsc`>> | Atomic reservation-set instructions | <<_processor_top_entity_generics, `RISCV_ISA_Zalrsc`>>
|
||||
| <<_zaamo_isa_extension,`Zaamo`>> | Atomic memory operations | <<_processor_top_entity_generics, `RISCV_ISA_Zaamo`>>
|
||||
| <<_zba_isa_extension,`Zba`>> | Shifted-add bit manipulation instructions | <<_processor_top_entity_generics, `RISCV_ISA_Zba`>>
|
||||
| <<_zbb_isa_extension,`Zbb`>> | Basic bit manipulation instructions | <<_processor_top_entity_generics, `RISCV_ISA_Zbb`>>
|
||||
| <<_zbkb_isa_extension,`Zbkb`>> | Scalar cryptographic bit manipulation instructions | <<_processor_top_entity_generics, `RISCV_ISA_Zbkb`>>
|
||||
|
@ -689,37 +685,23 @@ RISC-V specs. Also, custom trap codes for <<_mcause>> are implemented.
|
|||
* There are <<_neorv32_specific_csrs>>.
|
||||
|
||||
|
||||
==== `Zalrsc` ISA Extension
|
||||
==== `Zaamo` ISA Extension
|
||||
|
||||
The `Zalrsc` ISA extension is a sub-extension of the RISC-V _atomic memory access_ (`A`) ISA extension and includes
|
||||
instructions for reservation-set operations (load-reservate `lr` and store-conditional `sc`) only.
|
||||
It is enabled by the top's <<_processor_top_entity_generics, `RISCV_ISA_Zalrsc`>> generic.
|
||||
|
||||
.AMO / `A` Emulation
|
||||
[NOTE]
|
||||
The atomic memory access / read-modify-write operations of the `A` ISA extension can be emulated using the
|
||||
LR and SC operations (quote from the RISC-V spec.: "_Any AMO can be emulated by an LR/SC pair._").
|
||||
The NEORV32 <<_core_libraries>> provide an emulation wrapper for emulating AMO/read-modify-write instructions that is
|
||||
based on LR/SC pairs. A demo/program can be found in `sw/example/atomic_test`.
|
||||
The `Zaamo` ISA extension is a sub-extension of the RISC-V `A` ISA extension and compromises instructions for read-modify-write
|
||||
<<_atomic_memory_access>> operations. It is enabled by the top's <<_processor_top_entity_generics, `RISCV_ISA_Zaamo`>> generic.
|
||||
|
||||
.Instructions and Timing
|
||||
[cols="<2,<4,<3"]
|
||||
[cols="<2,<4,<1"]
|
||||
[options="header", grid="rows"]
|
||||
|=======================
|
||||
| Class | Instructions | Execution cycles
|
||||
| Load-reservate word | `lr.w` | 5
|
||||
| Store-conditional word | `sc.w` | 5
|
||||
| Atomic memory operations | `amoswap.w` `amoadd.w` `amoand.w` `amoor.w` `amoxor.w` `amomax[u].w` `amomin[u].w` | 5 + 2 * _memory_latency_
|
||||
|=======================
|
||||
|
||||
.`aq` and `rl` Bits
|
||||
[NOTE]
|
||||
The instruction word's `aq` and `lr` memory ordering bits are not evaluated by the hardware at all.
|
||||
|
||||
.Atomic Memory Access on Hardware Level
|
||||
[NOTE]
|
||||
More information regarding the atomic memory accesses and the according reservation
|
||||
sets can be found in section <<_reservation_set_controller>>.
|
||||
|
||||
|
||||
==== `Zifencei` ISA Extension
|
||||
|
||||
|
|
|
@ -226,7 +226,7 @@ The generic type "`suv(x:y)`" is an abbreviation for "`std_ulogic_vector(x downt
|
|||
| `RISCV_ISA_E` | boolean | false | Enable <<_e_isa_extension>> (reduced register file size).
|
||||
| `RISCV_ISA_M` | boolean | false | Enable <<_m_isa_extension>> (hardware-based integer multiplication and division).
|
||||
| `RISCV_ISA_U` | boolean | false | Enable <<_u_isa_extension>> (less-privileged user mode).
|
||||
| `RISCV_ISA_Zalrsc` | boolean | false | Enable <<_zalrsc_isa_extension>> (atomic reservation-set operations).
|
||||
| `RISCV_ISA_Zaamo` | boolean | false | Enable <<_zaamo_isa_extension>> (atomic memory operations).
|
||||
| `RISCV_ISA_Zba` | boolean | false | Enable <<_zba_isa_extension>> (shifted-add bit-manipulation instructions).
|
||||
| `RISCV_ISA_Zbb` | boolean | false | Enable <<_zbb_isa_extension>> (basic bit-manipulation instructions).
|
||||
| `RISCV_ISA_Zbkb` | boolean | false | Enable <<_zbkb_isa_extension>> (scalar cryptography bit manipulation instructions).
|
||||
|
@ -576,67 +576,41 @@ explicit specific processor generic. See section <<_processor_external_bus_inter
|
|||
|
||||
|
||||
:sectnums:
|
||||
==== Reservation Set Controller
|
||||
==== Atomic Memory Operations Controller
|
||||
|
||||
The reservation set controller is responsible for handling the load-reservate and store-conditional bus transaction that
|
||||
are triggered by the `lr.w` (LR) and `sc.w` (SC) instructions from the CPU's <<_zalrsc_isa_extension>>.
|
||||
The atomic memory operations (AMO) controller is responsible for handling the read-modify-write operations issued by the
|
||||
CPU's <<_zaamo_isa_extension>>. For each AMO request, the controller executes an atomic set of three operations:
|
||||
|
||||
A "reservation" defines an address or address range that provides a guarding mechanism to support atomic accesses. A new
|
||||
reservation is registered by the LR instruction. The address provided by this instruction defines the memory location
|
||||
that is now monitored for atomic accesses. The according SC instruction evaluates the state of this reservation. If
|
||||
the reservation is still valid the write access triggered by the SC instruction is finally executed and the instruction
|
||||
return a "success" state (`rd` = 0). If the reservation has been invalidated the SC instruction will not write to memory
|
||||
and will return a "failed" state (`rd` = 1).
|
||||
.Simplified AMO Controller Operation
|
||||
[cols="^1,<3,<6"]
|
||||
[options="header",grid="rows"]
|
||||
|=======================
|
||||
| Step | Pseudo Code | Description
|
||||
| 1 | `tmp1 <= MEM[address];` | Perform a read operation accessing the addressed memory
|
||||
cell and store the loaded data into an internal buffer (`tmp1`).
|
||||
| 2 | `tmp2 <= tmp1 OP cpu_wdata` | The buffered data from the first step is processed
|
||||
using the write data provide by the CPU. The result is stored to another internal buffer (`tmp2`).
|
||||
| 3 | `MEM[address] <= tmp2;` `cpu_rdata <= tmp1;` | The data from the second buffer (`tmp2`) is
|
||||
written to the addressed memory cell. In parallel, the data from the first buffer (`tmp1` = original
|
||||
content of the addresses memory cell) is sent back to the requesting CPU.
|
||||
|=======================
|
||||
|
||||
.Reservation Set(s) and Granule
|
||||
[NOTE]
|
||||
The reservation set controller supports only **a single** global reservation set with a **word-aligned 4-byte granule**.
|
||||
The controller performs two bus transactions: a read operations and a write operation. Only the acknowledge/error
|
||||
handshake of the last transaction is sent back to the CPU.
|
||||
|
||||
The reservation is invalidated if...
|
||||
|
||||
* an SC instruction is executed that accesses an address **outside** of the reservation set of the previous LR instruction.
|
||||
This SC instruction will **fail** (not writing to memory).
|
||||
* an SC instruction is executed that accesses an address **inside** of the reservation set of the previous LR instruction.
|
||||
This SC instruction will **succeed** (finally writing to memory).
|
||||
* a normal store operation accesses an address **inside** of the current reservation set (by the CPU or by the DMA).
|
||||
* a hardware reset is triggered.
|
||||
|
||||
.Consecutive LR Instructions
|
||||
[NOTE]
|
||||
If an LR instruction is followed by another LR instruction the reservation set of the former one is overridden
|
||||
by the reservation set of the latter one.
|
||||
|
||||
.Bus Access Errors
|
||||
[IMPORTANT]
|
||||
If the LR operation causes a bus access error (raising a load access exception) the reservation **is registered anyway**.
|
||||
If the SC operation causes a bus access error (raising a store access exception) an already registered reservation set
|
||||
**is invalidated anyway**.
|
||||
|
||||
.Strong Semantic
|
||||
[IMPORTANT]
|
||||
The LR/SC mechanism follows the _strong semantic_ approach: the LR/SC instruction pair fails only if there is a write
|
||||
access to the referenced memory location between the LR and SC instructions (by the CPU itself or by the DMA).
|
||||
Context changes, interrupts, traps, etc. do not effect nor invalidate the reservation state at all.
|
||||
As the AMO controller is the memory-nearest instance (see <<_bus_system>>) the previously described set of operations
|
||||
cannot be interrupted. Hence, they execute in an atomic way.
|
||||
|
||||
.Physical Memory Attributes
|
||||
[NOTE]
|
||||
The reservation set can be set for _any_ address (only constrained by the configured granularity). This also
|
||||
includes cached memory, memory-mapped IO devices and processor-external address spaces.
|
||||
|
||||
Bus transactions triggered by the LR instruction register a new reservation set and are delegated to the adressed
|
||||
memory/device. Bus transactions triggered by the SC remove a reservation set and are forwarded to the adressed
|
||||
memory/device only if the SC operations succeeds. Otherwise, the access request is not forwarded and a local ACK is
|
||||
generated to terminate the bus transaction.
|
||||
|
||||
.LR/SC Bus Protocol
|
||||
[NOTE]
|
||||
More information regarding the LR/SC bus transactions and the the according protocol can be found in section
|
||||
<<_bus_interface>> / <<_atomic_accesses>>.
|
||||
Atomic memory operations can be executed for _any_ address. This also includes
|
||||
cached memory, memory-mapped IO devices and processor-external address spaces.
|
||||
|
||||
.Cache Coherency
|
||||
[IMPORTANT]
|
||||
Atomic operations **always bypass** the cache using direct/uncached accesses. Care must be taken
|
||||
to maintain data cache coherency (e.g. by using the `fence` instruction).
|
||||
Atomic operations **always bypass** the CPU's <<_processor_internal_data_cache_dcache, data cache>>
|
||||
using direct/uncached accesses. Care must be taken to maintain data cache coherency when accessing
|
||||
cached memory (e.g. by using the `fence` instruction).
|
||||
|
||||
|
||||
:sectnums:
|
||||
|
|
|
@ -19,7 +19,7 @@
|
|||
**Overview**
|
||||
|
||||
The processor features an optional data cache to improve performance when using memories with high
|
||||
access latencies. The cache is connected directly to the CPU's data access interface and provides
|
||||
access latency. The cache is connected directly to the CPU's data access interface and provides
|
||||
full-transparent accesses. The cache is direct-mapped and uses "write-allocate" and "write-back" strategies.
|
||||
|
||||
.Cached/Uncached Accesses
|
||||
|
@ -28,8 +28,8 @@ The data cache provides direct accesses (= uncached) to memory in order to acces
|
|||
processor-internal IO/peripheral modules). All accesses that target the address range from `0xF0000000` to `0xFFFFFFFF`
|
||||
will not be cached at all (see section <<_address_space>>). Direct/uncached accesses have **lower** priority than
|
||||
cache block operations to allow continuous burst transfer and also to maintain logical instruction forward
|
||||
progress / data coherency. Furthermore, atomic load-reservate and store-conditional instructions (<<_zalrsc_isa_extension>>)
|
||||
will always **bypass** the cache.
|
||||
progress / data coherency. Furthermore, the atomic memory operations of the <<_zaamo_isa_extension>> will
|
||||
always **bypass** the cache.
|
||||
|
||||
.Caching Internal Memories
|
||||
[NOTE]
|
||||
|
|
|
@ -19,7 +19,7 @@
|
|||
**Overview**
|
||||
|
||||
The processor features an optional instruction cache to improve performance when using memories with high
|
||||
access latencies. The cache is connected directly to the CPU's instruction fetch interface and provides
|
||||
access latency. The cache is connected directly to the CPU's instruction fetch interface and provides
|
||||
full-transparent accesses. The cache is direct-mapped and read-only.
|
||||
|
||||
.Cached/Uncached Accesses
|
||||
|
@ -28,8 +28,8 @@ The data cache provides direct accesses (= uncached) to memory in order to acces
|
|||
processor-internal IO/peripheral modules). All accesses that target the address range from `0xF0000000` to `0xFFFFFFFF`
|
||||
will not be cached at all (see section <<_address_space>>). Direct/uncached accesses have **lower** priority than
|
||||
cache block operations to allow continuous burst transfer and also to maintain logical instruction forward
|
||||
progress / data coherency. Furthermore, atomic load-reservate and store-conditional instructions (<<_zalrsc_isa_extension>>)
|
||||
will always **bypass** the cache.
|
||||
progress / data coherency. Furthermore, the atomic memory operations of the <<_zaamo_isa_extension>> will
|
||||
always **bypass** the cache.
|
||||
|
||||
.Caching Internal Memories
|
||||
[NOTE]
|
||||
|
|
|
@ -140,5 +140,5 @@ The data cache provides direct accesses (= uncached) to memory in order to acces
|
|||
All accesses that target the address range from `0xF0000000` to `0xFFFFFFFF`
|
||||
will not be cached at all (see section <<_address_space>>). Direct/uncached accesses have **lower** priority than
|
||||
cache block operations to allow continuous burst transfer and also to maintain logical instruction forward
|
||||
progress / data coherency. Furthermore, atomic load-reservate and store-conditional instructions (<<_zalrsc_isa_extension>>)
|
||||
will always **bypass** the cache.
|
||||
progress / data coherency. Furthermore, the atomic memory operations of the <<_zaamo_isa_extension>> will
|
||||
always **bypass** the cache.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue