🐛 fixing some LR/SC design flaws (#654)

This commit is contained in:
stnolting 2023-07-26 20:37:46 +02:00 committed by GitHub
commit 1d5e198147
No known key found for this signature in database
GPG key ID: 4AEE18F83AFDEB23
7 changed files with 140 additions and 69 deletions

View file

@ -1,7 +1,7 @@
## Project Change Log
The most recent version of the **NEORV32** project can be found at the top of this list.
"Stable releases" are linked and highlighted :rocket:. The latest release is
The most recent version of the project can be found at the top of this list.
Releases are linked and highlighted. The latest release is
[![release](https://img.shields.io/github/v/release/stnolting/neorv32)](https://github.com/stnolting/neorv32/releases).
A list of all releases can be found [here](https://github.com/stnolting/neorv32/releases).
@ -10,14 +10,13 @@ The _version identifier_ uses an additional **custom** element (`MAJOR.MINOR.PAT
to track _individual_ changes. The identifier number is incremented with every core RTL
modification and also by major framework modifications.
The version number is globally defined by the `hw_version_c` constant in the main VHDL
The version identifier is globally defined by the `hw_version_c` constant in the main VHDL
[package file](https://github.com/stnolting/neorv32/blob/main/rtl/core/neorv32_package.vhd).
The processor can determine this version by reading the RISC-V-compatible `mimpid` CSR.
A 8x4-bit BCD representation is used. Examples:
Software can determine this version by reading the RISC-V-compatible `mimpid` CSR, which uses
a 8x4-bit BCD (binary-coded decimal) representation. Example:
```
mimpid = 0x01040312 => Version 01.04.03.12 => v1.4.3.12
mimpid = 0x01080200 => Version 01.08.02.00 => v1.8.2
mimpid = 0x01040312 -> Version 01.04.03.12 -> v1.4.3.12
```
@ -33,6 +32,7 @@ mimpid = 0x01080200 => Version 01.08.02.00 => v1.8.2
| Date (*dd.mm.yyyy*) | Version | Comment |
|:-------------------:|:-------:|:--------|
| 24.07.2023 | 1.8.6.10 | :bug: fixing some LR/SC design flaws; [#654](https://github.com/stnolting/neorv32/pull/654) |
| 23.07.2023 | 1.8.6.9 | optimize bus system and customization options; [#653](https://github.com/stnolting/neorv32/pull/653) |
| 22.07.2023 | 1.8.6.8 | minor rtl edits; [#652](https://github.com/stnolting/neorv32/pull/652) |
| 21.07.2023 | 1.8.6.7 | :sparkles: add support for **RISC-V A ISA Extension** (atomic memory accesses; `lr.w`/`sc.w` only!); [#651](https://github.com/stnolting/neorv32/pull/651) |

View file

@ -326,15 +326,20 @@ instruction stores a word to a word-aligned address only if the reservation set
is executed or if any write access to the _reservated_ address takes place. Traps and/or CPU privilege level changes
do not modify current reservation sets.
.`aq` and `rl` Bits
[NOTE]
The instruction word's `aq` and `lr` memory ordering bits are not evaluated by the hardware at all.
.Atomic Memory Access on Hardware Level
[NOTE]
More information regarding the atomic memory accesses and the according reservation
sets can be found in section <<_reservation_set_controller>>.
.LR/SC Bus Protocol
[NOTE]
More details regarding the LR/SC instruction's bus transactions can be found in section
<<_bus_interface>> / <<_atomic_accesses>>.
.Cache Coherency
[IMPORTANT]
Atomic operations **always bypass** the cache using direct/uncached accesses. Care must be taken
to maintain data cache coherency (e.g. by using the `fence` instruction).
==== `I` ISA Extension
@ -356,7 +361,12 @@ The `I` ISA extensions is the base RISC-V integer ISA that is always enabled.
| Illegal inst. | - | 3
|=======================
.`fence` Instruction
.`fence` Instruction - Predecessor and Successor Bits
[NOTE]
The `fence` instruction word's _predecessor_ and _successor_ bits (used for memory ordering) are not evaluated
by the hardware at all.
.`fence` Instruction - How it works
[NOTE]
CPU-internally, the `fence` instruction does not perform any operation inside the CPU. It only sets the
top's `d_bus_fence_o` signal high for one cycle to inform the memory system a `fence` instruction has been
@ -813,7 +823,7 @@ is driven by the _accessed_ device or bus system (i.e. a processor-internal memo
| `ben` | 4 | Byte-enable for each byte in `data`
| `we` | 1 | **Write** request trigger (single-shot)
| `re` | 1 | **Read** request trigger (single-shot)
| `src` | 1 | Access source (`0` = instruction fetch, `1` = data access)
| `src` | 1 | Access source (`0` = instruction fetch, `1` = load/store)
| `priv` | 1 | Set if privileged (M-mode) access
| `rvso` | 1 | Set if current access is a reservation-set operation (atomic `lr` or `sc` instruction)
|=======================
@ -848,7 +858,7 @@ The figure below shows three exemplary bus accesses:
[start=1]
. A read access to address `A_addr` returning `rdata` after several cycles (slow response; `ACK` arrives after several cycles).
. A word-wide write access to address `B_addr` writing `wdata` (fast response; `ACK` arrives right in the next cycle).
. A write access to address `B_addr` writing `wdata` (fastest response; `ACK` arrives right in the next cycle).
. A failing read access to address `C_addr` (slow response; `ERR` arrives after several cycles).
.Three Exemplary Bus Transactions

View file

@ -525,8 +525,6 @@ monitor starts an internal timer. The accessed module has to respond ("ACK") to
.Internal Bus Timeout Configuration
[source,vhdl]
----
-- "response time window" for processor-internal modules --
-- = cycles after which an *unacknowledged* internal bus access will timeout and trigger a bus fault exception
constant max_proc_int_response_time_c : natural := 15; -- default = 15
----
@ -550,20 +548,28 @@ explicit specific processor generic. See section <<_processor_external_memory_in
==== Reservation Set Controller
The reservation set controller is responsible for handling the load-reservate and store-conditional bus transaction that
are triggered by the `lr.w` and `sc.w` instructions from the CPU's <<_a_isa_extension>>.
are triggered by the `lr.w` (LR) and `sc.w` (SC) instructions from the CPU's <<_a_isa_extension>>.
A "reservation" defines an address or address range that provides a guarding mechanism to support atomic accesses. A new
reservation is registered by the `lr.w` instruction. The address provided by this instruction defines the memory location
that is now monitored for atomic accesses. The according `sc.w` instruction evaluates the state of this reservation. If
the reservation is still valid the write access triggered by the `sc.w` instruction is finally executed and the instruction
return a "success" state. If the reservation has been invalidated the `sc.w` instructions write access does not take place
and the instruction returns a "failed" state.
reservation is registered by the LR instruction. The address provided by this instruction defines the memory location
that is now monitored for atomic accesses. The according SC instruction evaluates the state of this reservation. If
the reservation is still valid the write access triggered by the SC instruction is finally executed and the instruction
return a "success" state (`rd` = 0). If the reservation has been invalidated the SC instruction will not write to memory
and will return a "failed" state (`rd` = 1).
The reservation is invalidated if...
* ... another `lr.w` instruction is executed.
* ... a normal store access to the reservated address takes place (by the CPU or by the DMA).
* ... a hardware reset is triggered.
* an SC instruction is executed that accesses an address **outside** of the reservation set of the previous LR instruction.
This SC instruction will **fail** (not writing to memory).
* an SC instruction is executed that accesses an address **inside** of the reservation set of the previous LR instruction.
This SC instruction will **succeed** (finally writing to memory).
* a normal store operation accesses an address **inside** of the current reservation set (by the CPU or by the DMA).
* a hardware reset is triggered.
.Consecutive LR Instructions
[NOTE]
If an LR instruction is followed by another LR instruction the reservation set of the former one is overridden
by the reservation set of the latter one.
.Bus Access Errors
[IMPORTANT]
@ -587,19 +593,19 @@ that the size of the address region has to be a power of two.
The reservation set can be set for _any_ address (only constrained by the configured granularity). This also
includes cached memory, memory-mapped IO devices and processor-external address spaces.
Bus transactions triggered by the `lr.w` instruction register a new reservation set and are delegated to the adressed
memory/device. Bus transactions triggered by the `sc.w` remove a reservation set and are forwarded to the adressed
Bus transactions triggered by the LR instruction register a new reservation set and are delegated to the adressed
memory/device. Bus transactions triggered by the SC remove a reservation set and are forwarded to the adressed
memory/device only if the SC operations succeeds. Otherwise, the access request is not forwarded and a local ACK is
generated to terminate the bus transaction.
.LR/SC Bus Protocol
[NOTE]
More information regarding the /LR/SC bus transactions and the the according protocol can be found in section
More information regarding the LR/SC bus transactions and the the according protocol can be found in section
<<_bus_interface>> / <<_atomic_accesses>>.
.Cache Coherency
[IMPORTANT]
Atomic operations **always bypass** the cache using direct/uncached accesses. HEnce, care must be taken
Atomic operations **always bypass** the cache using direct/uncached accesses. Care must be taken
to maintain data cache coherency (e.g. by using the `fence` instruction).

View file

@ -920,22 +920,36 @@ begin
case rsvs.state is
when "10" => -- active reservation: wait for condition to invalidate reservation
if (rvs_clear_i = '1') then -- external clear request
-- --------------------------------------------------------------------
if (core_req_i.re = '1') and (core_req_i.rvso = '1') then -- another LR instruction overriding the current reservation
rsvs.addr <= core_req_i.addr(31 downto abb_c);
end if;
--
if (rvs_clear_i = '1') then -- external clear request (highest priority!)
rsvs.state <= "00"; -- invalidate reservation
elsif (core_req_i.we = '1') then
elsif (core_req_i.we = '1') then -- write access
if (core_req_i.rvso = '1') then -- store-conditional instruction
rsvs.state <= "11";
elsif (rsvs.match = '1') then -- store to reservated address
if (rsvs.match = '1') then -- SC to reservated address
rsvs.state <= "11"; -- execute SC instruction (reservation still valid)
else -- SC to any other address (new reservation attempt while the current one is still valid)
rsvs.state <= "00"; -- invalidate reservation
end if;
elsif (rsvs.match = '1') then -- normal write to reservated address
rsvs.state <= "00"; -- invalidate reservation
end if;
end if;
when "11" => -- active reservation: invalidate reservation at the end of bus access
-- --------------------------------------------------------------------
if (sys_rsp_i.ack = '1') or (sys_rsp_i.err = '1') then
rsvs.state <= "00";
end if;
when others => -- "0-" no active reservation: wait for for new registration request
-- --------------------------------------------------------------------
if (core_req_i.re = '1') and (core_req_i.rvso = '1') then -- load-reservate instruction
rsvs.addr <= core_req_i.addr(31 downto abb_c);
rsvs.state <= "10";

View file

@ -56,7 +56,7 @@ package neorv32_package is
-- Architecture Constants -----------------------------------------------------------------
-- -------------------------------------------------------------------------------------------
constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01080609"; -- hardware version
constant hw_version_c : std_ulogic_vector(31 downto 0) := x"01080610"; -- hardware version
constant archid_c : natural := 19; -- official RISC-V architecture ID
constant XLEN : natural := 32; -- native data path width, do not change!
@ -93,17 +93,17 @@ package neorv32_package is
-- IO Address Map --
constant iodev_size_c : natural := 256; -- size of a single IO device (bytes)
--constant base_res_cfs_c : std_ulogic_vector(31 downto 0) := x"ffffe000"; -- reserved
--constant base_res_cfs_c : std_ulogic_vector(31 downto 0) := x"ffffe100"; -- reserved
--constant base_res_cfs_c : std_ulogic_vector(31 downto 0) := x"ffffe200"; -- reserved
--constant base_res_cfs_c : std_ulogic_vector(31 downto 0) := x"ffffe300"; -- reserved
--constant base_res_cfs_c : std_ulogic_vector(31 downto 0) := x"ffffe400"; -- reserved
--constant base_res_cfs_c : std_ulogic_vector(31 downto 0) := x"ffffe500"; -- reserved
--constant base_res_cfs_c : std_ulogic_vector(31 downto 0) := x"ffffe600"; -- reserved
--constant base_res_cfs_c : std_ulogic_vector(31 downto 0) := x"ffffe700"; -- reserved
--constant base_res_cfs_c : std_ulogic_vector(31 downto 0) := x"ffffe800"; -- reserved
--constant base_res_cfs_c : std_ulogic_vector(31 downto 0) := x"ffffe900"; -- reserved
--constant base_res_cfs_c : std_ulogic_vector(31 downto 0) := x"ffffea00"; -- reserved
--constant base_???_c : std_ulogic_vector(31 downto 0) := x"ffffe000"; -- reserved
--constant base_???_c : std_ulogic_vector(31 downto 0) := x"ffffe100"; -- reserved
--constant base_???_c : std_ulogic_vector(31 downto 0) := x"ffffe200"; -- reserved
--constant base_???_c : std_ulogic_vector(31 downto 0) := x"ffffe300"; -- reserved
--constant base_???_c : std_ulogic_vector(31 downto 0) := x"ffffe400"; -- reserved
--constant base_???_c : std_ulogic_vector(31 downto 0) := x"ffffe500"; -- reserved
--constant base_???_c : std_ulogic_vector(31 downto 0) := x"ffffe600"; -- reserved
--constant base_???_c : std_ulogic_vector(31 downto 0) := x"ffffe700"; -- reserved
--constant base_???_c : std_ulogic_vector(31 downto 0) := x"ffffe800"; -- reserved
--constant base_???_c : std_ulogic_vector(31 downto 0) := x"ffffe900"; -- reserved
--constant base_???_c : std_ulogic_vector(31 downto 0) := x"ffffea00"; -- reserved
constant base_io_cfs_c : std_ulogic_vector(31 downto 0) := x"ffffeb00";
constant base_io_slink_c : std_ulogic_vector(31 downto 0) := x"ffffec00";
constant base_io_dma_c : std_ulogic_vector(31 downto 0) := x"ffffed00";
@ -126,7 +126,7 @@ package neorv32_package is
constant base_io_sysinfo_c : std_ulogic_vector(31 downto 0) := x"fffffe00";
constant base_io_dm_c : std_ulogic_vector(31 downto 0) := x"ffffff00";
-- OCD Debug Module Entry Points --
-- On-Chip Debugger - Debug Module Entry Points (Code ROM) --
constant dm_exc_entry_c : std_ulogic_vector(31 downto 0) := x"ffffff00"; -- = base_io_dm_c + 0, exceptions entry point
constant dm_park_entry_c : std_ulogic_vector(31 downto 0) := x"ffffff08"; -- = base_io_dm_c + 8, normal entry point
@ -1076,7 +1076,7 @@ package body neorv32_package is
if (2**i >= input) then
return i;
end if;
end loop; -- i
end loop;
return 0;
end function index_size_f;
@ -1143,7 +1143,7 @@ package body neorv32_package is
tmp_v(input'length-1) := input(input'length-1); -- keep MSB
for i in input'length-2 downto 0 loop
tmp_v(i) := input(i) xor input(i+1);
end loop; -- i
end loop;
return tmp_v;
end function bin_to_gray_f;
@ -1155,7 +1155,7 @@ package body neorv32_package is
tmp_v(input'length-1) := input(input'length-1); -- keep MSB
for i in input'length-2 downto 0 loop
tmp_v(i) := tmp_v(i+1) xor input(i);
end loop; -- i
end loop;
return tmp_v;
end function gray_to_bin_f;
@ -1167,7 +1167,7 @@ package body neorv32_package is
tmp_v := '0';
for i in a'range loop
tmp_v := tmp_v or a(i);
end loop; -- i
end loop;
return tmp_v;
end function or_reduce_f;
@ -1179,7 +1179,7 @@ package body neorv32_package is
tmp_v := '1';
for i in a'range loop
tmp_v := tmp_v and a(i);
end loop; -- i
end loop;
return tmp_v;
end function and_reduce_f;
@ -1191,7 +1191,7 @@ package body neorv32_package is
tmp_v := '0';
for i in a'range loop
tmp_v := tmp_v xor a(i);
end loop; -- i
end loop;
return tmp_v;
end function xor_reduce_f;
@ -1233,7 +1233,7 @@ package body neorv32_package is
for i in 0 to 7 loop
hex_v := tmp_v(i*4+3 downto i*4+0);
res_v(i+1) := to_hexchar_f(bit_rev_f(hex_v));
end loop; -- i
end loop;
return res_v;
end function to_hstring32_f;
@ -1244,7 +1244,7 @@ package body neorv32_package is
begin
for i in 0 to input'length-1 loop
output_v(input'length-i-1) := input(i);
end loop; -- i
end loop;
return output_v;
end function bit_rev_f;
@ -1289,7 +1289,7 @@ package body neorv32_package is
if (input(i) = '1') then
cnt_v := cnt_v + 1;
end if;
end loop; -- i
end loop;
return cnt_v;
end function popcount_f;
@ -1305,7 +1305,7 @@ package body neorv32_package is
else
exit;
end if;
end loop; -- i
end loop;
return cnt_v;
end function leading_zeros_f;
@ -1319,9 +1319,9 @@ package body neorv32_package is
if (init'length > depth) then
return mem_v;
end if;
for idx_v in 0 to init'length-1 loop -- init only in range of source data array
mem_v(idx_v) := init(idx_v);
end loop; -- idx_v
for i in 0 to init'length-1 loop -- init only in range of source data array
mem_v(i) := init(i);
end loop;
return mem_v;
end function mem32_init_f;

View file

@ -1728,17 +1728,21 @@ int main() {
// [NOTE] LR/SC operations bypass the data cache so we need to flush/reload
// it before/after making "normal" load/store operations
neorv32_cpu_invalidate_reservations(); // invalidate all current reservations
amo_var = 0x00cafe00; // initialize
tmp_a = neorv32_cpu_load_reservate_word((uint32_t)&amo_var);
amo_var = 0x10cafe00; // break reservation
asm volatile ("fence"); // flush/reload d-cache
tmp_b = neorv32_cpu_store_conditional_word((uint32_t)&amo_var, 0xaaaaaaaa);
tmp_b = (tmp_b << 1) | neorv32_cpu_store_conditional_word((uint32_t)&amo_var, 0xcccccccc); // another SC: must fail
tmp_b = (tmp_b << 1) | neorv32_cpu_store_conditional_word((uint32_t)ADDR_UNREACHABLE, 0); // another SC: must fail; no bus exception!
asm volatile ("fence"); // flush/reload d-cache
if ((tmp_a == 0x00cafe00) && // correct LR.W result
(amo_var == 0x10cafe00) && // atomic variable NOT updates by SC.W
(tmp_b == 0x00000001) && // SC.W failed
(tmp_b == 0x00000007) && // SC.W[2] failed, SC.W[1] failed, SC.W[0] failed
(neorv32_cpu_csr_read(CSR_MCAUSE) == mcause_never_c)) { // no exception
test_ok();
}
@ -1766,17 +1770,21 @@ int main() {
// [NOTE] LR/SC operations bypass the data cache so we need to flush/reload
// it before/after making "normal" load/store operations
neorv32_cpu_invalidate_reservations(); // invalidate all current reservations
amo_var = 0x00abba00; // initialize
tmp_a = neorv32_cpu_load_reservate_word((uint32_t)&amo_var);
asm volatile ("fence"); // flush/reload d-cache
neorv32_cpu_load_unsigned_word((uint32_t)&amo_var); // dummy read, must not alter reservation set state
tmp_b = neorv32_cpu_store_conditional_word((uint32_t)&amo_var, 0xcccccccc);
tmp_b = (tmp_b << 1) | neorv32_cpu_store_conditional_word((uint32_t)&amo_var, 0xcccccccc); // another SC: must fail
tmp_b = (tmp_b << 1) | neorv32_cpu_store_conditional_word((uint32_t)ADDR_UNREACHABLE, 0); // another SC: must fail; no bus exception!
asm volatile ("fence"); // flush/reload d-cache
if ((tmp_a == 0x00abba00) && // correct LR.W result
(amo_var == 0xcccccccc) && // atomic variable WAS updates by SC.W
(tmp_b == 0x00000000) && // SC.W succeeded
(tmp_b == 0x00000003) && // SC.W[2] succeeded, SC.W[1] failed, SC.W[0] failed
(neorv32_cpu_csr_read(CSR_MCAUSE) == mcause_never_c)) { // no exception
test_ok();
}
@ -1834,9 +1842,9 @@ int main() {
tmp_a = (uint32_t)(&pmp_access[0]); // base address of protected region
// configure new region (with highest priority)
PRINT_STANDARD("[0] OFF @ 0x%x, ", tmp_a); // base
PRINT_STANDARD("[0]: OFF @ 0x%x, ", tmp_a); // base
tmp_b = neorv32_cpu_pmp_configure_region(0, tmp_a >> 2, 0);
PRINT_STANDARD("[1] TOR (!L,!X,!W,R) @ 0x%x ", tmp_a+4); // bound
PRINT_STANDARD("[1]: TOR (!L,!X,!W,R) @ 0x%x ", tmp_a+4); // bound
tmp_b += neorv32_cpu_pmp_configure_region(1, (tmp_a+4) >> 2, (PMP_TOR << PMPCFG_A_LSB) | (1 << PMPCFG_R)); // read-only
if ((tmp_b == 0) && (neorv32_cpu_csr_read(CSR_MCAUSE) == mcause_never_c)) {

View file

@ -63,12 +63,9 @@ void neorv32_cpu_goto_user_mode(void);
/**@}*/
/**********************************************************************//**
* Prototype for "after-main handler". This function is called if main() returns.
*
* @param[in] return_code Return value of main() function.
**************************************************************************/
extern void __attribute__ ((weak)) __neorv32_crt0_after_main(int32_t return_code);
// #################################################################################################
// Load/store
// #################################################################################################
/**********************************************************************//**
@ -211,6 +208,11 @@ inline int8_t __attribute__ ((always_inline)) neorv32_cpu_load_signed_byte(uint3
}
// #################################################################################################
// Atomic memory access / load-reservate/store-conditional
// #################################################################################################
/**********************************************************************//**
* Atomic memory access: load-reservate word.
*
@ -261,6 +263,24 @@ inline uint32_t __attribute__ ((always_inline)) neorv32_cpu_store_conditional_wo
}
/**********************************************************************//**
* Atomic memory access: invalidate (all) current reservation sets
*
* @warning This function requires the A ISA extension.
**************************************************************************/
inline void __attribute__ ((always_inline)) neorv32_cpu_invalidate_reservations(void) {
#if defined __riscv_atomic
asm volatile ("sc.w zero, zero, (zero)");
#endif
}
// #################################################################################################
// CSR access
// #################################################################################################
/**********************************************************************//**
* Read data from CPU control and status register (CSR).
*
@ -319,6 +339,19 @@ inline void __attribute__ ((always_inline)) neorv32_cpu_csr_clr(const int csr_id
}
// #################################################################################################
// Misc
// #################################################################################################
/**********************************************************************//**
* Prototype for "after-main handler". This function is called if main() returns.
*
* @param[in] return_code Return value of main() function.
**************************************************************************/
extern void __attribute__ ((weak)) __neorv32_crt0_after_main(int32_t return_code);
/**********************************************************************//**
* Put CPU into sleep / power-down mode.
*