mirror of
https://github.com/stnolting/neorv32.git
synced 2025-04-24 22:27:21 -04:00
[docs] update cache sections
This commit is contained in:
parent
ff8d127e32
commit
22ea686f6c
3 changed files with 53 additions and 81 deletions
|
@ -1,4 +1,5 @@
|
|||
<<<
|
||||
<<<
|
||||
:sectnums:
|
||||
==== Processor-Internal Data Cache (dCACHE)
|
||||
|
||||
|
@ -6,11 +7,11 @@
|
|||
[grid="none"]
|
||||
|=======================
|
||||
| Hardware source files: | neorv32_cache.vhd | Generic cache module
|
||||
| Software driver files: | none | _implicitly used_
|
||||
| Software driver files: | none |
|
||||
| Top entity ports: | none |
|
||||
| Configuration generics: | `DCACHE_EN` | implement processor-internal data cache when `true`
|
||||
| | `DCACHE_NUM_BLOCKS` | number of cache blocks (pages/lines)
|
||||
| | `DCACHE_BLOCK_SIZE` | size of a cache block in bytes
|
||||
| | `DCACHE_NUM_BLOCKS` | number of cache blocks (pages or lines); has to be a power of two
|
||||
| | `DCACHE_BLOCK_SIZE` | size of a cache block in bytes; has to be a power of two
|
||||
| CPU interrupts: | none |
|
||||
|=======================
|
||||
|
||||
|
@ -21,24 +22,17 @@ The processor features an optional data cache to improve performance when using
|
|||
access latency. The cache is connected directly to the CPU's data access interface and provides
|
||||
full-transparent accesses. The cache is direct-mapped and uses "write-allocate" and "write-back" strategies.
|
||||
|
||||
.Cached/Uncached Accesses
|
||||
.Uncached Accesses
|
||||
[NOTE]
|
||||
The data cache provides direct accesses (= uncached) to memory in order to access memory-mapped IO (like the
|
||||
processor-internal IO/peripheral modules). All accesses that target the address range from `0xF0000000` to `0xFFFFFFFF`
|
||||
will not be cached at all (see section <<_address_space>>). Direct/uncached accesses have **lower** priority than
|
||||
cache block operations to allow continuous burst transfer and also to maintain logical instruction forward
|
||||
progress / data coherency. Furthermore, the atomic memory operations of the <<_zaamo_isa_extension>> will
|
||||
always **bypass** the cache.
|
||||
will not be cached at all (see section <<_address_space>>). Furthermore, the atomic memory operations
|
||||
of the <<_zaamo_isa_extension>> will always **bypass** the cache.
|
||||
|
||||
.Caching Internal Memories
|
||||
[NOTE]
|
||||
The data cache is intended to accelerate data access to **processor-external** memories.
|
||||
The CPU cache(s) should not be implemented when using only processor-internal data and instruction memories.
|
||||
|
||||
.Manual Cache Flush/Clear/Reload
|
||||
.Manual Cache Flush/Clear/Reload and Memory Coherence
|
||||
[NOTE]
|
||||
By executing the `fence` instruction the data cache is flushed, cleared and reloaded.
|
||||
See section <<_cache_coherency>> for more information.
|
||||
See section <<_memory_coherence>> for more information.
|
||||
|
||||
.Retrieve Cache Configuration from Software
|
||||
[TIP]
|
||||
|
@ -46,8 +40,6 @@ Software can retrieve the cache configuration/layout from the <<_sysinfo_cache_c
|
|||
|
||||
.Bus Access Fault Handling
|
||||
[NOTE]
|
||||
The cache always loads a complete cache block (aligned to the block size) every time a
|
||||
cache miss is detected. Each cached word from this block provides a single status bit that indicates if the
|
||||
according bus access was successful or caused a bus error. Hence, the whole cache block remains valid even
|
||||
if certain addresses inside caused a bus error. If the CPU accesses any of the faulty cache words, a
|
||||
data bus error exception is raised.
|
||||
If the cache encounters a bus error when uploading a modified block to the next memory level or when
|
||||
downloading a new block from the next memory level, the entire block is invalidated and a bus access
|
||||
error exception is raised.
|
||||
|
|
|
@ -1,4 +1,5 @@
|
|||
<<<
|
||||
<<<
|
||||
:sectnums:
|
||||
==== Processor-Internal Instruction Cache (iCACHE)
|
||||
|
||||
|
@ -6,11 +7,11 @@
|
|||
[grid="none"]
|
||||
|=======================
|
||||
| Hardware source files: | neorv32_cache.vhd | Generic cache module
|
||||
| Software driver files: | none | _implicitly used_
|
||||
| Software driver files: | none |
|
||||
| Top entity ports: | none |
|
||||
| Configuration generics: | `ICACHE_EN` | implement processor-internal instruction cache when `true`
|
||||
| | `ICACHE_NUM_BLOCKS` | number of cache blocks (pages/lines)
|
||||
| | `ICACHE_BLOCK_SIZE` | size of a cache block in bytes
|
||||
| | `ICACHE_NUM_BLOCKS` | number of cache blocks (pages or lines); has to be a power of two
|
||||
| | `ICACHE_BLOCK_SIZE` | size of a cache block in bytes; has to be a power of two
|
||||
| CPU interrupts: | none |
|
||||
|=======================
|
||||
|
||||
|
@ -21,24 +22,17 @@ The processor features an optional instruction cache to improve performance when
|
|||
access latency. The cache is connected directly to the CPU's instruction fetch interface and provides
|
||||
full-transparent accesses. The cache is direct-mapped and read-only.
|
||||
|
||||
.Cached/Uncached Accesses
|
||||
.Uncached Accesses
|
||||
[NOTE]
|
||||
The data cache provides direct accesses (= uncached) to memory in order to access memory-mapped IO (like the
|
||||
processor-internal IO/peripheral modules). All accesses that target the address range from `0xF0000000` to `0xFFFFFFFF`
|
||||
will not be cached at all (see section <<_address_space>>). Direct/uncached accesses have **lower** priority than
|
||||
cache block operations to allow continuous burst transfer and also to maintain logical instruction forward
|
||||
progress / data coherency. Furthermore, the atomic memory operations of the <<_zaamo_isa_extension>> will
|
||||
always **bypass** the cache.
|
||||
will not be cached at all (see section <<_address_space>>). Furthermore, the atomic memory operations
|
||||
of the <<_zaamo_isa_extension>> will always **bypass** the cache.
|
||||
|
||||
.Caching Internal Memories
|
||||
[NOTE]
|
||||
The data cache is intended to accelerate data access to **processor-external** memories.
|
||||
The CPU cache(s) should not be implemented when using only processor-internal data and instruction memories.
|
||||
|
||||
.Manual Cache Clear/Reload
|
||||
.Manual Cache Flush/Clear/Reload and Memory Coherence
|
||||
[NOTE]
|
||||
By executing the `fence.i` instruction the instruction cache is cleared and reloaded.
|
||||
See section <<_cache_coherency>> for more information.
|
||||
See section <<_memory_coherence>> for more information.
|
||||
|
||||
.Retrieve Cache Configuration from Software
|
||||
[TIP]
|
||||
|
@ -46,8 +40,6 @@ Software can retrieve the cache configuration/layout from the <<_sysinfo_cache_c
|
|||
|
||||
.Bus Access Fault Handling
|
||||
[NOTE]
|
||||
The cache always loads a complete cache block (aligned to the block size) every time a
|
||||
cache miss is detected. Each cached word from this block provides a single status bit that indicates if the
|
||||
according bus access was successful or caused a bus error. Hence, the whole cache block remains valid even
|
||||
if certain addresses inside caused a bus error. If the CPU accesses any of the faulty cache words, an
|
||||
instruction bus error exception is raised.
|
||||
If the cache encounters a bus error when uploading a modified block to the next memory level or when
|
||||
downloading a new block from the next memory level, the entire block is invalidated and a bus access
|
||||
error exception is raised.
|
||||
|
|
|
@ -7,30 +7,30 @@
|
|||
|=======================
|
||||
| Hardware source files: | neorv32_xbus.vhd | External bus gateway
|
||||
| | neorv32_cache.vhd | Generic cache module
|
||||
| Software driver files: | none | _implicitly used_
|
||||
| Software driver files: | none |
|
||||
| Top entity ports: | `xbus_adr_o` | address output (32-bit)
|
||||
| | `xbus_dat_i` | data input (32-bit)
|
||||
| | `xbus_dat_o` | data output (32-bit)
|
||||
| | `xbus_tag_o` | access tag (3-bit)
|
||||
| | `xbus_we_o` | write enable (1-bit)
|
||||
| | `xbus_sel_o` | byte enable (4-bit)
|
||||
| | `xbus_stb_o` | bus strobe (1-bit)
|
||||
| | `xbus_cyc_o` | valid cycle (1-bit)
|
||||
| | `xbus_dat_i` | data input (32-bit)
|
||||
| | `xbus_ack_i` | acknowledge (1-bit)
|
||||
| | `xbus_err_i` | bus error (1-bit)
|
||||
| Configuration generics: | `XBUS_EN` | enable external bus interface when `true`
|
||||
| | `XBUS_TIMEOUT` | number of clock cycles after which an unacknowledged external bus access will auto-terminate (0 = disabled)
|
||||
| | `XBUS_REGSTAGE_EN` | implement XBUS register stages
|
||||
| | `XBUS_CACHE_EN` | implement the external bus cache
|
||||
| | `XBUS_CACHE_NUM_BLOCKS` | number of blocks ("lines"), has to be a power of two.
|
||||
| | `XBUS_CACHE_BLOCK_SIZE` | size in bytes of each block, has to be a power of two.
|
||||
| | `XBUS_CACHE_EN` | implement the external bus cache when `true`
|
||||
| | `XBUS_CACHE_NUM_BLOCKS` | number of cache blocks (pages or lines); has to be a power of two
|
||||
| | `XBUS_CACHE_BLOCK_SIZE` | size of a cache block in bytes; has to be a power of two
|
||||
| CPU interrupts: | none |
|
||||
|=======================
|
||||
|
||||
|
||||
**Overview**
|
||||
|
||||
The external bus interface provides a **Wishbone b4**-compatible on-chip bus interface that is
|
||||
The external bus interface provides a **Wishbone b4**-compatible on-chip bus interface that gets
|
||||
implemented if the `XBUS_EN` generic is `true`. This bus interface can be used to attach processor-external
|
||||
modules like memories, custom hardware accelerators or additional peripheral devices.
|
||||
An optional cache module ("XCACHE") can be enabled to improve memory access latency.
|
||||
|
@ -76,12 +76,8 @@ device's / bus system's `cyc` and `stb` signals (omitting the processor's `xbus_
|
|||
|
||||
.Atomic Memory Accesses
|
||||
[NOTE]
|
||||
<<_Atomic_Memory_Access>> keep the `cyc` signal active to perform a back-to-back bus access consisting of
|
||||
two `stb` strobes (one for the load/read operation and another one for the store/write operation).
|
||||
|
||||
.Endianness
|
||||
[NOTE]
|
||||
Just like the processor itself the XBUS interface uses **little-endian** byte order.
|
||||
<<_atomic_memory_access>> operations keep the `cyc` signal active to perform a back-to-back bus access
|
||||
consisting of two `stb` strobes (one for the load/read operation and another one for the store/write operation).
|
||||
|
||||
.Wishbone Specs.
|
||||
[TIP]
|
||||
|
@ -123,36 +119,28 @@ It compatible to the the AXI4 `ARPROT` and `AWPROT` signals.
|
|||
The XBUS interface provides an optional internal cache that can be used to buffer processor-external accesses.
|
||||
The x-cache is enabled via the `XBUS_CACHE_EN` generic. The total size of the cache is split into the number of
|
||||
cache lines or cache blocks (`XBUS_CACHE_NUM_BLOCKS` generic) and the line or block size in bytes
|
||||
(`XBUS_CACHE_BLOCK_SIZE` generic).
|
||||
(`XBUS_CACHE_BLOCK_SIZE` generic). The cache uses a direct-mapped architecture that implements "write-allocate"
|
||||
and "write-back" strategies.
|
||||
|
||||
.Simplified X-Cache Architecture
|
||||
[source,asciiart]
|
||||
---------------------------------------
|
||||
Direct Access +----------+
|
||||
/|------------------------->| Register |------------------------>|\
|
||||
| | +----------+ | |
|
||||
Core --->| | | |---> XBUS
|
||||
| | +--------------+ +--------------+ +-------------+ | |
|
||||
\|--->| Host Arbiter |--->| Cache Memory |<---| Bus Arbiter |--->|/
|
||||
+--------------+ +--------------+ +-------------+
|
||||
---------------------------------------
|
||||
|
||||
The cache uses a direct-mapped architecture that implements "write-allocate" and "write-back" strategies.
|
||||
The **write-allocate** strategy will fetch the entire referenced block from main memory when encountering
|
||||
a cache write-miss. The **write-back** strategy will gather all writes locally inside the cache until the according
|
||||
cache block is about to be replaced. In this case, the entire modified cache block is written back to main memory.
|
||||
|
||||
.Manual Cache Flush/Clear/Reload
|
||||
[NOTE]
|
||||
By executing a `fence` **or** `fence.i` instruction the XBUS cache is flushed (local modifications are send back to
|
||||
main memory), cleared (all cache entries are invalidated) and a reloaded (fetching new data from main memory).
|
||||
See section <<_cache_coherency>> for more information.
|
||||
|
||||
.Cached/Uncached Accesses
|
||||
.Uncached Accesses
|
||||
[NOTE]
|
||||
The data cache provides direct accesses (= uncached) to memory in order to access memory-mapped IO.
|
||||
All accesses that target the address range from `0xF0000000` to `0xFFFFFFFF`
|
||||
will not be cached at all (see section <<_address_space>>). Direct/uncached accesses have **lower** priority than
|
||||
cache block operations to allow continuous burst transfer and also to maintain logical instruction forward
|
||||
progress / data coherency. Furthermore, the atomic memory operations of the <<_zaamo_isa_extension>> will
|
||||
always **bypass** the cache.
|
||||
will not be cached at all (see section <<_address_space>>). Furthermore, the atomic memory operations
|
||||
of the <<_zaamo_isa_extension>> will always **bypass** the cache.
|
||||
|
||||
.Manual Cache Flush/Clear/Reload and Memory Coherence
|
||||
[NOTE]
|
||||
By executing a `fence` **or** `fence.i` instruction the XBUS cache is flushed (local modifications are send back to
|
||||
main memory), cleared (all cache entries are invalidated) and a reloaded (fetching new data from main memory).
|
||||
See section <<_memory_coherence>> for more information.
|
||||
|
||||
.Retrieve Cache Configuration from Software
|
||||
[TIP]
|
||||
Software can retrieve the cache configuration/layout from the <<_sysinfo_cache_configuration>> register.
|
||||
|
||||
.Bus Access Fault Handling
|
||||
[NOTE]
|
||||
If the cache encounters a bus error when uploading a modified block to the next memory level or when
|
||||
downloading a new block from the next memory level, the entire block is invalidated and a bus access
|
||||
error exception is raised.
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue