This macro is not required and makes the file harder to parse.
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: JeanRochCoulon <jean-roch.coulon@thalesgroup.com>
Following what was done in branch_unit, I set up a default value for hypervisor exception fields in cvxif_fu.
Should fix issue #2831
---------
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
Co-authored-by: JeanRochCoulon <jean-roch.coulon@thalesgroup.com>
This PR assigns 0 to tinst by default.
Even though tinst is only used when CVA6Cfg.RVH is enabled, I chose to assign it a default value in all configurations, since the signal is defined for all configurations.
Fixes#2803
Co-authored-by: JeanRochCoulon <jean-roch.coulon@thalesgroup.com>
Verification Plan provided in VP_TOOL for the PMP. The verification plan should be complete, however only a partial set of the tests is available. This is not included in the CI but a bash script is available to run the test.
This code hits verilator/verilator#5829 due to the use of partial assignments to dcache_rtrn_o in this always block, while reading other bits of the same packed struct elsewhere in the block.
The actual effect of this is that with a Verilator simulation, invalidation requests incoming from the coherence network are sometimes ignored breaking AMOs.
Moving the assignments to the bits read in the always block into the same always block avoids this issue.
---------
Co-authored-by: JeanRochCoulon <jean-roch.coulon@thalesgroup.com>
Co-authored-by: github-actions[bot] <41898282+github-actions[bot]@users.noreply.github.com>
This PR adds a new two-level BHT predictor with private history. The new BPType parameters allow choosing between the original BHT and the new one.
Co-authored-by: Gianmarco Ottavi <ottavig91@gmail.com>
Adds to #2798. Sorry for noticing this only now. Together with #2798, this reverts a bug that was introduced in #2528.
Signed-off-by: Nils Wistoff <nwistoff@iis.ee.ethz.ch>
A new configuration file and core v target is added to start working on a 64 bit CVA6 core.
---------
Co-authored-by: JeanRochCoulon <jean-roch.coulon@thalesgroup.com>
The load and store units sample the MMU exception one cycle after
`dtlb_hit` is asserted. However, misaligned exceptions are currently fed
through the MMU, potentially attributing a misaligned exception to the
*preceding* instruction. Fix this by latching the misaligned exception.
Signed-off-by: Nils Wistoff <nwistoff@iis.ee.ethz.ch>
Co-authored-by: JeanRochCoulon <jean-roch.coulon@thalesgroup.com>
Use separate parametrization for RVF and RVD support. In the cv32a6_imafc_sv32 configuration RVD is currently enabled, leading to compilation errors.
Co-authored-by: JeanRochCoulon <jean-roch.coulon@thalesgroup.com>
Hi! Some tools like morty struggle with this expression. I suggest this very simple rewrite. No need for fancy constructs here.
Thanks @flaviens for this contribution
Add parameter CoproType to select which coprocessor to instantiate when CvxifEn == 1
---------
Co-authored-by: JeanRochCoulon <jean-roch.coulon@thalesgroup.com>
Follow-up to the discussion on extending Linux support to the Ara vector processor.
* Main changes:
Add:
Add external MMU interface to share the MMU with the external accelerator.
Add avoid_neg() function used to clip negative numbers to zero. Useful for parametric array sizes and vector multipliers.
Modifications:
2 commit ports by default in cv64a6_imafdcv_config_pkg.
Change exception_t from localparam to param in cva6.sv.
Add parameters accelerator_req_t, accelerator_resp_t, acc_mmu_req_t, and acc_mmu_resp_t to cva6.sv.
Replace the fall-through register with a spill register in acc_dispatcher to decouple timing with the accelerator.
Decrease cache sizes in cv64a6_imafdcv_sv39_config_pkg.
Modify Bender.yml package name from ariane to cva6.
Add harmless code to prevent synthesizer tool from crashing when compiling csr_regfile.
* Collateral changes:
Fixes:
Guard some X-IF code lines with correct parameter in cva6.sv.
Parametrize the tracer interface with NrCommitPorts.
Add missing local dependencies to Bender.yml.
---------
Co-authored-by: JeanRochCoulon <jean-roch.coulon@thalesgroup.com>
Add support for Superscalar with ZCMP, ZCMT and CVXIF.
ZCMP decoder, ZCMT decoder and CVXIF interface driver are using port 0.
Standard RVC and 32 bits instruction can take port 0 or 1.
Use CVA6Cfg.DebugEn in an outer test instead of in inner tests
Signed-off-by: André Sintzoff <andre.sintzoff@thalesgroup.com>
Co-authored-by: JeanRochCoulon <jean-roch.coulon@thalesgroup.com>
is_rvc is redundant with riscv::OpcodeC1, riscv::OpcodeC2
Signed-off-by: André Sintzoff <andre.sintzoff@thalesgroup.com>
Co-authored-by: JeanRochCoulon <jean-roch.coulon@thalesgroup.com>
## Introduction
This PR implements the ZCMT extension in the CVA6 core, targeting the 32-bit embedded-class platforms. ZCMT is a code-size reduction feature that utilizes compressed table jump instructions (cm.jt and cm.jalt) to reduce code size for embedded systems
**Note:** Due to implementation complexity, ZCMT extension is primarily targeted at embedded class CPUs. Additionally, it is not compatible with architecture class profiles.(Ref. [Unprivilege spec 27.20](https://drive.google.com/file/d/1uviu1nH-tScFfgrovvFCrj7Omv8tFtkp/view))
## Key additions
- Added zcmt_decoder module for compressed table jump instructions: cm.jt (jump table) and cm.jalt (jump-and-link table)
- Implemented the Jump Vector Table (JVT) CSR to store the base address of the jump table in csr_reg module
- Implemented a return address stack, enabling cm.jalt to behave equivalently to jal ra (jump-and-link with return address), by pushing the return address onto the stack in zcmt_decoder module
## Implementation in CVA6
The implementation of the ZCMT extension involves the following major modifications:
### compressed decoder
The compressed decoder scans and identifies the cm.jt and cm.jalt instructions, and generates signals indicating that the instruction is both compressed and a ZCMT instruction.
### zcmt_decoder
A new zcmt_decoder module was introduced to decode the cm.jt and cm.jalt instructions, fetch the base address of the JVT table from JVT CSR, extract the index and construct jump instructions to ensure efficient integration of the ZCMT extension in embedded platforms. Table.1 shows the IO port connection of zcmt_decoder module. High-level block diagram of zcmt implementation in CVA6 is shown in Figure 1.
_Table. 1 IO port connection with zcmt_decoder module_
Signals | IO | Description | Connection | Type
-- | -- | -- | -- | --
clk_i | in | Subsystem Clock | SUBSYSTEM | logic
rst_ni | in | Asynchronous reset active low | SUBSYSTEM | logic
instr_i | in | Instruction in | compressed_decoder | logic [31:0]
pc_i | in | Current PC | PC from FRONTEND | logic [CVA6Cfg.VLEN-1:0]
is_zcmt_instr_i | in | Is instruction a zcmt instruction | compressed_decoder | logic
illegal_instr_i | in | Is instruction a illegal instruction | compressed_decoder | logic
is_compressed_i | in | Is instruction a compressed instruction | compressed_decoder | logic
jvt_i | in | JVT struct from CSR | CSR | jvt_t
req_port_i | in | Handshake between CACHE and FRONTEND (fetch) | Cache | dcache_req_o_t
instr_o | out | Instruction out | cvxif_compressed_if_driver | logic [31:0]
illegal_instr_o | out | Is the instruction is illegal | cvxif_compressed_if_driver | logic
is_compressed_o | out | Is the instruction is compressed | cvxif_compressed_if_driver | logic
fetch_stall_o | out | Stall siganl | cvxif_compressed_if_driver | logic
req_port_o | out | Handshake between CACHE and FRONTEND (fetch) | Cache | dcache_req_i_t
### branch unit condition
A condition is implemented in the branch unit to ensure that ZCMT instructions always cause a misprediction, forcing the program to jump to the calculated address of the newly constructed jump instruction.
### JVT CSR
A new JVT csr is implemented in csr_reg which holds the base address of the JVT table. The base address is fetched from the JVT CSR, and combined with the index value to calculate the effective address.
### No MMU
Embedded platform does not utilize the MMU, so zcmt_decoder is connected with cache through port 0 of the Dcache module for implicit read access from the memory.

_Figure. 1 High level block diagram of ZCMT extension implementation_
## Known Limitations
The implementation targets 32-bit instructions for embedded-class platforms without an MMU. Since the core does not utilize an MMU, it is leveraged to connect the zcmt_decoder to the cache via port 0.
## Testing and Verification
- Developed directed test cases to validate cm.jt and cm.jalt instruction functionality
- Verified correct initialization and updates of JVT CSR
### Test Plan
A test plan is developed to test the functionality of ZCMT extension along with JVT CSR. Directed Assembly test executed to check the functionality.
_Table. 2 Test plan_
S.no | Features | Description | Pass/Fail Criteria | Test Type | Test status
-- | -- | -- | -- | ---- | --
1 | cm.jt | Simple assembly test to validate the working of cm.jt instruction in CV32A60x. | Check against Spike's ref. model | Directed | Pass
2 | cm.jalt | Simple assembly test to validate the working of cm.jalt instruction in both CV32A60x. | Check against Spike's ref. model | Directed | Pass
3 | cm.jalt with return address stack | Simple assembly test to validate the working of cm.jalt instruction with return address stack in both CV32A60x. It works as jump and link ( j ra, imm) | Check against Spike's ref. model | Directed | Pass
4 | JVT CSR | Read and write base address of Jump table to JVT CSR | Check against Spike's ref. model | Directed | Pass
**Note**: Please find the test under CVA6_REPO_DIR/verif/tests/custom/zcmt"
The AXI AW channel in the HPDcache is shared by three components:
Write Buffer
Flush Controller
Uncached Controller
The ID for each transaction is generated based on its source as follows:
Write Buffer: {1'b0, write_buffer_entry_index}
Flush Controller: {1'b1, flush_controller_index}
Uncached Controller: '1
To distinguish between flush transactions and uncached transactions, the flush transaction ID must include at least one 0.
Currently, the AXI ID is limited to 4 bits, while the flush controller supports 8 entries. As a result, when a transaction is sent from the 8th entry of the flush controller, all bits of the ID are set to 1. This causes the HPDcache to misroute the response to the uncached controller instead of the flush controller.
The parameter CVA6ConfigWtDcacheWbufDepth is used in the WB cache to set the number of flush entries. To avoid modifying the ID width, the number of flush entries must be less than 8. Non-power-of-two values are supported.
Co-authored-by: JeanRochCoulon <jean-roch.coulon@thalesgroup.com>
Fix decoding of some bitmanip instruction where decoding differs between rv32 and rv64.
Co-authored-by: JeanRochCoulon <jean-roch.coulon@thalesgroup.com>
#2691 extended the cva6_user_cfg_t struct by two new parameters to control the cache's flush behaviour. Add these new parameters to all configs to fix compilation errors due to incomplete struct literals.
This PR modifies some components in the CVA6 to fully support the WB mode of the HPDcache.
When on WB mode, there may be coherency issues between the Instruction Cache and the Data Cache. This may happen when the software writes on instruction segments (e.g. to relocate a code in memory).
This PR contains the following modifications:
The CVA6 controller module rises the flush signal to the caches when executing a fence or fence.i instruction.
The HPDcache cache subsystem translates this fence signal to a FLUSH request to the cache (when the HPDcache is in WB mode).
Add new parameters in the CVA6 configuration packages:
DcacheFlushOnInvalidate: It changes the behavior of the CVA6 controller. When this parameter is set, the controller rises the Flush signal on fence instructions.
DcacheInvalidateOnFlush: It changes the behavior of the HPDcache request adapter. When issuing a flush, it also asks the HPDcache to invalidate the cachelines.
Add additional values to the DcacheType enum: HPDCACHE_WT, HPDCACHE_WB, HPDCACHE_WT_WB
In addition, it also fixes some issues with the rvfi_mem_paddr signal from the store_buffer.
zero-extended the paddrs to match the axi_addr width and thus fix lint warnings. However, this breaks elaboration if AxiAddrWidth <= PLEN. To fix lint warnings without breaking parametrisation, use explicit casts to pad/truncate as required.
If the data user signal is disabled and the user bus width is reduced,
the slice operator into the user field will cause elaboration errors.
Since the faulty else block is anyways without effect, just remove it.
In RVH, interrupts are currently delegated if hxdeleg is set but mxdeleg
is not, violating the spec ("A trap/irq *that has been delegated to
HS-mode (using mxdeleg)* is further delegated to VS-mode if the
corresponding hxdeleg bit is set"). Fix and simplify the corresponding logic.
Integration of bitstream generation for Altera APU in general flow.
* Automatic generation of IPs and sources required for Altera FPGA
* Adaptation of bootrom code (UART used in Altera is different and needs a different driver)
* Generation of project for Quartus Pro adding required sources and constraints - Quartus Pro licence required by users
* Configuration file for openocd connection with vJTAG tap
* [CVXIF] Various fixes for bugs report with CVXIF's UVM agent
* Update options and simulators to support CVXIF's UVM agent
---------
Co-authored-by: ajalali <ayoub.jalali@external.thalesgroup.com>
Co-authored-by: André Sintzoff <61976467+ASintzoff@users.noreply.github.com>