Adding support for ZCMT Extension for Code-Size Reduction in CVA6 (#2659)
Some checks are pending
bender-up-to-date / bender-up-to-date (push) Waiting to run
ci / build-riscv-tests (push) Waiting to run
ci / execute-riscv64-tests (push) Blocked by required conditions
ci / execute-riscv32-tests (push) Blocked by required conditions

## Introduction
This PR implements the ZCMT extension in the CVA6 core, targeting the 32-bit embedded-class platforms. ZCMT is a code-size reduction feature that utilizes compressed table jump instructions (cm.jt and cm.jalt) to reduce code size for embedded systems
**Note:** Due to implementation complexity, ZCMT extension is primarily targeted at embedded class CPUs. Additionally, it is not compatible with architecture class profiles.(Ref. [Unprivilege spec 27.20](https://drive.google.com/file/d/1uviu1nH-tScFfgrovvFCrj7Omv8tFtkp/view))

## Key additions

- Added zcmt_decoder module for compressed table jump instructions: cm.jt (jump table) and cm.jalt (jump-and-link table)

- Implemented the Jump Vector Table (JVT) CSR to store the base address of the jump table in csr_reg module

- Implemented a return address stack, enabling cm.jalt to behave equivalently to jal ra (jump-and-link with return address), by pushing the return address onto the stack in zcmt_decoder module

## Implementation in CVA6
The implementation of the ZCMT extension involves the following major modifications:

### compressed decoder 
The compressed decoder scans and identifies the cm.jt and cm.jalt instructions, and generates signals indicating that the instruction is both compressed and a ZCMT instruction.

### zcmt_decoder
A new zcmt_decoder module was introduced to decode the cm.jt and cm.jalt instructions, fetch the base address of the JVT table from JVT CSR, extract the index and construct jump instructions to ensure efficient integration of the ZCMT extension in embedded platforms. Table.1 shows the IO port connection of zcmt_decoder module. High-level block diagram of zcmt implementation in CVA6 is shown in Figure 1.

_Table. 1 IO port connection with zcmt_decoder module_
Signals | IO | Description | Connection | Type
-- | -- | -- | -- | --
clk_i | in | Subsystem Clock | SUBSYSTEM | logic
rst_ni | in | Asynchronous reset active low | SUBSYSTEM | logic
instr_i | in | Instruction in | compressed_decoder | logic [31:0]
pc_i | in | Current PC | PC from FRONTEND | logic [CVA6Cfg.VLEN-1:0]
is_zcmt_instr_i | in | Is instruction a zcmt instruction | compressed_decoder | logic
illegal_instr_i | in | Is instruction a illegal instruction | compressed_decoder | logic
is_compressed_i | in | Is instruction a compressed instruction | compressed_decoder | logic
jvt_i | in | JVT struct from CSR | CSR | jvt_t
req_port_i | in | Handshake between CACHE and FRONTEND (fetch) | Cache | dcache_req_o_t
instr_o | out | Instruction out | cvxif_compressed_if_driver | logic [31:0]
illegal_instr_o | out | Is the instruction is illegal | cvxif_compressed_if_driver | logic
is_compressed_o | out | Is the instruction is compressed | cvxif_compressed_if_driver | logic
fetch_stall_o | out | Stall siganl | cvxif_compressed_if_driver | logic
req_port_o | out | Handshake between CACHE and FRONTEND (fetch) | Cache | dcache_req_i_t

### branch unit condition
A condition is implemented in the branch unit to ensure that ZCMT instructions always cause a misprediction, forcing the program to jump to the calculated address of the newly constructed jump instruction.

### JVT CSR
A new JVT csr is implemented in csr_reg which holds the base address of the JVT table. The base address is fetched from the JVT CSR, and combined with the index value to calculate the effective address.

### No MMU
Embedded platform does not utilize the MMU, so zcmt_decoder is connected with cache through port 0 of the Dcache module for implicit read access from the memory.

![zcmt_block drawio](https://github.com/user-attachments/assets/ac7bba75-4f56-42f4-9f5e-0c18f00d4dae)
_Figure. 1 High level block diagram of ZCMT extension implementation_

## Known Limitations
The implementation targets 32-bit instructions for embedded-class platforms without an MMU. Since the core does not utilize an MMU, it is leveraged to connect the zcmt_decoder to the cache via port 0.

## Testing and Verification

- Developed directed test cases to validate cm.jt and cm.jalt instruction functionality
- Verified correct initialization and updates of JVT CSR

### Test Plan 
A test plan is developed to test the functionality of ZCMT extension along with JVT CSR. Directed Assembly test executed to check the functionality. 

_Table. 2 Test plan_
S.no | Features | Description | Pass/Fail Criteria | Test Type | Test status
-- | -- | -- | -- | ---- | --
1 | cm.jt | Simple assembly test to validate the working of cm.jt instruction in  CV32A60x. | Check against Spike's ref. model | Directed | Pass
2 | cm.jalt | Simple assembly test to validate the working of cm.jalt instruction in both CV32A60x. | Check against Spike's ref. model | Directed | Pass
3 | cm.jalt with return address stack | Simple assembly test to validate the working of cm.jalt instruction with return address stack in both CV32A60x. It works as jump and link ( j ra, imm) | Check against Spike's ref. model | Directed | Pass
4 | JVT CSR | Read and write base address of Jump table to JVT CSR | Check against Spike's ref. model | Directed | Pass


**Note**: Please find the test under CVA6_REPO_DIR/verif/tests/custom/zcmt"
This commit is contained in:
Farhan Ali Shah 2025-01-27 17:23:26 +05:00 committed by GitHub
parent fb4a8d4472
commit 542fe39adc
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
42 changed files with 800 additions and 74 deletions

View file

@ -1,2 +1,2 @@
cv32a65x:
gates: 184701
gates: 184679

View file

@ -112,6 +112,7 @@ ${CVA6_REPO_DIR}/core/branch_unit.sv
${CVA6_REPO_DIR}/core/compressed_decoder.sv
${CVA6_REPO_DIR}/core/macro_decoder.sv
${CVA6_REPO_DIR}/core/controller.sv
${CVA6_REPO_DIR}/core/zcmt_decoder.sv
${CVA6_REPO_DIR}/core/csr_buffer.sv
${CVA6_REPO_DIR}/core/csr_regfile.sv
${CVA6_REPO_DIR}/core/decoder.sv

View file

@ -31,6 +31,8 @@ module branch_unit #(
input fu_data_t fu_data_i,
// Instruction PC - ISSUE_STAGE
input logic [CVA6Cfg.VLEN-1:0] pc_i,
// Is zcmt instruction - ISSUE_STAGE
input logic is_zcmt_i,
// Instruction is compressed - ISSUE_STAGE
input logic is_compressed_instr_i,
// Branch unit instruction is valid - ISSUE_STAGE
@ -74,13 +76,21 @@ module branch_unit #(
// we need to put the branch target address into rd, this is the result of this unit
branch_result_o = next_pc;
resolved_branch_o.pc = pc_i;
// There are only two sources of mispredicts:
// There are only three sources of mispredicts:
// 1. Branches
// 2. Jumps to register addresses
// 3. Zcmt instructions
if (branch_valid_i) begin
// write target address which goes to PC Gen
// write target address which goes to PC Gen or select target address if zcmt
resolved_branch_o.target_address = (branch_comp_res_i) ? target_address : next_pc;
resolved_branch_o.is_taken = branch_comp_res_i;
if (CVA6Cfg.RVZCMT) begin
if (is_zcmt_i) begin
// Unconditional jump handling
resolved_branch_o.is_mispredict = 1'b1; // miss prediction for ZCMT
resolved_branch_o.cf_type = ariane_pkg::JumpR;
end
end
// check the outcome of the branch speculation
if (ariane_pkg::op_is_branch(fu_data_i.operation)) begin
// Set the `cf_type` of the output as `branch`, this will update the BHT.

View file

@ -188,10 +188,10 @@ module wt_dcache
// read controllers (LD unit and PTW/MMU)
///////////////////////////////////////////////////////
// 0 is used by MMU, 1 by READ access requests
// 0 is used by MMU or implicit read by zcmt, 1 by READ access requests
for (genvar k = 0; k < NumPorts - 1; k++) begin : gen_rd_ports
// set these to high prio ports
if ((k == 0 && CVA6Cfg.MmuPresent) || (k == 1) || (k == 2 && CVA6Cfg.EnableAccelerator)) begin
if ((k == 0 && (CVA6Cfg.MmuPresent || CVA6Cfg.RVZCMT )) || (k == 1) || (k == 2 && CVA6Cfg.EnableAccelerator)) begin
assign rd_prio[k] = 1'b1;
wt_dcache_ctrl #(
.CVA6Cfg(CVA6Cfg),

View file

@ -31,7 +31,9 @@ module compressed_decoder #(
// Output instruction is macro - decoder
output logic is_macro_instr_o,
// Output instruction is compressed - decoder
output logic is_compressed_o
output logic is_compressed_o,
// Output instruction is macro - decoder
output logic is_zcmt_instr_o
);
// -------------------
@ -42,6 +44,7 @@ module compressed_decoder #(
is_compressed_o = 1'b1;
instr_o = instr_i;
is_macro_instr_o = 0;
is_zcmt_instr_o = 1'b0;
// I: | imm[11:0] | rs1 | funct3 | rd | opcode |
// S: | imm[11:5] | rs2 | rs1 | funct3 | imm[4:0] | opcode |
@ -867,18 +870,13 @@ module compressed_decoder #(
3'b000,
riscv::OpcodeStoreFp
};
end else if (CVA6Cfg.RVZCMP) begin
if (instr_i[12:10] == 3'b110 || instr_i[12:10] == 3'b111 || instr_i[12:10] == 3'b011) begin //is a push/pop instruction
is_macro_instr_o = 1;
instr_o = instr_i;
end else begin
illegal_instr_o = 1'b1;
end
end else begin
illegal_instr_o = 1'b1;
end
end else if (CVA6Cfg.RVZCMP && (instr_i[12:10] == 3'b110 || instr_i[12:10] == 3'b111 || instr_i[12:10] == 3'b011)) begin
is_macro_instr_o = 1;
instr_o = instr_i;
end else if (CVA6Cfg.RVZCMT && (instr_i[12:10] == 3'b000)) //jt/jalt instruction
is_zcmt_instr_o = 1'b1;
else illegal_instr_o = 1'b1;
end
riscv::OpcodeC2Swsp: begin
// c.swsp -> sw rs2, imm(x2)
instr_o = {

View file

@ -18,6 +18,7 @@ module csr_regfile
#(
parameter config_pkg::cva6_cfg_t CVA6Cfg = config_pkg::cva6_cfg_empty,
parameter type exception_t = logic,
parameter type jvt_t = logic,
parameter type irq_ctrl_t = logic,
parameter type scoreboard_entry_t = logic,
parameter type rvfi_probes_csr_t = logic,
@ -167,7 +168,9 @@ module csr_regfile
// TO_BE_COMPLETED - PERF_COUNTERS
output logic [31:0] mcountinhibit_o,
// RVFI
output rvfi_probes_csr_t rvfi_csr_o
output rvfi_probes_csr_t rvfi_csr_o,
//jvt output
output jvt_t jvt_o
);
localparam logic [63:0] SMODE_STATUS_READ_MASK = ariane_pkg::smode_status_read_mask(CVA6Cfg);
@ -295,6 +298,7 @@ module csr_regfile
assign pmpaddr_o = pmpaddr_q[(CVA6Cfg.NrPMPEntries>0?CVA6Cfg.NrPMPEntries-1 : 0):0];
riscv::fcsr_t fcsr_q, fcsr_d;
jvt_t jvt_q, jvt_d;
// ----------------
// Assignments
// ----------------
@ -350,6 +354,13 @@ module csr_regfile
read_access_exception = 1'b1;
end
end
riscv::CSR_JVT: begin
if (CVA6Cfg.RVZCMT) begin
csr_rdata = {jvt_q.base, jvt_q.mode};
end else begin
read_access_exception = 1'b1;
end
end
// non-standard extension
riscv::CSR_FTRAN: begin
if (CVA6Cfg.FpPresent && !(mstatus_q.fs == riscv::Off || (CVA6Cfg.RVH && v_q && vsstatus_q.fs == riscv::Off))) begin
@ -908,12 +919,14 @@ module csr_regfile
perf_we_o = 1'b0;
perf_data_o = 'b0;
if (CVA6Cfg.RVZCMT) begin
jvt_d = jvt_q;
end
fcsr_d = fcsr_q;
fcsr_d = fcsr_q;
priv_lvl_d = priv_lvl_q;
v_d = v_q;
debug_mode_d = debug_mode_q;
priv_lvl_d = priv_lvl_q;
v_d = v_q;
debug_mode_d = debug_mode_q;
if (CVA6Cfg.DebugEn) begin
dcsr_d = dcsr_q;
@ -1060,6 +1073,14 @@ module csr_regfile
riscv::CSR_DSCRATCH1:
if (CVA6Cfg.DebugEn) dscratch1_d = csr_wdata;
else update_access_exception = 1'b1;
riscv::CSR_JVT: begin
if (CVA6Cfg.RVZCMT) begin
jvt_d.base = csr_wdata[CVA6Cfg.XLEN-1:6];
jvt_d.mode = 6'b000000;
end else begin
update_access_exception = 1'b1;
end
end
// trigger module CSRs
riscv::CSR_TSELECT: update_access_exception = 1'b1; // not implemented
riscv::CSR_TDATA1: update_access_exception = 1'b1; // not implemented
@ -2444,8 +2465,16 @@ module csr_regfile
assign fflags_o = fcsr_q.fflags;
assign frm_o = fcsr_q.frm;
assign fprec_o = fcsr_q.fprec;
//JVT outputs
if (CVA6Cfg.RVZCMT) begin
assign jvt_o.base = jvt_q.base;
assign jvt_o.mode = jvt_q.mode;
end else begin
assign jvt_o.base = '0;
assign jvt_o.mode = '0;
end
// MMU outputs
assign satp_ppn_o = CVA6Cfg.RVS ? satp_q.ppn : '0;
assign satp_ppn_o = CVA6Cfg.RVS ? satp_q.ppn : '0;
assign vsatp_ppn_o = CVA6Cfg.RVH ? vsatp_q.ppn : '0;
assign hgatp_ppn_o = CVA6Cfg.RVH ? hgatp_q.ppn : '0;
if (CVA6Cfg.RVS) begin
@ -2510,6 +2539,9 @@ module csr_regfile
priv_lvl_q <= riscv::PRIV_LVL_M;
// floating-point registers
fcsr_q <= '0;
if (CVA6Cfg.RVZCMT) begin
jvt_q <= '0;
end
// debug signals
if (CVA6Cfg.DebugEn) begin
debug_mode_q <= 1'b0;
@ -2591,6 +2623,9 @@ module csr_regfile
priv_lvl_q <= priv_lvl_d;
// floating-point registers
fcsr_q <= fcsr_d;
if (CVA6Cfg.RVZCMT) begin
jvt_q <= jvt_d;
end
// debug signals
if (CVA6Cfg.DebugEn) begin
debug_mode_q <= debug_mode_d;
@ -2712,6 +2747,7 @@ module csr_regfile
// RVFI
//-------------
assign rvfi_csr_o.fcsr_q = CVA6Cfg.FpPresent ? fcsr_q : '0;
assign rvfi_csr_o.jvt_q = CVA6Cfg.RVZCMT ? jvt_q : '0;
assign rvfi_csr_o.dcsr_q = CVA6Cfg.DebugEn ? dcsr_q : '0;
assign rvfi_csr_o.dpc_q = CVA6Cfg.DebugEn ? dpc_q : '0;
assign rvfi_csr_o.dscratch0_q = CVA6Cfg.DebugEn ? dscratch0_q : '0;

View file

@ -86,6 +86,11 @@ module cva6
branchpredict_sbe_t branch_predict; // this field contains branch prediction information regarding the forward branch path
exception_t ex; // this field contains exceptions which might have happened earlier, e.g.: fetch exceptions
},
//JVT struct{base,mode}
localparam type jvt_t = struct packed {
logic [CVA6Cfg.XLEN-7:0] base;
logic [5:0] mode;
},
// ID/EX/WB Stage
localparam type scoreboard_entry_t = struct packed {
@ -113,6 +118,7 @@ module cva6
logic is_last_macro_instr; // is last decoded 32bit instruction of macro definition
logic is_double_rd_macro_instr; // is double move decoded 32bit instruction of macro definition
logic vfp; // is this a vector floating-point instruction?
logic is_zcmt; //is a zcmt instruction
},
localparam type writeback_t = struct packed {
logic valid; // wb data is valid
@ -415,6 +421,7 @@ module cva6
fu_data_t [CVA6Cfg.NrIssuePorts-1:0] fu_data_id_ex;
logic [CVA6Cfg.VLEN-1:0] pc_id_ex;
logic zcmt_id_ex;
logic is_compressed_instr_id_ex;
logic [CVA6Cfg.NrIssuePorts-1:0][31:0] tinst_ex;
// fixed latency units
@ -563,6 +570,8 @@ module cva6
riscv::pmpcfg_t [(CVA6Cfg.NrPMPEntries > 0 ? CVA6Cfg.NrPMPEntries-1 : 0):0] pmpcfg;
logic [(CVA6Cfg.NrPMPEntries > 0 ? CVA6Cfg.NrPMPEntries-1 : 0):0][CVA6Cfg.PLEN-3:0] pmpaddr;
logic [31:0] mcountinhibit_csr_perf;
//jvt
jvt_t jvt;
// ----------------------------
// Performance Counters <-> *
// ----------------------------
@ -617,6 +626,8 @@ module cva6
// ----------------
dcache_req_i_t [2:0] dcache_req_ports_ex_cache;
dcache_req_o_t [2:0] dcache_req_ports_cache_ex;
dcache_req_i_t dcache_req_ports_id_cache;
dcache_req_o_t dcache_req_ports_cache_id;
dcache_req_i_t [1:0] dcache_req_ports_acc_cache;
dcache_req_o_t [1:0] dcache_req_ports_cache_acc;
logic dcache_commit_wbuffer_empty;
@ -671,8 +682,11 @@ module cva6
id_stage #(
.CVA6Cfg(CVA6Cfg),
.branchpredict_sbe_t(branchpredict_sbe_t),
.dcache_req_i_t(dcache_req_i_t),
.dcache_req_o_t(dcache_req_o_t),
.exception_t(exception_t),
.fetch_entry_t(fetch_entry_t),
.jvt_t(jvt_t),
.irq_ctrl_t(irq_ctrl_t),
.scoreboard_entry_t(scoreboard_entry_t),
.interrupts_t(interrupts_t),
@ -716,7 +730,11 @@ module cva6
.compressed_ready_i(x_compressed_ready),
.compressed_resp_i (x_compressed_resp),
.compressed_valid_o(x_compressed_valid),
.compressed_req_o (x_compressed_req)
.compressed_req_o (x_compressed_req),
.jvt_i (jvt),
// DCACHE interfaces
.dcache_req_ports_i(dcache_req_ports_cache_id),
.dcache_req_ports_o(dcache_req_ports_id_cache)
);
logic [CVA6Cfg.NrWbPorts-1:0][CVA6Cfg.TRANS_ID_BITS-1:0] trans_id_ex_id;
@ -817,6 +835,7 @@ module cva6
.rs2_forwarding_o (rs2_forwarding_id_ex),
.fu_data_o (fu_data_id_ex),
.pc_o (pc_id_ex),
.is_zcmt_o (zcmt_id_ex),
.is_compressed_instr_o (is_compressed_instr_id_ex),
.tinst_o (tinst_ex),
// fixed latency unit ready
@ -908,6 +927,7 @@ module cva6
.rs2_forwarding_i(rs2_forwarding_id_ex),
.fu_data_i(fu_data_id_ex),
.pc_i(pc_id_ex),
.is_zcmt_i(zcmt_id_ex),
.is_compressed_instr_i(is_compressed_instr_id_ex),
.tinst_i(tinst_ex),
// fixed latency units
@ -1078,6 +1098,7 @@ module cva6
csr_regfile #(
.CVA6Cfg (CVA6Cfg),
.exception_t (exception_t),
.jvt_t (jvt_t),
.irq_ctrl_t (irq_ctrl_t),
.scoreboard_entry_t(scoreboard_entry_t),
.rvfi_probes_csr_t (rvfi_probes_csr_t),
@ -1154,6 +1175,7 @@ module cva6
.pmpcfg_o (pmpcfg),
.pmpaddr_o (pmpaddr),
.mcountinhibit_o (mcountinhibit_csr_perf),
.jvt_o (jvt),
//RVFI
.rvfi_csr_o (rvfi_csr)
);
@ -1258,15 +1280,29 @@ module cva6
dcache_req_o_t [NumPorts-1:0] dcache_req_from_cache;
// D$ request
assign dcache_req_to_cache[0] = dcache_req_ports_ex_cache[0];
// Since ZCMT is only enable for embdeed class so MMU should be disable.
// Cache port 0 is being ultilize in implicit read access in ZCMT extension.
if (CVA6Cfg.RVZCMT & ~(CVA6Cfg.MmuPresent)) begin
assign dcache_req_to_cache[0] = dcache_req_ports_id_cache;
end else begin
assign dcache_req_to_cache[0] = dcache_req_ports_ex_cache[0];
end
assign dcache_req_to_cache[1] = dcache_req_ports_ex_cache[1];
assign dcache_req_to_cache[2] = dcache_req_ports_acc_cache[0];
assign dcache_req_to_cache[3] = dcache_req_ports_ex_cache[2].data_req ? dcache_req_ports_ex_cache [2] :
dcache_req_ports_acc_cache[1];
// D$ response
assign dcache_req_ports_cache_ex[0] = dcache_req_from_cache[0];
assign dcache_req_ports_cache_ex[1] = dcache_req_from_cache[1];
// Since ZCMT is only enable for embdeed class so MMU should be disable.
// Cache port 0 is being ultilized in implicit read access in ZCMT extension.
if (CVA6Cfg.RVZCMT & ~(CVA6Cfg.MmuPresent)) begin
assign dcache_req_ports_cache_id = dcache_req_from_cache[0];
assign dcache_req_ports_cache_ex[0] = '0;
end else begin
assign dcache_req_ports_cache_ex[0] = dcache_req_from_cache[0];
assign dcache_req_ports_cache_id = '0;
end
assign dcache_req_ports_cache_ex[1] = dcache_req_from_cache[1];
assign dcache_req_ports_cache_acc[0] = dcache_req_from_cache[2];
always_comb begin : gen_dcache_req_store_data_gnt
dcache_req_ports_cache_ex[2] = dcache_req_from_cache[3];

View file

@ -418,7 +418,7 @@ module cva6_rvfi
`CONNECT_RVFI_SAME(1'b1, icache)
`CONNECT_RVFI_SAME(CVA6Cfg.EnableAccelerator, acc_cons)
`CONNECT_RVFI_SAME(CVA6Cfg.RVZCMT, jvt)
`CONNECT_RVFI_FULL(1'b1, pmpcfg0, csr.pmpcfg_q[CVA6Cfg.XLEN/8-1:0])
`CONNECT_RVFI_FULL(CVA6Cfg.XLEN == 32, pmpcfg1, csr.pmpcfg_q[7:4])

View file

@ -48,6 +48,10 @@ module decoder
input logic is_last_macro_instr_i,
// Is mvsa01/mva01s macro instruction - macro_decoder
input logic is_double_rd_macro_instr_i,
// Zcmt instruction - FRONTEND
input logic is_zcmt_i,
// Jump address - zcmt_decoder
input logic [CVA6Cfg.XLEN-1:0] jump_address_i,
// Is a branch predict instruction - FRONTEND
input branchpredict_sbe_t branch_predict_i,
// If an exception occured in fetch stage - FRONTEND
@ -178,6 +182,7 @@ module decoder
instruction_o.use_zimm = 1'b0;
instruction_o.bp = branch_predict_i;
instruction_o.vfp = 1'b0;
instruction_o.is_zcmt = is_zcmt_i;
ecall = 1'b0;
ebreak = 1'b0;
check_fprm = 1'b0;
@ -1500,13 +1505,18 @@ module decoder
imm_u_type = {
{CVA6Cfg.XLEN - 32{instruction_i[31]}}, instruction_i[31:12], 12'b0
}; // JAL, AUIPC, sign extended to 64 bit
imm_uj_type = {
{CVA6Cfg.XLEN - 20{instruction_i[31]}},
instruction_i[19:12],
instruction_i[20],
instruction_i[30:21],
1'b0
};
// if zcmt then xlen jump address assign to immidiate
if (CVA6Cfg.RVZCMT && is_zcmt_i) begin
imm_uj_type = {{CVA6Cfg.XLEN - 32{jump_address_i[31]}}, jump_address_i[31:0]};
end else begin
imm_uj_type = {
{CVA6Cfg.XLEN - 20{instruction_i[31]}},
instruction_i[19:12],
instruction_i[20],
instruction_i[30:21],
1'b0
};
end
// NOIMM, IIMM, SIMM, SBIMM, UIMM, JIMM, RS3
// select immediate

View file

@ -47,6 +47,8 @@ module ex_stage
input fu_data_t [CVA6Cfg.NrIssuePorts-1:0] fu_data_i,
// PC of the current instruction - ISSUE_STAGE
input logic [CVA6Cfg.VLEN-1:0] pc_i,
// Is_zcmt instruction - ISSUE_STAGE
input logic is_zcmt_i,
// Report whether instruction is compressed - ISSUE_STAGE
input logic is_compressed_instr_i,
// Report instruction encoding - ISSUE_STAGE
@ -320,6 +322,7 @@ module ex_stage
.debug_mode_i,
.fu_data_i (one_cycle_data),
.pc_i,
.is_zcmt_i,
.is_compressed_instr_i,
.branch_valid_i (|branch_valid_i),
.branch_comp_res_i (alu_branch_res),

View file

@ -16,8 +16,11 @@
module id_stage #(
parameter config_pkg::cva6_cfg_t CVA6Cfg = config_pkg::cva6_cfg_empty,
parameter type branchpredict_sbe_t = logic,
parameter type dcache_req_i_t = logic,
parameter type dcache_req_o_t = logic,
parameter type exception_t = logic,
parameter type fetch_entry_t = logic,
parameter type jvt_t = logic,
parameter type irq_ctrl_t = logic,
parameter type scoreboard_entry_t = logic,
parameter type interrupts_t = logic,
@ -83,9 +86,15 @@ module id_stage #(
// CVXIF Compressed interface
input logic [CVA6Cfg.XLEN-1:0] hart_id_i,
input logic compressed_ready_i,
//JVT
input jvt_t jvt_i,
input x_compressed_resp_t compressed_resp_i,
output logic compressed_valid_o,
output x_compressed_req_t compressed_req_o
output x_compressed_req_t compressed_req_o,
// Data cache request ouput - CACHE
input dcache_req_o_t dcache_req_ports_i,
// Data cache request input - CACHE
output dcache_req_i_t dcache_req_ports_o
);
// ID/ISSUE register stage
typedef struct packed {
@ -102,20 +111,23 @@ module id_stage #(
logic [CVA6Cfg.NrIssuePorts-1:0] is_illegal;
logic [CVA6Cfg.NrIssuePorts-1:0] is_illegal_cmp;
logic [CVA6Cfg.NrIssuePorts-1:0] is_illegal_cvxif;
logic [CVA6Cfg.NrIssuePorts-1:0][31:0] instruction;
logic [CVA6Cfg.NrIssuePorts-1:0][31:0] compressed_instr;
logic [CVA6Cfg.NrIssuePorts-1:0][31:0] instruction_cvxif;
logic [CVA6Cfg.NrIssuePorts-1:0] is_compressed;
logic [CVA6Cfg.NrIssuePorts-1:0] is_compressed_cmp;
logic [CVA6Cfg.NrIssuePorts-1:0] is_compressed_cvxif;
logic [CVA6Cfg.NrIssuePorts-1:0] is_macro_instr_i;
logic [CVA6Cfg.NrIssuePorts-1:0] stall_instr_fetch;
logic stall_macro_deco;
logic is_last_macro_instr_o;
logic is_double_rd_macro_instr_o;
logic [CVA6Cfg.NrIssuePorts-1:0] is_illegal_cvxif, is_illegal_cvxif_zcmp, is_illegal_cvxif_zcmt;
logic [CVA6Cfg.NrIssuePorts-1:0][31:0] instruction;
logic [CVA6Cfg.NrIssuePorts-1:0][31:0] compressed_instr;
logic [CVA6Cfg.NrIssuePorts-1:0][31:0]
instruction_cvxif, instruction_cvxif_zcmp, instruction_cvxif_zcmt;
logic [CVA6Cfg.NrIssuePorts-1:0] is_compressed;
logic [CVA6Cfg.NrIssuePorts-1:0] is_compressed_cmp;
logic [CVA6Cfg.NrIssuePorts-1:0]
is_compressed_cvxif, is_compressed_cvxif_zcmp, is_compressed_cvxif_zcmt;
logic [CVA6Cfg.NrIssuePorts-1:0] is_macro_instr_i;
logic [CVA6Cfg.NrIssuePorts-1:0] stall_instr_fetch;
logic stall_macro_deco, stall_macro_deco_zcmp, stall_macro_deco_zcmt;
logic is_last_macro_instr_o;
logic is_double_rd_macro_instr_o;
logic [CVA6Cfg.NrIssuePorts-1:0] is_zcmt_instr;
logic [ CVA6Cfg.XLEN-1:0] jump_address;
if (CVA6Cfg.RVC) begin
// ---------------------------------------------------------
@ -129,28 +141,62 @@ module id_stage #(
.instr_o (compressed_instr[i]),
.illegal_instr_o (is_illegal[i]),
.is_compressed_o (is_compressed[i]),
.is_macro_instr_o(is_macro_instr_i[i])
.is_macro_instr_o(is_macro_instr_i[i]),
.is_zcmt_instr_o (is_zcmt_instr[i])
);
end
if (CVA6Cfg.RVZCMP) begin
if (CVA6Cfg.RVZCMP || (CVA6Cfg.RVZCMT & ~CVA6Cfg.MmuPresent)) begin //MMU should be off when using ZCMT
//sequencial decoder
macro_decoder #(
.CVA6Cfg(CVA6Cfg)
) macro_decoder_i (
.instr_i (compressed_instr[0]),
.is_macro_instr_i (is_macro_instr_i[0]),
.clk_i (clk_i),
.rst_ni (rst_ni),
.instr_o (instruction_cvxif[0]),
.illegal_instr_i (is_illegal[0]),
.is_compressed_i (is_compressed[0]),
.issue_ack_i (issue_instr_ack_i[0]),
.illegal_instr_o (is_illegal_cvxif[0]),
.is_compressed_o (is_compressed_cvxif[0]),
.fetch_stall_o (stall_macro_deco),
.is_last_macro_instr_o (is_last_macro_instr_o),
.is_double_rd_macro_instr_o(is_double_rd_macro_instr_o)
);
if (CVA6Cfg.RVZCMP) begin
macro_decoder #(
.CVA6Cfg(CVA6Cfg)
) macro_decoder_i (
.instr_i (compressed_instr[0]),
.is_macro_instr_i (is_macro_instr_i[0]),
.clk_i (clk_i),
.rst_ni (rst_ni),
.instr_o (instruction_cvxif_zcmp),
.illegal_instr_i (is_illegal[0]),
.is_compressed_i (is_compressed[0]),
.issue_ack_i (issue_instr_ack_i[0]),
.illegal_instr_o (is_illegal_cvxif_zcmp),
.is_compressed_o (is_compressed_cvxif_zcmp),
.fetch_stall_o (stall_macro_deco_zcmp),
.is_last_macro_instr_o (is_last_macro_instr_o),
.is_double_rd_macro_instr_o(is_double_rd_macro_instr_o)
);
end
if (CVA6Cfg.RVZCMT) begin
zcmt_decoder #(
.CVA6Cfg(CVA6Cfg),
.dcache_req_i_t(dcache_req_i_t),
.dcache_req_o_t(dcache_req_o_t),
.jvt_t(jvt_t),
.branchpredict_sbe_t(branchpredict_sbe_t)
) zcmt_decoder_i (
.instr_i (compressed_instr[0]),
.pc_i (fetch_entry_i[0].address),
.is_zcmt_instr_i(is_zcmt_instr[0]),
.clk_i (clk_i),
.rst_ni (rst_ni),
.instr_o (instruction_cvxif_zcmt),
.illegal_instr_i(is_illegal[0]),
.is_compressed_i(is_compressed[0]),
.illegal_instr_o(is_illegal_cvxif_zcmt),
.is_compressed_o(is_compressed_cvxif_zcmt),
.fetch_stall_o (stall_macro_deco_zcmt),
.jvt_i (jvt_i),
.req_port_i (dcache_req_ports_i),
.req_port_o (dcache_req_ports_o),
.jump_address_o (jump_address)
);
end else assign jump_address = '0;
assign instruction_cvxif[0] = is_zcmt_instr[0] ? instruction_cvxif_zcmt : instruction_cvxif_zcmp;
assign is_illegal_cvxif[0] = is_zcmt_instr[0] ? is_illegal_cvxif_zcmt : is_illegal_cvxif_zcmp;
assign is_compressed_cvxif[0] = is_zcmt_instr[0] ? is_compressed_cvxif_zcmt : is_compressed_cvxif_zcmp;
assign stall_macro_deco = is_zcmt_instr[0] ? stall_macro_deco_zcmt : stall_macro_deco_zcmp;
if (CVA6Cfg.SuperscalarEn) begin
assign instruction_cvxif[CVA6Cfg.NrIssuePorts-1] = '0;
assign is_illegal_cvxif[CVA6Cfg.NrIssuePorts-1] = '0;
@ -203,6 +249,7 @@ module id_stage #(
);
assign is_last_macro_instr_o = '0;
assign is_double_rd_macro_instr_o = '0;
assign jump_address = '0;
end
end else begin
for (genvar i = 0; i < CVA6Cfg.NrIssuePorts; i++) begin
@ -211,6 +258,8 @@ module id_stage #(
assign is_illegal_cmp = '0;
assign is_compressed_cmp = '0;
assign is_macro_instr_i = '0;
assign is_zcmt_instr = '0;
assign jump_address = '0;
assign is_last_macro_instr_o = '0;
assign is_double_rd_macro_instr_o = '0;
if (CVA6Cfg.CvxifEn) begin
@ -240,8 +289,10 @@ module id_stage #(
.pc_i (fetch_entry_i[i].address),
.is_compressed_i (is_compressed_cmp[i]),
.is_macro_instr_i (is_macro_instr_i[i]),
.is_zcmt_i (is_zcmt_instr[i]),
.is_last_macro_instr_i (is_last_macro_instr_o),
.is_double_rd_macro_instr_i(is_double_rd_macro_instr_o),
.jump_address_i (jump_address),
.is_illegal_i (is_illegal_cmp[i]),
.instruction_i (instruction[i]),
.compressed_instr_i (fetch_entry_i[i].instruction[15:0]),
@ -351,3 +402,4 @@ module id_stage #(
end
end
endmodule

View file

@ -70,6 +70,7 @@ package build_config_pkg;
cfg.RVC = CVA6Cfg.RVC;
cfg.RVH = CVA6Cfg.RVH;
cfg.RVZCB = CVA6Cfg.RVZCB;
cfg.RVZCMT = CVA6Cfg.RVZCMT;
cfg.RVZCMP = CVA6Cfg.RVZCMP;
cfg.XFVec = CVA6Cfg.XFVec;
cfg.CvxifEn = CVA6Cfg.CvxifEn;

View file

@ -68,6 +68,8 @@ package config_pkg;
bit RVZCB;
// Zcmp RISC-V extension
bit RVZCMP;
// Zcmt RISC-V extension
bit RVZCMT;
// Zicond RISC-V extension
bit RVZiCond;
// Zicntr RISC-V extension
@ -258,6 +260,7 @@ package config_pkg;
bit RVH;
bit RVZCB;
bit RVZCMP;
bit RVZCMT;
bit XFVec;
bit CvxifEn;
bit RVZiCond;
@ -396,6 +399,8 @@ package config_pkg;
// Software Interrupt can be disabled when there is only M machine mode in CVA6.
assert (!(Cfg.RVS && !Cfg.SoftwareInterruptEn));
assert (!(Cfg.RVH && !Cfg.SoftwareInterruptEn));
assert (!(Cfg.SuperscalarEn && Cfg.RVZCMT));
assert (!(Cfg.RVZCMT && ~Cfg.MmuPresent));
// pragma translate_on
endfunction

View file

@ -43,6 +43,7 @@ package cva6_config_pkg;
RVV: bit'(0),
RVC: bit'(1),
RVH: bit'(0),
RVZCMT: bit'(1),
RVZCB: bit'(1),
RVZCMP: bit'(1),
XFVec: bit'(0),

View file

@ -43,6 +43,7 @@ package cva6_config_pkg;
RVV: bit'(0),
RVC: bit'(1),
RVH: bit'(0),
RVZCMT: bit'(0),
RVZCB: bit'(1),
RVZCMP: bit'(0),
XFVec: bit'(0),

View file

@ -100,6 +100,7 @@ package cva6_config_pkg;
RVZCB: bit'(CVA6ConfigZcbExtEn),
RVZCMP: bit'(CVA6ConfigZcmpExtEn),
XFVec: bit'(CVA6ConfigFVecEn),
RVZCMT: bit'(0),
CvxifEn: bit'(CVA6ConfigCvxifEn),
RVZiCond: bit'(CVA6ConfigRVZiCond),
RVZicntr: bit'(1),

View file

@ -96,6 +96,7 @@ package cva6_config_pkg;
RVC: bit'(CVA6ConfigCExtEn),
RVH: bit'(CVA6ConfigHExtEn),
RVZCB: bit'(CVA6ConfigZcbExtEn),
RVZCMT: bit'(0),
RVZCMP: bit'(CVA6ConfigZcmpExtEn),
XFVec: bit'(CVA6ConfigFVecEn),
CvxifEn: bit'(CVA6ConfigCvxifEn),

View file

@ -96,6 +96,7 @@ package cva6_config_pkg;
RVC: bit'(CVA6ConfigCExtEn),
RVH: bit'(CVA6ConfigHExtEn),
RVZCB: bit'(CVA6ConfigZcbExtEn),
RVZCMT: bit'(0),
RVZCMP: bit'(CVA6ConfigZcmpExtEn),
XFVec: bit'(CVA6ConfigFVecEn),
CvxifEn: bit'(CVA6ConfigCvxifEn),

View file

@ -95,6 +95,7 @@ package cva6_config_pkg;
RVC: bit'(CVA6ConfigCExtEn),
RVH: bit'(CVA6ConfigHExtEn),
RVZCB: bit'(CVA6ConfigZcbExtEn),
RVZCMT: bit'(0),
RVZCMP: bit'(CVA6ConfigZcmpExtEn),
XFVec: bit'(CVA6ConfigFVecEn),
CvxifEn: bit'(CVA6ConfigCvxifEn),

View file

@ -97,6 +97,7 @@ package cva6_config_pkg;
RVH: bit'(CVA6ConfigHExtEn),
RVZCB: bit'(CVA6ConfigZcbExtEn),
RVZCMP: bit'(CVA6ConfigZcmpExtEn),
RVZCMT: bit'(0),
XFVec: bit'(CVA6ConfigFVecEn),
CvxifEn: bit'(CVA6ConfigCvxifEn),
RVZiCond: bit'(CVA6ConfigRVZiCond),

View file

@ -100,6 +100,7 @@ package cva6_config_pkg;
RVH: bit'(CVA6ConfigHExtEn),
RVZCB: bit'(CVA6ConfigZcbExtEn),
RVZCMP: bit'(CVA6ConfigZcmpExtEn),
RVZCMT: bit'(0),
XFVec: bit'(CVA6ConfigFVecEn),
CvxifEn: bit'(CVA6ConfigCvxifEn),
RVZiCond: bit'(CVA6ConfigRVZiCond),

View file

@ -99,6 +99,7 @@ package cva6_config_pkg;
RVC: bit'(CVA6ConfigCExtEn),
RVH: bit'(CVA6ConfigHExtEn),
RVZCB: bit'(CVA6ConfigZcbExtEn),
RVZCMT: bit'(0),
RVZCMP: bit'(CVA6ConfigZcmpExtEn),
XFVec: bit'(CVA6ConfigFVecEn),
CvxifEn: bit'(CVA6ConfigCvxifEn),

View file

@ -107,6 +107,7 @@ package cva6_config_pkg;
RVH: bit'(CVA6ConfigHExtEn),
RVZCB: bit'(CVA6ConfigZcbExtEn),
RVZCMP: bit'(CVA6ConfigZcmpExtEn),
RVZCMT: bit'(0),
XFVec: bit'(CVA6ConfigFVecEn),
CvxifEn: bit'(CVA6ConfigCvxifEn),
RVZiCond: bit'(CVA6ConfigRVZiCond),

View file

@ -107,6 +107,7 @@ package cva6_config_pkg;
RVH: bit'(CVA6ConfigHExtEn),
RVZCB: bit'(CVA6ConfigZcbExtEn),
RVZCMP: bit'(CVA6ConfigZcmpExtEn),
RVZCMT: bit'(0),
XFVec: bit'(CVA6ConfigFVecEn),
CvxifEn: bit'(CVA6ConfigCvxifEn),
RVZiCond: bit'(CVA6ConfigRVZiCond),

View file

@ -99,6 +99,7 @@ package cva6_config_pkg;
RVC: bit'(CVA6ConfigCExtEn),
RVH: bit'(CVA6ConfigHExtEn),
RVZCB: bit'(CVA6ConfigZcbExtEn),
RVZCMT: bit'(0),
RVZCMP: bit'(CVA6ConfigZcmpExtEn),
XFVec: bit'(CVA6ConfigFVecEn),
CvxifEn: bit'(CVA6ConfigCvxifEn),

View file

@ -99,6 +99,7 @@ package cva6_config_pkg;
RVC: bit'(CVA6ConfigCExtEn),
RVH: bit'(CVA6ConfigHExtEn),
RVZCB: bit'(CVA6ConfigZcbExtEn),
RVZCMT: bit'(0),
RVZCMP: bit'(CVA6ConfigZcmpExtEn),
XFVec: bit'(CVA6ConfigFVecEn),
CvxifEn: bit'(CVA6ConfigCvxifEn),

View file

@ -99,6 +99,7 @@ package cva6_config_pkg;
RVC: bit'(CVA6ConfigCExtEn),
RVH: bit'(CVA6ConfigHExtEn),
RVZCB: bit'(CVA6ConfigZcbExtEn),
RVZCMT: bit'(0),
RVZCMP: bit'(CVA6ConfigZcmpExtEn),
XFVec: bit'(CVA6ConfigFVecEn),
CvxifEn: bit'(CVA6ConfigCvxifEn),

View file

@ -99,6 +99,7 @@ package cva6_config_pkg;
RVC: bit'(CVA6ConfigCExtEn),
RVH: bit'(CVA6ConfigHExtEn),
RVZCB: bit'(CVA6ConfigZcbExtEn),
RVZCMT: bit'(0),
RVZCMP: bit'(CVA6ConfigZcmpExtEn),
XFVec: bit'(CVA6ConfigFVecEn),
CvxifEn: bit'(CVA6ConfigCvxifEn),

View file

@ -100,6 +100,7 @@ package cva6_config_pkg;
RVH: bit'(CVA6ConfigHExtEn),
RVZCB: bit'(CVA6ConfigZcbExtEn),
RVZCMP: bit'(CVA6ConfigZcmpExtEn),
RVZCMT: bit'(0),
XFVec: bit'(CVA6ConfigFVecEn),
CvxifEn: bit'(CVA6ConfigCvxifEn),
RVZiCond: bit'(CVA6ConfigRVZiCond),

View file

@ -52,6 +52,7 @@ package cva6_config_pkg;
RVH: bit'(0),
RVZCB: bit'(1),
RVZCMP: bit'(0),
RVZCMT: bit'(0),
XFVec: bit'(0),
CvxifEn: bit'(1),
RVZiCond: bit'(0),

View file

@ -385,6 +385,8 @@ package riscv;
CSR_FFLAGS = 12'h001,
CSR_FRM = 12'h002,
CSR_FCSR = 12'h003,
//jvt
CSR_JVT = 12'h017,
CSR_FTRAN = 12'h800,
// Vector CSRs
CSR_VSTART = 12'h008,
@ -724,6 +726,8 @@ package riscv;
localparam logic [63:0] SSTATUS_MXR = 'h00080000;
localparam logic [63:0] SSTATUS_UPIE = 'h00000010;
localparam logic [63:0] SSTATUS_UXL = 64'h0000000300000000;
// CSR Bit Implementation Masks
function automatic logic [63:0] sstatus_sd(logic IS_XLEN64);
return {IS_XLEN64, 31'h00000000, ~IS_XLEN64, 31'h00000000};
endfunction

View file

@ -39,6 +39,7 @@
rvfi_csr_elmt_t fflags; \
rvfi_csr_elmt_t frm; \
rvfi_csr_elmt_t fcsr; \
rvfi_csr_elmt_t jvt; \
rvfi_csr_elmt_t ftran; \
rvfi_csr_elmt_t dcsr; \
rvfi_csr_elmt_t dpc; \
@ -130,6 +131,7 @@
`define RVFI_PROBES_CSR_T(Cfg) struct packed { \
riscv::fcsr_t fcsr_q; \
riscv::dcsr_t dcsr_q; \
logic [Cfg.XLEN-1:0] jvt_q; \
logic [Cfg.XLEN-1:0] dpc_q; \
logic [Cfg.XLEN-1:0] dscratch0_q; \
logic [Cfg.XLEN-1:0] dscratch1_q; \

View file

@ -56,6 +56,8 @@ module issue_read_operands
output logic [CVA6Cfg.NrIssuePorts-1:0][CVA6Cfg.XLEN-1:0] rs2_forwarding_o,
// Program Counter - EX_STAGE
output logic [CVA6Cfg.VLEN-1:0] pc_o,
// Is zcmt - EX_STAGE
output logic is_zcmt_o,
// Is compressed instruction - EX_STAGE
output logic is_compressed_instr_o,
// Fixed Latency Unit is ready - EX_STAGE
@ -119,7 +121,6 @@ module issue_read_operands
input logic [CVA6Cfg.NrCommitPorts-1:0] we_gpr_i,
// FPR write enable - COMMIT_STAGE
input logic [CVA6Cfg.NrCommitPorts-1:0] we_fpr_i,
// Issue stall - PERF_COUNTERS
output logic stall_issue_o
);
@ -1140,6 +1141,7 @@ module issue_read_operands
tinst_q <= '0;
end
pc_o <= '0;
is_zcmt_o <= '0;
is_compressed_instr_o <= 1'b0;
branch_predict_o <= {cf_t'(0), {CVA6Cfg.VLEN{1'b0}}};
x_transaction_rejected_o <= 1'b0;
@ -1148,10 +1150,24 @@ module issue_read_operands
if (CVA6Cfg.RVH) begin
tinst_q <= tinst_n;
end
pc_o <= pc_n;
is_compressed_instr_o <= is_compressed_instr_n;
branch_predict_o <= branch_predict_n;
x_transaction_rejected_o <= x_transaction_rejected_n;
if (CVA6Cfg.SuperscalarEn) begin
if (issue_instr_i[1].fu == CTRL_FLOW) begin
pc_o <= issue_instr_i[1].pc;
is_compressed_instr_o <= issue_instr_i[1].is_compressed;
branch_predict_o <= issue_instr_i[1].bp;
end
end
if (issue_instr_i[0].fu == CTRL_FLOW) begin
pc_o <= issue_instr_i[0].pc;
is_compressed_instr_o <= issue_instr_i[0].is_compressed;
branch_predict_o <= issue_instr_i[0].bp;
if (CVA6Cfg.RVZCMT) is_zcmt_o <= issue_instr_i[0].is_zcmt;
else is_zcmt_o <= '0;
end
x_transaction_rejected_o <= 1'b0;
if (issue_instr_i[0].fu == CVXIF) begin
x_transaction_rejected_o <= x_transaction_rejected;
end
end
end

View file

@ -60,6 +60,8 @@ module issue_stage
output fu_data_t [CVA6Cfg.NrIssuePorts-1:0] fu_data_o,
// Program Counter - EX_STAGE
output logic [CVA6Cfg.VLEN-1:0] pc_o,
// Is zcmt instruction - EX_STAGE
output logic is_zcmt_o,
// Is compressed instruction - EX_STAGE
output logic is_compressed_instr_o,
// Transformed trap instruction - EX_STAGE
@ -263,6 +265,7 @@ module issue_stage
.rs1_forwarding_o (rs1_forwarding_xlen),
.rs2_forwarding_o (rs2_forwarding_xlen),
.pc_o,
.is_zcmt_o,
.is_compressed_instr_o,
.flu_ready_i (flu_ready_i),
.alu_valid_o (alu_valid_o),

131
core/zcmt_decoder.sv Normal file
View file

@ -0,0 +1,131 @@
// Licensed under the Solderpad Hardware Licence, Version 2.1 (the "License");
// you may not use this file except in compliance with the License.
// SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1
// You may obtain a copy of the License at https://solderpad.org/licenses/
//
// Author: Farhan Ali Shah, 10xEngineers
// Date: 15.11.2024
// Description: ZCMT extension in the CVA6 core targeting the 32-bit embedded-class platforms (CV32A60x).
// ZCMT is a code-size reduction feature that utilizes compressed table jump instructions (cm.jt and cm.jalt) to
//reduce code size for embedded systems
//
module zcmt_decoder #(
parameter config_pkg::cva6_cfg_t CVA6Cfg = config_pkg::cva6_cfg_empty,
parameter type dcache_req_i_t = logic,
parameter type dcache_req_o_t = logic,
parameter type jvt_t = logic,
parameter type branchpredict_sbe_t = logic
) (
// Subsystem Clock - SUBSYSTEM
input logic clk_i,
// Asynchronous reset active low - SUBSYSTEM
input logic rst_ni,
// Instruction input - compressed_decoder
input logic [ 31:0] instr_i,
// current PC - FRONTEND
input logic [CVA6Cfg.VLEN-1:0] pc_i,
// Intruction is of ZCMT extension - compressed_decoder
input logic is_zcmt_instr_i,
// Instruction is illegal - compressed_decoder
input logic illegal_instr_i,
// Instruction is compressed - compressed_decoder
input logic is_compressed_i,
// JVT struct input - CSR
input jvt_t jvt_i,
// Data cache request output - CACHE
input dcache_req_o_t req_port_i,
// Instruction out - cvxif_compressed_if_driver
output logic [ 31:0] instr_o,
// Instruction is illegal out - cvxif_compressed_if_driver
output logic illegal_instr_o,
// Instruction is compressed out - cvxif_compressed_if_driver
output logic is_compressed_o,
// Fetch stall - cvxif_compressed_if_driver
output logic fetch_stall_o,
// Data cache request input - CACHE
output dcache_req_i_t req_port_o,
// jump_address
output logic [CVA6Cfg.XLEN-1:0] jump_address_o
);
// FSM States
enum logic {
IDLE, // if ZCMT instruction then request sent to fetch the entry from jump table
TABLE_JUMP // Check the valid data from jump table and Calculate the offset for jump and create jal instruction
}
state_d, state_q;
// Temporary registers
// Physical address: jvt + (index <<2)
logic [CVA6Cfg.VLEN-1:0] table_address;
always_comb begin
state_d = state_q;
illegal_instr_o = 1'b0;
is_compressed_o = is_zcmt_instr_i || is_compressed_i;
fetch_stall_o = is_zcmt_instr_i;
jump_address_o = '0;
// cache request port
req_port_o.data_wdata = '0;
req_port_o.data_wuser = '0;
req_port_o.data_req = 1'b0;
req_port_o.data_we = 1'b0;
req_port_o.data_be = '0;
req_port_o.data_size = 2'b10;
req_port_o.data_id = 1'b1;
req_port_o.kill_req = 1'b0;
req_port_o.tag_valid = 1'b1;
unique case (state_q)
IDLE: begin
if (is_zcmt_instr_i) begin
if (CVA6Cfg.XLEN == 32) begin //It is only target for 32 bit targets in cva6 with No MMU
table_address = {jvt_i.base, 6'b000000} + {24'h0, instr_i[7:2], 2'b00};
req_port_o.address_index = table_address[9:0];
req_port_o.address_tag = table_address[CVA6Cfg.VLEN-1:10]; // No MMU support
state_d = TABLE_JUMP;
req_port_o.data_req = 1'b1;
end else illegal_instr_o = 1'b1;
// Condition may be extented for 64 bits embedded targets with No MMU
end else begin
illegal_instr_o = illegal_instr_i;
instr_o = instr_i;
state_d = IDLE;
end
end
TABLE_JUMP: begin
if (req_port_i.data_rvalid) begin
// save the PC relative Xlen table jump address
jump_address_o = $unsigned($signed(req_port_i.data_rdata) - $signed(pc_i));
if (instr_i[9:2] < 32) begin // jal pc_offset, x0 for no return stack
instr_o = {
20'h0, 5'h0, riscv::OpcodeJal
}; // immidiate assigned here (0) will be overwrite in decode stage with jump_address_o
end else if ((instr_i[9:2] >= 32) & (instr_i[9:2] <= 255)) begin //- jal pc_offset, x1 for return stack
instr_o = {
20'h0, 5'h1, riscv::OpcodeJal
}; // immidiate assigned here (0) will be overwrite in decode stage with jump_address_o
end else begin
illegal_instr_o = 1'b1;
instr_o = instr_i;
end
state_d = IDLE;
end else begin
state_d = TABLE_JUMP;
end
end
default: begin
state_d = IDLE;
end
endcase
end
always_ff @(posedge clk_i or negedge rst_ni) begin
if (~rst_ni) begin
state_q <= IDLE;
end else begin
state_q <= state_d;
end
end
endmodule

View file

@ -0,0 +1,62 @@
.globl _start
_start:
la t0, trap_handler
csrw mtvec, t0
la a1, target1
la t0, __jvt_base$
sw a1, 128(t0) //cm.jalt entry start from index >=32
csrw jvt, t0
fence.i
# Perform jump using the index from JVT
cm.jalt 32
li t1, 1
addi x20,x20, 9
j write_tohost
exit:
j write_tohost
write_tohost:
li x1, 1
la t0, tohost
sw x1, 0(t0)
j write_tohost
# Jump Vector Table (JVT) Section
# Create a separate section for the JVT
.section .riscv.jvt, "ax"
.align 6 # Align the JVT on a 64-byte boundary (6 = 2^6 = 64)
__jvt_base$:
.word 0x80000054
.word 0x80000800
.word 0x80000802
.word 0x80000804
# Target Addresses (Where cm.jalt will jump)
target0:
li x5, 99
j write_tohost
target1:
li x2, 99
j write_tohost
target2:
addi x2,x20, 5
j write_tohost
trap_handler:
j exit
.align 6; .global tohost; tohost: .dword 0;
.align 6; .global fromhost; fromhost: .dword 0;

View file

@ -0,0 +1,59 @@
.globl _start
_start:
la t0, trap_handler
csrw mtvec, t0
la a1, target1
la t0, __jvt_base$
sw a1, 128(t0) //cm.jalt entry start from index >=32
csrw jvt, t0
fence.i
# Perform jump using the index from JVT
cm.jalt 32
li t1, 1
addi x20,x20, 9
j write_tohost
exit:
j write_tohost
write_tohost:
li x1, 1
la t0, tohost
sw x1, 0(t0)
j write_tohost
# Jump Vector Table (JVT) Section
# Create a separate section for the JVT
.section .riscv.jvt, "ax"
.align 6 # Align the JVT on a 64-byte boundary (6 = 2^6 = 64)
__jvt_base$:
.word 0x80000054
.word 0x80000800
.word 0x80000802
.word 0x80000804
# Target Addresses (Where cm.jalt will jump)
.align 20
target1:
li x2, 99
lui t0, %hi(write_tohost) # Load upper 20 bits of target address into t0
addi t0, t0, %lo(write_tohost) # Add the lower 12 bits to t0
jalr x0, 0(t0)
trap_handler:
lui t0, %hi(write_tohost) # Load upper 20 bits of target address into t0
addi t0, t0, %lo(write_tohost) # Add the lower 12 bits to t0
jalr x0, 0(t0)
.align 6; .global tohost; tohost: .dword 0;
.align 6; .global fromhost; fromhost: .dword 0;

View file

@ -0,0 +1,57 @@
.globl _start
_start:
la t0, trap_handler
csrw mtvec, t0
la a1, target1
la t0, __jvt_base$
sw a1, 128(t0) //cm.jalt entry start from index >=32
csrw jvt, t0
fence.i
# Perform jump using the index from JVT
cm.jalt 32
li t1, 1
addi x20,x20, 9
j write_tohost
exit:
j write_tohost
write_tohost:
li x1, 1
la t0, tohost
sw x1, 0(t0)
j write_tohost
# Jump Vector Table (JVT) Section
# Create a separate section for the JVT
.section .riscv.jvt, "ax"
.align 6 # Align the JVT on a 64-byte boundary (6 = 2^6 = 64)
__jvt_base$:
.word 0x80000054
.word 0x80000800
.word 0x80000802
.word 0x80000804
# Target Addresses (Where cm.jalt will jump)
.align 20
target1:
li x2, 99
ret
trap_handler:
lui t0, %hi(write_tohost) # Load upper 20 bits of target address into t0
addi t0, t0, %lo(write_tohost) # Add the lower 12 bits to t0
jalr x0, 0(t0)
.align 6; .global tohost; tohost: .dword 0;
.align 6; .global fromhost; fromhost: .dword 0;

View file

@ -0,0 +1,62 @@
.globl _start
_start:
la t0, trap_handler
csrw mtvec, t0
la a1, target1
la t0, __jvt_base$
sw a1, 128(t0) //cm.jalt entry start from index >=32
csrw jvt, t0
fence.i
# Perform jump using the index from JVT
cm.jalt 32
li t1, 1
addi x20,x20, 9
j write_tohost
exit:
j write_tohost
write_tohost:
li x1, 1
la t0, tohost
sw x1, 0(t0)
j write_tohost
# Jump Vector Table (JVT) Section
# Create a separate section for the JVT
.section .riscv.jvt, "ax"
.align 6 # Align the JVT on a 64-byte boundary (6 = 2^6 = 64)
__jvt_base$:
.word 0x80000054
.word 0x80000800
.word 0x80000802
.word 0x80000804
# Target Addresses (Where cm.jalt will jump)
target0:
li x5, 9
j write_tohost
target1:
li x2, 99
ret
target2:
addi x2,x20, 5
j write_tohost
trap_handler:
j exit
.align 6; .global tohost; tohost: .dword 0;
.align 6; .global fromhost; fromhost: .dword 0;

View file

@ -0,0 +1,58 @@
.globl _start
_start:
la t0, trap_handler
csrw mtvec, t0
la a1, target1
la t0, __jvt_base$
sw a1, 0(t0)
csrw jvt, t0
fence.i
cm.jt 0 # Perform jump using the index 0 from JVT
addi x18,x18, 3
j target2
exit:
j write_tohost
write_tohost:
li x1, 1
la t0, tohost
sw x1, 0(t0)
j write_tohost
# Jump Vector Table (JVT) Section
# Create a separate section for the JVT
.section .riscv.jvt, "ax"
.align 6 # Align the JVT on a 64-byte boundary (6 = 2^6 = 64)
__jvt_base$:
.word 0x80000054
.word 0x80000800
.word 0x80000802
.word 0x80000804
# Target Addresses (Where cm.jt will jump)
target0:
j write_tohost
target1:
addi x6,x0, 7
j write_tohost
target2:
addi x2,x20, 5
j write_tohost
trap_handler:
j exit
.align 6; .global tohost; tohost: .dword 0;
.align 6; .global fromhost; fromhost: .dword 0;

View file

@ -0,0 +1,52 @@
.globl _start
_start:
la t0, trap_handler
csrw mtvec, t0
la a1, target1
la t0, __jvt_base$
sw a1, 0(t0)
csrw jvt, t0
fence.i
cm.jt 0 # Perform jump using the index 0 from JVT
addi x18,x18, 3
write_tohost:
li x1, 1
la t0, tohost
sw x1, 0(t0)
j write_tohost
# Jump Vector Table (JVT) Section
# Create a separate section for the JVT
.section .riscv.jvt, "ax"
.align 6 # Align the JVT on a 64-byte boundary (6 = 2^6 = 64)
__jvt_base$:
.word 0x80000054
.word 0x80000800
.word 0x80000802
.word 0x80000804
# Target Addresses (Where cm.jt will jump)
.align 20
target1:
addi x6,x0, 6
la t0, write_tohost # Load upper 20 bits of target address into t0
jalr x0, 0(t0)
trap_handler:
lui t0, %hi(write_tohost) # Load upper 20 bits of target address into t0
addi t0, t0, %lo(write_tohost) # Add the lower 12 bits to t0
jalr x0, 0(t0)
.align 6; .global tohost; tohost: .dword 0;
.align 6; .global fromhost; fromhost: .dword 0;

View file

@ -0,0 +1,51 @@
.globl _start
_start:
la t0, trap_handler
csrw mtvec, t0
la t0, __jvt_base$
csrw jvt, t0
fence.i
csrr x7, jvt
j exit
exit:
j write_tohost
write_tohost:
li x1, 1
la t0, tohost
sw x1, 0(t0)
j write_tohost
# Jump Vector Table (JVT) Section
# Create a separate section for the JVT
.section .riscv.jvt, "ax"
.align 6 # Align the JVT on a 64-byte boundary (6 = 2^6 = 64)
__jvt_base$:
.word 0x80000054
.word 0x80000800
.word 0x80000802
.word 0x80000804
# Target Addresses (Where cm.jt will jump)
target0:
j write_tohost
target1:
addi x6,x0, 7
j write_tohost
target2:
addi x2,x20, 5
j write_tohost
trap_handler:
j exit
.align 6; .global tohost; tohost: .dword 0;
.align 6; .global fromhost; fromhost: .dword 0;