Adding support for Scalar Cryptography Extensions (Zkn -- Zbkx, Zkne, Zknd, Zknh) (#2804)

* Introduction
This PR adds support for Zbkx, Zkne, Zknd and Zknh extensions in the CVA6 core. It also adds the documentation and tests for these extensions. These changes have been tested with self-written single instruction tests and with the riscv-arch-tests. This PR will complete the Zkn - NIST Algorithm Suite extension.

* Implementation
Zbkx Extension:
Added support for the Zbkx instruction set. It essentially expands the Bitmanip extension with additional instructions useful in cryptography. These instructions are xperm8, xperm4.

Zkne Extension:
Added support for the Zkne instruction set. It essentially adds AES encryption support for scalar cryptography. These instructions are aes32esi, aes32esmi, aes64es, aes64esm, aes64ks1i, aes64ks2.

Zknd Extension:
Added support for the Zknd instruction set. It adds AES decryption support for scalar cryptography. These instructions are aes32dsi, aes32dsmi, aes64ds, aes64dsm, aes64im, aes64ks1i, aes64ks2.

Note:
The aes64ks1i and aes64ks2 instructions are present in both the Zknd and Zkne extensions.

Zknh Extension:
Added support for the Zknh instruction set. It adds the hash function instructions support for scalar cryptography. These instructions are sha256sig0, sha256sig1, sha256sum0, sha256sum1, sha512sig0h, sha512sig0l, sha512sig1h, sha512sig1l, sha512sum0r, sha512sum1r, sha512sig0, sha512sig1, sha512sum0, sha512sum1.

* Modifications
Updated the ALU and decoder to recognize and handle Zbkx instructions. For Zkne, Zknd & Zknh, the decoder will now select the AES unit as functional unit instead of the ALU.

The complete Zkn extension is added under the ZKN bit for ease of use. This configuration will also require the RVB (bitmanip) bit to be set.

Note:
The Zkn extension does not require the use of vectorial fpu.

* AES Functional Unit
A new functional unit was created inside the execute stage that will handle all AES and Hashing instructions (Zkne, Zknd, Zknh).
A new package "aes_pkg" handles all AES functions such as sbox substitution, mix columns, etc.
aes_unit

* Documentation and Reference
The official RISC-V Cryptography Extensions Volume I was followed to ensure alignment with ratification. The relevant documentation for Zbkx, Zkne, Zknd and Zknh instructions was also added.

* Verification
Assembly Tests:
The instructions were tested and verified with the K module of both 32 bit and 64 bit versions of the riscv-arch-tests to ensure proper functionality. These tests check for ISA compliance, edge cases and use assertions to ensure expected behavior.
This commit is contained in:
Munail Waqar 2025-05-11 21:02:28 +05:00 committed by GitHub
parent 4a3629bff7
commit 6d9b76e560
No known key found for this signature in database
GPG key ID: B5690EEEBB952194
17 changed files with 1607 additions and 21 deletions

View file

@ -71,6 +71,7 @@ ${CVA6_REPO_DIR}/core/include/wt_cache_pkg.sv
${CVA6_REPO_DIR}/core/include/std_cache_pkg.sv
${CVA6_REPO_DIR}/core/include/instr_tracer_pkg.sv
${CVA6_REPO_DIR}/core/include/build_config_pkg.sv
${CVA6_REPO_DIR}/core/include/aes_pkg.sv
//CVXIF
${CVA6_REPO_DIR}/core/cvxif_compressed_if_driver.sv
@ -106,6 +107,7 @@ ${CVA6_REPO_DIR}/vendor/pulp-platform/common_cells/src/delta_counter.sv
${CVA6_REPO_DIR}/core/cva6.sv
${CVA6_REPO_DIR}/core/cva6_rvfi_probes.sv
${CVA6_REPO_DIR}/core/alu.sv
${CVA6_REPO_DIR}/core/aes.sv
// Note: depends on fpnew_pkg, above
${CVA6_REPO_DIR}/core/fpu_wrap.sv
${CVA6_REPO_DIR}/core/branch_unit.sv

234
core/aes.sv Normal file
View file

@ -0,0 +1,234 @@
// Licensed under the Solderpad Hardware Licence, Version 2.1 (the "License");
// you may not use this file except in compliance with the License.
// SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1
// You may obtain a copy of the License at https://solderpad.org/licenses/
//
// Author: Munail Waqar, 10xEngineers
// Date: 03.05.2025
// Description: The Zkn extension including its subsets accelerates cryptographic workloads by introducing dedicated
// scalar instructions compliant with the RISC-V Scalar Cryptography specification. The subsets include:
// Zknd (AES Decryption and related instructions), Zkne (AES Encryption support, including AES rounds and key expansion steps),
// Zknh (SHA-256 and SHA-512 hash functions for secure hashing operations).
//
module aes
import ariane_pkg::*;
import aes_pkg::*;
#(
parameter config_pkg::cva6_cfg_t CVA6Cfg = config_pkg::cva6_cfg_empty,
parameter type fu_data_t = logic
) (
// Subsystem Clock - SUBSYSTEM
input logic clk_i,
// Asynchronous reset active low - SUBSYSTEM
input logic rst_ni,
// FU data needed to execute instruction - ISSUE_STAGE
input fu_data_t fu_data_i,
// Original instruction bits for aes
input logic [ 5:0] orig_instr_aes,
// AES result - ISSUE_STAGE
output logic [CVA6Cfg.XLEN-1:0] result_o
);
logic [63:0] sr;
logic [ 7:0] sbox_in;
logic [31:0] aes32esi_gen;
logic [31:0] aes32esmi_gen;
logic [63:0] aes64es_gen;
logic [63:0] aes64esm_gen;
logic [31:0] aes32dsi_gen;
logic [31:0] aes32dsmi_gen;
logic [63:0] sr_inv;
logic [63:0] aes64ds_gen;
logic [63:0] aes64dsm_gen;
logic [63:0] aes64im_gen;
logic [63:0] aes64ks1i_gen;
logic [63:0] aes64ks2_gen;
logic [31:0] sha256sig0_gen;
logic [31:0] sha256sig1_gen;
logic [31:0] sha256sum0_gen;
logic [31:0] sha256sum1_gen;
logic [31:0] sha512sig0h_gen;
logic [31:0] sha512sig0l_gen;
logic [31:0] sha512sig1h_gen;
logic [31:0] sha512sig1l_gen;
logic [31:0] sha512sum0r_gen;
logic [31:0] sha512sum1r_gen;
logic [63:0] sha512sig0_gen;
logic [63:0] sha512sig1_gen;
logic [63:0] sha512sum0_gen;
logic [63:0] sha512sum1_gen;
// AES gen block
if (CVA6Cfg.ZKN && CVA6Cfg.RVB) begin : aes_gen_block
// SHA256 sigma0 transformation function by rotating, shifting and XORing rs1
assign sha256sig0_gen = (fu_data_i.operand_a[31:0] >> 7 | fu_data_i.operand_a[31:0] << 25) ^ (fu_data_i.operand_a[31:0] >> 18 | fu_data_i.operand_a[31:0] << 14) ^ (fu_data_i.operand_a[31:0] >> 3);
// SHA256 sigma1 transformation function by rotating, shifting and XORing rs1
assign sha256sig1_gen = (fu_data_i.operand_a[31:0] >> 17 | fu_data_i.operand_a[31:0] << 15) ^ (fu_data_i.operand_a[31:0] >> 19 | fu_data_i.operand_a[31:0] << 13) ^ (fu_data_i.operand_a[31:0] >> 10);
// SHA256 sum0 transformation function by rotating, shifting and XORing rs1
assign sha256sum0_gen = (fu_data_i.operand_a[31:0] >> 2 | fu_data_i.operand_a[31:0] << 30) ^ (fu_data_i.operand_a[31:0] >> 13 | fu_data_i.operand_a[31:0] << 19) ^ (fu_data_i.operand_a[31:0] >> 22 | fu_data_i.operand_a[31:0] << 10);
// SHA256 sum1 transformation function by rotating, shifting and XORing rs1
assign sha256sum1_gen = (fu_data_i.operand_a[31:0] >> 6 | fu_data_i.operand_a[31:0] << 26) ^ (fu_data_i.operand_a[31:0] >> 11 | fu_data_i.operand_a[31:0] << 21) ^ (fu_data_i.operand_a[31:0] >> 25 | fu_data_i.operand_a[31:0] << 7);
if (CVA6Cfg.IS_XLEN32) begin
assign sbox_in = fu_data_i.operand_b >> {orig_instr_aes[5:4], 3'b000};
// AES 32-bit final round encryption by applying rotations and the forward sbox to a single byte of rs2 based on the MSB byte of the instruction itself
assign aes32esi_gen = (fu_data_i.operand_a ^ ({24'b0, aes_sbox_fwd(
sbox_in[7:0]
)} << {orig_instr_aes[5:4], 3'b000}) | ({24'b0, aes_sbox_fwd(
sbox_in[7:0]
)} >> (32 - {orig_instr_aes[5:4], 3'b000})));
// AES 32-bit middle round encryption by applying rotations, forward mix-columns and the forward sbox to a single byte of rs2 based on the MSB byte of the instruction itself
assign aes32esmi_gen = fu_data_i.operand_a ^ ((aes_mixcolumn_fwd(
{24'h000000, aes_sbox_fwd(sbox_in[7:0])}
) << {orig_instr_aes[5:4], 3'b000}) | (aes_mixcolumn_fwd(
{24'h000000, aes_sbox_fwd(sbox_in[7:0])}
) >> (32 - {orig_instr_aes[5:4], 3'b000})));
// AES 32-bit final round decryption by applying rotations and the inverse sbox to a single byte of rs2 based on the MSB byte of the instruction itself
assign aes32dsi_gen = (fu_data_i.operand_a ^ ({24'b0, aes_sbox_inv(
sbox_in[7:0]
)} << {orig_instr_aes[5:4], 3'b000}) | ({24'b0, aes_sbox_inv(
sbox_in[7:0]
)} >> (32 - {orig_instr_aes[5:4], 3'b000})));
// AES 32-bit middle round decryption by applying rotations, inverse mix-columns and the inverse sbox to a single byte of rs2 based on the MSB byte of the instruction itself
assign aes32dsmi_gen = fu_data_i.operand_a ^ ((aes_mixcolumn_inv(
{24'h000000, aes_sbox_inv(sbox_in[7:0])}
) << {orig_instr_aes[5:4], 3'b000}) | (aes_mixcolumn_inv(
{24'h000000, aes_sbox_inv(sbox_in[7:0])}
) >> (32 - {orig_instr_aes[5:4], 3'b000})));
// SHA512 32-bit shifting and XORing rs1 and rs2
assign sha512sig0h_gen = (fu_data_i.operand_a >> 1) ^ (fu_data_i.operand_a >> 7) ^ (fu_data_i.operand_a >> 8) ^ (fu_data_i.operand_b << 31) ^ (fu_data_i.operand_b << 24);
assign sha512sig0l_gen = (fu_data_i.operand_a >> 1) ^ (fu_data_i.operand_a >> 7) ^ (fu_data_i.operand_a >> 8) ^ (fu_data_i.operand_b << 31) ^ (fu_data_i.operand_b << 25) ^ (fu_data_i.operand_b << 24);
assign sha512sig1h_gen = (fu_data_i.operand_a << 3) ^ (fu_data_i.operand_a >> 6) ^ (fu_data_i.operand_a >> 19) ^ (fu_data_i.operand_b >> 29) ^ (fu_data_i.operand_b << 13);
assign sha512sig1l_gen = (fu_data_i.operand_a << 3) ^ (fu_data_i.operand_a >> 6) ^ (fu_data_i.operand_a >> 19) ^ (fu_data_i.operand_b >> 29) ^ (fu_data_i.operand_b << 26) ^ (fu_data_i.operand_b << 13);
assign sha512sum0r_gen = (fu_data_i.operand_a << 25) ^ (fu_data_i.operand_a << 30) ^ (fu_data_i.operand_a >> 28) ^ (fu_data_i.operand_b >> 7) ^ (fu_data_i.operand_b >> 2) ^ (fu_data_i.operand_b << 4);
assign sha512sum1r_gen = (fu_data_i.operand_a << 23) ^ (fu_data_i.operand_a >> 14) ^ (fu_data_i.operand_a >> 18) ^ (fu_data_i.operand_b >> 9) ^ (fu_data_i.operand_b << 18) ^ (fu_data_i.operand_b << 14);
end else if (CVA6Cfg.IS_XLEN64) begin
// AES Shift rows forward and inverse step
assign sr = {
fu_data_i.operand_a[31:24],
fu_data_i.operand_b[55:48],
fu_data_i.operand_b[15:8],
fu_data_i.operand_a[39:32],
fu_data_i.operand_b[63:56],
fu_data_i.operand_b[23:16],
fu_data_i.operand_a[47:40],
fu_data_i.operand_a[7:0]
};
assign sr_inv = {
fu_data_i.operand_b[31:24],
fu_data_i.operand_b[55:48],
fu_data_i.operand_a[15:8],
fu_data_i.operand_a[39:32],
fu_data_i.operand_a[63:56],
fu_data_i.operand_b[23:16],
fu_data_i.operand_b[47:40],
fu_data_i.operand_a[7:0]
};
// AES 64-bit final round encryption by applying forward shift-rows and the forward sbox to each byte
assign aes64es_gen = {
aes_sbox_fwd(sr[63:56]),
aes_sbox_fwd(sr[55:48]),
aes_sbox_fwd(sr[47:40]),
aes_sbox_fwd(sr[39:32]),
aes_sbox_fwd(sr[31:24]),
aes_sbox_fwd(sr[23:16]),
aes_sbox_fwd(sr[15:8]),
aes_sbox_fwd(sr[7:0])
};
// AES 64-bit middle round encryption by applying forward shift-rows, forward sbox and forward mix-columns to all bytes
assign aes64esm_gen = {
aes_mixcolumn_fwd(aes64es_gen[63:32]), aes_mixcolumn_fwd(aes64es_gen[31:0])
};
// AES 64-bit final round decryption by applying inverse shift-rows and the inverse sbox to each byte
assign aes64ds_gen = {
aes_sbox_inv(sr_inv[63:56]),
aes_sbox_inv(sr_inv[55:48]),
aes_sbox_inv(sr_inv[47:40]),
aes_sbox_inv(sr_inv[39:32]),
aes_sbox_inv(sr_inv[31:24]),
aes_sbox_inv(sr_inv[23:16]),
aes_sbox_inv(sr_inv[15:8]),
aes_sbox_inv(sr_inv[7:0])
};
// AES 64-bit middle round decryption by applying inverse shift-rows, inverse sbox and inverse mix-columns to all bytes
assign aes64dsm_gen = {
aes_mixcolumn_inv(aes64ds_gen[63:32]), aes_mixcolumn_inv(aes64ds_gen[31:0])
};
// AES 64-bit keySchedule decryption by applying inverse mix-columns on rs1
assign aes64im_gen = {
aes_mixcolumn_inv(fu_data_i.operand_a[63:32]), aes_mixcolumn_inv(fu_data_i.operand_a[31:0])
};
// AES Key Schedule part by XORing different slices of rs1 and rs2
assign aes64ks2_gen = {
(fu_data_i.operand_a[63:32] ^ fu_data_i.operand_b[31:0] ^ fu_data_i.operand_b[63:32]),
(fu_data_i.operand_a[63:32] ^ fu_data_i.operand_b[31:0])
};
// AES Key Schedule part by substituting round constant based on round number(from instruction), rotations and forward subword substitutions
assign aes64ks1i_gen = (orig_instr_aes[3:0] <= 4'hA) ? {((aes_subword_fwd(
(orig_instr_aes[3:0] == 4'hA) ? fu_data_i.operand_a[63:32] : ((fu_data_i.operand_a[63:32] >> 8) | (fu_data_i.operand_a[63:32] << 24))
)) ^ (aes_decode_rcon(
orig_instr_aes[3:0]
))), ((aes_subword_fwd(
(orig_instr_aes[3:0] == 4'hA) ? fu_data_i.operand_a[63:32] : ((fu_data_i.operand_a[63:32] >> 8) | (fu_data_i.operand_a[63:32] << 24))
)) ^ (aes_decode_rcon(
orig_instr_aes[3:0]
)))} : 64'h0;
// SHA512 64bit rotating, shifting and XORing rs1
assign sha512sig0_gen = (fu_data_i.operand_a >> 1 | fu_data_i.operand_a << 63) ^ (fu_data_i.operand_a >> 8 | fu_data_i.operand_a << 56) ^ (fu_data_i.operand_a >> 7);
assign sha512sig1_gen = (fu_data_i.operand_a >> 19 | fu_data_i.operand_a << 45) ^ (fu_data_i.operand_a >> 61 | fu_data_i.operand_a << 3) ^ (fu_data_i.operand_a >> 6);
assign sha512sum0_gen = (fu_data_i.operand_a >> 28 | fu_data_i.operand_a << 36) ^ (fu_data_i.operand_a >> 34 | fu_data_i.operand_a << 30) ^ (fu_data_i.operand_a >> 39 | fu_data_i.operand_a << 25);
assign sha512sum1_gen = (fu_data_i.operand_a >> 14 | fu_data_i.operand_a << 50) ^ (fu_data_i.operand_a >> 18 | fu_data_i.operand_a << 46) ^ (fu_data_i.operand_a >> 41 | fu_data_i.operand_a << 23);
end
end
// -----------
// Result MUX
// -----------
always_comb begin
result_o = '0;
// AES instructions
if (CVA6Cfg.ZKN && CVA6Cfg.RVB) begin
if (CVA6Cfg.IS_XLEN32) begin
unique case (fu_data_i.operation)
AES32ESI: result_o = aes32esi_gen;
AES32ESMI: result_o = aes32esmi_gen;
AES32DSI: result_o = aes32dsi_gen;
AES32DSMI: result_o = aes32dsmi_gen;
SHA256SIG0: result_o = sha256sig0_gen;
SHA256SIG1: result_o = sha256sig1_gen;
SHA256SUM0: result_o = sha256sum0_gen;
SHA256SUM1: result_o = sha256sum1_gen;
SHA512SIG0H: result_o = sha512sig0h_gen;
SHA512SIG0L: result_o = sha512sig0l_gen;
SHA512SIG1H: result_o = sha512sig1h_gen;
SHA512SIG1L: result_o = sha512sig1l_gen;
SHA512SUM0R: result_o = sha512sum0r_gen;
SHA512SUM1R: result_o = sha512sum1r_gen;
default: ;
endcase
end
if (CVA6Cfg.IS_XLEN64) begin
unique case (fu_data_i.operation)
AES64ES: result_o = aes64es_gen;
AES64ESM: result_o = aes64esm_gen;
AES64DS: result_o = aes64ds_gen;
AES64DSM: result_o = aes64dsm_gen;
AES64IM: result_o = aes64im_gen;
AES64KS1I: result_o = aes64ks1i_gen;
AES64KS2: result_o = aes64ks2_gen;
SHA256SIG0: result_o = {{32{sha256sig0_gen[31]}}, sha256sig0_gen};
SHA256SIG1: result_o = {{32{sha256sig1_gen[31]}}, sha256sig1_gen};
SHA256SUM0: result_o = {{32{sha256sum0_gen[31]}}, sha256sum0_gen};
SHA256SUM1: result_o = {{32{sha256sum1_gen[31]}}, sha256sum1_gen};
SHA512SIG0: result_o = sha512sig0_gen;
SHA512SIG1: result_o = sha512sig1_gen;
SHA512SUM0: result_o = sha512sum0_gen;
SHA512SUM1: result_o = sha512sum1_gen;
default: ;
endcase
end
end
end
endmodule

View file

@ -54,6 +54,9 @@ module alu
logic [CVA6Cfg.XLEN-1:0] brev8_reversed;
logic [ 31:0] unzip_gen;
logic [ 31:0] zip_gen;
logic [CVA6Cfg.XLEN-1:0] xperm8_result;
logic [CVA6Cfg.XLEN-1:0] xperm4_result;
// bit reverse operand_a for left shifts and bit counting
generate
genvar k;
@ -268,16 +271,22 @@ module alu
// ZKN gen block
if (CVA6Cfg.ZKN && CVA6Cfg.RVB) begin : zkn_gen_block
genvar i, m, n;
// Generate brev8_reversed by reversing bits within each byte
for (i = 0; i < (CVA6Cfg.XLEN / 8); i++) begin : brev8_gen
genvar i, m, n, q;
for (i = 0; i < (CVA6Cfg.XLEN / 8); i++) begin : brev8_xperm8_gen
// Generating xperm8_result by extracting bytes from operand a based on indices from operand b
assign xperm8_result[i << 3 +: 8] = (fu_data_i.operand_b[i << 3 +: 8] < (CVA6Cfg.XLEN / 8)) ? fu_data_i.operand_a[fu_data_i.operand_b[i << 3 +: 8] << 3 +: 8] : 8'b0;
// Generate brev8_reversed by reversing bits within each byte
for (m = 0; m < 8; m++) begin : reverse_bits
// Reversing the order of bits within a single byte
assign brev8_reversed[(i<<3)+m] = fu_data_i.operand_a[(i<<3)+(7-m)];
end
end
// Generate zip and unzip results
for (q = 0; q < (CVA6Cfg.XLEN / 4); q++) begin : xperm4_gen
// Generating xperm4_result by extracting nibbles from operand a based on indices from operand b
assign xperm4_result[q << 2 +: 4] = (fu_data_i.operand_b[q << 2 +: 4] < (CVA6Cfg.XLEN / 4)) ? fu_data_i.operand_a[{2'b0, fu_data_i.operand_b[q << 2 +: 4]} << 2 +: 4] : 4'b0;
end
if (CVA6Cfg.IS_XLEN32) begin
// Generate zip and unzip results
for (n = 0; n < 16; n++) begin : zip_unzip_gen
// Assigning lower and upper half of operand into the even and odd positions of result
assign zip_gen[n<<1] = fu_data_i.operand_a[n];
@ -392,6 +401,8 @@ module alu
PACK_H:
result_o = (CVA6Cfg.IS_XLEN32) ? ({16'b0, fu_data_i.operand_b[7:0], fu_data_i.operand_a[7:0]}) : ({48'b0, fu_data_i.operand_b[7:0], fu_data_i.operand_a[7:0]});
BREV8: result_o = brev8_reversed;
XPERM8: result_o = xperm8_result;
XPERM4: result_o = xperm4_result;
default: ;
endcase
if (fu_data_i.operation == PACK_W && CVA6Cfg.IS_XLEN64)

View file

@ -442,6 +442,8 @@ module cva6
exception_t flu_exception_ex_id;
// ALU
logic [CVA6Cfg.NrIssuePorts-1:0] alu_valid_id_ex;
logic [5:0] orig_instr_aes;
logic [CVA6Cfg.NrIssuePorts-1:0] aes_valid_id_ex;
// Branches and Jumps
logic [CVA6Cfg.NrIssuePorts-1:0] branch_valid_id_ex;
@ -858,6 +860,7 @@ module cva6
.flu_ready_i (flu_ready_ex_id),
// ALU
.alu_valid_o (alu_valid_id_ex),
.aes_valid_o (aes_valid_id_ex),
// Branches and Jumps
.branch_valid_o (branch_valid_id_ex), // branch is valid
.branch_predict_o (branch_predict_id_ex), // branch predict to ex
@ -916,7 +919,8 @@ module cva6
.rvfi_issue_pointer_o (rvfi_issue_pointer),
.rvfi_commit_pointer_o(rvfi_commit_pointer),
.rvfi_rs1_o (rvfi_rs1),
.rvfi_rs2_o (rvfi_rs2)
.rvfi_rs2_o (rvfi_rs2),
.orig_instr_aes_bits (orig_instr_aes)
);
// ---------
@ -958,6 +962,8 @@ module cva6
.flu_ready_o(flu_ready_ex_id),
// ALU
.alu_valid_i(alu_valid_id_ex),
.orig_instr_aes_i(orig_instr_aes),
.aes_valid_i(aes_valid_id_ex),
// Branches and Jumps
.branch_valid_i(branch_valid_id_ex),
.branch_predict_i(branch_predict_id_ex), // branch predict to ex

View file

@ -467,7 +467,7 @@ module decoder
// --------------------------------------------
// Vectorial Floating-Point Reg-Reg Operations
// --------------------------------------------
if (instr.rvftype.funct2 == 2'b10) begin // Prefix 10 for all Xfvec ops
if (!CVA6Cfg.ZKN && instr.rvftype.funct2 == 2'b10) begin // Prefix 10 for all Xfvec ops
// only generate decoder if FP extensions are enabled (static)
if (CVA6Cfg.FpPresent && CVA6Cfg.XFVec && fs_i != riscv::Off && ((CVA6Cfg.RVH && (!v_i || vfs_i != riscv::Off)) || !CVA6Cfg.RVH)) begin
automatic logic allow_replication; // control honoring of replication flag
@ -788,6 +788,18 @@ module decoder
if (CVA6Cfg.ZKN) instruction_o.op = ariane_pkg::PACK_H; //packh
else illegal_instr_bm = 1'b1;
end
{
7'b001_0100, 3'b100
} : begin
if (CVA6Cfg.ZKN) instruction_o.op = ariane_pkg::XPERM8; // xperm8
else illegal_instr_bm = 1'b1;
end
{
7'b001_0100, 3'b010
} : begin
if (CVA6Cfg.ZKN) instruction_o.op = ariane_pkg::XPERM4; // xperm4
else illegal_instr_bm = 1'b1;
end
// Zero Extend Op RV32 encoding
{
7'b000_0100, 3'b100
@ -797,6 +809,150 @@ module decoder
else if (CVA6Cfg.ZKN) instruction_o.op = ariane_pkg::PACK; // pack
else illegal_instr_bm = 1'b1;
end
{
7'b001_1001, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::AES64ES; // aes64es
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
{
7'b001_1011, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::AES64ESM; // aes64esm
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
{
7'b011_1111, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::AES64KS2; // aes64ks2
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
{
7'b0010001, 3'b000
}, {
7'b0110001, 3'b000
}, {
7'b1010001, 3'b000
}, {
7'b1110001, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::AES32ESI; // aes32esi
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
{
7'b0010011, 3'b000
}, {
7'b0110011, 3'b000
}, {
7'b1010011, 3'b000
}, {
7'b1110011, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::AES32ESMI; // aes32esmi
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
{
7'b0010101, 3'b000
}, {
7'b0110101, 3'b000
}, {
7'b1010101, 3'b000
}, {
7'b1110101, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::AES32DSI; // aes32dsi
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
{
7'b0010111, 3'b000
}, {
7'b0110111, 3'b000
}, {
7'b1010111, 3'b000
}, {
7'b1110111, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::AES32DSMI; // aes32dsmi
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
{
7'b001_1101, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::AES64DS; // aes64ds
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
{
7'b001_1111, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::AES64DSM; // aes64dsm
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
{
7'b010_1110, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::SHA512SIG0H; // sha512sig0h
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
{
7'b010_1010, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::SHA512SIG0L; // sha512sig0l
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
{
7'b010_1111, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::SHA512SIG1H; // sha512sig1h
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
{
7'b010_1011, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::SHA512SIG1L; // sha512sig1l
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
{
7'b010_1000, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::SHA512SUM0R; // sha512sum0r
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
{
7'b010_1001, 3'b000
} : begin
if (CVA6Cfg.ZKN) begin
instruction_o.op = ariane_pkg::SHA512SUM1R; // sha512sum1r
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
default: begin
illegal_instr_bm = 1'b1;
end
@ -937,7 +1093,37 @@ module decoder
instruction_o.op = ariane_pkg::BSETI;
else if (CVA6Cfg.ZKN && instr.instr[31:20] == 12'b000010001111)
instruction_o.op = ariane_pkg::ZIP;
else illegal_instr_bm = 1'b1;
else if (CVA6Cfg.ZKN && instr.instr[31:24] == 8'b00110001) begin
instruction_o.op = ariane_pkg::AES64KS1I;
instruction_o.fu = AES;
end else if (CVA6Cfg.ZKN && instr.instr[31:20] == 12'b001100000000) begin
instruction_o.op = ariane_pkg::AES64IM;
instruction_o.fu = AES;
end else if (CVA6Cfg.ZKN && instr.instr[31:20] == 12'b000100000010) begin
instruction_o.op = ariane_pkg::SHA256SIG0;
instruction_o.fu = AES;
end else if (CVA6Cfg.ZKN && instr.instr[31:20] == 12'b000100000011) begin
instruction_o.op = ariane_pkg::SHA256SIG1;
instruction_o.fu = AES;
end else if (CVA6Cfg.ZKN && instr.instr[31:20] == 12'b000100000000) begin
instruction_o.op = ariane_pkg::SHA256SUM0;
instruction_o.fu = AES;
end else if (CVA6Cfg.ZKN && instr.instr[31:20] == 12'b000100000001) begin
instruction_o.op = ariane_pkg::SHA256SUM1;
instruction_o.fu = AES;
end else if (CVA6Cfg.ZKN && instr.instr[31:20] == 12'b000100000110) begin
instruction_o.op = ariane_pkg::SHA512SIG0;
instruction_o.fu = AES;
end else if (CVA6Cfg.ZKN && instr.instr[31:20] == 12'b000100000111) begin
instruction_o.op = ariane_pkg::SHA512SIG1;
instruction_o.fu = AES;
end else if (CVA6Cfg.ZKN && instr.instr[31:20] == 12'b000100000100) begin
instruction_o.op = ariane_pkg::SHA512SUM0;
instruction_o.fu = AES;
end else if (CVA6Cfg.ZKN && instr.instr[31:20] == 12'b000100000101) begin
instruction_o.op = ariane_pkg::SHA512SUM1;
instruction_o.fu = AES;
end else illegal_instr_bm = 1'b1;
end
3'b101: begin
if (instr.instr[31:20] == 12'b001010000111) instruction_o.op = ariane_pkg::ORCB;

View file

@ -67,6 +67,8 @@ module ex_stage
output logic flu_valid_o,
// ALU instruction is valid - ISSUE_STAGE
input logic [CVA6Cfg.NrIssuePorts-1:0] alu_valid_i,
// AES instruction is valid - ISSUE_STAGE
input logic [CVA6Cfg.NrIssuePorts-1:0] aes_valid_i,
// Branch unit instruction is valid - ISSUE_STAGE
input logic [CVA6Cfg.NrIssuePorts-1:0] branch_valid_i,
// Information of branch prediction - ISSUE_STAGE
@ -235,7 +237,9 @@ module ex_stage
// Information dedicated to RVFI - RVFI
output lsu_ctrl_t rvfi_lsu_ctrl_o,
// Information dedicated to RVFI - RVFI
output [CVA6Cfg.PLEN-1:0] rvfi_mem_paddr_o
output [CVA6Cfg.PLEN-1:0] rvfi_mem_paddr_o,
// Original instruction AES bits
input logic [5:0] orig_instr_aes_i
);
// -------------------------
@ -271,14 +275,14 @@ module ex_stage
// from ALU to branch unit
logic alu_branch_res; // branch comparison result
logic [CVA6Cfg.XLEN-1:0] alu_result, csr_result, mult_result;
logic [CVA6Cfg.XLEN-1:0] alu_result, csr_result, mult_result, aes_result;
logic [CVA6Cfg.VLEN-1:0] branch_result;
logic csr_ready, mult_ready;
logic [CVA6Cfg.TRANS_ID_BITS-1:0] mult_trans_id;
logic mult_valid;
logic [CVA6Cfg.NrIssuePorts-1:0] one_cycle_select;
assign one_cycle_select = alu_valid_i | branch_valid_i | csr_valid_i;
assign one_cycle_select = alu_valid_i | branch_valid_i | csr_valid_i | aes_valid_i;
fu_data_t one_cycle_data;
logic [CVA6Cfg.VLEN-1:0] rs1_forwarding;
@ -370,6 +374,8 @@ module ex_stage
end else if (mult_valid) begin
flu_result_o = mult_result;
flu_trans_id_o = mult_trans_id;
end else if (|aes_valid_i) begin
flu_result_o = aes_result;
end
end
@ -723,4 +729,24 @@ module ex_stage
assign gpaddr_to_be_flushed = '0;
end
// ----------------
// Scalar Cryptography Unit
// ----------------
generate
if (CVA6Cfg.ZKN) begin : aes_gen
aes #(
.CVA6Cfg (CVA6Cfg),
.fu_data_t(fu_data_t)
) aes_i (
.clk_i,
.rst_ni,
.fu_data_i (one_cycle_data),
.result_o (aes_result),
.orig_instr_aes(orig_instr_aes_i)
);
end else begin : no_aes_gen
assign aes_result = '0;
end
endgenerate
endmodule

219
core/include/aes_pkg.sv Normal file
View file

@ -0,0 +1,219 @@
// Licensed under the Solderpad Hardware Licence, Version 2.1 (the "License");
// you may not use this file except in compliance with the License.
// SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1
// You may obtain a copy of the License at https://solderpad.org/licenses/
//
// Author: Munail Waqar, 10xEngineers
// Date: 03.05.2025
// Description: The Zkn extension including its subsets accelerates cryptographic workloads by introducing dedicated
// scalar instructions compliant with the RISC-V Scalar Cryptography specification. The subsets include:
// Zknd (AES Decryption and related instructions), Zkne (AES Encryption support, including AES rounds and key expansion steps),
// Zknh (SHA-256 and SHA-512 hash functions for secure hashing operations).
//
package aes_pkg;
// ----------------------
// AES functions
// ----------------------
// AES MixColumns Forward
function [31:0] aes_mixcolumn_fwd(input [31:0] x);
begin
aes_mixcolumn_fwd = {
(((x[7:0] << 1) ^ ((x[7]) ? 8'h1B : 8'h00)) ^ x[7:0]) ^ x[15:8] ^ x[23:16] ^ ((x[31:24] << 1) ^ ((x[31]) ? 8'h1B : 8'h00)),
x[7:0] ^ x[15:8] ^ ((x[23:16] << 1) ^ ((x[23]) ? 8'h1B : 8'h00)) ^ (((x[31:24] << 1) ^ ((x[31]) ? 8'h1B : 8'h00)) ^ x[31:24]),
x[7:0] ^ ((x[15:8] << 1) ^ ((x[15]) ? 8'h1B : 8'h00)) ^ (((x[23:16] << 1) ^ ((x[23]) ? 8'h1B : 8'h00)) ^ x[23:16]) ^ x[31:24],
((x[7:0] << 1) ^ ((x[7]) ? 8'h1B : 8'h00)) ^ (((x[15:8] << 1) ^ ((x[15]) ? 8'h1B : 8'h00)) ^ x[15:8]) ^ x[23:16] ^ x[31:24]
};
end
endfunction
// AES subword Forward
function [31:0] aes_subword_fwd(input [31:0] word);
aes_subword_fwd = {
aes_sbox_fwd(word[31:24]),
aes_sbox_fwd(word[23:16]),
aes_sbox_fwd(word[15:8]),
aes_sbox_fwd(word[7:0])
};
endfunction
// AES Round Constant
function [31:0] aes_decode_rcon(input [3:0] r);
case (r)
4'h0: aes_decode_rcon = 32'h00000001;
4'h1: aes_decode_rcon = 32'h00000002;
4'h2: aes_decode_rcon = 32'h00000004;
4'h3: aes_decode_rcon = 32'h00000008;
4'h4: aes_decode_rcon = 32'h00000010;
4'h5: aes_decode_rcon = 32'h00000020;
4'h6: aes_decode_rcon = 32'h00000040;
4'h7: aes_decode_rcon = 32'h00000080;
4'h8: aes_decode_rcon = 32'h0000001b;
4'h9: aes_decode_rcon = 32'h00000036;
4'hA: aes_decode_rcon = 32'h00000000;
4'hB: aes_decode_rcon = 32'h00000000;
4'hC: aes_decode_rcon = 32'h00000000;
4'hD: aes_decode_rcon = 32'h00000000;
4'hE: aes_decode_rcon = 32'h00000000;
4'hF: aes_decode_rcon = 32'h00000000;
default: aes_decode_rcon = 32'h00000000;
endcase
endfunction
// AES MixColumns Inverse
function logic [31:0] aes_mixcolumn_inv(input logic [31:0] x);
aes_mixcolumn_inv = {
(gfmul(x[7:0], 4'hB) ^ gfmul(x[15:8], 4'hD) ^ gfmul(x[23:16], 4'h9) ^ gfmul(x[31:24], 4'hE)),
(gfmul(x[7:0], 4'hD) ^ gfmul(x[15:8], 4'h9) ^ gfmul(x[23:16], 4'hE) ^ gfmul(x[31:24], 4'hB)),
(gfmul(x[7:0], 4'h9) ^ gfmul(x[15:8], 4'hE) ^ gfmul(x[23:16], 4'hB) ^ gfmul(x[31:24], 4'hD)),
(gfmul(x[7:0], 4'hE) ^ gfmul(x[15:8], 4'hB) ^ gfmul(x[23:16], 4'hD) ^ gfmul(x[31:24], 4'h9))
};
endfunction
// GF multiplication
function logic [7:0] gfmul(input logic [7:0] x, input logic [3:0] y);
logic [7:0] result, temp;
result = 8'h00;
if (y[0]) result ^= x;
if (y[1]) begin
result ^= ((x << 1) ^ ((x[7]) ? 8'h1B : 8'h00));
end
if (y[2]) begin
temp = (x << 1) ^ ((x[7]) ? 8'h1B : 8'h00);
result ^= (temp << 1) ^ ((temp[7]) ? 8'h1B : 8'h00);
end
if (y[3]) begin
temp = (x << 1) ^ ((x[7]) ? 8'h1B : 8'h00);
temp = (temp << 1) ^ ((temp[7]) ? 8'h1B : 8'h00);
result ^= (temp << 1) ^ ((temp[7]) ? 8'h1B : 8'h00);
end
return result;
endfunction
// AES Sbox implementation based on https://github.com/riscv/riscv-crypto
// AES Sbox Forward
function automatic logic [7:0] aes_sbox_fwd(input logic [7:0] in_byte);
logic [20:0] expanded;
logic [17:0] non_linear;
logic [ 7:0] compressed;
expanded = linear_top_layer(in_byte);
non_linear = non_linear_layer(expanded);
compressed = linear_bottom_layer(non_linear);
aes_sbox_fwd = compressed;
endfunction
// AES Sbox Inverse
function automatic logic [7:0] aes_sbox_inv(input logic [7:0] in_byte);
logic [20:0] expanded;
logic [17:0] non_linear;
logic [ 7:0] compressed;
expanded = aes_sbox_inv_top(in_byte);
non_linear = non_linear_layer(expanded);
compressed = aes_sbox_inv_out(non_linear);
aes_sbox_inv = compressed;
endfunction
// AES Sbox Forward Top Layer
function automatic logic [20:0] linear_top_layer(input logic [7:0] x);
return {
((x[7] ^ x[4]) ^ (x[5] ^ x[2])),
(((x[7] ^ x[4]) ^ ((x[6] ^ x[5]) ^ (x[4] ^ x[0]))) ^ ((x[0] ^ (x[6] ^ x[5])) ^ ((x[3] ^ x[1]) ^ (x[5] ^ x[2])))),
((x[7] ^ x[2]) ^ (((x[7] ^ x[4]) ^ (x[3] ^ x[1])) ^ (x[6] ^ x[5]))),
((x[7] ^ x[2]) ^ ((x[6] ^ x[5]) ^ (x[1] ^ x[0]))),
((x[6] ^ x[5]) ^ (x[1] ^ x[0])),
((x[7] ^ x[4]) ^ ((x[6] ^ x[5]) ^ (x[4] ^ x[0]))),
((x[6] ^ x[5]) ^ (x[4] ^ x[0])),
((x[0] ^ (x[6] ^ x[5])) ^ ((x[3] ^ x[1]) ^ (x[5] ^ x[2]))),
((x[3] ^ x[1]) ^ (x[5] ^ x[2])),
((x[3] ^ x[1]) ^ (x[6] ^ x[2])),
(((x[7] ^ x[4]) ^ (x[3] ^ x[1])) ^ (x[6] ^ x[2])),
((x[7] ^ x[1]) ^ (x[4] ^ x[2])),
(((x[7] ^ x[4]) ^ (x[3] ^ x[1])) ^ (x[6] ^ x[5])),
(x[0] ^ (x[6] ^ x[5])),
(x[0] ^ ((x[7] ^ x[4]) ^ (x[3] ^ x[1]))),
((x[7] ^ x[4]) ^ (x[3] ^ x[1])),
(x[4] ^ x[2]),
(x[7] ^ x[1]),
(x[7] ^ x[2]),
(x[7] ^ x[4]),
(x[0])
};
endfunction
// AES Sbox Middle Layer
function automatic logic [17:0] non_linear_layer(input logic [20:0] x);
logic t1, t2, t3, t4, t5;
logic [17:0] y;
t1 = (((x[10] ^ (x[9] & x[5])) ^ (x[17] & x[6])) ^ ((x[4] & x[20]) ^ (x[1] & x[11])));
t2 = ((((x[14] & x[0]) ^ (x[9] & x[5])) ^ x[18]) ^ ((x[2] & x[8]) ^ (x[1] & x[11])));
t3 = ((((x[3] ^ x[12]) ^ (x[3] & x[12])) ^ (x[16] & x[7])) ^ ((x[4] & x[20]) ^ (x[1] & x[11])));
t4 = ((((x[15] & x[13]) ^ (x[3] & x[12])) ^ ((x[2] & x[8]) ^ (x[1] & x[11]))) ^ x[19]);
t5 = ((((t1 ^ t2) & (t1 & t4)) ^ ((t1 ^ t2) ^ (t3 & t1))) ^ (((t3 ^ t4) & (t2 & t3)) ^ ((t3 ^ t4) ^ (t3 & t1))));
y[0] = (((t1 ^ t2) & (t1 & t4)) ^ ((t1 ^ t2) ^ (t3 & t1))) & x[7];
y[1] = (t2 ^ ((t4 ^ (t3 & t1)) & (t1 ^ t2))) & x[13];
y[2] = ((t2 ^ ((t4 ^ (t3 & t1)) & (t1 ^ t2))) ^ (t4 ^ ((t2 ^ (t3 & t1)) & (t3 ^ t4)))) & x[11];
y[3] = (((t2 ^ ((t4 ^ (t3 & t1)) & (t1 ^ t2))) ^ (t4 ^ ((t2 ^ (t3 & t1)) & (t3 ^ t4)))) ^ t5) & x[20];
y[4] = t5 & x[8];
y[5] = ((t4 ^ ((t2 ^ (t3 & t1)) & (t3 ^ t4))) ^ (((t3 ^ t4) & (t2 & t3)) ^ ((t3 ^ t4) ^ (t3 & t1)))) & x[9];
y[6] = (((t3 ^ t4) & (t2 & t3)) ^ ((t3 ^ t4) ^ (t3 & t1))) & x[17];
y[7] = (t4 ^ ((t2 ^ (t3 & t1)) & (t3 ^ t4))) & x[14];
y[8] = ((t2 ^ ((t4 ^ (t3 & t1)) & (t1 ^ t2))) ^ (((t1 ^ t2) & (t1 & t4)) ^ ((t1 ^ t2) ^ (t3 & t1)))) & x[3];
y[9] = (((t1 ^ t2) & (t1 & t4)) ^ ((t1 ^ t2) ^ (t3 & t1))) & x[16];
y[10] = (t2 ^ ((t4 ^ (t3 & t1)) & (t1 ^ t2))) & x[15];
y[11] = ((t2 ^ ((t4 ^ (t3 & t1)) & (t1 ^ t2))) ^ (t4 ^ ((t2 ^ (t3 & t1)) & (t3 ^ t4)))) & x[1];
y[12] = (((t2 ^ ((t4 ^ (t3 & t1)) & (t1 ^ t2))) ^ (t4 ^ ((t2 ^ (t3 & t1)) & (t3 ^ t4)))) ^ t5) & x[4];
y[13] = t5 & x[2];
y[14] = ((t4 ^ ((t2 ^ (t3 & t1)) & (t3 ^ t4))) ^ (((t3 ^ t4) & (t2 & t3)) ^ ((t3 ^ t4) ^ (t3 & t1)))) & x[5];
y[15] = (((t3 ^ t4) & (t2 & t3)) ^ ((t3 ^ t4) ^ (t3 & t1))) & x[6];
y[16] = (t4 ^ ((t2 ^ (t3 & t1)) & (t3 ^ t4))) & x[0];
y[17] = ((t2 ^ ((t4 ^ (t3 & t1)) & (t1 ^ t2))) ^ (((t1 ^ t2) & (t1 & t4)) ^ ((t1 ^ t2) ^ (t3 & t1)))) & x[12];
return y;
endfunction
// AES Sbox Forward Bottom Layer
function automatic logic [7:0] linear_bottom_layer(input logic [17:0] x);
logic [7:0] y;
y[0] = ((x[12] ^ (x[17] ^ x[11])) ^~ ((x[8] ^ (x[1] ^ x[9])) ^ (x[14] ^ x[16])));
y[1] = ((x[0] ^ (x[11] ^ x[12])) ^~ ((x[1] ^ x[9]) ^ (x[3] ^ (x[4] ^ x[8]))));
y[2] = (((x[12] ^ (x[17] ^ x[11])) ^ (x[3] ^ (x[4] ^ x[8]))) ^ ((x[10] ^ (x[14] ^ x[16])) ^ (x[7] ^ (x[0] ^ x[6]))));
y[3] = (((x[11] ^ x[12]) ^ (x[0] ^ x[6])) ^ ((x[15] ^ x[5]) ^ (x[16] ^ x[1])));
y[4] = ((x[12] ^ (x[17] ^ x[11])) ^ ((x[0] ^ x[6]) ^ (x[14] ^ (x[15] ^ x[5]))));
y[5] = ((x[13] ^ (x[4] ^ x[8])) ^~ ((x[10] ^ (x[14] ^ x[16])) ^ (x[2] ^ x[11])));
y[6] = ((x[6] ^ (x[11] ^ x[12])) ^~ ((x[14] ^ (x[15] ^ x[5])) ^ (x[2] ^ x[3])));
y[7] = ((x[12] ^ (x[17] ^ x[11])) ^ ((x[5] ^ (x[0] ^ x[6])) ^ (x[2] ^ x[3])));
return y;
endfunction
// AES Sbox Inverse Top Layer
function automatic logic [20:0] aes_sbox_inv_top(input logic [7:0] x);
return {
((x[4] ^ x[3]) ^ (x[2] ^~ x[1])),
(x[5] ^~ (x[4] ^ x[3])),
(x[3] ^~ x[0]),
(x[7] ^ x[4]),
(x[6] ^~ x[4]),
((x[3] ^~ x[0]) ^ (x[6] ^ x[1])),
((x[6] ^~ x[4]) ^ (x[1] ^ x[0])),
(x[5] ^~ ((x[6] ^~ x[4]) ^ (x[1] ^ x[0]))),
((x[6] ^ x[1]) ^ (x[5] ^~ x[3])),
(((x[7] ^~ x[6]) ^ (x[3] ^~ x[0])) ^ ((x[4] ^ x[3]) ^ (x[2] ^~ x[1]))),
(((x[7] ^~ x[6]) ^ (x[3] ^~ x[0])) ^ (x[2] ^~ x[1])),
((x[7] ^~ x[6]) ^ (x[1] ^ x[0])),
((x[7] ^~ x[6]) ^ (x[3] ^~ x[0])),
(x[0] ^~ (x[4] ^ x[3])),
(x[6] ^~ (x[7] ^ x[4])),
((x[6] ^~ x[4]) ^ (x[5] ^~ x[2])),
(x[3] ^ (x[6] ^~ (x[7] ^ x[4]))),
((x[4] ^ x[3]) ^ (x[1] ^ x[0])),
(x[7] ^~ x[6]),
(x[4] ^ x[3]),
(x[7] ^ (x[5] ^~ x[2]))
};
endfunction
// AES Sbox Inverse Bottom Layer
function automatic logic [7:0] aes_sbox_inv_out(input logic [17:0] x);
logic [7:0] y;
y[0] = ((x[5] ^ x[13]) ^ (x[7] ^ x[11]));
y[1] = ((x[17] ^ x[12]) ^ (((x[2] ^ x[11]) ^ (x[8] ^ x[9])) ^ (x[0] ^ x[3])));
y[2] = (((x[4] ^ x[12]) ^ (x[15] ^ x[0])) ^ ((x[14] ^ x[1]) ^ ((x[2] ^ x[11]) ^ (x[8] ^ x[9]))));
y[3] = ((((x[2] ^ x[11]) ^ (x[8] ^ x[9])) ^ (x[0] ^ x[3])) ^ ((x[7] ^ (x[16] ^ x[6])) ^ (x[13] ^ (x[14] ^ x[1]))));
y[4] = ((x[14] ^ x[16]) ^ ((x[4] ^ x[12]) ^ ((x[2] ^ x[11]) ^ (x[8] ^ x[9]))));
y[5] = ((x[8] ^ (x[4] ^ x[12])) ^ (((x[2] ^ x[11]) ^ (x[15] ^ x[0])) ^ ((x[17] ^ x[10]) ^ (x[7] ^ (x[16] ^ x[6])))));
y[6] = (((x[5] ^ x[13]) ^ ((x[2] ^ x[11]) ^ (x[15] ^ x[0]))) ^ ((x[4] ^ x[9]) ^ ((x[16] ^ x[6]) ^ (x[17] ^ x[10]))));
y[7] = ((x[17] ^ x[1]) ^ ((x[4] ^ x[12]) ^ ((x[2] ^ x[11]) ^ (x[8] ^ x[9]))));
return y;
endfunction
endpackage

View file

@ -196,7 +196,8 @@ package ariane_pkg;
FPU, // 7
FPU_VEC, // 8
CVXIF, // 9
ACCEL // 10
ACCEL, // 10
AES // 11
} fu_t;
// Index of writeback ports
@ -496,7 +497,39 @@ package ariane_pkg;
BREV8,
// Zip instructions
UNZIP,
ZIP
ZIP,
// Xperm instructions
XPERM8,
XPERM4,
// AES Encryption instructions
AES32ESI,
AES32ESMI,
AES64ES,
AES64ESM,
// AES Decryption instructions
AES32DSI,
AES32DSMI,
AES64DS,
AES64DSM,
AES64IM,
// AES Key-Schedule instructions
AES64KS1I,
AES64KS2,
// Hashing instructions
SHA256SIG0,
SHA256SIG1,
SHA256SUM0,
SHA256SUM1,
SHA512SIG0H,
SHA512SIG0L,
SHA512SIG1H,
SHA512SIG1L,
SHA512SUM0R,
SHA512SUM1R,
SHA512SIG0,
SHA512SIG1,
SHA512SUM0,
SHA512SUM1
} fu_op;
function automatic logic op_is_branch(input fu_op op);

View file

@ -35,7 +35,7 @@ package cva6_config_pkg;
localparam CVA6ConfigAxiAddrWidth = 64;
localparam CVA6ConfigAxiDataWidth = 64;
localparam CVA6ConfigFetchUserEn = 0;
localparam CVA6ConfigFetchUserWidth = 1; // Just not to raise warnings
localparam CVA6ConfigFetchUserWidth = 1; // Just not to raise warnings
localparam CVA6ConfigDataUserEn = 0;
localparam CVA6ConfigDataUserWidth = CVA6ConfigXlen;

View file

@ -64,6 +64,8 @@ module issue_read_operands
input logic flu_ready_i,
// ALU output is valid - EX_STAGE
output logic [CVA6Cfg.NrIssuePorts-1:0] alu_valid_o,
// AES output is valid - EX_STAGE
output logic [CVA6Cfg.NrIssuePorts-1:0] aes_valid_o,
// Branch unit is valid - EX_STAGE
output logic [CVA6Cfg.NrIssuePorts-1:0] branch_valid_o,
// Transformed trap instruction - EX_STAGE
@ -126,14 +128,15 @@ module issue_read_operands
// Information dedicated to RVFI - RVFI
output logic [CVA6Cfg.NrIssuePorts-1:0][CVA6Cfg.XLEN-1:0] rvfi_rs1_o,
// Information dedicated to RVFI - RVFI
output logic [CVA6Cfg.NrIssuePorts-1:0][CVA6Cfg.XLEN-1:0] rvfi_rs2_o
output logic [CVA6Cfg.NrIssuePorts-1:0][CVA6Cfg.XLEN-1:0] rvfi_rs2_o,
// Original instruction bits for AES
output logic [5:0] orig_instr_aes_bits
);
localparam OPERANDS_PER_INSTR = CVA6Cfg.NrRgprPorts / CVA6Cfg.NrIssuePorts;
typedef struct packed {
logic none, load, store, alu, alu2, ctrl_flow, mult, csr, fpu, fpu_vec, cvxif, accel;
logic none, load, store, alu, alu2, ctrl_flow, mult, csr, fpu, fpu_vec, cvxif, accel, aes;
} fus_busy_t;
logic [CVA6Cfg.NrIssuePorts-1:0] stall_raw, stall_rs1, stall_rs2, stall_rs3;
@ -153,6 +156,7 @@ module issue_read_operands
logic [CVA6Cfg.XLEN-1:0] imm_forward_rs3;
logic [CVA6Cfg.NrIssuePorts-1:0] alu_valid_n, alu_valid_q;
logic [CVA6Cfg.NrIssuePorts-1:0] aes_valid_n, aes_valid_q;
logic [CVA6Cfg.NrIssuePorts-1:0] mult_valid_n, mult_valid_q;
logic [CVA6Cfg.NrIssuePorts-1:0] fpu_valid_n, fpu_valid_q;
logic [1:0] fpu_fmt_n, fpu_fmt_q;
@ -271,6 +275,7 @@ module issue_read_operands
assign fu_data_o = fu_data_q;
assign alu_valid_o = alu_valid_q;
assign aes_valid_o = aes_valid_q;
assign branch_valid_o = branch_valid_q;
assign lsu_valid_o = lsu_valid_q;
assign csr_valid_o = csr_valid_q;
@ -294,6 +299,7 @@ module issue_read_operands
// Since we can not have two CVXIF instruction on 1st issue port, CVXIF is always ready for the pending instruction.
if (!flu_ready_i) begin
fus_busy[0].alu = 1'b1;
fus_busy[0].aes = 1'b1;
fus_busy[0].ctrl_flow = 1'b1;
fus_busy[0].csr = 1'b1;
fus_busy[0].mult = 1'b1;
@ -303,6 +309,7 @@ module issue_read_operands
// otherwise we will get contentions on the fixed latency bus
if (|mult_valid_q) begin
fus_busy[0].alu = 1'b1;
fus_busy[0].aes = 1'b1;
fus_busy[0].ctrl_flow = 1'b1;
fus_busy[0].csr = 1'b1;
end
@ -401,6 +408,7 @@ module issue_read_operands
LOAD: fu_busy[i] = fus_busy[i].load;
STORE: fu_busy[i] = fus_busy[i].store;
CVXIF: fu_busy[i] = fus_busy[i].cvxif;
AES: fu_busy[i] = fus_busy[i].aes;
default:
if (CVA6Cfg.FpPresent) begin
unique case (issue_instr_i[i].fu)
@ -673,6 +681,7 @@ module issue_read_operands
always_comb begin
alu_valid_n = '0;
aes_valid_n = '0;
lsu_valid_n = '0;
mult_valid_n = '0;
fpu_valid_n = '0;
@ -703,6 +712,9 @@ module issue_read_operands
CSR: begin
csr_valid_n[i] = 1'b1;
end
AES: begin
aes_valid_n[i] = 1'b1;
end
default: begin
if (issue_instr_i[i].fu == FPU && CVA6Cfg.FpPresent) begin
fpu_valid_n[i] = 1'b1;
@ -721,6 +733,7 @@ module issue_read_operands
// functional unit with the wrong inputs
if (flush_i) begin
alu_valid_n = '0;
aes_valid_n = '0;
lsu_valid_n = '0;
mult_valid_n = '0;
fpu_valid_n = '0;
@ -734,6 +747,7 @@ module issue_read_operands
always_ff @(posedge clk_i or negedge rst_ni) begin
if (!rst_ni) begin
alu_valid_q <= '0;
aes_valid_q <= '0;
lsu_valid_q <= '0;
mult_valid_q <= '0;
fpu_valid_q <= '0;
@ -744,6 +758,7 @@ module issue_read_operands
branch_valid_q <= '0;
end else begin
alu_valid_q <= alu_valid_n;
aes_valid_q <= aes_valid_n;
lsu_valid_q <= lsu_valid_n;
mult_valid_q <= mult_valid_n;
fpu_valid_q <= fpu_valid_n;
@ -1004,6 +1019,9 @@ module issue_read_operands
x_transaction_rejected_o <= 1'b0;
end else begin
fu_data_q <= fu_data_n;
if (CVA6Cfg.ZKN) begin
orig_instr_aes_bits <= {orig_instr_i[0][31:30], orig_instr_i[0][23:20]};
end
if (CVA6Cfg.RVH) begin
tinst_q <= tinst_n;
end

View file

@ -70,6 +70,8 @@ module issue_stage
input logic flu_ready_i,
// ALU output is valid - EX_STAGE
output logic [CVA6Cfg.NrIssuePorts-1:0] alu_valid_o,
// AES output is valid - EX_STAGE
output logic [CVA6Cfg.NrIssuePorts-1:0] aes_valid_o,
// Branch unit is valid - EX_STAGE
output logic [CVA6Cfg.NrIssuePorts-1:0] branch_valid_o,
// Information of branch prediction - EX_STAGE
@ -163,7 +165,9 @@ module issue_stage
// Information dedicated to RVFI - RVFI
output logic [CVA6Cfg.NrIssuePorts-1:0][CVA6Cfg.XLEN-1:0] rvfi_rs1_o,
// Information dedicated to RVFI - RVFI
output logic [CVA6Cfg.NrIssuePorts-1:0][CVA6Cfg.XLEN-1:0] rvfi_rs2_o
output logic [CVA6Cfg.NrIssuePorts-1:0][CVA6Cfg.XLEN-1:0] rvfi_rs2_o,
// Original instruction bits for AES
output logic [5:0] orig_instr_aes_bits
);
// ---------------------------------------------------
// Scoreboard (SB) <-> Issue and Read Operands (IRO)
@ -265,6 +269,7 @@ module issue_stage
.is_compressed_instr_o,
.flu_ready_i (flu_ready_i),
.alu_valid_o (alu_valid_o),
.aes_valid_o (aes_valid_o),
.branch_valid_o (branch_valid_o),
.tinst_o (tinst_o),
.branch_predict_o,
@ -300,7 +305,8 @@ module issue_stage
.we_fpr_i,
.stall_issue_o,
.rvfi_rs1_o (rvfi_rs1_o),
.rvfi_rs2_o (rvfi_rs2_o)
.rvfi_rs2_o (rvfi_rs2_o),
.orig_instr_aes_bits (orig_instr_aes_bits)
);
endmodule

View file

@ -0,0 +1,93 @@
.. Licensed under the Solderpad Hardware Licence, Version 2.1 (the "License");
.. you may not use this file except in compliance with the License.
.. SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1
.. You may obtain a copy of the License at https://solderpad.org/licenses/
.. Author: Munail Waqar, 10xEngineers
.. Date: 03.05.2025
..
Copyright (c) 2023 OpenHW Group
Copyright (c) 2023 10xEngineers
SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1
.. Level 1
=======
Level 2
-------
Level 3
~~~~~~~
Level 4
^^^^^^^
.. _cva6_riscv_instructions_RV32Zbkx:
*Applicability of this chapter to configurations:*
.. csv-table::
:widths: auto
:align: left
:header: "Configuration", "Implementation"
"CV32A60AX", "Implemented extension"
"CV64A6_MMU", "Implemented extension"
=============================
RVZbkx: Crossbar permutation instructions
=============================
The following instructions comprise the Zbkx extension:
Xperm instructions
--------------------
The xperm instructions perform permutation operations on a register. They use indices extracted from rs2 to select data chunks (bytes for xperm8 or nibbles for xperm4) from rs1. The selected data is then placed into the destination register (rd) at positions corresponding to the extracted indices in rs2. If an index in rs2 is out of range, the corresponding chunk in rd is set to 0.
+-----------+-----------+-----------------------+
| RV32 | RV64 | Mnemonic |
+===========+===========+=======================+
| ✔ | ✔ | xperm8 rd, rs1, rs2 |
+-----------+-----------+-----------------------+
| ✔ | ✔ | xperm4 rd, rs1, rs2 |
+-----------+-----------+-----------------------+
RV32 and RV64 Instructions
~~~~~~~~~~~~~~~~~~~~~~~~~~
- **XPERM8**: Crossbar permutation (bytes)
**Format**: xperm8 rd, rs1, rs2
**Description**: The xperm8 instruction operates on bytes. The rs1 register contains a vector of XLEN/8 8-bit elements. The rs2 register contains a vector of XLEN/8 8-bit indexes. The result is each element in rs2 replaced by the indexed element in rs1, or zero if the index into rs2 is out of bounds.
**Pseudocode**: foreach (i from 0 to xlen by 8) {
if (rs2[i*8+:8]<(xlen/8))
X(rd)[i*8+:8] = rs1[rs2[i*8+:8]*8+:8];
else
X(rd)[i*8+:8] = 8'b0;
}
**Invalid values**: NONE
**Exception raised**: NONE
- **XPERM4**: Crossbar permutation (nibbles)
**Format**: xperm4 rd, rs1, rs2
**Description**: The xperm4 instruction operates on nibbles. The rs1 register contains a vector of XLEN/4 4-bit elements. The rs2 register contains a vector of XLEN/4 4-bit indexes. The result is each element in rs2 replaced by the indexed element in rs1, or zero if the index into rs2 is out of bounds.
**Pseudocode**: foreach (i from 0 to xlen by 4) {
if (rs2[i*4+:4]<(xlen/4))
X(rd)[i*4+:4] = rs1[rs2[i*4+:4]*4+:4];
else
X(rd)[i*4+:4] = 4'b0;
}
**Invalid values**: NONE
**Exception raised**: NONE

View file

@ -0,0 +1,161 @@
.. Licensed under the Solderpad Hardware Licence, Version 2.1 (the "License");
.. you may not use this file except in compliance with the License.
.. SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1
.. You may obtain a copy of the License at https://solderpad.org/licenses/
.. Author: Munail Waqar, 10xEngineers
.. Date: 03.05.2025
..
Copyright (c) 2023 OpenHW Group
Copyright (c) 2023 10xEngineers
SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1
.. Level 1
=======
Level 2
-------
Level 3
~~~~~~~
Level 4
^^^^^^^
.. _cva6_riscv_instructions_RV32Zkne:
*Applicability of this chapter to configurations:*
.. csv-table::
:widths: auto
:align: left
:header: "Configuration", "Implementation"
"CV32A60AX", "Implemented extension"
"CV64A6_MMU", "Implemented extension"
=============================
RVZknd: NIST Suite: AES Decryption
=============================
The following instructions comprise the Zknd extension:
Decryption instructions
--------------------
The Decryption instructions (Zknd) provide support and acceleration for AES decryption and key expansion.
+-----------+-----------+----------------------------+
| RV32 | RV64 | Mnemonic |
+===========+===========+============================+
| ✔ | | aes32dsi rd, rs1, rs2, bs |
+-----------+-----------+----------------------------+
| ✔ | | aes32dsmi rd, rs1, rs2, bs |
+-----------+-----------+----------------------------+
| | ✔ | aes64ds rd, rs1, rs2 |
+-----------+-----------+----------------------------+
| | ✔ | aes64dsm rd, rs1, rs2 |
+-----------+-----------+----------------------------+
RV32 specific instructions
~~~~~~~~~~~~~~~~~~~~~~~~~~
- **AES32DSI**: AES final round decryption instruction for RV32
**Format**: aes32dsi rd, rs1, rs2, bs
**Description**: This instruction sources a single byte from rs2 according to bs. To this it applies the inverse AES SBox operation, and XOR the result with rs1.
**Pseudocode**: X(rd) = X(rs1)[31..0] ^ rol32((0x000000 @ aes_sbox_inv((X(rs2)[31..0] >> bs*8)[7..0])), unsigned(bs*8));
**Invalid values**: NONE
**Exception raised**: NONE
- **AES32DSMI**: AES middle round decryption instruction for RV32.
**Format**: aes32dsmi rd, rs1, rs2, bs
**Description**: This instruction sources a single byte from rs2 according to bs. To this it applies the inverse AES SBox operation, and a partial inverse MixColumn, before XORing the result with rs1.
**Pseudocode**: X(rd) = X(rs1)[31..0] ^ rol32(aes_mixcolumn_byte_inv(aes_sbox_inv((X(rs2)[31..0] >> bs*8)[7..0])), unsigned(bs*8));
**Invalid values**: NONE
**Exception raised**: NONE
RV64 specific instructions
~~~~~~~~~~~~~~~~~~~~~~~~~~
- **AES64DS**: AES final round decryption instruction for RV64.
**Format**: aes64ds rd, rs1, rs2
**Description**: Uses the two 64-bit source registers to represent the entire AES state, and produces half of the next round output, applying the Inverse ShiftRows and SubBytes steps.
**Pseudocode**: X(rd) = aes_apply_inv_sbox_to_each_byte(aes_rv64_shiftrows_inv(X(rs2)[63..0], X(rs1)[63..0]));
**Invalid values**: NONE
**Exception raised**: NONE
- **AES64DSM**: AES middle round decryption instruction for RV64.
**Format**: aes64dsm rd, rs1, rs2
**Description**: Uses the two 64-bit source registers to represent the entire AES state, and produces half of the next round output, applying the Inverse ShiftRows, SubBytes and MixColumns steps.
**Pseudocode**: X(rd) = aes_mixcolumn_inv(aes_apply_inv_sbox_to_each_byte(aes_rv64_shiftrows_inv(X(rs2)[63..0], X(rs1)[63..0]))[63..32])
@
aes_mixcolumn_inv(aes_apply_inv_sbox_to_each_byte(aes_rv64_shiftrows_inv(X(rs2)[63..0], X(rs1)[63..0]))[31..0]);
**Invalid values**: NONE
**Exception raised**: NONE
Key Schedule instructions
--------------------------------
+-----------+-----------+-----------------------+
| RV32 | RV64 | Mnemonic |
+===========+===========+=======================+
| | ✔ | aes64ks1i rd, rs |
+-----------+-----------+-----------------------+
| | ✔ | aes64ks2 rd, rs |
+-----------+-----------+-----------------------+
RV64 specific Instructions
~~~~~~~~~~~~~~~~~~~~~~~~~~
- **AES64KS1I**: This instruction implements part of the KeySchedule operation for the AES Block cipher involving the SBox operation.
**Format**: aes64ks1i rd, rs1, rnum
**Description**: This instruction implements the rotation, SubBytes and Round Constant addition steps of the AES block cipher Key Schedule. Note that rnum must be in the range 0x0..0xA.
**Pseudocode**: if(unsigned(rnum) > A) {
X(rd) = 64'b0;
} else {
tmp = if (rnum ==0xA)
X(rs1)[63..32]
else
ror32(X(rs1)[63..32], 8)
X(rd) = (aes_subword_fwd(tmp) ^ aes_decode_rcon(rnum)) @ (aes_subword_fwd(tmp) ^ aes_decode_rcon(rnum));
**Invalid values**: NONE
**Exception raised**: NONE
- **AES64KS2**: This instruction implements part of the KeySchedule operation for the AES Block cipher.
**Format**: aes64ks2 rd, rs1, rs2
**Description**: This instruction implements the additional XORing of key words as part of the AES block cipher Key Schedule.
**Pseudocode**: X(rd) = (X(rs1)[63..32] ^ X(rs2)[31..0] ^ X(rs2)[63..32]) @ (X(rs1)[63..32] ^ X(rs2)[31..0]);
**Invalid values**: NONE
**Exception raised**: NONE

View file

@ -0,0 +1,161 @@
.. Licensed under the Solderpad Hardware Licence, Version 2.1 (the "License");
.. you may not use this file except in compliance with the License.
.. SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1
.. You may obtain a copy of the License at https://solderpad.org/licenses/
.. Author: Munail Waqar, 10xEngineers
.. Date: 03.05.2025
..
Copyright (c) 2023 OpenHW Group
Copyright (c) 2023 10xEngineers
SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1
.. Level 1
=======
Level 2
-------
Level 3
~~~~~~~
Level 4
^^^^^^^
.. _cva6_riscv_instructions_RV32Zkne:
*Applicability of this chapter to configurations:*
.. csv-table::
:widths: auto
:align: left
:header: "Configuration", "Implementation"
"CV32A60AX", "Implemented extension"
"CV64A6_MMU", "Implemented extension"
=============================
RVZkne: NIST Suite: AES Encryption
=============================
The following instructions comprise the Zkne extension:
Encryption instructions
--------------------
The Encryption instructions (Zkne) provide support and acceleration for AES encryption and key expansion.
+-----------+-----------+----------------------------+
| RV32 | RV64 | Mnemonic |
+===========+===========+============================+
| ✔ | | aes32esi rd, rs1, rs2, bs |
+-----------+-----------+----------------------------+
| ✔ | | aes32esmi rd, rs1, rs2, bs |
+-----------+-----------+----------------------------+
| | ✔ | aes64es rd, rs1, rs2 |
+-----------+-----------+----------------------------+
| | ✔ | aes64esm rd, rs1, rs2 |
+-----------+-----------+----------------------------+
RV32 specific instructions
~~~~~~~~~~~~~~~~~~~~~~~~~~
- **AES32ESI**: AES final round encryption instruction for RV32
**Format**: aes32esi rd, rs1, rs2, bs
**Description**: This instruction sources a single byte from rs2 according to bs. To this it applies the forward AES SBox operation, before XORing the result with rs1.
**Pseudocode**: X(rd) = X(rs1)[31..0] ^ rol32((0x000000 @ aes_sbox_fwd((X(rs2)[31..0] >> bs*8)[7..0])), unsigned(bs*8));
**Invalid values**: NONE
**Exception raised**: NONE
- **AES32ESMI**: AES middle round encryption instruction for RV32.
**Format**: aes32esmi rd, rs1, rs2, bs
**Description**: This instruction sources a single byte from rs2 according to bs. To this it applies the forward AES SBox operation, and a partial forward MixColumn, before XORing the result with rs1.
**Pseudocode**: X(rd) = X(rs1)[31..0] ^ rol32(aes_mixcolumn_byte_fwd(aes_sbox_fwd((X(rs2)[31..0] >> bs*8)[7..0])), unsigned(bs*8));
**Invalid values**: NONE
**Exception raised**: NONE
RV64 specific instructions
~~~~~~~~~~~~~~~~~~~~~~~~~~
- **AES64ES**: AES final round encryption instruction for RV64.
**Format**: aes64es rd, rs1, rs2
**Description**: Uses the two 64-bit source registers to represent the entire AES state, and produces half of the next round output, applying the ShiftRows and SubBytes steps.
**Pseudocode**: X(rd) = aes_apply_fwd_sbox_to_each_byte(aes_rv64_shiftrows_fwd(X(rs2)[63..0], X(rs1)[63..0]));
**Invalid values**: NONE
**Exception raised**: NONE
- **AES64ESM**: AES middle round encryption instruction for RV64.
**Format**: aes64esm rd, rs1, rs2
**Description**: Uses the two 64-bit source registers to represent the entire AES state, and produces half of the next round output, applying the ShiftRows, SubBytes and MixColumns steps.
**Pseudocode**: X(rd) = aes_mixcolumn_fwd(aes_apply_fwd_sbox_to_each_byte(aes_rv64_shiftrows_fwd(X(rs2)[63..0], X(rs1)[63..0]))[63..32])
@
aes_mixcolumn_fwd(aes_apply_fwd_sbox_to_each_byte(aes_rv64_shiftrows_fwd(X(rs2)[63..0], X(rs1)[63..0]))[31..0]);
**Invalid values**: NONE
**Exception raised**: NONE
Key Schedule instructions
--------------------------------
+-----------+-----------+-----------------------+
| RV32 | RV64 | Mnemonic |
+===========+===========+=======================+
| | ✔ | aes64ks1i rd, rs |
+-----------+-----------+-----------------------+
| | ✔ | aes64ks2 rd, rs |
+-----------+-----------+-----------------------+
RV64 specific Instructions
~~~~~~~~~~~~~~~~~~~~~~~~~~
- **AES64KS1I**: This instruction implements part of the KeySchedule operation for the AES Block cipher involving the SBox operation.
**Format**: aes64ks1i rd, rs1, rnum
**Description**: This instruction implements the rotation, SubBytes and Round Constant addition steps of the AES block cipher Key Schedule. Note that rnum must be in the range 0x0..0xA.
**Pseudocode**: if(unsigned(rnum) > A) {
X(rd) = 64'b0;
} else {
tmp = if (rnum ==0xA)
X(rs1)[63..32]
else
ror32(X(rs1)[63..32], 8)
X(rd) = (aes_subword_fwd(tmp) ^ aes_decode_rcon(rnum)) @ (aes_subword_fwd(tmp) ^ aes_decode_rcon(rnum));
**Invalid values**: NONE
**Exception raised**: NONE
- **AES64KS2**: This instruction implements part of the KeySchedule operation for the AES Block cipher.
**Format**: aes64ks2 rd, rs1, rs2
**Description**: This instruction implements the additional XORing of key words as part of the AES block cipher Key Schedule.
**Pseudocode**: X(rd) = (X(rs1)[63..32] ^ X(rs2)[31..0] ^ X(rs2)[63..32]) @ (X(rs1)[63..32] ^ X(rs2)[31..0]);
**Invalid values**: NONE
**Exception raised**: NONE

View file

@ -0,0 +1,263 @@
.. Licensed under the Solderpad Hardware Licence, Version 2.1 (the "License");
.. you may not use this file except in compliance with the License.
.. SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1
.. You may obtain a copy of the License at https://solderpad.org/licenses/
.. Author: Munail Waqar, 10xEngineers
.. Date: 03.05.2025
..
Copyright (c) 2023 OpenHW Group
Copyright (c) 2023 10xEngineers
SPDX-License-Identifier: Apache-2.0 WITH SHL-2.1
.. Level 1
=======
Level 2
-------
Level 3
~~~~~~~
Level 4
^^^^^^^
.. _cva6_riscv_instructions_RV32Zknh:
*Applicability of this chapter to configurations:*
.. csv-table::
:widths: auto
:align: left
:header: "Configuration", "Implementation"
"CV32A60AX", "Implemented extension"
"CV64A6_MMU", "Implemented extension"
=============================
RVZknh: NIST Suite: Hash Function Instructions
=============================
The following instructions comprise the Zknh extension:
Hash Function instructions
--------------------
The Hash Function instructions (Zknh) provide acceleration for the SHA2 family of cryptographic hash functions.
+-----------+-----------+----------------------------+
| RV32 | RV64 | Mnemonic |
+===========+===========+============================+
| ✔ | ✔ | sha256sig0 rd, rs1 |
+-----------+-----------+----------------------------+
| ✔ | ✔ | sha256sig1 rd, rs1 |
+-----------+-----------+----------------------------+
| ✔ | ✔ | sha256sum0 rd, rs1 |
+-----------+-----------+----------------------------+
| ✔ | ✔ | sha256sum1 rd, rs1 |
+-----------+-----------+----------------------------+
| ✔ | | sha512sig0h rd, rs1, rs2 |
+-----------+-----------+----------------------------+
| ✔ | | sha512sig0l rd, rs1, rs2 |
+-----------+-----------+----------------------------+
| ✔ | | sha512sig1h rd, rs1, rs2 |
+-----------+-----------+----------------------------+
| ✔ | | sha512sig1l rd, rs1, rs2 |
+-----------+-----------+----------------------------+
| ✔ | | sha512sum0r rd, rs1, rs2 |
+-----------+-----------+----------------------------+
| ✔ | | sha512sum1r rd, rs1, rs2 |
+-----------+-----------+----------------------------+
| | ✔ | sha512sig0 rd, rs1 |
+-----------+-----------+----------------------------+
| | ✔ | sha512sig1 rd, rs1 |
+-----------+-----------+----------------------------+
| | ✔ | sha512sum0 rd, rs1 |
+-----------+-----------+----------------------------+
| | ✔ | sha512sum1 rd, rs1 |
+-----------+-----------+----------------------------+
RV32 and RV64 Instructions
~~~~~~~~~~~~~~~~~~~~~~~~~~
- **SHA256SIG0**: SHA2-256 Sigma0 instruction
**Format**: sha256sig0 rd, rs1
**Description**: Implements the Sigma0 transformation function as used in the SHA2-256 hash function. For RV32, the entire XLEN source register is operated on. For RV64, the low 32 bits of the source register are operated on, and the result sign extended to XLEN bits.
**Pseudocode**: X(rd) = EXTS(ror32(X(rs1)[31..0], 7) ^ ror32(X(rs1)[31..0], 18) ^ (X(rs1)[31..0] >> 3));
**Invalid values**: NONE
**Exception raised**: NONE
- **SHA256SIG1**: SHA2-256 Sigma1 instruction
**Format**: sha256sig1 rd, rs1
**Description**: Implements the Sigma1 transformation function as used in the SHA2-256 hash function. For RV32, the entire XLEN source register is operated on. For RV64, the low 32 bits of the source register are operated on, and the result sign extended to XLEN bits.
**Pseudocode**: X(rd) = EXTS(ror32(X(rs1)[31..0], 17) ^ ror32(X(rs1)[31..0], 19) ^ (X(rs1)[31..0] >> 10));
**Invalid values**: NONE
**Exception raised**: NONE
- **SHA256SUM0**: SHA2-256 Sum0 instruction
**Format**: sha256sum0 rd, rs1
**Description**: Implements the Sum0 transformation function as used in the SHA2-256 hash function. For RV32, the entire XLEN source register is operated on. For RV64, the low 32 bits of the source register are operated on, and the result sign extended to XLEN bits.
**Pseudocode**: X(rd) = EXTS(ror32(X(rs1)[31..0], 2) ^ ror32(X(rs1)[31..0], 13) ^ ror32(X(rs1)[31..0] >> 22));
**Invalid values**: NONE
**Exception raised**: NONE
- **SHA256SUM1**: SHA2-256 Sum1 instruction
**Format**: sha256sum1 rd, rs1
**Description**: Implements the Sum1 transformation function as used in the SHA2-256 hash function. For RV32, the entire XLEN source register is operated on. For RV64, the low 32 bits of the source register are operated on, and the result sign extended to XLEN bits.
**Pseudocode**: X(rd) = EXTS(ror32(X(rs1)[31..0], 6) ^ ror32(X(rs1)[31..0], 11) ^ ror32(X(rs1)[31..0] >> 25));
**Invalid values**: NONE
**Exception raised**: NONE
RV32 specific instructions
~~~~~~~~~~~~~~~~~~~~~~~~~~
- **SHA512SIG0H**: SHA2-512 Sigma0 high (RV32)
**Format**: sha512sig0h rd, rs1, rs2
**Description**: Implements the high half of the Sigma0 transformation, as used in the SHA2-512 hash function. Used to compute the Sigma0 transform of the SHA2-512 hash function in conjunction with the sha512sig0l instruction. The transform is a 64-bit to 64-bit function, so the input and output are each represented by two 32-bit registers.
**Pseudocode**: X(rd) = EXTS((X(rs1) >> 1) ^ (X(rs1) >> 7) ^ (X(rs1) >> 8) ^ (X(rs2) << 31) ^ (X(rs2) << 24));
**Invalid values**: NONE
**Exception raised**: NONE
- **SHA512SIG0L**: SHA2-512 Sigma0 low (RV32)
**Format**: sha512sig0l rd, rs1, rs2
**Description**: Implements the low half of the Sigma0 transformation, as used in the SHA2-512 hash function. Used to compute the Sigma0 transform of the SHA2-512 hash function in conjunction with the sha512sig0h instruction. The transform is a 64-bit to 64-bit function, so the input and output are each represented by two 32-bit registers.
**Pseudocode**: X(rd) = EXTS((X(rs1) >> 1) ^ (X(rs1) >> 7) ^ (X(rs1) >> 8) ^ (X(rs2) << 31) ^ (X(rs2) << 25) ^ (X(rs2) << 24));
**Invalid values**: NONE
**Exception raised**: NONE
- **SHA512SIG1H**: SHA2-512 Sigma1 high (RV32)
**Format**: sha512sig1h rd, rs1, rs2
**Description**: Implements the high half of the Sigma1 transformation, as used in the SHA2-512 hash function. Used to compute the Sigma1 transform of the SHA2-512 hash function in conjunction with the sha512sig1l instruction. The transform is a 64-bit to 64-bit function, so the input and output are each represented by two 32-bit registers.
**Pseudocode**: X(rd) = EXTS((X(rs1) << 3) ^ (X(rs1) >> 6) ^ (X(rs1) >> 19) ^ (X(rs2) >> 29) ^ (X(rs2) << 13));
**Invalid values**: NONE
**Exception raised**: NONE
- **SHA512SIG1L**: SHA2-512 Sigma1 low (RV32)
**Format**: sha512sig1l rd, rs1, rs2
**Description**: Implements the low half of the Sigma1 transformation, as used in the SHA2-512 hash function. Used to compute the Sigma1 transform of the SHA2-512 hash function in conjunction with the sha512sig0h instruction. The transform is a 64-bit to 64-bit function, so the input and output are each represented by two 32-bit registers.
**Pseudocode**: X(rd) = EXTS((X(rs1) << 3) ^ (X(rs1) >> 6) ^ (X(rs1) >> 19) ^ (X(rs2) >> 29) ^ (X(rs2) << 26) ^ (X(rs2) << 13));
**Invalid values**: NONE
**Exception raised**: NONE
- **SHA512SUM0R**: SHA2-512 Sum0 (RV32)
**Format**: sha512sum0r rd, rs1, rs2
**Description**: Implements the Sum0 transformation, as used in the SHA2-512 hash function. The transform is a 64-bit to 64-bit function, so the input and output are each represented by two 32-bit registers.
**Pseudocode**: X(rd) = EXTS((X(rs1) << 25) ^ (X(rs1) << 30) ^ (X(rs1) >> 28) ^ (X(rs2) >> 7) ^ (X(rs2) >> 2) ^ (X(rs2) << 4));
**Invalid values**: NONE
**Exception raised**: NONE
- **SHA512SUM1R**: SHA2-512 Sum1 (RV32)
**Format**: sha512sum1r rd, rs1, rs2
**Description**: Implements the Sum1 transformation, as used in the SHA2-512 hash function. The transform is a 64-bit to 64-bit function, so the input and output are each represented by two 32-bit registers.
**Pseudocode**: X(rd) = EXTS((X(rs1) << 23) ^ (X(rs1) >> 14) ^ (X(rs1) >> 18) ^ (X(rs2) >> 9) ^ (X(rs2) << 18) ^ (X(rs2) << 14));
**Invalid values**: NONE
**Exception raised**: NONE
RV64 specific Instructions
~~~~~~~~~~~~~~~~~~~~~~~~~~
- **SHA512SIG0**: SHA2-512 Sigma0 instruction (RV64)
**Format**: sha512sig0 rd, rs1
**Description**: Implements the Sigma0 transformation function as used in the SHA2-512 hash function.
**Pseudocode**: X(rd) = ror64(X(rs1), 1) ^ ror64(X(rs1), 8) ^ (X(rs1) >> 7);
**Invalid values**: NONE
**Exception raised**: NONE
- **SHA512SIG1**: SHA2-512 Sigma1 instruction (RV64)
**Format**: sha512sig1 rd, rs1
**Description**: Implements the Sigma1 transformation function as used in the SHA2-512 hash function.
**Pseudocode**: X(rd) = ror64(X(rs1), 19) ^ ror64(X(rs1), 61) ^ (X(rs1) >> 6);
**Invalid values**: NONE
**Exception raised**: NONE
- **SHA512SUM0**: SHA2-512 Sum0 instruction (RV64)
**Format**: sha512sum0 rd, rs1
**Description**: Implements the Sum0 transformation function as used in the SHA2-512 hash function.
**Pseudocode**: X(rd) = ror64(X(rs1), 28) ^ ror64(X(rs1), 34) ^ ror64(X(rs1) ,39);
**Invalid values**: NONE
**Exception raised**: NONE
- **SHA512SUM1**: SHA2-512 Sum1 instruction (RV64)
**Format**: sha512sum1 rd, rs1
**Description**: Implements the Sum1 transformation function as used in the SHA2-512 hash function.
**Pseudocode**: X(rd) = ror64(X(rs1), 14) ^ ror64(X(rs1), 18) ^ ror64(X(rs1) ,41);
**Invalid values**: NONE
**Exception raised**: NONE

View file

@ -884,12 +884,12 @@ def load_config(args, cwd):
if base in ("cv64a6_imafdch_sv39", "cv64a6_imafdch_sv39_wb"):
args.mabi = "lp64d"
args.isa = "rv64gch_zba_zbb_zbs_zbc"
elif base in ("cv64a6_imafdc_sv39_wb"):
elif base in ("cv64a6_imafdc_sv39_wb",):
args.mabi = "lp64d"
args.isa = "rv64gc_zba_zbb_zbs_zbc"
elif base in ("cv64a6_imafdc_sv39", "cv64a6_imafdc_sv39_hpdcache", "cv64a6_imafdc_sv39_hpdcache_wb"):
args.mabi = "lp64d"
args.isa = "rv64gc_zba_zbb_zbs_zbc_zbkb"
args.isa = "rv64gc_zba_zbb_zbs_zbc_zbkb_zbkx_zkne_zknd_zknh"
elif base == "cv32a60x":
args.mabi = "ilp32"
args.isa = "rv32imc_zba_zbb_zbs_zbc"
@ -906,7 +906,7 @@ def load_config(args, cwd):
args.isa = "rv32imac"
elif base == "cv32a6_imac_sv32":
args.mabi = "ilp32"
args.isa = "rv32imac_zbkb"
args.isa = "rv32imac_zbkb_zbkx_zkne_zknd_zknh"
elif base == "cv32a6_imafc_sv32":
args.mabi = "ilp32f"
args.isa = "rv32imafc"

View file

@ -968,6 +968,8 @@ testlist:
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/A/src/amoxor.w-01.S
#K
- test: rv64im-pack-01
<<: *common_test_config
iterations: 1
@ -987,3 +989,168 @@ testlist:
<<: *common_test_config
iterations: 1
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/brev8-01.S
- test: rv64i_m-xperm8-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/xperm8-01.S
- test: rv64i_m-xperm4-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/xperm4-01.S
- test: rv64i_m-aes64es-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/aes64es-01.S
- test: rv64i_m-aes64esm-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/aes64esm-01.S
- test: rv64i_m-aes64ks2-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/aes64ks2-01.S
- test: rv64i_m-aes64ks1i-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/aes64ks1i-01.S
- test: rv64i_m-aes64ds-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/aes64ds-01.S
- test: rv64i_m-aes64dsm-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/aes64dsm-01.S
- test: rv64i_m-aes64im-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/aes64im-01.S
- test: rv64i_m-sha256sig0-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha256sig0-01.S
- test: rv64i_m-sha256sig0-rwp1
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha256sig0-rwp1.S
- test: rv64i_m-sha256sig0-rwp2
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha256sig0-rwp2.S
- test: rv64i_m-sha256sig1-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha256sig1-01.S
- test: rv64i_m-sha256sig1-rwp1
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha256sig1-rwp1.S
- test: rv64i_m-sha256sig1-rwp2
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha256sig1-rwp2.S
- test: rv64i_m-sha256sum0-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha256sum0-01.S
- test: rv64i_m-sha256sum0-rwp1
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha256sum0-rwp1.S
- test: rv64i_m-sha256sum0-rwp2
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha256sum0-rwp2.S
- test: rv64i_m-sha256sum1-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha256sum1-01.S
- test: rv64i_m-sha256sum1-rwp1
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha256sum1-rwp1.S
- test: rv64i_m-sha256sum1-rwp2
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha256sum1-rwp2.S
- test: rv64i_m-sha512sig0-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha512sig0-01.S
- test: rv64i_m-sha512sig0-rwp1
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha512sig0-rwp1.S
- test: rv64i_m-sha512sig0-rwp2
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha512sig0-rwp2.S
- test: rv64i_m-sha512sig1-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha512sig1-01.S
- test: rv64i_m-sha512sig1-rwp1
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha512sig1-rwp1.S
- test: rv64i_m-sha512sig1-rwp2
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha512sig1-rwp2.S
- test: rv64i_m-sha512sum0-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha512sum0-01.S
- test: rv64i_m-sha512sum0-rwp1
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha512sum0-rwp1.S
- test: rv64i_m-sha512sum0-rwp2
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha512sum0-rwp2.S
- test: rv64i_m-sha512sum1-01
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha512sum1-01.S
- test: rv64i_m-sha512sum1-rwp1
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha512sum1-rwp1.S
- test: rv64i_m-sha512sum1-rwp2
iterations: 1
<<: *common_test_config
asm_tests: <path_var>/riscv-arch-test/riscv-test-suite/rv64i_m/K/src/sha512sum1-rwp2.S