mirror of
https://github.com/stnolting/neorv32.git
synced 2025-04-24 22:27:21 -04:00
[sw/example/floating_point] update
This commit is contained in:
parent
7a0e5073cf
commit
4bfcc05fd3
2 changed files with 119 additions and 48 deletions
|
@ -1,30 +1,24 @@
|
|||
# NEORV32 `Zfinx` Floating-Point Extension
|
||||
|
||||
The NEORV32 floating-point unit (FPU) implements the `Zfinx` RISC-V extension. The extensions can be enabled via the `CPU_EXTENSION_RISCV_Zfinx` top configuration generic.
|
||||
The NEORV32 floating-point unit (FPU) implements the `Zfinx` RISC-V extension. The extension can be
|
||||
enabled via the `CPU_EXTENSION_RISCV_Zfinx` top configuration generic.
|
||||
|
||||
The RISC-V `Zfinx` single-precision floating-point extensions uses the integer register file `x` instead of the dedicated floating-point `f` register file (which is
|
||||
defined by the RISC-V `F` single-precision floating-point extension). Hence, the standard data transfer instructions from the `F` extension are **not** available in `Zfinx`:
|
||||
The RISC-V `Zfinx` single-precision floating-point extensions uses the integer register file `x` instead
|
||||
of a dedicated floating-point `f` register file (which is defined by the RISC-V `F` single-precision
|
||||
floating-point extension). Hence, the standard data transfer instructions from the `F` extension are
|
||||
**not** available in `Zfinx`:
|
||||
|
||||
* floating-point load/store operations (`FLW`, `FSW`) and their compressed versions
|
||||
* integer register file `x` <-> floating point register file `f` move operations (`FMV.W.X`, `FMV.X.W`)
|
||||
|
||||
:information_source: More information regarding the RISC-V `Zfinx` single-precision floating-point extension can be found in the official GitHub repo:
|
||||
[`github.com/riscv/riscv-zfinx`](https://github.com/riscv/riscv-zfinx).
|
||||
|
||||
:warning: The RISC-V `Zfinx` extension is not officially ratified yet, but it is assumed to remain unchanged. Hence, it is not supported by the upstream RISC-V GCC port.
|
||||
Make sure you **do not** use the `f` ISA attribute when compiling applications that use floating-point arithmetic (`MARCH=rv32i*f*` is **NOT ALLOWED!**).
|
||||
|
||||
|
||||
### :warning: FPU Limitations
|
||||
|
||||
* The FPU **does not support subnormal numbers** yet. Subnormal FPU inputs and subnormal FPU results are always *flushed to zero*. The *classify* instruction `FCLASS` will never set the "subnormal" mask bits.
|
||||
* Rounding mode `ob100` "round to nearest, ties to max magnitude" is not supported yet (this and all invalid rounding mode configurations behave as "round towards zero" (truncation)).
|
||||
:information_source: See the according section of the NEORV32 data sheet for more information.
|
||||
|
||||
|
||||
## Intrinsic Library
|
||||
|
||||
The NEORV32 `Zfinx` floating-point extension can still be used using the provided **intrinsic library**. This library uses "custom" inline assmbly instructions
|
||||
wrapped within normal C-language functions. Each original instruction of the extension can be utilized using an according intrinsic function.
|
||||
The NEORV32 `Zfinx` floating-point extension can still be used using the provided **intrinsic library**. This
|
||||
library uses "custom" inline assembly instructions wrapped within normal C-language functions. Each original
|
||||
instruction of the extension can be utilized using an according intrinsic function.
|
||||
|
||||
For example, the floating-point addition instruction `FADD.S` can be invoked using the according intrinsic function:
|
||||
|
||||
|
@ -32,21 +26,98 @@ For example, the floating-point addition instruction `FADD.S` can be invoked usi
|
|||
float riscv_intrinsic_fadds(float rs1, float rs2)
|
||||
```
|
||||
|
||||
The pure-software emulation instruction, which uses the standard built-in functions to execute all floating-point operations, is available via wrapper function. The
|
||||
emulation function for the `FADD.S` instruction is:
|
||||
The pure-software emulation instruction, which uses the standard built-in functions to execute all
|
||||
floating-point operations, is available via wrapper function. The emulation function for the `FADD.S` instruction is:
|
||||
|
||||
```c
|
||||
float riscv_emulate_fadds(float rs1, float rs2)
|
||||
```
|
||||
|
||||
The emulation functions as well as the available intrinsics for the `Zfinx` extension are located in `neorv32_zfinx_extension_intrinsics.h`.
|
||||
|
||||
The provided test program `main.c` verifies all currently implemented `Zfinx` instructions by checking the functionality against the pure software-based emulation model
|
||||
(GCC soft-float library).
|
||||
The emulation functions as well as the available intrinsics for the `Zfinx` extension are located in
|
||||
`neorv32_zfinx_extension_intrinsics.h`. The provided test program `main.c` verifies all currently implemented
|
||||
`Zfinx` instructions by checking the functionality against the pure software-based emulation model (GCC soft-float library).
|
||||
|
||||
|
||||
## Resources
|
||||
## Exemplary Test Output
|
||||
|
||||
* Great page with online calculators for floating-point arithmetic: [http://www.ecs.umass.edu/ece/koren/arith/simulator/](http://www.ecs.umass.edu/ece/koren/arith/simulator/)
|
||||
* A handy tool for visualizing floating-point numbers in their binary representation: [https://www.h-schmidt.net/FloatConverter/IEEE754.html](https://www.h-schmidt.net/FloatConverter/IEEE754.html)
|
||||
* This helped me to understand what results the different FPU operation generate when having "special" inputs like NaN: [https://techdocs.altium.com/display/FPGA/IEEE+754+Standard+-+Overview](https://techdocs.altium.com/display/FPGA/IEEE+754+Standard+-+Overview)
|
||||
```
|
||||
<<< Zfinx extension test >>>
|
||||
SILENT_MODE enabled (only showing actual errors)
|
||||
Test cases per instruction: 1000000
|
||||
NOTE: The NEORV32 FPU does not support subnormal numbers yet. Subnormal numbers are flushed to zero.
|
||||
|
||||
|
||||
#0: FCVT.S.WU (unsigned integer to float)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#1: FCVT.S.W (signed integer to float)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#2: FCVT.WU.S (float to unsigned integer)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#3: FCVT.W.S (float to signed integer)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#4: FADD.S (addition)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#5: FSUB.S (subtraction)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#6: FMUL.S (multiplication)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#7: FMIN.S (select minimum)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#8: FMAX.S (select maximum)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#9: FEQ.S (compare if equal)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#10: FLT.S (compare if less-than)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#11: FLE.S (compare if less-than-or-equal)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#12: FSGNJ.S (sign-injection)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#13: FSGNJN.S (sign-injection NOT)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#14: FSGNJX.S (sign-injection XOR)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
#15: FCLASS.S (classify)...
|
||||
Errors: 0/1000000 [ok]
|
||||
|
||||
# unsupported FDIV.S (division) [illegal instruction]...
|
||||
<RTE> Illegal instruction @ PC=0x000006A8, INST=0x18A484D3 </RTE>
|
||||
[ok]
|
||||
|
||||
# unsupported FSQRT.S (square root) [illegal instruction]...
|
||||
<RTE> Illegal instruction @ PC=0x000006E0, INST=0x580484D3 </RTE>
|
||||
[ok]
|
||||
|
||||
# unsupported FMADD.S (fused multiply-add) [illegal instruction]...
|
||||
<RTE> Illegal instruction @ PC=0x0000071E, INST=0x1EA484C3 </RTE>
|
||||
[ok]
|
||||
|
||||
# unsupported FMSUB.S (fused multiply-sub) [illegal instruction]...
|
||||
<RTE> Illegal instruction @ PC=0x0000075C, INST=0x1EA484C7 </RTE>
|
||||
[ok]
|
||||
|
||||
# unsupported FNMSUB.S (fused negated multiply-sub) [illegal instruction]...
|
||||
<RTE> Illegal instruction @ PC=0x0000079A, INST=0x1EA484CF </RTE>
|
||||
[ok]
|
||||
|
||||
# unsupported FNMADD.S (fused negated multiply-add) [illegal instruction]...
|
||||
<RTE> Illegal instruction @ PC=0x000007D8, INST=0x1EA484CF </RTE>
|
||||
[ok]
|
||||
|
||||
[Zfinx extension verification successful!]
|
||||
```
|
||||
|
|
|
@ -178,7 +178,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fadds(float rs1, fl
|
|||
opa.float_value = rs1;
|
||||
opb.float_value = rs2;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0000000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
|
||||
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0000000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -196,7 +196,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsubs(float rs1, fl
|
|||
opa.float_value = rs1;
|
||||
opb.float_value = rs2;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0000100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
|
||||
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0000100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -214,7 +214,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmuls(float rs1, fl
|
|||
opa.float_value = rs1;
|
||||
opb.float_value = rs2;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0001000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
|
||||
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0001000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -232,7 +232,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmins(float rs1, fl
|
|||
opa.float_value = rs1;
|
||||
opb.float_value = rs2;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
|
||||
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -250,7 +250,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmaxs(float rs1, fl
|
|||
opa.float_value = rs1;
|
||||
opb.float_value = rs2;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010100, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
|
||||
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010100, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -266,7 +266,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_fcvt_wus(float r
|
|||
float_conv_t opa;
|
||||
opa.float_value = rs1;
|
||||
|
||||
return CUSTOM_INSTR_R1_TYPE(0b1100000, 0b00001, opa.binary_value, 0b000, 0b1010011);
|
||||
return CUSTOM_INSTR_R2_TYPE(0b1100000, 0b00001, opa.binary_value, 0b000, 0b1010011);
|
||||
}
|
||||
|
||||
|
||||
|
@ -281,7 +281,7 @@ inline int32_t __attribute__ ((always_inline)) riscv_intrinsic_fcvt_ws(float rs1
|
|||
float_conv_t opa;
|
||||
opa.float_value = rs1;
|
||||
|
||||
return (int32_t)CUSTOM_INSTR_R1_TYPE(0b1100000, 0b00000, opa.binary_value, 0b000, 0b1010011);
|
||||
return (int32_t)CUSTOM_INSTR_R2_TYPE(0b1100000, 0b00000, opa.binary_value, 0b000, 0b1010011);
|
||||
}
|
||||
|
||||
|
||||
|
@ -295,7 +295,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fcvt_swu(uint32_t r
|
|||
|
||||
float_conv_t res;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R1_TYPE(0b1101000, 0b00001, rs1, 0b000, 0b1010011);
|
||||
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b1101000, 0b00001, rs1, 0b000, 0b1010011);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -310,7 +310,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fcvt_sw(int32_t rs1
|
|||
|
||||
float_conv_t res;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R1_TYPE(0b1101000, 0b00000, rs1, 0b000, 0b1010011);
|
||||
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b1101000, 0b00000, rs1, 0b000, 0b1010011);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -328,7 +328,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_feqs(float rs1,
|
|||
opa.float_value = rs1;
|
||||
opb.float_value = rs2;
|
||||
|
||||
return CUSTOM_INSTR_R2_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b010, 0b1010011);
|
||||
return CUSTOM_INSTR_R3_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b010, 0b1010011);
|
||||
}
|
||||
|
||||
|
||||
|
@ -345,7 +345,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_flts(float rs1,
|
|||
opa.float_value = rs1;
|
||||
opb.float_value = rs2;
|
||||
|
||||
return CUSTOM_INSTR_R2_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
|
||||
return CUSTOM_INSTR_R3_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
|
||||
}
|
||||
|
||||
|
||||
|
@ -362,7 +362,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_fles(float rs1,
|
|||
opa.float_value = rs1;
|
||||
opb.float_value = rs2;
|
||||
|
||||
return CUSTOM_INSTR_R2_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
|
||||
return CUSTOM_INSTR_R3_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
|
||||
}
|
||||
|
||||
|
||||
|
@ -379,7 +379,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsgnjs(float rs1, f
|
|||
opa.float_value = rs1;
|
||||
opb.float_value = rs2;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
|
||||
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -397,7 +397,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsgnjns(float rs1,
|
|||
opa.float_value = rs1;
|
||||
opb.float_value = rs2;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
|
||||
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -415,7 +415,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsgnjxs(float rs1,
|
|||
opa.float_value = rs1;
|
||||
opb.float_value = rs2;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b010, 0b1010011);
|
||||
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b010, 0b1010011);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -431,7 +431,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_fclasss(float rs
|
|||
float_conv_t opa;
|
||||
opa.float_value = rs1;
|
||||
|
||||
return CUSTOM_INSTR_R1_TYPE(0b1110000, 0b00000, opa.binary_value, 0b001, 0b1010011);
|
||||
return CUSTOM_INSTR_R2_TYPE(0b1110000, 0b00000, opa.binary_value, 0b001, 0b1010011);
|
||||
}
|
||||
|
||||
|
||||
|
@ -454,7 +454,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fdivs(float rs1, fl
|
|||
opa.float_value = rs1;
|
||||
opb.float_value = rs2;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0001100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
|
||||
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0001100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -472,7 +472,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsqrts(float rs1) {
|
|||
float_conv_t opa, res;
|
||||
opa.float_value = rs1;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R1_TYPE(0b0101100, 0b00000, opa.binary_value, 0b000, 0b1010011);
|
||||
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0101100, 0b00000, opa.binary_value, 0b000, 0b1010011);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -494,7 +494,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmadds(float rs1, f
|
|||
opb.float_value = rs2;
|
||||
opc.float_value = rs3;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R3_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1000011);
|
||||
res.binary_value = CUSTOM_INSTR_R4_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1000011);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -516,7 +516,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmsubs(float rs1, f
|
|||
opb.float_value = rs2;
|
||||
opc.float_value = rs3;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R3_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1000111);
|
||||
res.binary_value = CUSTOM_INSTR_R4_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1000111);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -538,7 +538,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fnmsubs(float rs1,
|
|||
opb.float_value = rs2;
|
||||
opc.float_value = rs3;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R3_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1001011);
|
||||
res.binary_value = CUSTOM_INSTR_R4_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1001011);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
@ -560,7 +560,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fnmadds(float rs1,
|
|||
opb.float_value = rs2;
|
||||
opc.float_value = rs3;
|
||||
|
||||
res.binary_value = CUSTOM_INSTR_R3_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1001111);
|
||||
res.binary_value = CUSTOM_INSTR_R4_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1001111);
|
||||
return res.float_value;
|
||||
}
|
||||
|
||||
|
|
Loading…
Add table
Add a link
Reference in a new issue