[sw/example/floating_point] update

This commit is contained in:
stnolting 2022-12-01 20:02:48 +01:00
parent 7a0e5073cf
commit 4bfcc05fd3
2 changed files with 119 additions and 48 deletions

View file

@ -1,30 +1,24 @@
# NEORV32 `Zfinx` Floating-Point Extension
The NEORV32 floating-point unit (FPU) implements the `Zfinx` RISC-V extension. The extensions can be enabled via the `CPU_EXTENSION_RISCV_Zfinx` top configuration generic.
The NEORV32 floating-point unit (FPU) implements the `Zfinx` RISC-V extension. The extension can be
enabled via the `CPU_EXTENSION_RISCV_Zfinx` top configuration generic.
The RISC-V `Zfinx` single-precision floating-point extensions uses the integer register file `x` instead of the dedicated floating-point `f` register file (which is
defined by the RISC-V `F` single-precision floating-point extension). Hence, the standard data transfer instructions from the `F` extension are **not** available in `Zfinx`:
The RISC-V `Zfinx` single-precision floating-point extensions uses the integer register file `x` instead
of a dedicated floating-point `f` register file (which is defined by the RISC-V `F` single-precision
floating-point extension). Hence, the standard data transfer instructions from the `F` extension are
**not** available in `Zfinx`:
* floating-point load/store operations (`FLW`, `FSW`) and their compressed versions
* integer register file `x` <-> floating point register file `f` move operations (`FMV.W.X`, `FMV.X.W`)
:information_source: More information regarding the RISC-V `Zfinx` single-precision floating-point extension can be found in the official GitHub repo:
[`github.com/riscv/riscv-zfinx`](https://github.com/riscv/riscv-zfinx).
:warning: The RISC-V `Zfinx` extension is not officially ratified yet, but it is assumed to remain unchanged. Hence, it is not supported by the upstream RISC-V GCC port.
Make sure you **do not** use the `f` ISA attribute when compiling applications that use floating-point arithmetic (`MARCH=rv32i*f*` is **NOT ALLOWED!**).
### :warning: FPU Limitations
* The FPU **does not support subnormal numbers** yet. Subnormal FPU inputs and subnormal FPU results are always *flushed to zero*. The *classify* instruction `FCLASS` will never set the "subnormal" mask bits.
* Rounding mode `ob100` "round to nearest, ties to max magnitude" is not supported yet (this and all invalid rounding mode configurations behave as "round towards zero" (truncation)).
:information_source: See the according section of the NEORV32 data sheet for more information.
## Intrinsic Library
The NEORV32 `Zfinx` floating-point extension can still be used using the provided **intrinsic library**. This library uses "custom" inline assmbly instructions
wrapped within normal C-language functions. Each original instruction of the extension can be utilized using an according intrinsic function.
The NEORV32 `Zfinx` floating-point extension can still be used using the provided **intrinsic library**. This
library uses "custom" inline assembly instructions wrapped within normal C-language functions. Each original
instruction of the extension can be utilized using an according intrinsic function.
For example, the floating-point addition instruction `FADD.S` can be invoked using the according intrinsic function:
@ -32,21 +26,98 @@ For example, the floating-point addition instruction `FADD.S` can be invoked usi
float riscv_intrinsic_fadds(float rs1, float rs2)
```
The pure-software emulation instruction, which uses the standard built-in functions to execute all floating-point operations, is available via wrapper function. The
emulation function for the `FADD.S` instruction is:
The pure-software emulation instruction, which uses the standard built-in functions to execute all
floating-point operations, is available via wrapper function. The emulation function for the `FADD.S` instruction is:
```c
float riscv_emulate_fadds(float rs1, float rs2)
```
The emulation functions as well as the available intrinsics for the `Zfinx` extension are located in `neorv32_zfinx_extension_intrinsics.h`.
The provided test program `main.c` verifies all currently implemented `Zfinx` instructions by checking the functionality against the pure software-based emulation model
(GCC soft-float library).
The emulation functions as well as the available intrinsics for the `Zfinx` extension are located in
`neorv32_zfinx_extension_intrinsics.h`. The provided test program `main.c` verifies all currently implemented
`Zfinx` instructions by checking the functionality against the pure software-based emulation model (GCC soft-float library).
## Resources
## Exemplary Test Output
* Great page with online calculators for floating-point arithmetic: [http://www.ecs.umass.edu/ece/koren/arith/simulator/](http://www.ecs.umass.edu/ece/koren/arith/simulator/)
* A handy tool for visualizing floating-point numbers in their binary representation: [https://www.h-schmidt.net/FloatConverter/IEEE754.html](https://www.h-schmidt.net/FloatConverter/IEEE754.html)
* This helped me to understand what results the different FPU operation generate when having "special" inputs like NaN: [https://techdocs.altium.com/display/FPGA/IEEE+754+Standard+-+Overview](https://techdocs.altium.com/display/FPGA/IEEE+754+Standard+-+Overview)
```
<<< Zfinx extension test >>>
SILENT_MODE enabled (only showing actual errors)
Test cases per instruction: 1000000
NOTE: The NEORV32 FPU does not support subnormal numbers yet. Subnormal numbers are flushed to zero.
#0: FCVT.S.WU (unsigned integer to float)...
Errors: 0/1000000 [ok]
#1: FCVT.S.W (signed integer to float)...
Errors: 0/1000000 [ok]
#2: FCVT.WU.S (float to unsigned integer)...
Errors: 0/1000000 [ok]
#3: FCVT.W.S (float to signed integer)...
Errors: 0/1000000 [ok]
#4: FADD.S (addition)...
Errors: 0/1000000 [ok]
#5: FSUB.S (subtraction)...
Errors: 0/1000000 [ok]
#6: FMUL.S (multiplication)...
Errors: 0/1000000 [ok]
#7: FMIN.S (select minimum)...
Errors: 0/1000000 [ok]
#8: FMAX.S (select maximum)...
Errors: 0/1000000 [ok]
#9: FEQ.S (compare if equal)...
Errors: 0/1000000 [ok]
#10: FLT.S (compare if less-than)...
Errors: 0/1000000 [ok]
#11: FLE.S (compare if less-than-or-equal)...
Errors: 0/1000000 [ok]
#12: FSGNJ.S (sign-injection)...
Errors: 0/1000000 [ok]
#13: FSGNJN.S (sign-injection NOT)...
Errors: 0/1000000 [ok]
#14: FSGNJX.S (sign-injection XOR)...
Errors: 0/1000000 [ok]
#15: FCLASS.S (classify)...
Errors: 0/1000000 [ok]
# unsupported FDIV.S (division) [illegal instruction]...
<RTE> Illegal instruction @ PC=0x000006A8, INST=0x18A484D3 </RTE>
[ok]
# unsupported FSQRT.S (square root) [illegal instruction]...
<RTE> Illegal instruction @ PC=0x000006E0, INST=0x580484D3 </RTE>
[ok]
# unsupported FMADD.S (fused multiply-add) [illegal instruction]...
<RTE> Illegal instruction @ PC=0x0000071E, INST=0x1EA484C3 </RTE>
[ok]
# unsupported FMSUB.S (fused multiply-sub) [illegal instruction]...
<RTE> Illegal instruction @ PC=0x0000075C, INST=0x1EA484C7 </RTE>
[ok]
# unsupported FNMSUB.S (fused negated multiply-sub) [illegal instruction]...
<RTE> Illegal instruction @ PC=0x0000079A, INST=0x1EA484CF </RTE>
[ok]
# unsupported FNMADD.S (fused negated multiply-add) [illegal instruction]...
<RTE> Illegal instruction @ PC=0x000007D8, INST=0x1EA484CF </RTE>
[ok]
[Zfinx extension verification successful!]
```

View file

@ -178,7 +178,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fadds(float rs1, fl
opa.float_value = rs1;
opb.float_value = rs2;
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0000000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0000000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
return res.float_value;
}
@ -196,7 +196,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsubs(float rs1, fl
opa.float_value = rs1;
opb.float_value = rs2;
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0000100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0000100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
return res.float_value;
}
@ -214,7 +214,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmuls(float rs1, fl
opa.float_value = rs1;
opb.float_value = rs2;
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0001000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0001000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
return res.float_value;
}
@ -232,7 +232,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmins(float rs1, fl
opa.float_value = rs1;
opb.float_value = rs2;
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
return res.float_value;
}
@ -250,7 +250,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmaxs(float rs1, fl
opa.float_value = rs1;
opb.float_value = rs2;
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010100, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010100, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
return res.float_value;
}
@ -266,7 +266,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_fcvt_wus(float r
float_conv_t opa;
opa.float_value = rs1;
return CUSTOM_INSTR_R1_TYPE(0b1100000, 0b00001, opa.binary_value, 0b000, 0b1010011);
return CUSTOM_INSTR_R2_TYPE(0b1100000, 0b00001, opa.binary_value, 0b000, 0b1010011);
}
@ -281,7 +281,7 @@ inline int32_t __attribute__ ((always_inline)) riscv_intrinsic_fcvt_ws(float rs1
float_conv_t opa;
opa.float_value = rs1;
return (int32_t)CUSTOM_INSTR_R1_TYPE(0b1100000, 0b00000, opa.binary_value, 0b000, 0b1010011);
return (int32_t)CUSTOM_INSTR_R2_TYPE(0b1100000, 0b00000, opa.binary_value, 0b000, 0b1010011);
}
@ -295,7 +295,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fcvt_swu(uint32_t r
float_conv_t res;
res.binary_value = CUSTOM_INSTR_R1_TYPE(0b1101000, 0b00001, rs1, 0b000, 0b1010011);
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b1101000, 0b00001, rs1, 0b000, 0b1010011);
return res.float_value;
}
@ -310,7 +310,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fcvt_sw(int32_t rs1
float_conv_t res;
res.binary_value = CUSTOM_INSTR_R1_TYPE(0b1101000, 0b00000, rs1, 0b000, 0b1010011);
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b1101000, 0b00000, rs1, 0b000, 0b1010011);
return res.float_value;
}
@ -328,7 +328,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_feqs(float rs1,
opa.float_value = rs1;
opb.float_value = rs2;
return CUSTOM_INSTR_R2_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b010, 0b1010011);
return CUSTOM_INSTR_R3_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b010, 0b1010011);
}
@ -345,7 +345,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_flts(float rs1,
opa.float_value = rs1;
opb.float_value = rs2;
return CUSTOM_INSTR_R2_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
return CUSTOM_INSTR_R3_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
}
@ -362,7 +362,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_fles(float rs1,
opa.float_value = rs1;
opb.float_value = rs2;
return CUSTOM_INSTR_R2_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
return CUSTOM_INSTR_R3_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
}
@ -379,7 +379,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsgnjs(float rs1, f
opa.float_value = rs1;
opb.float_value = rs2;
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
return res.float_value;
}
@ -397,7 +397,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsgnjns(float rs1,
opa.float_value = rs1;
opb.float_value = rs2;
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
return res.float_value;
}
@ -415,7 +415,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsgnjxs(float rs1,
opa.float_value = rs1;
opb.float_value = rs2;
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b010, 0b1010011);
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b010, 0b1010011);
return res.float_value;
}
@ -431,7 +431,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_fclasss(float rs
float_conv_t opa;
opa.float_value = rs1;
return CUSTOM_INSTR_R1_TYPE(0b1110000, 0b00000, opa.binary_value, 0b001, 0b1010011);
return CUSTOM_INSTR_R2_TYPE(0b1110000, 0b00000, opa.binary_value, 0b001, 0b1010011);
}
@ -454,7 +454,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fdivs(float rs1, fl
opa.float_value = rs1;
opb.float_value = rs2;
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0001100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0001100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
return res.float_value;
}
@ -472,7 +472,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsqrts(float rs1) {
float_conv_t opa, res;
opa.float_value = rs1;
res.binary_value = CUSTOM_INSTR_R1_TYPE(0b0101100, 0b00000, opa.binary_value, 0b000, 0b1010011);
res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0101100, 0b00000, opa.binary_value, 0b000, 0b1010011);
return res.float_value;
}
@ -494,7 +494,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmadds(float rs1, f
opb.float_value = rs2;
opc.float_value = rs3;
res.binary_value = CUSTOM_INSTR_R3_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1000011);
res.binary_value = CUSTOM_INSTR_R4_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1000011);
return res.float_value;
}
@ -516,7 +516,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmsubs(float rs1, f
opb.float_value = rs2;
opc.float_value = rs3;
res.binary_value = CUSTOM_INSTR_R3_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1000111);
res.binary_value = CUSTOM_INSTR_R4_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1000111);
return res.float_value;
}
@ -538,7 +538,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fnmsubs(float rs1,
opb.float_value = rs2;
opc.float_value = rs3;
res.binary_value = CUSTOM_INSTR_R3_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1001011);
res.binary_value = CUSTOM_INSTR_R4_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1001011);
return res.float_value;
}
@ -560,7 +560,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fnmadds(float rs1,
opb.float_value = rs2;
opc.float_value = rs3;
res.binary_value = CUSTOM_INSTR_R3_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1001111);
res.binary_value = CUSTOM_INSTR_R4_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1001111);
return res.float_value;
}