[sw/example/floating_point] update

2025-04-24 22:27:21 -04:00 · 2022-12-01 20:02:48 +01:00 · 2022-12-01 20:02:48 +01:00 · 4bfcc05fd3
commit 4bfcc05fd3
parent 7a0e5073cf
2 changed files with 119 additions and 48 deletions
--- a/sw/example/floating_point_test/README.md
+++ b/sw/example/floating_point_test/README.md
@ -1,30 +1,24 @@
 # NEORV32 `Zfinx` Floating-Point Extension

-The NEORV32 floating-point unit (FPU) implements the `Zfinx` RISC-V extension. The extensions can be enabled via the `CPU_EXTENSION_RISCV_Zfinx` top configuration generic.
+The NEORV32 floating-point unit (FPU) implements the `Zfinx` RISC-V extension. The extension can be
+enabled via the `CPU_EXTENSION_RISCV_Zfinx` top configuration generic.

-The RISC-V `Zfinx` single-precision floating-point extensions uses the integer register file `x` instead of the dedicated floating-point `f` register file (which is
-defined by the RISC-V `F` single-precision floating-point extension). Hence, the standard data transfer instructions from the `F` extension are **not** available in `Zfinx`:
+The RISC-V `Zfinx` single-precision floating-point extensions uses the integer register file `x` instead
+of a dedicated floating-point `f` register file (which is defined by the RISC-V `F` single-precision
+floating-point extension). Hence, the standard data transfer instructions from the `F` extension are
+**not** available in `Zfinx`:

 * floating-point load/store operations (`FLW`, `FSW`) and their compressed versions
 * integer register file `x` <-> floating point register file `f` move operations (`FMV.W.X`, `FMV.X.W`)

-:information_source: More information regarding the RISC-V `Zfinx` single-precision floating-point extension can be found in the official GitHub repo:
-[`github.com/riscv/riscv-zfinx`](https://github.com/riscv/riscv-zfinx).
-
-:warning: The RISC-V `Zfinx` extension is not officially ratified yet, but it is assumed to remain unchanged. Hence, it is not supported by the upstream RISC-V GCC port.
-Make sure you **do not** use the `f` ISA attribute when compiling applications that use floating-point arithmetic (`MARCH=rv32i*f*` is **NOT ALLOWED!**).
-
-
-### :warning: FPU Limitations
-
-* The FPU **does not support subnormal numbers** yet. Subnormal FPU inputs and subnormal FPU results are always *flushed to zero*. The *classify* instruction `FCLASS` will never set the "subnormal" mask bits.
-* Rounding mode `ob100` "round to nearest, ties to max magnitude" is not supported yet (this and all invalid rounding mode configurations behave as "round towards zero" (truncation)).
+:information_source: See the according section of the NEORV32 data sheet for more information.


 ## Intrinsic Library

-The NEORV32 `Zfinx` floating-point extension can still be used using the provided **intrinsic library**. This library uses "custom" inline assmbly instructions
-wrapped within normal C-language functions. Each original instruction of the extension can be utilized using an according intrinsic function.
+The NEORV32 `Zfinx` floating-point extension can still be used using the provided **intrinsic library**. This
+library uses "custom" inline assembly instructions wrapped within normal C-language functions. Each original
+instruction of the extension can be utilized using an according intrinsic function.

 For example, the floating-point addition instruction `FADD.S` can be invoked using the according intrinsic function:

@ -32,21 +26,98 @@ For example, the floating-point addition instruction `FADD.S` can be invoked usi
 float riscv_intrinsic_fadds(float rs1, float rs2)
 ```

-The pure-software emulation instruction, which uses the standard built-in functions to execute all floating-point operations, is available via wrapper function. The
-emulation function for the `FADD.S` instruction is:
+The pure-software emulation instruction, which uses the standard built-in functions to execute all
+floating-point operations, is available via wrapper function. The emulation function for the `FADD.S` instruction is:

 ```c
 float riscv_emulate_fadds(float rs1, float rs2)
 ```

-The emulation functions as well as the available intrinsics for the `Zfinx` extension are located in `neorv32_zfinx_extension_intrinsics.h`.
-
-The provided test program `main.c` verifies all currently implemented `Zfinx` instructions by checking the functionality against the pure software-based emulation model
-(GCC soft-float library).
+The emulation functions as well as the available intrinsics for the `Zfinx` extension are located in
+`neorv32_zfinx_extension_intrinsics.h`. The provided test program `main.c` verifies all currently implemented
+`Zfinx` instructions by checking the functionality against the pure software-based emulation model (GCC soft-float library).


-## Resources
+## Exemplary Test Output

-* Great page with online calculators for floating-point arithmetic: [http://www.ecs.umass.edu/ece/koren/arith/simulator/](http://www.ecs.umass.edu/ece/koren/arith/simulator/)
-* A handy tool for visualizing floating-point numbers in their binary representation: [https://www.h-schmidt.net/FloatConverter/IEEE754.html](https://www.h-schmidt.net/FloatConverter/IEEE754.html)
-* This helped me to understand what results the different FPU operation generate when having "special" inputs like NaN: [https://techdocs.altium.com/display/FPGA/IEEE+754+Standard+-+Overview](https://techdocs.altium.com/display/FPGA/IEEE+754+Standard+-+Overview)
+```
+<<< Zfinx extension test >>>
+SILENT_MODE enabled (only showing actual errors)
+Test cases per instruction: 1000000
+NOTE: The NEORV32 FPU does not support subnormal numbers yet. Subnormal numbers are flushed to zero.
+
+
+#0: FCVT.S.WU (unsigned integer to float)...
+Errors: 0/1000000 [ok]
+
+#1: FCVT.S.W (signed integer to float)...
+Errors: 0/1000000 [ok]
+
+#2: FCVT.WU.S (float to unsigned integer)...
+Errors: 0/1000000 [ok]
+
+#3: FCVT.W.S (float to signed integer)...
+Errors: 0/1000000 [ok]
+
+#4: FADD.S (addition)...
+Errors: 0/1000000 [ok]
+
+#5: FSUB.S (subtraction)...
+Errors: 0/1000000 [ok]
+
+#6: FMUL.S (multiplication)...
+Errors: 0/1000000 [ok]
+
+#7: FMIN.S (select minimum)...
+Errors: 0/1000000 [ok]
+
+#8: FMAX.S (select maximum)...
+Errors: 0/1000000 [ok]
+
+#9: FEQ.S (compare if equal)...
+Errors: 0/1000000 [ok]
+
+#10: FLT.S (compare if less-than)...
+Errors: 0/1000000 [ok]
+
+#11: FLE.S (compare if less-than-or-equal)...
+Errors: 0/1000000 [ok]
+
+#12: FSGNJ.S (sign-injection)...
+Errors: 0/1000000 [ok]
+
+#13: FSGNJN.S (sign-injection NOT)...
+Errors: 0/1000000 [ok]
+
+#14: FSGNJX.S (sign-injection XOR)...
+Errors: 0/1000000 [ok]
+
+#15: FCLASS.S (classify)...
+Errors: 0/1000000 [ok]
+
+# unsupported FDIV.S (division) [illegal instruction]...
+<RTE> Illegal instruction @ PC=0x000006A8, INST=0x18A484D3 </RTE>
+[ok]
+
+# unsupported FSQRT.S (square root) [illegal instruction]...
+<RTE> Illegal instruction @ PC=0x000006E0, INST=0x580484D3 </RTE>
+[ok]
+
+# unsupported FMADD.S (fused multiply-add) [illegal instruction]...
+<RTE> Illegal instruction @ PC=0x0000071E, INST=0x1EA484C3 </RTE>
+[ok]
+
+# unsupported FMSUB.S (fused multiply-sub) [illegal instruction]...
+<RTE> Illegal instruction @ PC=0x0000075C, INST=0x1EA484C7 </RTE>
+[ok]
+
+# unsupported FNMSUB.S (fused negated multiply-sub) [illegal instruction]...
+<RTE> Illegal instruction @ PC=0x0000079A, INST=0x1EA484CF </RTE>
+[ok]
+
+# unsupported FNMADD.S (fused negated multiply-add) [illegal instruction]...
+<RTE> Illegal instruction @ PC=0x000007D8, INST=0x1EA484CF </RTE>
+[ok]
+
+[Zfinx extension verification successful!]
+```
--- a/sw/example/floating_point_test/neorv32_zfinx_extension_intrinsics.h
+++ b/sw/example/floating_point_test/neorv32_zfinx_extension_intrinsics.h
@ -178,7 +178,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fadds(float rs1, fl
  opa.float_value = rs1;
  opb.float_value = rs2;

-  res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0000000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
+  res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0000000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
  return res.float_value;
 }

@ -196,7 +196,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsubs(float rs1, fl
  opa.float_value = rs1;
  opb.float_value = rs2;

-  res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0000100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
+  res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0000100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
  return res.float_value;
 }

@ -214,7 +214,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmuls(float rs1, fl
  opa.float_value = rs1;
  opb.float_value = rs2;

-  res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0001000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
+  res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0001000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
  return res.float_value;
 }

@ -232,7 +232,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmins(float rs1, fl
  opa.float_value = rs1;
  opb.float_value = rs2;

-  res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
+  res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
  return res.float_value;
 }

@ -250,7 +250,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmaxs(float rs1, fl
  opa.float_value = rs1;
  opb.float_value = rs2;

-  res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010100, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
+  res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010100, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
  return res.float_value;
 }

@ -266,7 +266,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_fcvt_wus(float r
  float_conv_t opa;
  opa.float_value = rs1;

-  return CUSTOM_INSTR_R1_TYPE(0b1100000, 0b00001, opa.binary_value, 0b000, 0b1010011);
+  return CUSTOM_INSTR_R2_TYPE(0b1100000, 0b00001, opa.binary_value, 0b000, 0b1010011);
 }


@ -281,7 +281,7 @@ inline int32_t __attribute__ ((always_inline)) riscv_intrinsic_fcvt_ws(float rs1
  float_conv_t opa;
  opa.float_value = rs1;

-  return (int32_t)CUSTOM_INSTR_R1_TYPE(0b1100000, 0b00000, opa.binary_value, 0b000, 0b1010011);
+  return (int32_t)CUSTOM_INSTR_R2_TYPE(0b1100000, 0b00000, opa.binary_value, 0b000, 0b1010011);
 }


@ -295,7 +295,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fcvt_swu(uint32_t r

  float_conv_t res;

-  res.binary_value = CUSTOM_INSTR_R1_TYPE(0b1101000, 0b00001, rs1, 0b000, 0b1010011);
+  res.binary_value = CUSTOM_INSTR_R2_TYPE(0b1101000, 0b00001, rs1, 0b000, 0b1010011);
  return res.float_value;
 }

@ -310,7 +310,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fcvt_sw(int32_t rs1

  float_conv_t res;

-  res.binary_value = CUSTOM_INSTR_R1_TYPE(0b1101000, 0b00000, rs1, 0b000, 0b1010011);
+  res.binary_value = CUSTOM_INSTR_R2_TYPE(0b1101000, 0b00000, rs1, 0b000, 0b1010011);
  return res.float_value;
 }

@ -328,7 +328,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_feqs(float rs1,
  opa.float_value = rs1;
  opb.float_value = rs2;

-  return CUSTOM_INSTR_R2_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b010, 0b1010011);
+  return CUSTOM_INSTR_R3_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b010, 0b1010011);
 }


@ -345,7 +345,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_flts(float rs1,
  opa.float_value = rs1;
  opb.float_value = rs2;

-  return CUSTOM_INSTR_R2_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
+  return CUSTOM_INSTR_R3_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
 }


@ -362,7 +362,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_fles(float rs1,
  opa.float_value = rs1;
  opb.float_value = rs2;

-  return CUSTOM_INSTR_R2_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
+  return CUSTOM_INSTR_R3_TYPE(0b1010000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
 }


@ -379,7 +379,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsgnjs(float rs1, f
  opa.float_value = rs1;
  opb.float_value = rs2;

-  res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
+  res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
  return res.float_value;
 }

@ -397,7 +397,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsgnjns(float rs1,
  opa.float_value = rs1;
  opb.float_value = rs2;

-  res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
+  res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b001, 0b1010011);
  return res.float_value;
 }

@ -415,7 +415,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsgnjxs(float rs1,
  opa.float_value = rs1;
  opb.float_value = rs2;

-  res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b010, 0b1010011);
+  res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0010000, opb.binary_value, opa.binary_value, 0b010, 0b1010011);
  return res.float_value;
 }

@ -431,7 +431,7 @@ inline uint32_t __attribute__ ((always_inline)) riscv_intrinsic_fclasss(float rs
  float_conv_t opa;
  opa.float_value = rs1;

-  return CUSTOM_INSTR_R1_TYPE(0b1110000, 0b00000, opa.binary_value, 0b001, 0b1010011);
+  return CUSTOM_INSTR_R2_TYPE(0b1110000, 0b00000, opa.binary_value, 0b001, 0b1010011);
 }


@ -454,7 +454,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fdivs(float rs1, fl
  opa.float_value = rs1;
  opb.float_value = rs2;

-  res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0001100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
+  res.binary_value = CUSTOM_INSTR_R3_TYPE(0b0001100, opb.binary_value, opa.binary_value, 0b000, 0b1010011);
  return res.float_value;
 }

@ -472,7 +472,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fsqrts(float rs1) {
  float_conv_t opa, res;
  opa.float_value = rs1;

-  res.binary_value = CUSTOM_INSTR_R1_TYPE(0b0101100, 0b00000, opa.binary_value, 0b000, 0b1010011);
+  res.binary_value = CUSTOM_INSTR_R2_TYPE(0b0101100, 0b00000, opa.binary_value, 0b000, 0b1010011);
  return res.float_value;
 }

@ -494,7 +494,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmadds(float rs1, f
  opb.float_value = rs2;
  opc.float_value = rs3;

-  res.binary_value = CUSTOM_INSTR_R3_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1000011);
+  res.binary_value = CUSTOM_INSTR_R4_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1000011);
  return res.float_value;
 }

@ -516,7 +516,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fmsubs(float rs1, f
  opb.float_value = rs2;
  opc.float_value = rs3;

-  res.binary_value = CUSTOM_INSTR_R3_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1000111);
+  res.binary_value = CUSTOM_INSTR_R4_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1000111);
  return res.float_value;
 }

@ -538,7 +538,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fnmsubs(float rs1,
  opb.float_value = rs2;
  opc.float_value = rs3;

-  res.binary_value = CUSTOM_INSTR_R3_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1001011);
+  res.binary_value = CUSTOM_INSTR_R4_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1001011);
  return res.float_value;
 }

@ -560,7 +560,7 @@ inline float __attribute__ ((always_inline)) riscv_intrinsic_fnmadds(float rs1,
  opb.float_value = rs2;
  opc.float_value = rs3;

-  res.binary_value = CUSTOM_INSTR_R3_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1001111);
+  res.binary_value = CUSTOM_INSTR_R4_TYPE(opc.binary_value, opb.binary_value, opa.binary_value, 0b000, 0b1001111);
  return res.float_value;
 }