Support of half precision floating-point format in RISC-V ISA: Zfh(min), Zvfh(min)

Brief survey of RISC-V scalar and vector support for 16-bit floating-point

Nov 03, 2022

Updated February 17th 2024 to correct how IEEE-754 2008 and 2019 specified binary16: not as a basic format but not only as an interchange/storage format (Thanks to a message from Mantas Mikaitis on the IEEE-754 mailing list).

Floating-point support in RISC-V ISA

RISC-V does not mandate the support of any floating-point operation as part of the base ISA (RV32I and RV64I) but single precision and double-precision extensions (e.g. RV64F and RV64D) were among the first to be specified. The F extension adds 32 floating-point registers, floating-point arithmetic operations, moves and conversions instructions. The D extension (which requires the F extension) extends this set of operation to also support double precision. There also exists a Q extension for quad-precision.

In reality, there are multiple F and D extensions: RV32F (to extend the 32-bit base ISA) and RV64F (for the 64-bit base ISA), similarly there are RV32D and RV64D. RV32D has the particularity of adding 64-bit wide floating-point registers when the associated general purpose registers are only 32-bit wide. This flexibility of RISC-V ISA is reviewed in the post: https://fprox.substack.com/p/risc-v-register-files

Initially, smaller floating-point formats, such as half precision, were not supported.

Half precision used to be lagging behind in term of ISA and hardware support. A common misconception is that it was only specified has a storage format in the 2008 revision of the IEEE-754 standard (the IEEE standard specifying floating-point formats and operations). This was not actually true, binary16 arithmetic was specified and standardized, it is simply not specified as a basic format: which means supporting it alone is not enough to be considered IEEE-754 compliant.

The momentum of deep learning and convolutional neural networks has kick-started a renewed interest for small number formats and in particular half precision (among many others).

RISC-V International (the association behind the RISC-V specification(s)) recently ratified two extensions specifying sets of instructions for half-precision support: Zfh and Zfhmin. The later being defined as a subset of the former, we will review Zfh first.

Zfh: full half-precision support

Zfh extends floating-point registers (FRF) to support half precision. It requires the F extension, thus no new register file, nor any register width extension is required since the floating-point registers are already wide enough to contain single precision values.

As other single-precision values in RV32D/RV64D, half precision values are stored in larger FLEN registers using RISC-V NaN boxing scheme (more on that in a future post).

This new extension defines:

Instruction to move data (with or without conversions):
- flh: load from memory into an F register
- fsh: store to memory from an F register
- fmv.h.x, fmv.x.h: bit pattern move between X and F register files (unmodified)
- fcvt.h.(w/l)[u], fcvt.(w/l)[u].h: conversions between X and F register files (from/to integer)
Arithmetic operations: fadd.h, fsub.h, fmul.h, fsqrt.h, fdiv.h, fmin.h, fmax.h, f(n)madd/f(n)msub.h
Floating-point comparisons: fcmp.h
Conversions between half precision and other floating-point formats
Miscellaneous: fclass.h, fsgn(/n/x).h

All arithmetic operations operate on uniform I/Os: all operands are half precision values, and the output is an half precision result. They share their opcode with the corresponding F/D instructions: the fmt fields (bits 25 and 26) value 2b'10 (2) encodes half precision (value 2b'00 for single precision, 2b'01 for double, and 2b'11 for quad but who needs so much precision !).

The following diagram represents the data move instructions:

The availability of move to/from and conversion with 64-bit operands is conditioned on the availability of RV64 (general purpose move and integer conversions) and D extension (floating-point conversions).Although certainly anecdotal, the availability of the Q extension (floating-point quad precision, a.k.a 128-bit format): instructions for conversions from/to quad precision are defined: fcvt.h.q and fcvt.q.h.

Zfhmin: reduced half-precision support

The extension Zfhmin can be seen as a subset of Zfh. It only mandates support for half-precision load and store, bit pattern moves from/to integer register file (no conversions with integer formats) and conversions with other floating-point formats. It represents a total of 6 instructions (extended to 8 with conversion from/to double precision format if the D extension is supported).

Zfhmin constitutes a reduced set of instruction which can be used by platforms where computing with half-precision values directly is not required but which still require the capability to manipulate them, in particular in memory, before converting them to a larger format for an eventual computation.

Vector support for half precision: Zvfh and Zvfhmin

The RISC-V Vector extension (RVV) version 1.0 specified vector support for single and double precision (SEW=32 and 64 bits). A draft of the specification (link to source) introduces Zvfh which is an extension of all the floating-point instruction to half-precision , including conversions with other formats, and Zvfhmin which is a really reduced subset of operations. As is the case for other floating-point format, half precision is supported by a specific value of the vsew field of the vtype configuration register: vsew=1. This encoding corresponding to a Selected Element Width (SEW) of 16 bits and is identical for both integer and floating point formats.

Zvfhmin only mandates the support of conversion between half and single precision: it extends the support of vfwcvt.f.f.v (widening half-to-float) and vfncvt.f.f.w (narrowing float-to-half) to SEW=16-bit.

Zvfh extends to half-precision all the floating-point vector operations, floating-point reductions, floating-point moves., a brief description of those operations can be found on Part 2 of our series RISC-V Vector Extension in a Nutshell.

Both Zvfhmin and Zvfh mandates the support of single precision element in vectors. On top of this support Zvfh mandates at least Zfhmin on the scalar floating-point side.

Half precision in RVA22 profile

The RISC-V consortium defines profiles. These profiles aim at defining a common set of mandatory extensions and a reduced set of optional extensions which can be used by hardware and software providers to build a compatible ecosystem without having to deal with more specialized ISA extensions. Profile descriptions can be found on RISC-V github.RVA22 is the most recent profile, it is dedicated for 64-bit application processors. Zfhmin is part of the mandatory extensions of the RVA22 profile, while Zfh is an optional extension (which supersede Zfhmin when selected). This means that all application processors targeting compatibility with the RISC-V ecosystem must have a minimal support for half precision, and than extended support is part of the extended profile. Neither of the vector extensions Zvfh nor Zvfhmin are required in the RVA22 profile.

Conclusion

RISC-V Half-precision support in scalar operations has already been ratified (as extensions Zfh and Zfhmin) and his part of the latest application profile (RVA22). There exist draft specifications for the support of half-precision in vector operations: Zvfh and Zvfhmin. Other formats, such as BFloat16 or more esoteric number formats, should follow in the coming years.

RISC-V is an open community so do not hesitate to sign-in to stay up-to-date and participate to the effort: http://riscv.org.

This post was revised Feb 3rd 2023 for publication on substack and later updated on February 17th 2024 to correct binary16 being wrongfully specified as storage/interchange format only by IEEE-754.

Reference:

RISC-V specification: "Zfh" and "Zfhmin" Standard Extensions for Half-Precision Floating-Point
RISC-V RVA22 profiles https://github.com/riscv/riscv-profiles/blob/main/profiles.adoc#6-rva22-profiles
RISC-V vector draft specification for Zvfh section
RISC-V vector draft specification for Zvfhmin section

What are you optimizing for ? (fprox's substack)

Discussion about this post