RISC-V and Floating-Point
Because Floating-Point Rocks
RISC-V support for floating-point arithmetic is a topic we have partially covered in a few previous posts but we felt like it deserves a full overview post of its own.
RISC-V base ISA (RV32I or RV64I) does not define any floating-point instructions. RISC-V provides extensions to the base ISA to bring such support for floating-point arithmetic.
The map of ratified floating-point ISA extensions (and their dependencies) is presented in the figures below. The first figure presents both scalar and vector ratified floating-point extensions. The listed scalar extensions are built around a dedicated scalar floating-point register file (FRF).
There exists a mirror set of scalar floating-point extensions operating from the general purpose register file (XRF). They are listed in the figure below:
RISC-V original FP support: the F extension
RISC-V introduced support for floating-point through the F extension. This extension specifies a new register file (FRF) with FLEN-bit wide registers, and a set of operations to perform single precision floating-point (and related) operations. It builds on the 2008 revision of the IEEE-754 standard and includes basic operations (addition, subtraction, division/square root, conversions, …), but also the Fused Multiply-Add (FMA) which was not present in the original version of the IEEE-754 standard (1985).
RISC-V Architects made the choice of assigning a dedicated register file for floating-point operands and results of floating-point operations. We covered this in a previous post:
Compared to a unified integer1/floating-point register file, this choice implies additional cost for supporting Floating-Point as it adds 32 extra registers and the need for dedicated floating-point load/store and data moves between FRF and XRF; at the same time it simplifies register allocation in programs and provide more flexibility to assign architectural storage to floating-point data operands without competing for the same storage resources as the general purpose operations. The separate register files offer an extra layer of flexibility: general purpose register width, XLEN, can differ from floating-point register width, FLEN; allowing to tune the register size for the actual workload needs. For example RV32 + D requires 32 x 32-bit general purpose architectural registers (well actually “31 x”, since x0 is free) and 32 x 64-bit floating-point registers; similarly RV64 + F requires 31 x 64-bit general purpose registers and 32 x 32-bit floating-point registers: you just pay for what you need.
RISC-V natural choice for IEEE 754 Formats
The base floating-point extension to the RISC-V ISA, the F extension, specifies format operations which make implementations compliant with the IEEE 754 standard (in particular its 2008 revision).
The IEEE-754 standard is the most widely accepted floating-point arithmetic standards for CPUs, so selecting it appear as a natural choice for RISC-V.
You can find a brief history of the IEEE 754 standard(s) in this post:
The F extension defines RISC-V support for single-precision format (IEEE-754’s binary32, sometimes dubbed FP32) and an associated of general operations (including conversions from and to integer formats). RISC-V floating-point support is built around IEEE-754 formats.
Historically, the second Floating-Point extension is the D extension, bringing scalar support for double precision format (IEEE-754’s binary64) and instructions operating on it.
A Quadruple Precision (IEEE-754’s binary128) extension, dubbed Q, has also been specified. It seems it has seen only limited interest from the RISC-V community and limited adoption.
The original set of F, D, Q extensions has since been extended with support for half precision (IEEE 754’s binary16) which was added with the Zfh and Zfhmin instruction set extensions. Zfhmin is a strict subset of Zfh and only contains basic data moves and conversions.
Note: One of the rationales behind Zfhmin is that it allows implementaters to chose to only support binary16 as a storage format and not bother with a full scalar support; saving hardware and still allowing computing on half precision datum after a promotion to binary32.
The F extension (scalar support for single precision) is mandated for most of the other floating-point extensions. For example, it is required to enable half precision support (Zfh or Zfhmin) or vector floating-point support (Zve32f onward).
Later in this post, we will cover the IEEE-754 support on the vector side and also look at support for non-standard floating-point formats.
Zfa: additional floating-point scalar support
The F extension has been extended by the Zfa extension which offers a few useful scalar floating-point operations. Zfa operations include a floating-point load immediate instruction (with 32 useful floating-point constants), a set of quiet floating-point comparisons, various rounding to integer values (in floating-point format) and a few other operations.
We covered Zfa in more details in this post:
Zfa is defined for the 3 standard formats (binary32, binary64, and binary16) with respective dependency on the F, D, and Zfh.
Note:
fli.his defined if and only if Zfh or Zvfh are defined. Since Zvfh depends on Zfhmin, it means that Zfhmin is necessary but not sufficient forfli.hto exist. The rationale being that if they are no vector supportfli.his not very useful on its own, e.g. usingfli.smakes more sense thanfli.hfollowed byfcvt.s.h.
Zfinx: area constrained floating-point support
Part of the extra cost of RISC-V floating-point support, namely the additional floating-point register file, can be avoided by selecting the Zfinx family of Floating-Point support for RISC-V. In this case, floating-point operations operate from general purpose registers. Zfinx should be read as “Z-F-in-X”, and indicates that the operations from the F extension are implemented using the XRF, a.k.a. the general purpose register file.
RISC-V specifies Zfinx as the equivalent to the F extension (single precision support), Zdinx as the equivalent to the D extension, and finally Zhinx / Zhinxmin as the equivalents to Zfh / Zfhmin. Those extensions reuse the F / D / Zfh / Zfhmin encodings and remove a few instructions (namely floating-point loads, stores and moves between register files) which would be redundant with existing general purpose instructions.
Note: Zfinx does not specify NaN boxing when a 32-bit F value is stored in a XLEN=64 RV64 register; it mandates sign extension (same as for integer values).
Floating-Point Support in RISC-V Vector
The baseline vector support for floating-point in RISC-V comes with RVV 1.0 which specifies support for single and double precision support.
The support is even more comprehensive than on the scalar side:
RVV 1.0 defines vector variants for all existing scalar instructions, but also instructions without scalar counterparts:
Widening addition/subtraction/multiply/multiply-accumulate instructions
Reciprocal and reciprocal square-root estimate instructions
Narrowing float-to-float conversions with rounding towards odd
One difference to notice is that contrary to their scalar counterparts, the vector multiply-accumulate instructions are destructive: one of the operand is overwritten as the destination. Multiple variants are defined with different “destructive” scheme:
vfmacc.vv vd, vs1, vs2, vm # vd[i] = (vs1[i] * vs2[i]) + vd[i] (addend destructive)
vfmadd.vv vd, vs1, vs2, vm # vd[i] = (vs1[i] * vd[i]) + vs2[i] (multiplier destructive)Note: Both
vfmaccandvfmaddimplement FMA-like semantic: fused multiply-add with a single final rounding.
The vector extension also defines vector-specific operations: floating-point reductions (sum/min/max), vector-scalar/vector-vector variants (including reverse vector-scalar operations such as vfrdiv.vf and vfrsub.vf).
Later extensions, Zvfhmin and Zvfh, added large (resp. minimal) vector support for half precision / binary16. More recently minimal support for BFloat16 was added with Zvfbfmin (conversions) and Zvfbfwma (widening multiply-accumulate BF16.FP32).
Support beyond IEEE 754 standard formats
A few non standard floating-point formats have made their apparition into ratified or under-specification RISC-V extensions. Some formats inherit from IEEE 754 patterns (e.g., BFloat16) others differ more widely (e.g., OpenCompute’s OFP8).
Ratified Support for BFloat16
The first one is BFloat16 (a.k.a. BrainFloat16). This 16-bit floating-point format was never officially listed in any IEEE 754 specifications (at least up to the 2019 revision) but draws from the standard pattern: same encodings, same bias defintion, same special values. The original definition2 corresponds to a truncation of IEEE 754 binary32 (keeping only the upper 16 bits) and the original specification of operation on BFloat16 numbers specifies that subnormal numbers should be flushed to zero. RISC-V does use the standard encoding definition but mandate full subnormal support (adhering to IEEE 754 floating-point arithmetic mandate).
On the scalar side, RISC-V ISA can be extended with Zfbfmin (spec source) which defines basic data moves and BFloat16 conversions from/to single precision.
On the vector side, RISC-V can be extended with either basic data moves and conversions with Zvfbfmin or with widening multiply-accumulate with Zvfbfwma.
Those extensions were covered in a previous post:
Those 3 extensions are ratified and listed as optional into the user level RISC-V Application profile, RVA23U64. Thus, it can be expected that many RISC-V implementation will support them.
Future extended vector support for BFloat16: Zvfbfa
A new extension bringing extended vector BFloat16 support ro RVV is making its way through RISC-V specification process: Zfvfbfa.
The extension project is available as a pull-request against the official riscv-isa-manual repo. We covered this extension in this section from a previous post.
Zvfbfa represents a very large step towards full vector BFloat16 support compared to Zvfbfmin and Zvfbfwma: it offers almost as wide of a support for BFloat16 as Zve32f does for binary32. The only excluded operations are “division, square root, reductions, and conversions to/from integers wider than 8 bits”.
Note: at the time of writing (April 2026), there are no specified instructions to convert from half precision from/to BFloat16. Such conversions would have to go through single precision. It is assumed that such cases are rare and would not justify the opcode allocation. Similarly there are not mixed format operations, e.g. a widening product between half precision and BFloat16 with a binary32 result.
Open Compute Formats
With the growing interest for small precision formats (e.g. 8-bit floating-point), RISC-V has been extended to support other non-IEEE floating-point formats. An example is Open Compute’s 8-bit floating-point formats: OFP8 with the Zvfofp8min vector extension project. This extension defines a few conversions instructions from/to both binary32 and BFloat16.
We cover this on-going extension project in RISC-V and MX Scaling Formats.
In this post we also presented another extension project to offer vector conversions from OpenCompute 4-bit floating-point format E2M1 and OFP8 E4M3 format: Zvfofp4min. At the time of writing, this extension is not an official RISC-V International project but it is likely to become one if interest for smaller and smaller floating-point formats continues to grow.
The Future of Floating-Point Support in RISC-V
Let’s briefly review an incomplete subset of the upcoming RISC-V projects related to floating-point arithmetic.
Support for micro-scaling
It seems support for micro-scaling is a recurring ask and it will certainly materialize as one or more future RISC-V extensions.
Micro-scaling is a concept standardized by OpenCompute MX scaling format standard which defines the storage and operations on a block of values associated with a shared scaling factor.
More formally, there are n values stored v_0, v_1, …, v_{n-1} and a stored scaling factor S and the actual values represented are (v_0 * S), (v_1 * S), …, (v_{n-1} * S). So the storage of a value is split between an element and a shared scale factor. Assuming v_i used an e-bit encoding and S uses an s-bit encoding, n values are represented using n.e + s bits, rather than n.(e+s). Depending on the overlap between element and scale encoding, the actual saving might be less than (n-1).s bits and the algorithm used to determine the scale may also reduce the accuracy of some or all the elements.
Although scaling format support can already be implemented with existing or in-progress extensions (e.g. Zve32f+ Zfvfba + Zvfofp8min), it could be made more efficient by introducing instructions which operates directly on MX encoded blocks: handling the interpretation of the block value as part of the operation without having to explicitly lay down a sequence of multiple instructions.
This future work applies to both OpenCompute MX-Scaling Formats and IEEE P3109 formats (described in the next section).
Support for IEEE P3109
IEEE is currently working on a standard for smaller formats (16-bit and less) targeted at machine learning applications and offering a very large space of possible formats.
We covered the status of the spec effort in this section of this previous post:
P3109 has made very different choices compared to OCP: while OCP specifies a small number of formats (2 OFP8 formats and 6 MX Scaling formats), P3109 defines a framework than can be used to specifies 100s of different formats. P3109 also defines a large number of primitives operations (including non-linear functions).
It is not the intent of P3109 that any hardware implementations support the full range of formats but rather than they select a relevant subset. For RISC-V, this will certainly mean defining one or more profile which the ecosystem could organize around.
P3109 is neither compatible with OFP8/MX-scaling formats, nor with IEEE 754 definition (e.g. it does not match the definition requirement for emax).
Although no official RISC-V project to support P3109 exist at the time of writing, it is highly likely that one or more will be spawned once P3109 gets closer to ratification.
Conclusion
RISC-V offers a rich support for floating-point, both on the scalar and the vector sides. This support ranges from the low-end Zfinx (scalar floating-point support operating on the general purpose register file) to the extensive V extension (with full single and double precision vector support). This support has been extended with half precision and BFloat16 support; and other extensions are in development. We covered a few in this post, but there are many other on-going efforts involving floating-point at RISC-V International including:
Specification for Dot Product and Matrix Operations (these would justify their own post series).
long dot product, batched dot product
Various flavors of matrix extensions: IME, VME, AME
Specifications for Data-Independent Execution Latency Floating-Point operations (for the implementation of cryptographic primitives).
In its on-going development work, RISC-V International faces a few challenges. For example, RVI has to tackle the existence of multiple competing standard (ratified or in project) for small floating-point formats. There are two distincts (and often opposite) mandates: specify rapidly to address key market/ecosystem need and specify for the long term, one would not want to waste time and encoding space working on supporting a set of formats which would be abandoned by the time the extension specification is ratified. RVIA can rely on its active member ecosystem to bubble-up critical need and work on long-term support while the member companies continues to experiment with custom extensions.
To stay informed about the on-going floating-point development at RVI (or contribute), you can join the following RISC-V groups:
The Floating-Point Special Interest Group (SIG FP): https://lists.riscv.org/g/sig-fp
The Vector SIG: https://lists.riscv.org/g/sig-vector
The Unprivileged ISA Committee (Unpriv IC): https://lists.riscv.org/g/tech-unprivileged/
Architectural Review Committee Mailbox: https://lists.riscv.org/g/tech-arch-review (where request for extension ARC reviews are posted)
Cheat Sheet: RISC-V Floating-Point Extensions
minimal support often means support for conversions and data moves but no arithmetic operations.
Each extension name should be a link to the source of its specification on riscv-isa-manual or a pull-request introducing the specification (for in-development)
Scalar extensions with dedicated floating-point registers:
F: baseline scalar floating-point support for single precision (binary32)
D: baseline scalar floating-point support for double precision (binary64)
Q: baseline scalar floating-point support for quad precision (binary128)
Zfh: baseline scalar floating-point support for half precision (binary16)
Zfhmin: minimal scalar floating-point support for half precision (binary16)
Zfa: additional scalar floating-point support (extends F, D, Zfh/Zvfh)
Zfbfmin: minimal scalar floating-point support for BFloat16
Scalar extensions operating from general purpose registers:
Zfinx: baseline scalar floating-point support for single precision (binary32)
Zdinx: baseline scalar floating-point support for double precision (binary64)
Zhinx: baseline scalar floating-point support for half precision (binary16)
Zhinxmin: minimal scalar floating-point support for half precision (binary16)
Vector extensions:
Zve32f: baseline vector floating-point support single precision (binary32)
Zve64d: baseline vector floating-point support double precision (binary64)
Zvfh: baseline vector floating-point support half precision (binary16)
Zvfhmin: minimal vector floating-point support half precision (binary16)
Zvfbfmin: minimal vector floating-point support for BFloat16
Zvfbfwma: widening BFloat16 multiply-accumulate vector floating-point support
Zvfbfa: baseline vector floating-point support BFloat16 (in development)
Zvfofp8min: minimal vector floating-point support for OFP8 (in development)
Zvfofp4min: minimal vector floating-point support for OFP4 (project, unofficial repo)
Notes:
The dependencies are not listed here (see figure at the beginning of this post): e.g. supporting D means supporting F (so both baseline single and double precision scalar support).
Zve64d and v extensions offer identical floating-point support (Zve64d implies Zve32f)
Other posts relevant to Floating-Point:
Some Extra Reference(s):
BFloat16 Definition: BFloat16: The secret to high performance on Cloud TPUs
general purpose would be more apt than integer register file
The introduction of BFloat16 seems to be at Google I/O 2017 conference when it was mentioned as a new functionality of Google TPU v2 processor (covered in a later blogpost: https://cloud.google.com/blog/products/ai-machine-learning/bfloat16-the-secret-to-high-performance-on-cloud-tpus)











