Energy-Efficiency Optimization of a RISC-V Floating-Point Unit for HPC-Oriented Architectures
As High-Performance Computing (HPC) advances towards the Exascale era, energy efficiency has become the primary design constraint. In HPC systems, the Floating-Point Unit (FPU) is instantiated in massive numbers to support parallel workloads, that require huge number of floating point computations. Consequently, the FPU becomes a dominant consumer of dynamic power within the chip. This work presents an energy-optimized FPU for RISC-V Vector Processing Units. To address the inefficiencies of standard unified FMA datapath, we propose a Split-Path FMA micro- architecture tailored for the RISC-V Vector specification. Our design integrates the physical separation of the arithmetic pipelines with vector-aware clock gating and operand isolation. Evaluated in a commercial 4nm technology at 2 GHz, the optimized design demonstrates up to a 29% increase in energy efficiency for mixed-arithmetic workloads and a 7.8% performance speedup in vector reduction-heavy kernels.