An Open-Source Framework to Enable Float16 On-Device Training on RISC-V Single-Core
2026-06-11 , Poster Island C

This work proposes an open-source framework that leverages both the Zfh (scalar float16) and the Zvfh (vector float16) extensions to enable complete on-device training on resource-constrained RISC-V single-core. On top of reducing the memory footprint by about 50% as compared to using float32, our approach facilitates transfer learning and fine-tuning scenarios by incorporating layer-freezing capabilities. Our work builds onto AIfES an open-source, modular and generic DNN training and inference framework for embedded systems that can be extended with custom hardware-specific functions.


Various approaches have been proposed in recent years to mitigate the compute and memory extensiveness of On-Device Training (ODT) of Deep Neural Networks, but these approaches struggle to support full-fledged training or the use of a batch size greater than 1. These limitations primarily stem from the design of hybrid methods, which employ quantized operations for the forward pass while reverting to float32 for the computationally expensive backward pass, ultimately leading to significant learning instability. RISC-V offers the scalar float16 extension Zfh and its vector counterpart Zvfh, which stand as promising candidates to meet the full ODT requirements: lower memory footprint than float32 and SIMD execution from Zvfh. Although existing open-source RISC-V frameworks offer full scalar float16 ODT capabilities, it is specific to multi-core platforms. To address the lack of open-source ODT frameworks optimized for RISC-V single-core supporting Zfh and/or Zvfh, we propose an easy-to-use open-source library which allows PyTorch/Tensorflow models to be deployed and fully trained on RISC-V single-core featuring Zfh/Zvfh support.

See also:

PhD student in machine learning security at CEA-Leti.