End-to-End On-Device Transformer Training on Ultra-Low Power RISC-V MCU
2026-06-10 , Devzone

This demo showcases complete end-to-end Transformer training locally on the GAP9 RISC-V MCU. On-device training is crucial for applications that operate in dynamically changing environments. One example is biosignal DNNs in wearable devices, where cross-subject transfer and long-term temporal drift degrade performance. RISC-V MCUs are already widely used for edge DNN deployment. However, most existing work focuses either on inference only, or on fine-tuning a small portion of the network.

We extended the Deeploy compiler to generate training code. Deeploy generates bare-metal C code from an ONNX graph and is tailored for efficient inference. To support training, we added critical kernels such as optimizers and in-place gradient accumulators. We also extended the ONNX runtime training API to generate graphs optimized for edge deployment. This extension is released at https://github.com/pulp-platform/ONNX4Deeploy. To reduce the memory footprint of batching required for stable training, we implement gradient accumulation. The demo video showcases the full workflow, from training graph optimization to code generation and on-board execution. The video is available at https://drive.google.com/file/d/16BMiHn0jyMvScFJD7AGTwHpA4Rc0aMnC/view?usp=drive_link and will be uploaded to the Pulp Platform YouTube channel.

See also: Abstract Pdf (215.6 KB)

Victor Jean-Baptiste Jung received his Bachelor’s Degree in Computer Science and Engineering Physics from Juniata College, and his Master’s Degree in Computer Science from the Institut Supérieur de l’Electronique et du Numérique of Lille (ISEN Lille) in 2022. After 3 months as a research intern with KU Leuven’s MICAS Research group, supervised by Prof. Marian Verhelst, he's currently pursuing his Ph.D. at the Integrated Systems Laboratory with Prof. Dr. Luca Benini. His current research interests include Efficient deployment of ML models on Microcontrollers, Tiny Transformers, Scheduling, and Quantization.