2026-06-11 –, Poster Island C
This work enables optimized, end-to-end inference of the object detection models on RISC-V vector CPU. It includes the implementation of optimized pre- and post-processing pipelines as well as the enablement of efficient execution of the models at FP32, FP16, and INT8 precisions. IREE, an MLIR-based compiler, is used to compile and optimize the model. Model inference on the Banana Pi BPI-F3 is profiled to identify top hotspot ops and their compilation is optimized in the IREE compilation pipeline either by improving vectorization or by implementing ukernels. For accuracy validation, the mean Average Precision (mAP) is computed using the COCO validation dataset. This project is supported by the RISC-V Software Ecosystem (RISE), and all the developed artifacts are open-source.
This work is done in collaboration with RISC-V Software Ecosystem (RISE) under RISE RP018 - Enabling and Optimizing IREE AI/ML e2e Models for High-Performance RISC-V Hardware - Yolov7/v8.
This work implements efficient pre- and post-processing pipelines for detection models for RISC-V Vector CPUs and improves IREE compilation for RISC-V, bringing it at par with X86 and ARM. All the artifacts developed in the project are open-source, and improvements made to IREE are planned to be merged into the upstream IREE repository.
I am a compiler engineer at 10xEngineers, working on enabling the compilation of LLMs and vision models for custom hardware/accelerators using IREE, an MLIR-based AI compiler. I have experience in writing optimized kernels for RISC-V Vector (RVV) and custom hardware, LLVM middle-end and backend development.