2026-06-10 –, Poster Island D
Deploying high-performance AI inference on autonomous drones requires a precise balance between computational throughput and a strict 1W power envelope. This paper presents a vertical design space exploration (DSE) of the RISC-V Gemmini accelerator, scaling from 8x8 to 32x32 mesh configurations in the SkyWater 130nm process. Through an end-to-end evaluation using a YOLOv4-tiny model on the VisDrone dataset, we demonstrate a 74.75% reduction in model memory footprint via INT8 quantization and a speedup of up to 2352x compared to a RISC-V CPU baseline. Our results indicate that while the 32x32 mesh excels in peak throughput, the 16x16 mesh represents the optimal “sweet spot” for 1W-limited drone chiplets, combining high performance with manageable leakage and area.