Carlos Rafael Tordoya Taquichiri RISC-V Summit Europe 2026

Carlos Rafael Tordoya Taquichiri
.ical

Rafael Tordoya is a Research Associate at the Zurich University of Applied Sciences (ZHAW) in the fields of Artificial Intelligence (AI) and Embedded Systems. His research focuses on leveraging AI in resource-constrained environments, with particular emphasis on optimized AI inference, mathematical backends, and leveraging available hardware capabilities to enhance inference performance in embedded systems.

Session

06-11

10:40

10min

RISC-V Packed-SIMD Acceleration for Quantized Edge-AI Inference on Space-Qualified Platforms

Carlos Rafael Tordoya Taquichiri

Conservative/qualification-sensitive RISC-V ecosystems tend to view large architectural changes as costly due to hardware overhead, integration effort, software/toolchain adaptation, and assurance scope. This is especially relevant for platforms intended for harsh environments and long lifetimes such as space-oriented and radiation-tolerant platforms (e.g., NOEL-V). At the same time, there is growing interest in on-board processing to support time-critical decisions close to the sensor and reduce reliance on transmitting raw sensor data, increasing the demand for compute-intensive Edge-AI inference. In such settings, full vector architectures can deliver high throughput, but they tend to introduce additional architectural state and increase integration complexity across the hardware and software stack. Therefore, to introduce data-parallel acceleration with minimal disruption, we evaluate packed-SIMD as a small-change alternative based on packed subword parallelism that remains close to the existing register and memory model.
We consider two packed-SIMD options: SWAR and SPARROW. On a NOEL-V softcore, we implement SWAR operator kernels for the most computationally expensive layers and integrate them into the math backend of a space prequalified inference engine, running on a space prequalified RTOS (RTEMS6 SMP). Using a hardware SWAR unit for packed subword operations, we report full-model results with and without SWAR acceleration, showing improved inference performance without requiring a full vector architecture. Finally, we outline future work extending the same backend methodology to SPARROW to compare performance across packed-SIMD options.

Blind Submission (Default)

Poster Island C

Carlos Rafael Tordoya Taquichiri .ical

Session

Carlos Rafael Tordoya Taquichiri
.ical