Yueh-Feng Lee RISC-V Summit Europe 2026

Yueh-Feng Lee
.ical

Dr. Yueh-Feng Lee received his Ph.D. degree in computer science from National Chiao Tung University. He previously worked at Mediatek and Industrial Technology Research Institute. His areas of focus include AI compiler and runtime, hypervisor technology, and embedded systems.

Session

06-09

11:20

10min

Accelerating LLM Inference on Edge RISC-V CPUs via Vector Extension Instructions and Flash Attention

Yueh-Feng Lee

In this work, we optimize LLM inference on edge RISC-V CPUs using vector extension instructions. We leverage 4-bit vector load and efficient 8-bit dot-product instructions to accelerate quantized and repacked 4-bit kernels in llama.cpp. In addition, we implement RVV support for tiled flash attention, which further improves performance in the prefill stage. Experimental results show that the proposed optimizations achieve 1.72x-2.14x speedup over the upstream implementation while maintaining near-linear scaling for prefill workloads on an RVV-enabled multi-core platform.

Blind Submission (Default)

Poster Island C

Yueh-Feng Lee .ical

Session

Yueh-Feng Lee
.ical