BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//pretalx//cfp.riscv-europe.org//eu-summit-2026//speaker//AFYFGP
BEGIN:VTIMEZONE
TZID:CET
BEGIN:STANDARD
DTSTART:20001029T040000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10
TZNAME:CET
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20000326T030000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=3
TZNAME:CEST
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:pretalx-eu-summit-2026-NFYQCV@cfp.riscv-europe.org
DTSTART;TZID=CET:20260609T112000
DTEND;TZID=CET:20260609T113000
DESCRIPTION:In this work\, we optimize LLM inference on edge RISC-V CPUs us
 ing vector extension instructions. We leverage 4-bit vector load and effic
 ient 8-bit dot-product instructions to accelerate quantized and repacked 4
 -bit kernels in llama.cpp. In addition\, we implement RVV support for tile
 d flash attention\, which further improves performance in the prefill stag
 e. Experimental results show that the proposed optimizations achieve 1.76x
 -2.14x speedup over the upstream implementation while maintaining near-lin
 ear scaling for prefill workloads on an RVV-enabled multi-core platform.
DTSTAMP:20260522T162343Z
LOCATION:Poster Island C
SUMMARY:Accelerating LLM Inference on Edge RISC-V CPUs via Vector Extension
  Instructions and Flash Attention - Yueh-Feng Lee
URL:https://cfp.riscv-europe.org/eu-summit-2026/talk/NFYQCV/
END:VEVENT
END:VCALENDAR
