BEGIN:VCALENDAR
VERSION:2.0
PRODID:-//pretalx//cfp.riscv-europe.org//eu-summit-2026//speaker//YLYDCG
BEGIN:VTIMEZONE
TZID:CET
BEGIN:STANDARD
DTSTART:20001029T040000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=10
TZNAME:CET
TZOFFSETFROM:+0200
TZOFFSETTO:+0100
END:STANDARD
BEGIN:DAYLIGHT
DTSTART:20000326T030000
RRULE:FREQ=YEARLY;BYDAY=-1SU;BYMONTH=3
TZNAME:CEST
TZOFFSETFROM:+0100
TZOFFSETTO:+0200
END:DAYLIGHT
END:VTIMEZONE
BEGIN:VEVENT
UID:pretalx-eu-summit-2026-KGMSLQ@cfp.riscv-europe.org
DTSTART;TZID=CET:20260611T105000
DTEND;TZID=CET:20260611T110000
DESCRIPTION:Sparse tensor operations are critical for scientific computing 
 but their irregular memory access patterns challenge traditional architect
 ures. While domain-specific architectures offer efficiency\, integration i
 nto mature SoCs often requires ISA modifications or complex driver develop
 ment. This work addresses these challenges via a decoupled SpMV access uni
 t integrated through Cohort\, a coherent shared-memory queue interface com
 municating with a CVA6 RISC-V core. To mitigate the inter-tile communicati
 on overhead\, we introduce a hybrid tiling approach that co-locates the ac
 cess unit and the core in the same tile\, enabling direct data delivery in
 to the private cache. This hybrid architecture achieves significant perfor
 mance gains\, yielding geometric mean speedups of 1.33× and 1.50× for CO
 O and CSR formats\, respectively\, over traditional multi-tile configurati
 ons. These results demonstrate that offloading memory traversal to a progr
 ammable data-flow engine\, combined with optimized placement in the memory
  hierarchy\, efficiently accelerates irregular workloads with minimal intr
 usion.
DTSTAMP:20260522T162437Z
LOCATION:Poster Island A
SUMMARY:Lessons Learned from Designing Decoupled-Access Hardware Accelerato
 rs in a RISC-V Framework - Xicu Marí
URL:https://cfp.riscv-europe.org/eu-summit-2026/talk/KGMSLQ/
END:VEVENT
END:VCALENDAR
