Locality-Aware Sparse Matrix Multiplication on RISC-V RVV RISC-V Summit Europe 2026

Locality-Aware Sparse Matrix Multiplication on RISC-V RVV
.ical
2026-06-09 11:00–11:10, Poster Island C

Sparse matrix–dense matrix multiplication (SpMM) is a fundamental workload in high-performance computing and emerging edge workloads, yet its performance is typically memory-bound due to irregular and indirect memory accesses. While the RISC-V Vector Extension (RVV) provides flexible data-parallel execution, efficiently exploiting it for sparse workloads remains challenging.

This work evaluates an iterative SpMM kernel on an RVV-enabled RISC-V processor (Spacemit X60, 8 cores) and investigates the combined impact of locality-aware data layout and explicit vectorization. We compare scalar, compiler-vectorized, library-based, and manual intrinsic implementations. Additionally, we apply Morton (Z-order) reordering to improve spatial locality in memory.

Experimental results show that vectorization alone provides limited benefits in memory-bound regimes. However, when combined with Morton reordering, manual RVV vectorization achieves the best performance. Microarchitectural analysis confirms reduced cache misses and improved IPC, although the workload remains fundamentally bandwidth-limited.

The study highlights the importance of data layout co-design when targeting sparse workloads on emerging RISC-V platforms.

Locality-Aware Sparse Matrix Multiplication on RISC-V RVV .ical 2026-06-09 11:00–11:10, Poster Island C

Locality-Aware Sparse Matrix Multiplication on RISC-V RVV
.ical
2026-06-09 11:00–11:10, Poster Island C