Benchmarking the Vortex RISC-V GPU for Sparse Workloads
2026-06-10 , Poster Island B

Many computational problems require the processing of large sparse matrices, where the vast majority of entries are zero. The irregular distribution of the non-zero elements in these matrices stresses the memory system resulting in performance being bottlenecked by the memory bandwidth. On parallel architectures, workload imbalances also limit performance. Graphics Processing Units (GPUs) runnning sparse matrix kernels using state-of-the-art Basic Linear Algebra Subsystem (BLAS) libraries are central in modern HPC systems. Although RISC-V application processors are gaining in performance, RISC-V based GPUs are in an early stage of development. We benchmark sparse kernels both on modern HPC-grade GPUs and on Vortex, a RISC-V GPU that is gaining adoption. We analyse their performance under memory-bound workloads and report the gaps in software and hardware required to enable efficient sparse BLAS processing on RISC-V GPUs.