AME-PIM: Breaking the Memory Wall with RISC-V Matrix Extensions and HBM-PIM RISC-V Summit Europe 2026

AME-PIM: Breaking the Memory Wall with RISC-V Matrix Extensions and HBM-PIM
.ical
2026-06-11 11:10–11:20, Poster Island B

Matrix workloads, essential in generative AI, increasingly rely on ISA-level (i.e. AMX, SME). The attached matrix extension (AME) is one of the three (IME, AME, VME) ISA extensions under standardization in RISC-V. In common, all these matrix-ISA assumes extensions of the processor datapath with dedicated matrix acceleration hardware. However, executing matrix kernels requires moving large tiles between memory and processor registers, making performance limited by memory bandwidth.
We investigate whether High Bandwidth Memory with Processing-in-Memory (HBM--PIM) can serve as alternative implementation of AME instructions. We propose a PIM Execution Primitive (PEP) computational model mapping AME ISA onto Samsung Aquabolt-XL HBM-PIM microkernels, using an outer-product dataflow to enable in-memory accumulation, as well as remapping AME tile registers into memory regions—making possible to chain AME instructions without leaving the memory.
Our experiments show AME tile multiplication reaching 14.9 GFLOP/s (59.4 FLOP/cycle) on a HBM--PIM pseudo-channel, demonstrating that HBM--PIM can serve as an implementation of RISC-V matrix extensions.

See also:

Emanuele Venieri

I am a PhD student at the ECS Lab at University of Bologna, where I also earned my MSc in Electronics Engineering. My research focuses on digital architectures, with particular interest in RISC-V vector and matrix extensions and processing-in-memory (PIM) systems. I work on the Monte Cimone project, contributing to the enablement and characterization of the second-generation RISC-V cluster while evaluating the third iteration. I also contributed to AME-PIM, a novel approach that exposes PIM capabilities through the semantics of a matrix extension. In parallel, I work within the DARE project, where I contribute to the delivery of ControlPULP as the power-management controller for the GPP subsystem.

This speaker also appears in:

AME-PIM: Breaking the Memory Wall with RISC-V Matrix Extensions and HBM-PIM .ical 2026-06-11 11:10–11:20, Poster Island B

AME-PIM: Breaking the Memory Wall with RISC-V Matrix Extensions and HBM-PIM
.ical
2026-06-11 11:10–11:20, Poster Island B