STARBUG: RISC-V Hint Instructions for Lightweight VLIW Execution on Embedded DSP Workloads
This paper presents a standards-aligned microarchitectural extension that leverages architecturally reserved RISC-V HINT encodings to enable lightweight Very Long Instruction Word (VLIW) execution while preserving full backward binary compatibility. Unlike conventional superscalar designs that rely on dynamic scheduling, speculative issue, and complex hazard detection, our approach encodes static scheduling decisions in HINT instructions that execute as NOPs on unmodified cores. Modified implementations interpret these hints to form statically scheduled issue bundles, achieving higher Instruction-Level Parallelism (ILP) without increasing ISA surface area or compromising compliance.
We validate the proposal through a full-stack methodology spanning ISA modeling, RTL implementation, and FPGA deployment. ISA semantics were prototyped using Google’s MPACT simulator to evaluate bundle formation and decode behavior. We then extended the OpenHW Group CVW (Wally) core to support 4-wide integer VLIW execution via a widened multi-ported register file and parallel datapaths. The design was verified in Questa and Verilator and synthesized for FPGA-based cycle-accurate measurement.
Evaluation on representative DSP kernels (FFT, FIR, IIR, and dot product) demonstrates substantial IPC and cycle-count improvements relative to scalar RV32I execution, while maintaining binary compatibility and toolchain transparency. The proposed mechanism provides a path for energy-efficient ILP extraction in embedded and domain-specific systems, illustrating how reserved ISA space can be systematically exploited to deliver microarchitectural innovation without ecosystem fragmentation.