October 26, 2025

STAR: Sub-Entry Sharing TLB for Multi-Instance GPU Efficiency

17 minutes

These April 29, 2024 paper provides an overview of the challenges associated with using **NVIDIA's Multi-Instance GPU (MIG)** technology, specifically focusing on the address translation mechanism in the **A100 GPU**. The papers reveal, primarily through **reverse-engineering efforts**, that the L2 and L3 Translation Lookaside Buffers (**TLBs**) utilize a compression design where each entry comprises **16 sub-entries** to enhance memory capacity management. A major problem arises because the **L3 TLB is shared** across all isolated MIG instances, causing contention that results in frequent evictions and low utilization of these sub-entries. To mitigate this performance degradation, the sources propose **STAR**, a novel hardware solution that dynamically enables the sharing of TLB sub-entries among different base addresses to improve overall efficiency.

Source:

https://arxiv.org/pdf/2404.18361

...more