AI Post Transformers

STAR: Sub-Entry Sharing TLB for Multi-Instance GPU Efficiency


Listen Later

These April 29, 2024 paper provides an overview of the challenges associated with using NVIDIA's Multi-Instance GPU (MIG) technology, specifically focusing on the address translation mechanism in the A100 GPU. The papers reveal, primarily through reverse-engineering efforts, that the L2 and L3 Translation Lookaside Buffers (TLBs) utilize a compression design where each entry comprises 16 sub-entries to enhance memory capacity management. A major problem arises because the L3 TLB is shared across all isolated MIG instances, causing contention that results in frequent evictions and low utilization of these sub-entries. To mitigate this performance degradation, the sources propose STAR, a novel hardware solution that dynamically enables the sharing of TLB sub-entries among different base addresses to improve overall efficiency. Source: https://arxiv.org/pdf/2404.18361
...more
View all episodesView all episodes
Download on the App Store

AI Post TransformersBy mcgrof