AI: post transformers

AiSAQ: DRAM-free ANNS with Product Quantization


Listen Later

This paper February 2025 paper introduces AiSAQ (All-in-Storage ANNS with Product Quantization), a novel method designed for Approximate Nearest Neighbor Search (ANNS) that significantly reduces DRAM (Dynamic Random-Access Memory) usage. Unlike traditional methods like DiskANN, which store compressed vectors in DRAM, AiSAQ offloads these to SSD (Solid-State Drive) storage, allowing for near-zero memory footprint even with billion-scale datasets. The paper details how this approach maintains high search performance while drastically lowering index load times and enabling rapid switching between large datasets, making it particularly beneficial for applications like Retrieval-Augmented Generation (RAG) in Large Language Models (LLMs) and multi-server environments. Experiments demonstrate AiSAQ's efficiency in terms of memory, latency, and cost-effectiveness for large-scale information retrieval.


Source: February 2025 https://arxiv.org/pdf/2404.06004

...more
View all episodesView all episodes
Download on the App Store

AI: post transformersBy mcgrof