AI: post transformers

FusionANNS: Billion-Scale ANNS with SSD and GPU


Listen Later

This September 2024 paper introduces FusionANNS, a novel system designed to improve Approximate Nearest Neighbor Search (ANNS) for extremely large datasets. It addresses challenges in existing ANNS systems, such as performance bottleneckshigh operational costs, and accuracy limitations, particularly when dealing with billion-scale vector data in modern AI infrastructure like Large Language Models (LLMs). FusionANNS achieves this through a cooperative CPU/GPU architecture that employs multi-tiered indexingheuristic re-ranking, and redundancy-aware I/O deduplication. The system is shown to significantly outperform state-of-the-art SSD-based and GPU-accelerated in-memory ANNS solutions in terms of throughput (QPS)cost efficiency, and memory efficiency, while maintaining low latency and high accuracy.

Source:

https://arxiv.org/pdf/2409.16576

...more
View all episodesView all episodes
Download on the App Store

AI: post transformersBy mcgrof