Byte Sized Breakthroughs

Models tell you what to discard


Listen Later

This paper introduces FastGen, a novel method that uses lightweight model profiling and adaptive key-value caching to significantly reduce memory footprint without noticeable quality loss.
Read full paper: https://arxiv.org/abs/2310.01801
Tags: Systems and Performance, Machine Learning, Optimization
...more
View all episodesView all episodes
Download on the App Store

Byte Sized BreakthroughsBy Arjun Srivastava