
Sign up to save your podcasts
Or


LLMs generate text painfully slow, one low-info token at a time. Researchers just figured out how to compress 4 tokens into smart vectors & cut costs by 44%—with full code & proofs! Meanwhile OpenAI drops product ads, not papers.
Sponsors
This episode is brought to you by Statistical Horizons
By Francesco Gadaleta4.2
7272 ratings
LLMs generate text painfully slow, one low-info token at a time. Researchers just figured out how to compress 4 tokens into smart vectors & cut costs by 44%—with full code & proofs! Meanwhile OpenAI drops product ads, not papers.
Sponsors
This episode is brought to you by Statistical Horizons

32,185 Listeners

7,766 Listeners

4,046 Listeners

891 Listeners

624 Listeners

580 Listeners

6,422 Listeners

303 Listeners

93 Listeners

8,530 Listeners

269 Listeners

9,169 Listeners

199 Listeners

15,616 Listeners

29,277 Listeners