
Sign up to save your podcasts
Or


LLMs generate text painfully slow, one low-info token at a time. Researchers just figured out how to compress 4 tokens into smart vectors & cut costs by 44%—with full code & proofs! Meanwhile OpenAI drops product ads, not papers.
Sponsors
This episode is brought to you by Statistical Horizons
By Francesco Gadaleta4.2
7272 ratings
LLMs generate text painfully slow, one low-info token at a time. Researchers just figured out how to compress 4 tokens into smart vectors & cut costs by 44%—with full code & proofs! Meanwhile OpenAI drops product ads, not papers.
Sponsors
This episode is brought to you by Statistical Horizons

3,999 Listeners

26,335 Listeners

766 Listeners

623 Listeners

12,155 Listeners

6,466 Listeners

301 Listeners

113,521 Listeners

57,033 Listeners

25 Listeners

4,115 Listeners

8,704 Listeners

204 Listeners

6,470 Listeners

16,427 Listeners