
Sign up to save your podcasts
Or


LLMs generate text painfully slow, one low-info token at a time. Researchers just figured out how to compress 4 tokens into smart vectors & cut costs by 44%—with full code & proofs! Meanwhile OpenAI drops product ads, not papers.
Sponsors
This episode is brought to you by Statistical Horizons
By Francesco Gadaleta4.2
7272 ratings
LLMs generate text painfully slow, one low-info token at a time. Researchers just figured out how to compress 4 tokens into smart vectors & cut costs by 44%—with full code & proofs! Meanwhile OpenAI drops product ads, not papers.
Sponsors
This episode is brought to you by Statistical Horizons

4,027 Listeners

26,384 Listeners

753 Listeners

628 Listeners

12,133 Listeners

6,463 Listeners

305 Listeners

113,307 Listeners

56,974 Listeners

15 Listeners

4,027 Listeners

8,037 Listeners

209 Listeners

6,466 Listeners

16,508 Listeners