
Sign up to save your podcasts
Or


LLMs generate text painfully slow, one low-info token at a time. Researchers just figured out how to compress 4 tokens into smart vectors & cut costs by 44%—with full code & proofs! Meanwhile OpenAI drops product ads, not papers.
Sponsors
This episode is brought to you by Statistical Horizons
By Francesco Gadaleta4.2
7272 ratings
LLMs generate text painfully slow, one low-info token at a time. Researchers just figured out how to compress 4 tokens into smart vectors & cut costs by 44%—with full code & proofs! Meanwhile OpenAI drops product ads, not papers.
Sponsors
This episode is brought to you by Statistical Horizons

4,004 Listeners

26,409 Listeners

762 Listeners

627 Listeners

12,159 Listeners

6,434 Listeners

302 Listeners

113,095 Listeners

56,827 Listeners

27 Listeners

4,075 Listeners

8,192 Listeners

215 Listeners

6,446 Listeners

16,492 Listeners