
Sign up to save your podcasts
Or


LLMs generate text painfully slow, one low-info token at a time. Researchers just figured out how to compress 4 tokens into smart vectors & cut costs by 44%—with full code & proofs! Meanwhile OpenAI drops product ads, not papers.
Sponsors
This episode is brought to you by Statistical Horizons
By Francesco Gadaleta4.2
7171 ratings
LLMs generate text painfully slow, one low-info token at a time. Researchers just figured out how to compress 4 tokens into smart vectors & cut costs by 44%—with full code & proofs! Meanwhile OpenAI drops product ads, not papers.
Sponsors
This episode is brought to you by Statistical Horizons

1,289 Listeners

392 Listeners

478 Listeners

624 Listeners

175 Listeners

288 Listeners

303 Listeners

342 Listeners

146 Listeners

268 Listeners

212 Listeners

90 Listeners

96 Listeners

209 Listeners

591 Listeners