Agora - The Marketplace of Ideas

The Bleeding Edge


Listen Later

**Are the state-of-the-art autoregressive decoder-style transformers the only future for large language models?** **We dive into the most fascinating alternatives, including linear attention hybrids that promise huge efficiency gains for long contexts and text diffusion models that generate tokens in parallel instead of sequentially.** Plus, discover how Code World Models are training models to simulate code behavior for improved modeling performance, aiming to develop more capable coding systems.

...more
View all episodesView all episodes
Download on the App Store

Agora - The Marketplace of IdeasBy Matthew Harris