The Neuron: AI Explained

Diffusion for Text: Why Mercury Could Make LLMs 10x Faster


Listen Later

Diffusion models changed how we generate images and video—now they’re coming for text.


In this episode, we sit down with Stefano Ermon, Stanford computer science professor and founder of Inception Labs, to unpack how diffusion works for language, why it can generate in parallel (instead of token-by-token), and what that means for latency, cost, and real-time AI products.


We talk through:

  • The simplest mental model for diffusion: generate a full draft, then refine it by “fixing mistakes”

  • Why today’s autoregressive LLM inference is often memory-bound—and why diffusion can shift it toward a more GPU-friendly compute profile

  • Where Mercury wins today (IDEs, voice/real-time agents, customer support, EdTech—anywhere humans can’t wait)

  • What changes (and what doesn’t) for long context and architecture choices

  • The real-world way to evaluate models in production: offline evals + the gold-standard A/B test

Stefano also shares what’s next on Mercury’s roadmap—especially around stronger planning and reasoning for agentic use cases.


Try Mercury + learn more: inceptionlabs.ai


For more practical, grounded conversations on AI systems that actually work, subscribe to The Neuron newsletter at https://theneuron.ai.

...more
View all episodesView all episodes
Download on the App Store

The Neuron: AI ExplainedBy The Neuron

  • 4.8
  • 4.8
  • 4.8
  • 4.8
  • 4.8

4.8

63 ratings


More shows like The Neuron: AI Explained

View all
NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

348 Listeners

AI Today Podcast by AI & Data Today

AI Today Podcast

160 Listeners

Practical AI by Practical AI LLC

Practical AI

216 Listeners

The Artificial Intelligence Show by Paul Roetzer and Mike Kaput

The Artificial Intelligence Show

207 Listeners

AI Chat: ChatGPT, AI News, Artificial Intelligence, OpenAI, Machine Learning by Jaeden Schafer

AI Chat: ChatGPT, AI News, Artificial Intelligence, OpenAI, Machine Learning

162 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

228 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

668 Listeners

AI For Humans: Weekly AI News, Tools & Trends by Kevin Pereira & Gavin Purcell

AI For Humans: Weekly AI News, Tools & Trends

280 Listeners

Everyday AI Podcast – An AI and ChatGPT Podcast by Everyday AI

Everyday AI Podcast – An AI and ChatGPT Podcast

108 Listeners

A Beginner's Guide to AI by Dietmar Fischer

A Beginner's Guide to AI

58 Listeners

AI Hustle: Make Money from AI and ChatGPT, Midjourney, NVIDIA, Anthropic, OpenAI by Jaeden Schafer and Jamie McCauley

AI Hustle: Make Money from AI and ChatGPT, Midjourney, NVIDIA, Anthropic, OpenAI

88 Listeners

The Next Wave - AI and The Future of Technology by Mindstream (Hubspot Media)

The Next Wave - AI and The Future of Technology

56 Listeners

Beyond The Prompt - How to use AI in your company by Jeremy Utley & Henrik Werdelin

Beyond The Prompt - How to use AI in your company

61 Listeners

Using AI at Work: AI in the Workplace & Generative AI for Business Leaders by Chris Daigle

Using AI at Work: AI in the Workplace & Generative AI for Business Leaders

22 Listeners

OpenAI Podcast by OpenAI

OpenAI Podcast

59 Listeners