The MAD Podcast with Matt Turck

What’s Next for AI? OpenAI’s Łukasz Kaiser (Transformer Co-Author)


Listen Later

We’re told that AI progress is slowing down, that pre-training has hit a wall, that scaling laws are running out of road. Yet we’re releasing this episode in the middle of a wild couple of weeks that saw GPT-5.1, GPT-5.1 Codex Max, fresh reasoning modes and long-running agents ship from OpenAI — on top of a flood of new frontier models elsewhere. To make sense of what’s actually happening at the edge of the field, I sat down with someone who has literally helped define both of the major AI paradigms of our time.


Łukasz Kaiser is one of the co-authors of “Attention Is All You Need,” the paper that introduced the Transformer architecture behind modern LLMs, and is now a leading research scientist at OpenAI working on reasoning models like those behind GPT-5.1. In this conversation, he explains why AI progress still looks like a smooth exponential curve from inside the labs, why pre-training is very much alive even as reinforcement-learning-based reasoning models take over the spotlight, how chain-of-thought actually works under the hood, and what it really means to “train the thinking process” with RL on verifiable domains like math, code and science. We talk about the messy reality of low-hanging fruit in engineering and data, the economics of GPUs and distillation, interpretability work on circuits and sparsity, and why the best frontier models can still be stumped by a logic puzzle from his five-year-old’s math book.


We also go deep into Łukasz’s personal journey — from logic and games in Poland and France, to Ray Kurzweil’s team, Google Brain and the inside story of the Transformer, to joining OpenAI and helping drive the shift from chatbots to genuine reasoning engines. Along the way we cover GPT-4 → GPT-5 → GPT-5.1, post-training and tone, GPT-5.1 Codex Max and long-running coding agents with compaction, alternative architectures beyond Transformers, whether foundation models will “eat” most agents and applications, what the translation industry can teach us about trust and human-in-the-loop, and why he thinks generalization, multimodal reasoning and robots in the home are where some of the most interesting challenges still lie.


OpenAI

Website - https://openai.com

X/Twitter - https://x.com/OpenAI


Łukasz Kaiser

LinkedIn - https://www.linkedin.com/in/lukaszkaiser/

X/Twitter - https://x.com/lukaszkaiser


FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap


Matt Turck (Managing Director)

Blog - https://mattturck.com

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck


(00:00) – Cold open and intro

(01:29) – “AI slowdown” vs a wild week of new frontier models

(08:03) – Low-hanging fruit: infra, RL training and better data

(11:39) – What is a reasoning model, in plain language?

(17:02) – Chain-of-thought and training the thinking process with RL

(21:39) – Łukasz’s path: from logic and France to Google and Kurzweil

(24:20) – Inside the Transformer story and what “attention” really means

(28:42) – From Google Brain to OpenAI: culture, scale and GPUs

(32:49) – What’s next for pre-training, GPUs and distillation

(37:29) – Can we still understand these models? Circuits, sparsity and black boxes

(39:42) – GPT-4 → GPT-5 → GPT-5.1: what actually changed

(42:40) – Post-training, safety and teaching GPT-5.1 different tones

(46:16) – How long should GPT-5.1 think? Reasoning tokens and jagged abilities

(47:43) – The five-year-old’s dot puzzle that still breaks frontier models

(52:22) – Generalization, child-like learning and whether reasoning is enough

(53:48) – Beyond Transformers: ARC, LeCun’s ideas and multimodal bottlenecks

(56:10) – GPT-5.1 Codex Max, long-running agents and compaction

(1:00:06) – Will foundation models eat most apps? The translation analogy and trust

(1:02:34) – What still needs to be solved, and where AI might go next

...more
View all episodesView all episodes
Download on the App Store

The MAD Podcast with Matt TurckBy Matt Turck

  • 5
  • 5
  • 5
  • 5
  • 5

5

24 ratings


More shows like The MAD Podcast with Matt Turck

View all
The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

529 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,093 Listeners

Invest Like the Best with Patrick O'Shaughnessy by Colossus | Investing & Business Podcasts

Invest Like the Best with Patrick O'Shaughnessy

2,361 Listeners

Azeem Azhar's Exponential View by Azeem Azhar

Azeem Azhar's Exponential View

614 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

227 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

9,971 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

95 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

517 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

500 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

130 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

92 Listeners

AI + a16z by a16z

AI + a16z

36 Listeners

Sharp Tech with Ben Thompson by Andrew Sharp and Ben Thompson

Sharp Tech with Ben Thompson

95 Listeners

TBPN by John Coogan & Jordi Hays

TBPN

121 Listeners

Uncapped with Jack Altman by Alt Capital

Uncapped with Jack Altman

43 Listeners