The MAD Podcast with Matt Turck

What’s Next for AI? OpenAI’s Łukasz Kaiser (Transformer Co-Author)


Listen Later

We’re told that AI progress is slowing down, that pre-training has hit a wall, that scaling laws are running out of road. Yet we’re releasing this episode in the middle of a wild couple of weeks that saw GPT-5.1, GPT-5.1 Codex Max, fresh reasoning modes and long-running agents ship from OpenAI — on top of a flood of new frontier models elsewhere. To make sense of what’s actually happening at the edge of the field, I sat down with someone who has literally helped define both of the major AI paradigms of our time.


Łukasz Kaiser is one of the co-authors of “Attention Is All You Need,” the paper that introduced the Transformer architecture behind modern LLMs, and is now a leading research scientist at OpenAI working on reasoning models like those behind GPT-5.1. In this conversation, he explains why AI progress still looks like a smooth exponential curve from inside the labs, why pre-training is very much alive even as reinforcement-learning-based reasoning models take over the spotlight, how chain-of-thought actually works under the hood, and what it really means to “train the thinking process” with RL on verifiable domains like math, code and science. We talk about the messy reality of low-hanging fruit in engineering and data, the economics of GPUs and distillation, interpretability work on circuits and sparsity, and why the best frontier models can still be stumped by a logic puzzle from his five-year-old’s math book.


We also go deep into Łukasz’s personal journey — from logic and games in Poland and France, to Ray Kurzweil’s team, Google Brain and the inside story of the Transformer, to joining OpenAI and helping drive the shift from chatbots to genuine reasoning engines. Along the way we cover GPT-4 → GPT-5 → GPT-5.1, post-training and tone, GPT-5.1 Codex Max and long-running coding agents with compaction, alternative architectures beyond Transformers, whether foundation models will “eat” most agents and applications, what the translation industry can teach us about trust and human-in-the-loop, and why he thinks generalization, multimodal reasoning and robots in the home are where some of the most interesting challenges still lie.


OpenAI

Website - https://openai.com

X/Twitter - https://x.com/OpenAI


Łukasz Kaiser

LinkedIn - https://www.linkedin.com/in/lukaszkaiser/

X/Twitter - https://x.com/lukaszkaiser


FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap


Matt Turck (Managing Director)

Blog - https://mattturck.com

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck


(00:00) – Cold open and intro

(01:29) – “AI slowdown” vs a wild week of new frontier models

(08:03) – Low-hanging fruit: infra, RL training and better data

(11:39) – What is a reasoning model, in plain language?

(17:02) – Chain-of-thought and training the thinking process with RL

(21:39) – Łukasz’s path: from logic and France to Google and Kurzweil

(24:20) – Inside the Transformer story and what “attention” really means

(28:42) – From Google Brain to OpenAI: culture, scale and GPUs

(32:49) – What’s next for pre-training, GPUs and distillation

(37:29) – Can we still understand these models? Circuits, sparsity and black boxes

(39:42) – GPT-4 → GPT-5 → GPT-5.1: what actually changed

(42:40) – Post-training, safety and teaching GPT-5.1 different tones

(46:16) – How long should GPT-5.1 think? Reasoning tokens and jagged abilities

(47:43) – The five-year-old’s dot puzzle that still breaks frontier models

(52:22) – Generalization, child-like learning and whether reasoning is enough

(53:48) – Beyond Transformers: ARC, LeCun’s ideas and multimodal bottlenecks

(56:10) – GPT-5.1 Codex Max, long-running agents and compaction

(1:00:06) – Will foundation models eat most apps? The translation analogy and trust

(1:02:34) – What still needs to be solved, and where AI might go next

...more
View all episodesView all episodes
Download on the App Store

The MAD Podcast with Matt TurckBy Matt Turck

  • 5
  • 5
  • 5
  • 5
  • 5

5

24 ratings


More shows like The MAD Podcast with Matt Turck

View all
This Week in Startups by Jason Calacanis

This Week in Startups

1,285 Listeners

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

536 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,084 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

226 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

93 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

507 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

490 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

136 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

475 Listeners

AI + a16z by a16z

AI + a16z

36 Listeners

Lightcone Podcast by Y Combinator

Lightcone Podcast

21 Listeners

Training Data by Sequoia Capital

Training Data

40 Listeners

Uncapped with Jack Altman by Alt Capital

Uncapped with Jack Altman

44 Listeners

OpenAI Podcast by OpenAI

OpenAI Podcast

53 Listeners

Cheeky Pint by Stripe

Cheeky Pint

49 Listeners