The MAD Podcast with Matt Turck

How GPT-5 Thinks — OpenAI VP of Research Jerry Tworek


Listen Later

What does it really mean when GPT-5 “thinks”? In this conversation, OpenAI’s VP of Research Jerry Tworek explains how modern reasoning models work in practice—why pretraining and reinforcement learning (RL/RLHF) are both essential, what that on-screen “thinking” actually does, and when extra test-time compute helps (or doesn’t). We trace the evolution from O1 (a tech demo good at puzzles) to O3 (the tool-use shift) to GPT-5 (Jerry calls it “03.1-ish”), and talk through verifiers, reward design, and the real trade-offs behind “auto” reasoning modes.


We also go inside OpenAI: how research is organized, why collaboration is unusually transparent, and how the company ships fast without losing rigor. Jerry shares the backstory on competitive-programming results like ICPC, what they signal (and what they don’t), and where agents and tool use are genuinely useful today. Finally, we zoom out: could pretraining + RL be the path to AGI?


This is the MAD Podcast —AI for the 99%. If you’re curious about how these systems actually work (without needing a PhD), this episode is your map to the current AI frontier.



OpenAI

Website - https://openai.com

X/Twitter - https://x.com/OpenAI


Jerry Tworek

LinkedIn - https://www.linkedin.com/in/jerry-tworek-b5b9aa56

X/Twitter - https://x.com/millionint


FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap


Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck



(00:00) Intro

(01:01) What Reasoning Actually Means in AI

(02:32) Chain of Thought: Models Thinking in Words

(05:25) How Models Decide Thinking Time

(07:24) Evolution from O1 to O3 to GPT-5

(11:00) Before OpenAI: Growing up in Poland, Dropping out of School, Trading

(20:32) Working on Robotics and Rubik's Cube Solving

(23:02) A Day in the Life: Talking to Researchers

(24:06) How Research Priorities Are Determined

(26:53) Collaboration vs IP Protection at OpenAI

(29:32) Shipping Fast While Doing Deep Research

(31:52) Using OpenAI's Own Tools Daily

(32:43) Pre-Training Plus RL: The Modern AI Stack

(35:10) Reinforcement Learning 101: Training Dogs

(40:17) The Evolution of Deep Reinforcement Learning

(42:09) When GPT-4 Seemed Underwhelming at First

(45:39) How RLHF Made GPT-4 Actually Useful

(48:02) Unsupervised vs Supervised Learning

(49:59) GRPO and How DeepSeek Accelerated US Research

(53:05) What It Takes to Scale Reinforcement Learning

(55:36) Agentic AI and Long-Horizon Thinking

(59:19) Alignment as an RL Problem

(1:01:11) Winning ICPC World Finals Without Specific Training

(1:05:53) Applying RL Beyond Math and Coding

(1:09:15) The Path from Here to AGI

(1:12:23) Pure RL vs Language Models

...more
View all episodesView all episodes
Download on the App Store

The MAD Podcast with Matt TurckBy Matt Turck

  • 5
  • 5
  • 5
  • 5
  • 5

5

24 ratings


More shows like The MAD Podcast with Matt Turck

View all
The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

535 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,093 Listeners

Invest Like the Best with Patrick O'Shaughnessy by Colossus | Investing & Business Podcasts

Invest Like the Best with Patrick O'Shaughnessy

2,364 Listeners

Azeem Azhar's Exponential View by Azeem Azhar

Azeem Azhar's Exponential View

614 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

227 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

10,018 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

95 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

516 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

501 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

130 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

92 Listeners

AI + a16z by a16z

AI + a16z

36 Listeners

Sharp Tech with Ben Thompson by Andrew Sharp and Ben Thompson

Sharp Tech with Ben Thompson

95 Listeners

TBPN by John Coogan & Jordi Hays

TBPN

121 Listeners

Uncapped with Jack Altman by Alt Capital

Uncapped with Jack Altman

43 Listeners