The MAD Podcast with Matt Turck

Sonnet 4.5 & the AI Plateau Myth — Sholto Douglas (Anthropic)


Listen Later

Sholto Douglas, a top AI researcher at Anthropic, discusses the breakthroughs behind Claude Sonnet 4.5—the world's leading coding model—and why we might be just 2-3 years from AI matching human-level performance on most computer-facing tasks.


You'll discover why RL on language models suddenly started working in 2024, how agents maintain coherency across 30-hour coding sessions through self-correction and memory systems, and why the "bitter lesson" of scale keeps proving clever priors wrong.


Sholto shares his path from top-50 world fencer to Google's Gemini team to Anthropic, explaining why great blog posts sometimes matter more than PhDs in AI research. He discusses the culture at big AI labs and why Anthropic is laser-focused on coding (it's the fastest path to both economic impact and AI-assisted AI research). Sholto also discusses how the training pipeline is still "held together by duct tape" with massive room to improve, and why every benchmark created shows continuous rapid progress with no plateau in sight.


Bold predictions: individuals will soon manage teams of AI agents working 24/7, robotics is about to experience coding-level breakthroughs, and policymakers should urgently track AI progress on real economic tasks. A clear-eyed look at where AI stands today and where it's headed in the next few years.



Anthropic

Website - https://www.anthropic.com

Twitter - https://x.com/AnthropicAI


Sholto Douglas

LinkedIn - https://www.linkedin.com/in/sholto

Twitter - https://x.com/_sholtodouglas


FIRSTMARK

Website - https://firstmark.com

Twitter - https://twitter.com/FirstMarkCap


Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

Twitter - https://twitter.com/mattturck



(00:00) Intro

(01:09) The Rapid Pace of AI Releases at Anthropic

(02:49) Understanding Opus, Sonnet, and Haiku Model Tiers

(04:14) Shelto's Journey: From Australian Fencer to AI Researcher

(12:01) The Growing Pool of AI Talent

(16:16) Breaking Into AI Research Without Traditional Credentials

(18:29) What "Taste" Means in AI Research

(23:05) Moving to Google and Building Gemini's Inference Stack

(25:08) How Anthropic Differs from Other AI Labs

(31:46) Why Anthropic Is Laser-Focused on Coding

(36:40) Inside a 30-Hour Autonomous Coding Session

(38:41) Examples of What AI Can Build in 30 Hours

(43:13) The Breakthroughs That Enabled 30-Hour Runs

(46:28) What's Actually Driving the Performance Gains

(47:42) Pre-Training vs. Reinforcement Learning Explained

(52:11) Test-Time Compute and the New Scaling Paradigm

(55:55) Why RL on LLMs Finally Started Working

(59:38) Are We on Track to AGI?

(01:02:05) Why the "Plateau" Narrative Is Wrong

(01:03:41) Sonnet's Performance Across Economic Sectors

(01:05:47) Preparing for a World of 10–100x Individual Leverage

...more
View all episodesView all episodes
Download on the App Store

The MAD Podcast with Matt TurckBy Matt Turck

  • 5
  • 5
  • 5
  • 5
  • 5

5

24 ratings


More shows like The MAD Podcast with Matt Turck

View all
The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

534 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,093 Listeners

Invest Like the Best with Patrick O'Shaughnessy by Colossus | Investing & Business Podcasts

Invest Like the Best with Patrick O'Shaughnessy

2,364 Listeners

Azeem Azhar's Exponential View by Azeem Azhar

Azeem Azhar's Exponential View

614 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

227 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

9,997 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

95 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

519 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

501 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

129 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

92 Listeners

AI + a16z by a16z

AI + a16z

36 Listeners

Sharp Tech with Ben Thompson by Andrew Sharp and Ben Thompson

Sharp Tech with Ben Thompson

95 Listeners

TBPN by John Coogan & Jordi Hays

TBPN

121 Listeners

Uncapped with Jack Altman by Alt Capital

Uncapped with Jack Altman

43 Listeners