Two Minds, One Model

When the Scaffold Moves Inside


Listen Later

This episode traces AI reasoning from human-designed external scaffolding (process reward models, test-time compute scaling) to internally emergent capability, culminating in DeepSeek R1's finding that a model rewarded only for correctness spontaneously learns to reason, self-correct, and backtrack without any explicit instruction to do so.

Credits

Cover Art by Brianna Williams

TMOM Intro Music by Danny Meza

A special thank you to these talented artists for their contributions to the show.

Links and Reference

  • US appeals court fined lawyers https://www.sixthcircuitappellateblog.com/recent-cases/sixth-circuit-sanctions-attorneys-for-fake-citations-what-does-this-mean-for-use-of-ai/https://www.jdsupra.com/legalnews/the-ai-sanction-wave-145k-in-q1-1240943/#:~:text=In%20Whiting%20v.%20City%20of,cases%20presenting%20the%20same%20problems.

  • CEO Krafton used ChatGPT to nullify $250M contract https://legaltalknetwork.com/podcasts/heels-in-the-courtroom/2026/04/ep-1006-when-clients-use-ai-the-new-risks-to-privilege-and-discovery/#:~:text=So%20the%20allegations%20were%20that,let%20ChatGPT%20be%20his%20lawyer.

  • "Let's Verify Step by Step" https://arxiv.org/abs/2305.20050

  • PRM800K dataset — 800,000 step-level human feedback labels, open-sourcedhttps://github.com/openai/prm800k

  • Snell et al. paper on test-time compute scaling, published Aug 2024https://arxiv.org/abs/2408.03314

  • "Chinchilla optimal" — paper on optimal scaling of parameters vs. datahttps://arxiv.org/pdf/2203.15556

  • LangChain documented convergence in open SWE frameworkhttps://blog.langchain.com/open-swe-an-open-source-framework-for-internal-coding-agents/

  • "Thinking Fast and Slow" by Kahneman, Dhttps://psycnet.apa.org/record/2011-26535-000

  • T3 Code — Theo's Claude Code harness replacementhttps://www.youtube.com/watch?v=-7akxGb-lAM#:~:text=Theo%20Did%20It.,Gemini%20without%20the%20lock%2Din.

  • DeepSeek R1 technical report, January 2025 https://arxiv.org/abs/2501.12948

  • Uncanny Valley concepthttps://web.ics.purdue.edu/~drkelly/MoriTheUncannyValley1970.pdf

Abandoned Episode Titles

The Episode That Definitely Didn't Anthropomorphize Anything

Pump Harder: A Metaphor That Should Have Died But Absolutely Didn't

"Wait, Wait, Wait, Don’t Tell Me"

The One Where the Math Problem Checks Its Own Work and We All Get a Little Creeped Out

...more
View all episodesView all episodes
Download on the App Store

Two Minds, One ModelBy John Jezl and Jon Rocha