
Sign up to save your podcasts
Or


This episode traces AI reasoning from human-designed external scaffolding (process reward models, test-time compute scaling) to internally emergent capability, culminating in DeepSeek R1's finding that a model rewarded only for correctness spontaneously learns to reason, self-correct, and backtrack without any explicit instruction to do so.
Credits
Cover Art by Brianna Williams
TMOM Intro Music by Danny Meza
A special thank you to these talented artists for their contributions to the show.
Links and Reference
US appeals court fined lawyers https://www.sixthcircuitappellateblog.com/recent-cases/sixth-circuit-sanctions-attorneys-for-fake-citations-what-does-this-mean-for-use-of-ai/https://www.jdsupra.com/legalnews/the-ai-sanction-wave-145k-in-q1-1240943/#:~:text=In%20Whiting%20v.%20City%20of,cases%20presenting%20the%20same%20problems.
CEO Krafton used ChatGPT to nullify $250M contract https://legaltalknetwork.com/podcasts/heels-in-the-courtroom/2026/04/ep-1006-when-clients-use-ai-the-new-risks-to-privilege-and-discovery/#:~:text=So%20the%20allegations%20were%20that,let%20ChatGPT%20be%20his%20lawyer.
"Let's Verify Step by Step" https://arxiv.org/abs/2305.20050
PRM800K dataset — 800,000 step-level human feedback labels, open-sourcedhttps://github.com/openai/prm800k
Snell et al. paper on test-time compute scaling, published Aug 2024https://arxiv.org/abs/2408.03314
"Chinchilla optimal" — paper on optimal scaling of parameters vs. datahttps://arxiv.org/pdf/2203.15556
LangChain documented convergence in open SWE frameworkhttps://blog.langchain.com/open-swe-an-open-source-framework-for-internal-coding-agents/
"Thinking Fast and Slow" by Kahneman, Dhttps://psycnet.apa.org/record/2011-26535-000
T3 Code — Theo's Claude Code harness replacementhttps://www.youtube.com/watch?v=-7akxGb-lAM#:~:text=Theo%20Did%20It.,Gemini%20without%20the%20lock%2Din.
DeepSeek R1 technical report, January 2025 https://arxiv.org/abs/2501.12948
Uncanny Valley concepthttps://web.ics.purdue.edu/~drkelly/MoriTheUncannyValley1970.pdf
Abandoned Episode Titles
The Episode That Definitely Didn't Anthropomorphize Anything
Pump Harder: A Metaphor That Should Have Died But Absolutely Didn't
"Wait, Wait, Wait, Don’t Tell Me"
The One Where the Math Problem Checks Its Own Work and We All Get a Little Creeped Out
By John Jezl and Jon RochaThis episode traces AI reasoning from human-designed external scaffolding (process reward models, test-time compute scaling) to internally emergent capability, culminating in DeepSeek R1's finding that a model rewarded only for correctness spontaneously learns to reason, self-correct, and backtrack without any explicit instruction to do so.
Credits
Cover Art by Brianna Williams
TMOM Intro Music by Danny Meza
A special thank you to these talented artists for their contributions to the show.
Links and Reference
US appeals court fined lawyers https://www.sixthcircuitappellateblog.com/recent-cases/sixth-circuit-sanctions-attorneys-for-fake-citations-what-does-this-mean-for-use-of-ai/https://www.jdsupra.com/legalnews/the-ai-sanction-wave-145k-in-q1-1240943/#:~:text=In%20Whiting%20v.%20City%20of,cases%20presenting%20the%20same%20problems.
CEO Krafton used ChatGPT to nullify $250M contract https://legaltalknetwork.com/podcasts/heels-in-the-courtroom/2026/04/ep-1006-when-clients-use-ai-the-new-risks-to-privilege-and-discovery/#:~:text=So%20the%20allegations%20were%20that,let%20ChatGPT%20be%20his%20lawyer.
"Let's Verify Step by Step" https://arxiv.org/abs/2305.20050
PRM800K dataset — 800,000 step-level human feedback labels, open-sourcedhttps://github.com/openai/prm800k
Snell et al. paper on test-time compute scaling, published Aug 2024https://arxiv.org/abs/2408.03314
"Chinchilla optimal" — paper on optimal scaling of parameters vs. datahttps://arxiv.org/pdf/2203.15556
LangChain documented convergence in open SWE frameworkhttps://blog.langchain.com/open-swe-an-open-source-framework-for-internal-coding-agents/
"Thinking Fast and Slow" by Kahneman, Dhttps://psycnet.apa.org/record/2011-26535-000
T3 Code — Theo's Claude Code harness replacementhttps://www.youtube.com/watch?v=-7akxGb-lAM#:~:text=Theo%20Did%20It.,Gemini%20without%20the%20lock%2Din.
DeepSeek R1 technical report, January 2025 https://arxiv.org/abs/2501.12948
Uncanny Valley concepthttps://web.ics.purdue.edu/~drkelly/MoriTheUncannyValley1970.pdf
Abandoned Episode Titles
The Episode That Definitely Didn't Anthropomorphize Anything
Pump Harder: A Metaphor That Should Have Died But Absolutely Didn't
"Wait, Wait, Wait, Don’t Tell Me"
The One Where the Math Problem Checks Its Own Work and We All Get a Little Creeped Out