
Sign up to save your podcasts
Or


Physics of Language Models: Part 2 – Grade School Math, Depth, and the Power of Mistakes Hosted by Nathan Rigoni
In this episode, we move beyond general language patterns to explore how Large Language Models (LLMs) grapple with the rigid logic of mathematics. Using the second installment of Meta’s "Physics of Language Models" research, we investigate whether models are simply "stochastic parrots" or if they are developing a genuine internal geometry of reasoning. From the critical importance of architectural depth to the surprising necessity of learning from incorrect answers, we break down what it actually takes to build a machine that can "think" through a problem rather than just memorizing it.
What you will learn
Resources mentioned
Why this episode matters
If you've ever wondered why an AI can write a poem but struggles with basic arithmetic, this episode provides the mechanistic answer. We explore the "serial nature of logic" and how architectural choices directly impact a model's ability to navigate complex, multi-step reasoning. By understanding the relationship between sequence length and long-term projection—analogous to a grandmaster planning 50 moves ahead in chess—we gain a clearer picture of the future of "thinking" models like DeepSeek.
Subscribe for more deep dives into philosophy, AI, and cognition. Visit www.phronesis-analytics.com or email [email protected] and join the conversation.
Keywords: Physics of Language Models, Grade School Math, Mechanistic Interpretability, Transformer Depth, Hidden States, V-Probe, Error Correction, Recovery Manifold, Chain of Thought, Logic, Phronesis Analytics.
By Nathan RigoniPhysics of Language Models: Part 2 – Grade School Math, Depth, and the Power of Mistakes Hosted by Nathan Rigoni
In this episode, we move beyond general language patterns to explore how Large Language Models (LLMs) grapple with the rigid logic of mathematics. Using the second installment of Meta’s "Physics of Language Models" research, we investigate whether models are simply "stochastic parrots" or if they are developing a genuine internal geometry of reasoning. From the critical importance of architectural depth to the surprising necessity of learning from incorrect answers, we break down what it actually takes to build a machine that can "think" through a problem rather than just memorizing it.
What you will learn
Resources mentioned
Why this episode matters
If you've ever wondered why an AI can write a poem but struggles with basic arithmetic, this episode provides the mechanistic answer. We explore the "serial nature of logic" and how architectural choices directly impact a model's ability to navigate complex, multi-step reasoning. By understanding the relationship between sequence length and long-term projection—analogous to a grandmaster planning 50 moves ahead in chess—we gain a clearer picture of the future of "thinking" models like DeepSeek.
Subscribe for more deep dives into philosophy, AI, and cognition. Visit www.phronesis-analytics.com or email [email protected] and join the conversation.
Keywords: Physics of Language Models, Grade School Math, Mechanistic Interpretability, Transformer Depth, Hidden States, V-Probe, Error Correction, Recovery Manifold, Chain of Thought, Logic, Phronesis Analytics.