February 18, 2026

Why Do Reasoning Models Loop

17 minutes

In this episode:
• Introduction: The Infinite Loop: Professor Norris and Linda introduce the episode's topic: the phenomenon of reasoning models getting stuck in repetitive loops. Norris jokes about his own lectures looping, while Linda introduces the paper 'Wait, Wait, Wait... Why Do Reasoning Models Loop?' and the context of Chain-of-Thought reasoning.
• The Distillation Mystery: Linda presents the paper's empirical findings, highlighting that 'student' models (distilled) loop significantly more than their 'teacher' models. Norris is skeptical that a student could be worse than the teacher if trained properly, leading to a discussion on 'errors in learning.'
• Mechanism 1: Risk Aversion and Hard Steps: The hosts dive into the first theoretical mechanism: Risk Aversion due to Hardness of Learning. Linda uses the 'Star Graph' analogy to explain how models prefer easy, cyclic actions (like resetting) over hard, progress-making steps when they are uncertain.
• Mechanism 2: Deja Vu and Correlated Errors: They discuss the second mechanism: Inductive Bias for Temporally Correlated Errors. Norris learns why models don't just guess randomly when confused but instead make the *same* mistake repeatedly, leading to the 'Groundhog Day' effect in reasoning traces.
• Temperature: A Cure or a Band-Aid?: Linda explains why turning up the 'temperature' (randomness) helps break loops but is ultimately just a stopgap that masks the underlying learning errors. They conclude with a look at how loops become self-reinforcing catalysts.

...more

View all episodes

By Mechanical Dirk

February 18, 2026

Why Do Reasoning Models Loop

17 minutes

...more

Share Why Do Reasoning Models Loop

Sign up to save your podcasts

Why Do Reasoning Models Loop

Why Do Reasoning Models Loop