February 15, 2026

“LLMs struggle to verbalize their internal reasoning” by Emil Ryd

19 minutes

Emil Ryd

Thanks to Adam Karvonen, Arjun Khandelwal, Arun Jose, Fabien Roger, James Chua, Nic Kruus, & Sukrit Sumant for helpful feedback and discussion.

Thanks to Claude Opus 4.5 for help with designing and implementing the experiments.

Introduction

We study to what extent LLMs can verbalize their internal reasoning. To do this, we train LLMs to solve various games and tasks (sorting lists, two-hop lookup, a custom grid-world game, and chess) in a single forward pass. After training, we evaluate them by prompting them with a suite of questions asking them to explain their moves and the reasoning behind it, e.g. “Explain why you chose your move.”, “Explain the rules of the game”).

We find that:

Models trained to solve tasks in a single forward pass are not able to verbalize a correct reason for their actions[1]. Instead, they hallucinate incorrect reasoning.
When trained to solve a very simple sorting task (sorting lists in increasing order) the models are able to verbalize the sorting rule, although unreliably. Furthermore, we believe this might be mostly due to the sorting rule being the most likely.
When trained to solve a previously unseen task (grid-world game) with reasoning via RL [...]

---

Outline:

(00:30) Introduction

(01:45) Background

(03:26) Methods

(04:29) Datasets

(04:32) Increased Sort

(05:04) Subtracted Table Lookup

(06:04) Chess

(06:30) Hot Square Capture

(07:38) Training

(08:16) Evaluation

(09:35) Results

(09:38) Models are generally unable to verbalize their reasoning on tasks

(12:31) Training models to solve a task in natural language does not guarantee legible reasoning

(15:17) Discussion

(15:20) Limitations

(17:04) Training models to verbalize their reasoning

The original text contained 3 footnotes which were omitted from this narration.

---

First published:

February 14th, 2026

Source:

https://www.lesswrong.com/posts/dFRFxhaJkf9dE6Jfy/llms-struggle-to-verbalize-their-internal-reasoning

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

View all episodes

By LessWrong

February 15, 2026

“LLMs struggle to verbalize their internal reasoning” by Emil Ryd

19 minutes

Emil Ryd

Thanks to Adam Karvonen, Arjun Khandelwal, Arun Jose, Fabien Roger, James Chua, Nic Kruus, & Sukrit Sumant for helpful feedback and discussion.

Thanks to Claude Opus 4.5 for help with designing and implementing the experiments.

Introduction

We find that:

Models trained to solve tasks in a single forward pass are not able to verbalize a correct reason for their actions[1]. Instead, they hallucinate incorrect reasoning.
When trained to solve a very simple sorting task (sorting lists in increasing order) the models are able to verbalize the sorting rule, although unreliably. Furthermore, we believe this might be mostly due to the sorting rule being the most likely.
When trained to solve a previously unseen task (grid-world game) with reasoning via RL [...]

---

Outline:

(00:30) Introduction

(01:45) Background

(03:26) Methods

(04:29) Datasets

(04:32) Increased Sort

(05:04) Subtracted Table Lookup

(06:04) Chess

(06:30) Hot Square Capture

(07:38) Training

(08:16) Evaluation

(09:35) Results

(09:38) Models are generally unable to verbalize their reasoning on tasks

(12:31) Training models to solve a task in natural language does not guarantee legible reasoning

(15:17) Discussion

(15:20) Limitations

(17:04) Training models to verbalize their reasoning

The original text contained 3 footnotes which were omitted from this narration.

---

First published:

February 14th, 2026

Source:

https://www.lesswrong.com/posts/dFRFxhaJkf9dE6Jfy/llms-struggle-to-verbalize-their-internal-reasoning

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

More shows like LessWrong (30+ Karma)

View all

The Daily

112,326 Listeners

Astral Codex Ten Podcast

130 Listeners

Interesting Times with Ross Douthat

7,242 Listeners

Dwarkesh Podcast

559 Listeners

The Ezra Klein Show

16,321 Listeners

AI Article Readings

4 Listeners

Doom Debates!

14 Listeners

LessWrong posts by zvi

2 Listeners

Share “LLMs struggle to verbalize their internal reasoning” by Emil Ryd

Sign up to save your podcasts

“LLMs struggle to verbalize their internal reasoning” by Emil Ryd

“LLMs struggle to verbalize their internal reasoning” by Emil Ryd

More shows like LessWrong (30+ Karma)

The Daily

Astral Codex Ten Podcast

Interesting Times with Ross Douthat

Dwarkesh Podcast

The Ezra Klein Show

AI Article Readings

Doom Debates!

LessWrong posts by zvi