LessWrong (30+ Karma)

“In-context learning of representations can be explained by induction circuits” by Andy Arditi


Listen Later

This is a crosspost of my ICLR 2026 blogpost track post. All code and experiments are available at github.com/andyrdt/iclr_induction.

Summary

Park et al., 2025 show that when large language models (LLMs) process random walks on a graph, their internal representations come to mirror the underlying graph's structure. The authors interpret this broadly, suggesting that LLMs can "manipulate their representations in order to reflect concept semantics specified entirely in-context". In this post, we take a closer look at the underlying mechanism, and suggest a simpler explanation. We argue that induction circuits (Elhage et al., 2021; Olsson et al., 2022), a well-known mechanism for in-context bigram recall, suffice to explain both the task performance and the representation geometry observed by Park et al.

Recapitulation and reproduction of Park et al., 2025

We begin by describing the experimental setup of Park et al., 2025 and reproducing their main results on Llama-3.1-8B.

Figure 1. Overview of Park et al.
(a) The grid tracing task uses a 4×4 grid of words. (b) Models observe random walks on the grid (e.g.,  apple  bird  milk  sand  sun  plane  opera ...) where consecutive words are always neighbors. As the sequence length grows, the model begins to predict valid [...]

---

Outline:

(00:20) Summary

(01:06) Recapitulation and reproduction of Park et al., 2025

(02:03) The grid tracing task

(04:36) Reproduction and Park et al.s interpretation

(06:19) A simpler explanation: induction circuits

(07:32) Testing the induction hypothesis

(08:45) Results

(11:25) Previous-token mixing can account for representation geometry

(11:52) The neighbor-mixing hypothesis

(12:50) A toy model of previous-token mixing

(13:56) Evidence of neighbor mixing in individual model activations

(15:34) Limitations

(17:57) Conclusion

The original text contained 4 footnotes which were omitted from this narration.

---

First published:

March 2nd, 2026

Source:

https://www.lesswrong.com/posts/qtdSzLpQ8BXv6YANd/in-context-learning-of-representations-can-be-explained-by

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more
View all episodesView all episodes
Download on the App Store

LessWrong (30+ Karma)By LessWrong


More shows like LessWrong (30+ Karma)

View all
The Daily by The New York Times

The Daily

113,290 Listeners

Astral Codex Ten Podcast by Jeremiah

Astral Codex Ten Podcast

132 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,259 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

566 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,498 Listeners

AI Article Readings by Readings of great articles in AI voices

AI Article Readings

4 Listeners

Doom Debates! by Liron Shapira

Doom Debates!

14 Listeners

LessWrong posts by zvi by zvi

LessWrong posts by zvi

2 Listeners