LessWrong (30+ Karma)

“Finding Backward Chaining Circuits in Transformers Trained on Tree Search” by abhayesian, Jannik Brinkmann, Victor Levoso


Listen Later

This is a link post.

This post is a summary of our paper A Mechanistic Analysis of a Transformer Trained on a Symbolic Multi-Step Reasoning Task (ACL 2024). While we wrote and released the paper a couple of months ago, we have done a bad job promoting it so far. As a result, we’re writing up a summary of our results here to reinvigorate interest in our work and hopefully find some collaborators for follow-up projects. If you’re interested in the results we describe in this post, please see the paper for more details.

TL;DR - We train transformer models to find the path from the root of a tree to a given leaf (given an edge list of the tree). We use standard techniques from mechanistic interpretability to figure out how our model performs this task. We found circuits that involve backward chaining - the first layer attends to [...]

---

Outline:

(01:52) Motivation and The Task

(05:18) Backward Chaining with Deduction Heads

(08:57) Register Tokens and Path Merging

(10:38) Final Heuristic

(12:09) Tuned Lens Visualization

(12:42) Takeaways and Limitations

(15:37) Future work

(16:26) Acknowledgments

---

First published:

May 28th, 2024

Source:

https://www.lesswrong.com/posts/EBbcuSuNafkYpsgTW/finding-backward-chaining-circuits-in-transformers-trained-1

---

Narrated by TYPE III AUDIO.

...more
View all episodesView all episodes
Download on the App Store

LessWrong (30+ Karma)By LessWrong


More shows like LessWrong (30+ Karma)

View all
The Daily by The New York Times

The Daily

113,207 Listeners

Astral Codex Ten Podcast by Jeremiah

Astral Codex Ten Podcast

130 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,258 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

534 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,291 Listeners

AI Article Readings by Readings of great articles in AI voices

AI Article Readings

4 Listeners

Doom Debates by Liron Shapira

Doom Debates

14 Listeners

LessWrong posts by zvi by zvi

LessWrong posts by zvi

2 Listeners