Artificial Discourse

Grokked Transformers are Implicit Reasoners: A Mechanistic Journey to the Edge of Generalization


Listen Later

This research explores whether transformers, a type of neural network architecture, can learn to reason implicitly over knowledge. The authors find that transformers can learn to reason implicitly, but only through a phenomenon called grokking, where training extends far beyond overfitting. The study investigates two reasoning types: composition and comparison. They find that while the transformers generalize well on in-distribution examples for both types, they struggle with out-of-distribution generalization for composition but succeed for comparison. Through mechanistic analysis of the model’s internals, they discover that different circuits are formed during grokking for each reasoning type, which explains the varying levels of systematicity. The authors also demonstrate the potential of parametric memory for complex reasoning tasks with large search spaces, showing that a fully grokked transformer can achieve near-perfect accuracy, while state-of-the-art LLMs with non-parametric memory fail.

...more
View all episodesView all episodes
Download on the App Store

Artificial DiscourseBy Kenpachi