The Daily ML

Ep39. Arithmetic Without Algorithms: Language Models Solve Math With a Bag of Heuristics


Listen Later

This research investigates how large language models (LLMs) perform arithmetic tasks. The authors find that LLMs do not rely on robust algorithms or memorization but instead use a "bag of heuristics," a collection of simple, memorized rules, to solve arithmetic problems. They identify a specific set of neurons in the LLMs that implement these heuristics and analyze how they develop over the course of training. Their findings suggest that improving LLMs' mathematical abilities may require fundamental changes to training and architecture rather than relying on post-hoc techniques.
...more
View all episodesView all episodes
Download on the App Store

The Daily MLBy The Daily ML