Intellectually Curious

Smallest Addition Transformer


Listen Later

We dive into the race to build a perfectly accurate 10-digit addition model with under 7,000 parameters, comparing ClaudeCode’s data-forward approach with reversed output to Codex’s token-based compression. Along the way, we explore grokking, data formatting tricks, and what these tiny models reveal about AI research and problem-solving at scale.


Note:  This podcast was AI-generated, and sometimes AI can make mistakes.  Please double-check any critical information.

Sponsored by Embersilk LLC

...more
View all episodesView all episodes
Download on the App Store

Intellectually CuriousBy Mike Breault