
Sign up to save your podcasts
Or


We dive into the race to build a perfectly accurate 10-digit addition model with under 7,000 parameters, comparing ClaudeCode’s data-forward approach with reversed output to Codex’s token-based compression. Along the way, we explore grokking, data formatting tricks, and what these tiny models reveal about AI research and problem-solving at scale.
Note: This podcast was AI-generated, and sometimes AI can make mistakes. Please double-check any critical information.
Sponsored by Embersilk LLC
By Mike BreaultWe dive into the race to build a perfectly accurate 10-digit addition model with under 7,000 parameters, comparing ClaudeCode’s data-forward approach with reversed output to Codex’s token-based compression. Along the way, we explore grokking, data formatting tricks, and what these tiny models reveal about AI research and problem-solving at scale.
Note: This podcast was AI-generated, and sometimes AI can make mistakes. Please double-check any critical information.
Sponsored by Embersilk LLC