
Sign up to save your podcasts
Or


A game millions of people solve over morning coffee is exposing a fundamental weakness in today’s most powerful AI models. In this Five-Minute Friday, Jon Krohn breaks down Pathway’s new Sudoku Extreme benchmark, roughly 250,000 of the hardest Sudoku puzzles available and why leading LLMs like o3-mini, DeepSeek-R1, and Claude 3.7 Sonnet scored effectively zero percent, while Pathway’s post-transformer BDH architecture achieved 97.4% accuracy at a fraction of the cost. Listen to the episode to find out what BDH is doing differently, why Sudoku performance matters far beyond puzzles, and what this means for the future of AI reasoning.
Additional materials: www.superdatascience.com/978
Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.
By Jon Krohn4.6
295295 ratings
A game millions of people solve over morning coffee is exposing a fundamental weakness in today’s most powerful AI models. In this Five-Minute Friday, Jon Krohn breaks down Pathway’s new Sudoku Extreme benchmark, roughly 250,000 of the hardest Sudoku puzzles available and why leading LLMs like o3-mini, DeepSeek-R1, and Claude 3.7 Sonnet scored effectively zero percent, while Pathway’s post-transformer BDH architecture achieved 97.4% accuracy at a fraction of the cost. Listen to the episode to find out what BDH is doing differently, why Sudoku performance matters far beyond puzzles, and what this means for the future of AI reasoning.
Additional materials: www.superdatascience.com/978
Interested in sponsoring a SuperDataScience Podcast episode? Email [email protected] for sponsorship information.

481 Listeners

626 Listeners

583 Listeners

343 Listeners

157 Listeners

266 Listeners

212 Listeners

140 Listeners

101 Listeners

150 Listeners

161 Listeners

228 Listeners

688 Listeners

280 Listeners

39 Listeners