Show Notes — The Bitter Lesson (DTFFTL-0027)
Why it matters.
Rich Sutton published a 1,200-word essay in 2019 and was largely dismissed. Then the past five years vindicated every word of it. Now Sutton is making a second claim: that the LLM paradigm — train in a lab, freeze weights, deploy — is structurally limited in the same way the knowledge-based approaches he criticized were limited. He and John Carmack are building the alternative at Keen Technologies, targeting a genuine AGI prototype by 2030. Whether they succeed or fail, the argument deserves serious examination. The track record says so.
Primary Sources
The Bitter Lesson — Rich Sutton (2019) — The original essay. 1,200 words. Read it.Rich Sutton's homepage — University of Alberta — Papers, essays, and research archiveReinforcement Learning: An Introduction — Sutton & Barto (2nd ed., 2018) — Free online; the canonical RL textbookSutton's 2025 talk: Toward Greater Generality and Autonomy in AI — Post-LLM critique, continual learning, the training/deployment divideTuring Award 2024: Sutton & Barto citation — ACM announcement for RL foundationsKeen Technologies
Keen Technologies — company site — Carmack's AGI startupCarmack leaves Meta announcement — Dec 2022 — Carmack's Facebook post explaining his departureCarmack / Sutton partnership announcement — Sep 2023 — Carmack tweet on joining forces with SuttonKeen Technologies $20M seed round coverage — TechCrunch on investors and goalsCarmack on AGI: Lex Fridman podcast #309 — Long-form on leaving Meta, AGI approach, and timelinesHistorical Cases from the Essay
Deep Blue vs. Kasparov — IBM research archive — The 1997 match and the brute-force vs. knowledge debateAlphaGo paper — Silver et al., Nature 2016 — "Mastering the game of Go with deep neural networks and tree search"Deep learning vs. handcrafted features in speech — Hinton et al., 2012 — "Deep Neural Networks for Acoustic Modeling in Speech Recognition"ImageNet and the end of handcrafted vision features — Krizhevsky et al., 2012 — AlexNet. The paper that broke the SIFT era.Reinforcement Learning Foundations
Temporal Difference Learning — Sutton (1988) — TD learning paper; the foundation of the value function approachPolicy gradient methods — Sutton et al. (1999) — Foundations of modern deep RLWorld models — Ha & Schmidhuber (2018) — Agents that model their environment and plan inside the modelLLM Critique Context
RLHF paper — Christiano et al. (2017) — "Deep Reinforcement Learning from Human Preferences" — the RLHF origin paperLimitations of LLMs as reasoners — Marcus & Davis (2019) — Pre-GPT-4 critique; useful historical context