Episode 0012: Kolmogorov Complexity and Algorithmic Randomness
Series: Greatest Hits — Ilya Sutskever's Reading List
Date: Sunday, February 22, 2026
Overview
The only full textbook on Ilya Sutskever's famous ~30 item reading list. Why did a deep learning pioneer tell John Carmack to study algorithmic randomness? Because compression is intelligence — and this book is the mathematical foundation for that claim.
We cover the key concepts: Kolmogorov complexity, the invariance theorem, incompressibility, algorithmic randomness, Berry's paradox, connections to Gödel and Turing, and why all of this matters for understanding why neural networks generalize.
The Textbook
Title: Kolmogorov Complexity and Algorithmic RandomnessAuthors: Alexander Shen, Vladimir A. Uspensky, Nikolai VereshchaginPublisher: American Mathematical Society (Mathematical Surveys and Monographs, Volume 220), 2017AMS Bookstore: bookstore.ams.org/surv-220Author's draft (PDF): lirmm.fr/~ashen/kolmbook-eng-scan.pdfAlexander Shen's page: lirmm.fr/~ashenIlya Sutskever's Reading List
Ilya Sutskever's recommended reading list — the ~30 papers and one textbook he told John Carmack would cover "90% of what matters"Ilya Sutskever on "compression is intelligence" — various talks and interviewsKey Concepts Discussed
Kolmogorov Complexity
The length of the shortest program that produces a given outputIndependently discovered by Ray Solomonoff (1960), Andrei Kolmogorov (1965), and Gregory Chaitin (1966)Wikipedia: Kolmogorov complexityThe Invariance Theorem
Kolmogorov complexity is independent of programming language, up to an additive constantMakes the definition well-defined as a property of the string itselfAlgorithmic Randomness & Martin-Löf
A string is random iff it is incompressibleMartin-Löf randomness: passes every computable statistical testEquivalence between the two definitions is a central theoremWikipedia: Algorithmic randomnessWikipedia: Per Martin-LöfBerry's Paradox
"The smallest positive integer not definable in fewer than twenty words"Formalized via Kolmogorov complexity to prove uncomputabilityWikipedia: Berry paradoxConnections to Gödel and Turing
Kolmogorov complexity is uncomputable (reduces to halting problem)Chaitin's Omega: halting probability with algorithmically random digitsIncompleteness, halting, and randomness as three faces of the same limitWikipedia: Chaitin's constantWhy This Matters for AI
Compression ↔ Prediction (Shannon Duality)
Good predictors are good compressors and vice versaNeural networks trained to predict are implicitly compressingShannon's 1948 paperMinimum Description Length (MDL)
Also on Ilya's reading listBest model = shortest total description (model + data given model)Occam's razor made mathematicalWikipedia: Minimum description lengthJorma Rissanen's original work (1978)Solomonoff Induction
Theoretically optimal prediction via Kolmogorov complexity priorEvery neural network is an approximation of this uncomputable idealWikipedia: Solomonoff's theory of inductive inferenceDeep Learning Generalization
Why overparameterized networks don't overfit: they compressBlier & Ollivier (2018): trained network weights compress training dataThe Description Length of Deep Learning Models (Blier & Ollivier, NeurIPS 2018)The Authors
Alexander Shen — LIRMM, CNRS & Université de Montpellier, France. HomepageVladimir A. Uspensky (1930–2018) — Lomonosov Moscow State University. Pioneer in Russian mathematical logic. WikipediaNikolai Vereshchagin — Lomonosov Moscow State University. Google ScholarFurther Reading
Li & Vitányi, "An Introduction to Kolmogorov Complexity and Its Applications" — the other major textbook on Kolmogorov complexityHutter, "Universal Artificial Intelligence" — extends Solomonoff induction to reinforcement learning (AIXI)Grünwald, "The Minimum Description Length Principle" — comprehensive treatment of MDLThis episode was researched and written with AI assistance.