February 22, 2026

Kolmogorov Complexity — Sunday Greatest Hits

24 minutes

Episode 0012: Kolmogorov Complexity and Algorithmic Randomness

Series: Greatest Hits — Ilya Sutskever's Reading List

Date: Sunday, February 22, 2026

Overview

The only full textbook on Ilya Sutskever's famous ~30 item reading list. Why did a deep learning pioneer tell John Carmack to study algorithmic randomness? Because compression is intelligence — and this book is the mathematical foundation for that claim.

We cover the key concepts: Kolmogorov complexity, the invariance theorem, incompressibility, algorithmic randomness, Berry's paradox, connections to Gödel and Turing, and why all of this matters for understanding why neural networks generalize.

The Textbook

Title: Kolmogorov Complexity and Algorithmic Randomness

Authors: Alexander Shen, Vladimir A. Uspensky, Nikolai Vereshchagin

Publisher: American Mathematical Society (Mathematical Surveys and Monographs, Volume 220), 2017

AMS Bookstore: bookstore.ams.org/surv-220

Author's draft (PDF): lirmm.fr/~ashen/kolmbook-eng-scan.pdf

Alexander Shen's page: lirmm.fr/~ashen

Ilya Sutskever's Reading List

Ilya Sutskever's recommended reading list — the ~30 papers and one textbook he told John Carmack would cover "90% of what matters"

Ilya Sutskever on "compression is intelligence" — various talks and interviews

Key Concepts Discussed

Kolmogorov Complexity

The length of the shortest program that produces a given output

Independently discovered by Ray Solomonoff (1960), Andrei Kolmogorov (1965), and Gregory Chaitin (1966)

Wikipedia: Kolmogorov complexity

The Invariance Theorem

Kolmogorov complexity is independent of programming language, up to an additive constant

Makes the definition well-defined as a property of the string itself

Algorithmic Randomness & Martin-Löf

A string is random iff it is incompressible

Martin-Löf randomness: passes every computable statistical test

Equivalence between the two definitions is a central theorem

Wikipedia: Algorithmic randomness

Wikipedia: Per Martin-Löf

Berry's Paradox

"The smallest positive integer not definable in fewer than twenty words"

Formalized via Kolmogorov complexity to prove uncomputability

Wikipedia: Berry paradox

Connections to Gödel and Turing

Kolmogorov complexity is uncomputable (reduces to halting problem)

Chaitin's Omega: halting probability with algorithmically random digits

Incompleteness, halting, and randomness as three faces of the same limit

Wikipedia: Chaitin's constant

Why This Matters for AI

Compression ↔ Prediction (Shannon Duality)

Good predictors are good compressors and vice versa

Neural networks trained to predict are implicitly compressing

Shannon's 1948 paper

Minimum Description Length (MDL)

Also on Ilya's reading list

Best model = shortest total description (model + data given model)

Occam's razor made mathematical

Wikipedia: Minimum description length

Jorma Rissanen's original work (1978)

Solomonoff Induction

Theoretically optimal prediction via Kolmogorov complexity prior

Every neural network is an approximation of this uncomputable ideal

Wikipedia: Solomonoff's theory of inductive inference

Deep Learning Generalization

Why overparameterized networks don't overfit: they compress

Blier & Ollivier (2018): trained network weights compress training data

The Description Length of Deep Learning Models (Blier & Ollivier, NeurIPS 2018)

The Authors

Alexander Shen — LIRMM, CNRS & Université de Montpellier, France. Homepage

Vladimir A. Uspensky (1930–2018) — Lomonosov Moscow State University. Pioneer in Russian mathematical logic. Wikipedia

Nikolai Vereshchagin — Lomonosov Moscow State University. Google Scholar

Kolmogorov Complexity — Sunday Greatest Hits

24 minutes

Episode 0012: Kolmogorov Complexity and Algorithmic Randomness

Series: Greatest Hits — Ilya Sutskever's Reading List

Date: Sunday, February 22, 2026

Overview

The Textbook

Title: Kolmogorov Complexity and Algorithmic Randomness

Authors: Alexander Shen, Vladimir A. Uspensky, Nikolai Vereshchagin

Publisher: American Mathematical Society (Mathematical Surveys and Monographs, Volume 220), 2017

AMS Bookstore: bookstore.ams.org/surv-220

Author's draft (PDF): lirmm.fr/~ashen/kolmbook-eng-scan.pdf

Alexander Shen's page: lirmm.fr/~ashen

Ilya Sutskever's Reading List

Ilya Sutskever's recommended reading list — the ~30 papers and one textbook he told John Carmack would cover "90% of what matters"

Ilya Sutskever on "compression is intelligence" — various talks and interviews

Key Concepts Discussed

Kolmogorov Complexity

The length of the shortest program that produces a given output

Independently discovered by Ray Solomonoff (1960), Andrei Kolmogorov (1965), and Gregory Chaitin (1966)

Wikipedia: Kolmogorov complexity

The Invariance Theorem

Kolmogorov complexity is independent of programming language, up to an additive constant

Makes the definition well-defined as a property of the string itself

Algorithmic Randomness & Martin-Löf

A string is random iff it is incompressible

Martin-Löf randomness: passes every computable statistical test

Equivalence between the two definitions is a central theorem

Wikipedia: Algorithmic randomness

Wikipedia: Per Martin-Löf

Berry's Paradox

"The smallest positive integer not definable in fewer than twenty words"

Formalized via Kolmogorov complexity to prove uncomputability

Wikipedia: Berry paradox

Connections to Gödel and Turing

Kolmogorov complexity is uncomputable (reduces to halting problem)

Chaitin's Omega: halting probability with algorithmically random digits

Incompleteness, halting, and randomness as three faces of the same limit

Wikipedia: Chaitin's constant

Why This Matters for AI

Compression ↔ Prediction (Shannon Duality)

Good predictors are good compressors and vice versa

Neural networks trained to predict are implicitly compressing

Shannon's 1948 paper

Minimum Description Length (MDL)

Also on Ilya's reading list

Best model = shortest total description (model + data given model)

Occam's razor made mathematical

Wikipedia: Minimum description length

Jorma Rissanen's original work (1978)

Solomonoff Induction

Theoretically optimal prediction via Kolmogorov complexity prior

Every neural network is an approximation of this uncomputable ideal

Wikipedia: Solomonoff's theory of inductive inference

Deep Learning Generalization

Why overparameterized networks don't overfit: they compress

Blier & Ollivier (2018): trained network weights compress training data

The Description Length of Deep Learning Models (Blier & Ollivier, NeurIPS 2018)

The Authors

Alexander Shen — LIRMM, CNRS & Université de Montpellier, France. Homepage

Vladimir A. Uspensky (1930–2018) — Lomonosov Moscow State University. Pioneer in Russian mathematical logic. Wikipedia

Nikolai Vereshchagin — Lomonosov Moscow State University. Google Scholar

Share Kolmogorov Complexity — Sunday Greatest Hits

Sign up to save your podcasts

Kolmogorov Complexity — Sunday Greatest Hits

Kolmogorov Complexity — Sunday Greatest Hits