April 30, 2026

“Learning zero, and what SLT gets wrong about it” by Dmitry Vaintrob

22 minutes

This is a first in a pair of posts I'm hoping to write about Singular Learning Theory (SLT) and singularities as a model of data degeneracy. If I get to it, the second post is going to be more general-audience; this one is more technical.

Introduction

To me, SLT is an important source of toy models which point at an interesting class of new statistical phenomena in learning. It is also a valuable correction to an older and (at this point) largely-defunct story of learning being fully controlled by Hessian eigenvalues and "nonsingular basins". Practitioners of SLT have been instrumental for developing and refining the practice of Bayesian sampling (used by physicists in papers like this one) to empirical models. And the theory's founder Sumio Watanabe is a once-in-a-generation genius who saw and mathematically justified crucial statistical and information-theoretic concepts in learning before long before they appeared in "mainstream" ML theory.

However there is a frequently repeated statement in SLT papers – one that doesn't affect empirical results – which I think is wrong in a load-bearing way. This is the statement that models that appear in machine learning are singular in the infinite-data limit, and that a measurement [...]

---

Outline:

(00:27) Introduction

(03:40) What doesnt need fixing

(04:45) Whats wrong

(07:17) The theory

(08:05) Infinite data, and the parameters and .

(09:16) The SLT prediction

(10:31) Hermite modes and excitations

(13:50) Addendum: the actual lambda-hat scaling (Ansatz and experiment)

(15:28) The effective theory

(18:39) Is this example special?

(20:27) The upshot

The original text contained 10 footnotes which were omitted from this narration.

---

First published:

April 28th, 2026

Source:

https://www.lesswrong.com/posts/5hKgJy8rcqnM9ntp2/learning-zero-and-what-slt-gets-wrong-about-it

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

View all episodes

By LessWrong

April 30, 2026

“Learning zero, and what SLT gets wrong about it” by Dmitry Vaintrob

22 minutes

Introduction

---

Outline:

(00:27) Introduction

(03:40) What doesnt need fixing

(04:45) Whats wrong

(07:17) The theory

(08:05) Infinite data, and the parameters and .

(09:16) The SLT prediction

(10:31) Hermite modes and excitations

(13:50) Addendum: the actual lambda-hat scaling (Ansatz and experiment)

(15:28) The effective theory

(18:39) Is this example special?

(20:27) The upshot

The original text contained 10 footnotes which were omitted from this narration.

---

First published:

April 28th, 2026

Source:

https://www.lesswrong.com/posts/5hKgJy8rcqnM9ntp2/learning-zero-and-what-slt-gets-wrong-about-it

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

More shows like LessWrong (30+ Karma)

View all

The Daily

112,330 Listeners

Astral Codex Ten Podcast

130 Listeners

Interesting Times with Ross Douthat

7,247 Listeners

Dwarkesh Podcast

563 Listeners

The Ezra Klein Show

16,328 Listeners

AI Article Readings

4 Listeners

Doom Debates!

14 Listeners

LessWrong posts by zvi

2 Listeners

Share “Learning zero, and what SLT gets wrong about it” by Dmitry Vaintrob

Sign up to save your podcasts

“Learning zero, and what SLT gets wrong about it” by Dmitry Vaintrob

“Learning zero, and what SLT gets wrong about it” by Dmitry Vaintrob

More shows like LessWrong (30+ Karma)

The Daily

Astral Codex Ten Podcast

Interesting Times with Ross Douthat

Dwarkesh Podcast

The Ezra Klein Show

AI Article Readings

Doom Debates!

LessWrong posts by zvi