LessWrong (30+ Karma)

“Omniscaling to MNIST” by cloud


Listen Later

In this post, I describe a mindset that is flawed, and yet helpful for choosing impactful technical AI safety research projects.

The mindset is this: future AI might look very different than AI today, but good ideas are universal. If you want to develop a method that will scale up to powerful future AI systems, your method should also scale down to MNIST. In other words, good ideas omniscale: they work well across all model sizes, domains, and training regimes.

The Modified National Institute of Standards and Technology database (MNIST): 70,000 images of handwritten digits, 28x28 pixels each (source: Wikipedia). You can fit the whole dataset and many models on a single GPU!

Putting the omniscaling mindset into practice is straightforward. Any time you come across a clever-sounding machine learning idea, ask: "can I apply this to MNIST?" If not, then it's not a good idea. If so, run an experiment to see if it works. If it doesn't, then it's not a good idea. If it does, then it might be a good idea, and you can continue as usual to more realistic experiments or theory.

In this post, I will:

  1. Share how MNIST experiments have informed my [...]

---

Outline:

(01:58) Applications to MNIST

(02:42) Gradient routing

(04:43) Distillation robustifies unlearning

(08:39) Subliminal learning

(10:37) Why you should do it on MNIST

(11:30) MNIST is not sufficient (and other tips)

(14:25) The omniscaling assumption is false

(17:09) Code and more ideas

(18:40) Closing thoughts

The original text contained 7 footnotes which were omitted from this narration.

---

First published:

November 8th, 2025

Source:

https://www.lesswrong.com/posts/4aeshNuEKF8Ak356D/omniscaling-to-mnist

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more
View all episodesView all episodes
Download on the App Store

LessWrong (30+ Karma)By LessWrong


More shows like LessWrong (30+ Karma)

View all
The Daily by The New York Times

The Daily

112,982 Listeners

Astral Codex Ten Podcast by Jeremiah

Astral Codex Ten Podcast

132 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,292 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

548 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,366 Listeners

AI Article Readings by Readings of great articles in AI voices

AI Article Readings

4 Listeners

Doom Debates by Liron Shapira

Doom Debates

14 Listeners

LessWrong posts by zvi by zvi

LessWrong posts by zvi

2 Listeners