March 08, 2026

“Can governments quickly and cheaply slow AI training?” by joshc

37 minutes

I originally wrote this as a private doc for people working in the field - it's not super polished, or optimized for a broad audience.

But I'm publishing anyway because inference-verification is a new and exciting area, and there aren't many birds-eye-view explainers of what's going on in it and what the bottlenecks are.

1. Summary

I think powerful AI will be obviously scary at some point, and companies or governments might want to slow it down to buy time for additional safety or oversight. Maybe this could be done quickly, e.g. by:

Unplugging inter-server cables to slow gradient syncs
Limiting bandwidth with simple devices
Periodically erasing clusters to delete covert training checkpoints
Recomputing a sample of outputs to confirm they are, in fact, inference generations

(Section 2)

Would these methods actually work? Or more specifically, if these methods were implemented quickly and correctly, would they substantially slow AI development?

I looked into this question for around a week, and here are my current views:

Current prototypes of inference-verification would probably be ineffective. Standard inference-verification measures slow training by restricting communication between servers (see Section 2), since training involves chucking big gradients around in a hivemind, and inference [...]

---

Outline:

(00:28) 1. Summary

(05:25) 2. Ways to quickly and cheaply slow training by restricting communication

(06:31) 2.1. Method #1: Disconnect inter-rack high-speed cables

(07:07) 2.2. Method #2: Tap-verified bandwidth limits

(08:33) 2.3. Method #3: Output re-computation

(11:34) 2.4. Method #4: Memory wipes

(13:20) 2.5. Method #5: Proof of work / proof of memory

(14:35) 3. Ways to efficiently continue training despite these constraints

(15:09) 3.1. Method #1: Larger batch size + infrequent SGD steps

(16:35) 3.2. Method #2: Periodically merge independent training runs

(18:40) 3.3. Method #3: Compress gradients and weights

(20:54) 3.4. Method #4: Use more compute for inference rollouts, and less for training

(24:16) 4. But more aggressive verification methods would probably make training with current algorithms impractical

(26:56) 5. However, if developers (or AIs) have a lot of time to research better algorithms, all bets are off

(29:50) 6. Conclusion

(30:13) Appendix

(30:16) Are we in the serially bottlenecked training regime? A BOTEC by Claude

(30:23) Setup

(31:13) Key Formula

(31:41) B_crit at Frontier Scale

(32:20) How Many GPUs Per Model Replica?

(32:48) Achievable Batch Size vs. B_crit

(33:22) Key Takeaways

(35:09) Caveats

(36:48) Sources

---

First published:

March 7th, 2026

Source:

https://www.lesswrong.com/posts/Xzf3eMnhTko7AxnEy/can-governments-quickly-and-cheaply-slow-ai-training

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

View all episodes

By LessWrong

March 08, 2026

“Can governments quickly and cheaply slow AI training?” by joshc

37 minutes

I originally wrote this as a private doc for people working in the field - it's not super polished, or optimized for a broad audience.

But I'm publishing anyway because inference-verification is a new and exciting area, and there aren't many birds-eye-view explainers of what's going on in it and what the bottlenecks are.

1. Summary

Unplugging inter-server cables to slow gradient syncs
Limiting bandwidth with simple devices
Periodically erasing clusters to delete covert training checkpoints
Recomputing a sample of outputs to confirm they are, in fact, inference generations

(Section 2)

Would these methods actually work? Or more specifically, if these methods were implemented quickly and correctly, would they substantially slow AI development?

I looked into this question for around a week, and here are my current views:

---

Outline:

(00:28) 1. Summary

(05:25) 2. Ways to quickly and cheaply slow training by restricting communication

(06:31) 2.1. Method #1: Disconnect inter-rack high-speed cables

(07:07) 2.2. Method #2: Tap-verified bandwidth limits

(08:33) 2.3. Method #3: Output re-computation

(11:34) 2.4. Method #4: Memory wipes

(13:20) 2.5. Method #5: Proof of work / proof of memory

(14:35) 3. Ways to efficiently continue training despite these constraints

(15:09) 3.1. Method #1: Larger batch size + infrequent SGD steps

(16:35) 3.2. Method #2: Periodically merge independent training runs

(18:40) 3.3. Method #3: Compress gradients and weights

(20:54) 3.4. Method #4: Use more compute for inference rollouts, and less for training

(24:16) 4. But more aggressive verification methods would probably make training with current algorithms impractical

(26:56) 5. However, if developers (or AIs) have a lot of time to research better algorithms, all bets are off

(29:50) 6. Conclusion

(30:13) Appendix

(30:16) Are we in the serially bottlenecked training regime? A BOTEC by Claude

(30:23) Setup

(31:13) Key Formula

(31:41) B_crit at Frontier Scale

(32:20) How Many GPUs Per Model Replica?

(32:48) Achievable Batch Size vs. B_crit

(33:22) Key Takeaways

(35:09) Caveats

(36:48) Sources

---

First published:

March 7th, 2026

Source:

https://www.lesswrong.com/posts/Xzf3eMnhTko7AxnEy/can-governments-quickly-and-cheaply-slow-ai-training

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

More shows like LessWrong (30+ Karma)

View all

The Daily

112,192 Listeners

Astral Codex Ten Podcast

131 Listeners

Interesting Times with Ross Douthat

7,227 Listeners

Dwarkesh Podcast

564 Listeners

The Ezra Klein Show

16,195 Listeners

AI Article Readings

4 Listeners

Doom Debates!

14 Listeners

LessWrong posts by zvi

2 Listeners

Share “Can governments quickly and cheaply slow AI training?” by joshc

Sign up to save your podcasts

“Can governments quickly and cheaply slow AI training?” by joshc

“Can governments quickly and cheaply slow AI training?” by joshc

More shows like LessWrong (30+ Karma)

The Daily

Astral Codex Ten Podcast

Interesting Times with Ross Douthat

Dwarkesh Podcast

The Ezra Klein Show

AI Article Readings

Doom Debates!

LessWrong posts by zvi