September 12, 2025

Book Review: If Anyone Builds It, Everyone Dies

42 minutes

Eliezer Yudkowsky's Machine Intelligence Research Institute is the original AI safety org. But the original isn't always the best - how is Mesopotamia doing these days? As money, brainpower, and prestige pour into the field, MIRI remains what it always was - a group of loosely-organized weird people, one of whom cannot be convinced to stop wearing a sparkly top hat in public. So when I was doing AI grantmaking last year, I asked them - why should I fund you instead of the guys with the army of bright-eyed Harvard grads, or the guys who just got Geoffrey Hinton as their celebrity spokesperson? What do you have that they don't?

MIRI answered: moral clarity.

Most people in AI safety (including me) are uncertain and confused and looking for least-bad incremental solutions. We think AI will probably be an exciting and transformative technology, but there's some chance, 5 or 15 or 30 percent, that it might turn against humanity in a catastrophic way. Or, if it doesn't, that there will be something less catastrophic but still bad - maybe humanity gradually fading into the background, the same way kings and nobles faded into the background during the modern era. This is scary, but AI is coming whether we like it or not, and probably there are also potential risks from delaying too hard. We're not sure exactly what to do, but for now we want to build a firm foundation for reacting to any future threat. That means keeping AI companies honest and transparent, helping responsible companies like Anthropic stay in the race, and investing in understanding AI goal structures and the ways that AIs interpret our commands. Then at some point in the future, we'll be close enough to the actually-scary AI that we can understand the threat model more clearly, get more popular buy-in, and decide what to do next.

MIRI thinks this is pathetic - like trying to protect against an asteroid impact by wearing a hard hat. They're kind of cagey about their own probability of AI wiping out humanity, but it seems to be somewhere around 95 - 99%. They think plausibly-achievable gains in company responsibility, regulation quality, and AI scholarship are orders of magnitude too weak to seriously address the problem, and they don't expect enough of a "warning shot" that they feel comfortable kicking the can down the road until everything becomes clear and action is easy. They suggest banning all AI capabilities research immediately, to be restarted only in some distant future when the situation looks more promising.

Both sides honestly believe their position and don't want to modulate their message for PR reasons. But both sides, coincidentally, think that their message is better PR. The incrementalists think a moderate, cautious approach keeps bridges open with academia, industry, government, and other actors that prefer normal clean-shaven interlocutors who don't emit spittle whenever they talk. MIRI thinks that the public is sick of focus-group-tested mealy-mouthed bullshit, but might be ready to rise up against AI if someone presented the case in a clear and unambivalent way.

Now Yudkowsky and his co-author, MIRI president Nate Soares, have reached new heights of unambivalence with their new book, If Anyone Builds It, Everyone Dies (release date September 16, currently available for preorder).

https://www.astralcodexten.com/p/book-review-if-anyone-builds-it-everyone

...more