Welcome to The Nonlinear Library, where we use Text-to-Speech software to convert the best writing from the Rationalist and EA communities into audio. This is: Contra Yudkowsky on AI Doom, published by jacob cannell on April 24, 2023 on LessWrong.
Eliezer Yudkowsky predicts doom from AI: that humanity faces likely extinction in the near future (years or decades) from a rogue unaligned superintelligent AI system. Moreover he predicts that this is the default outcome, and AI alignment is so incredibly difficult that even he failed to solve it.
EY is an entertaining and skilled writer, but do not confuse rhetorical writing talent for depth and breadth of technical knowledge. I do not have EY's talents there, or Scott Alexander's poetic powers of prose. My skill points instead have gone near exclusively towards extensive study of neuroscience, deep learning, and graphics/GPU programming. More than most, I actually have the depth and breadth of technical knowledge necessary to evaluate these claims in detail.
I have evaluated this model in detail and found it substantially incorrect and in fact brazenly naively overconfident.
Intro
Even though the central prediction of the doom model is necessarily un-observable for anthropic reasons, alternative models (such as my own, or moravec's, or hanson's) have already made substantially better predictions, such that EY's doom model has low posterior probability.
EY has espoused this doom model for over a decade, and hasn't updated it much from what I can tell. Here is the classic doom model as I understand it, starting first with key background assumptions:
The brain inefficiency assumption: The human brain is inefficient in multiple dimensions/ways/metrics that translate into intelligence per dollar; inefficient as a hardware platform in key metrics such as thermodynamic efficiency.
The mind inefficiency or human incompetence assumption: In terms of software he describes the brain as an inefficient complex "kludgy mess of spaghetti-code". He derived these insights from the influential evolved modularity hypothesis as popularized in ev pysch by Tooby and Cosmides. He boo-hooed neural networks, and in fact actively bet against them in actions by hiring researchers trained in abstract math/philosophy, ignoring neuroscience and early DL, etc.
The more room at the bottom assumption: Naturally dovetailing with points 1 and 2, EY confidently predicts there is enormous room for further hardware improvement, especially through strong drexlerian nanotech.
The alien mindspace assumption: EY claims human mindspace is an incredibly narrow twisty complex target to hit, whereas the space of AI mindspace is vast, and AI designs will be something like random rolls from this vast alien mindspace resulting in an incredibly low probability of hitting the narrow human target.
Doom naturally follows from these assumptions: Sometime in the near future some team discovers the hidden keys of intelligence and creates a human-level AGI which then rewrites its own source code, initiating a self improvement recursion which quickly results in the AGI developing strong nanotech and killing all humans within a matter of days or even hours.
If assumptions 1 and 2 don't hold (relative to 3) then there is little to no room for recursive self improvement. If assumption 4 is completely wrong then the default outcome is not doom regardless.
Every one of his key assumptions is mostly wrong, as I and others predicted well in advance. EY seems to have been systematically overconfident as an early futurist, and then perhaps updated later to avoid specific predictions, but without much updating his mental models (specifically his nanotech-woo model, as we will see).
Brain Hardware Efficiency
EY correctly recognizes that thermodynamic efficiency is a key metric for computation/intelligence, and he confidently, brazenly claims (as of late 2021), that the brain is about 6 OOM from thermodynamic efficiency limits:
Which brings me to...