
Sign up to save your podcasts
Or


People sometimes say that AI alignment is scary partly (or perhaps: centrally) because you have to get it right on the “first critical try,” and can’t learn from failures.[1] What does this mean? Is it true? Does there need to be a “first critical try” in the relevant sense? I’ve sometimes felt confused about this, so I wrote up a few thoughts to clarify.
I start with a few miscellaneous conceptual points. I then focus in on a notion of “first critical try” tied to the first point (if there is one) when AIs get a “decisive strategic advantage” (DSA) over humanity – that is, roughly, the ability to kill/disempower all humans if they try.[2] I further distinguish between four different types of DSA:
---
Outline:
(02:31) Some conceptual points
(06:00) Unilateral DSAs
(09:13) Coordination DSAs
(15:19) Correlation DSAs
(22:21) A few final thoughts
The original text contained 20 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
By LessWrongPeople sometimes say that AI alignment is scary partly (or perhaps: centrally) because you have to get it right on the “first critical try,” and can’t learn from failures.[1] What does this mean? Is it true? Does there need to be a “first critical try” in the relevant sense? I’ve sometimes felt confused about this, so I wrote up a few thoughts to clarify.
I start with a few miscellaneous conceptual points. I then focus in on a notion of “first critical try” tied to the first point (if there is one) when AIs get a “decisive strategic advantage” (DSA) over humanity – that is, roughly, the ability to kill/disempower all humans if they try.[2] I further distinguish between four different types of DSA:
---
Outline:
(02:31) Some conceptual points
(06:00) Unilateral DSAs
(09:13) Coordination DSAs
(15:19) Correlation DSAs
(22:21) A few final thoughts
The original text contained 20 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.

112,909 Listeners

130 Listeners

7,215 Listeners

532 Listeners

16,221 Listeners

4 Listeners

14 Listeners

2 Listeners