September 23, 2025

“Why I don’t believe Superalignment will work” by Simon Lermen

9 minutes

We skip over [..] where we move from the human-ish range to strong superintelligence[1]. [..] the period where we can harness potentially vast quantities of AI labour to help us with the alignment of the next generation of models

- Will MacAskill in his critique of IABIED

I want to respond to Will MacAskill's claim in his IABIED review that we may be able use AI to solve alignment.[1] Will believes that recent developments in AI made it more likely that takeoff will be relatively slow - "Sudden, sharp, large leaps in intelligence now look unlikely". Because of this, he and many others believe that there will likely be a period of time at some point in the future when we can essentially direct the AIs to align more powerful AIs. But it appears to me that a “slow takeoff” is not sufficient at all and that a [...]

---

Outline:

(01:47) Fast takeoff is possible

(02:49) AIs are unlikely to speed up alignment before capabilities

(04:21) What would the AI alignment researchers actually be doing?

(05:29) Alignment problem might require genius breakthroughs

(06:57) Most labs won't use the time

(07:26) The plan could have negative consequences

The original text contained 2 footnotes which were omitted from this narration.

---

First published:

September 22nd, 2025