LessWrong (30+ Karma)

“Do we get automated alignment research before an AI Takeoff?” by JanWehner


Listen Later

TLDR: Will AI-automation first speed up capabilities or safety research? I forecast that most areas of capabilities research will see a 10x speedup before safety research. This is primarily because capabilities research has clearer feedback signals and relies more on engineering than on novel insights. To change this, researchers should now build and adopt tools to automate AI Safety research, focus on creating benchmarks, model organisms, and research proposals, and companies should grant differential access to safety research.

Epistemic status: I spent ~ a week thinking about this. My conclusions rely on a model with high uncertainty, so please take them lightly. I’d love for people to share their own estimates about this.

The Ordering of Automation Matters

AI might automate AI R&D in the next decade and thus lead to large increases in AI progress. This is extremely important for both the risks (eg an Intelligence Explosion) and the solutions (automated alignment research).

On the one hand, AI could drive AI capabilities progress. This is a stated goal of Frontier AI Companies (eg OpenAI aims for a true automated AI researcher by March of 2028), and it is central to the AI 2027 forecast. On the other hand [...]

---

Outline:

(01:00) The Ordering of Automation Matters

(05:39) Methodology

(11:43) Prediction

(13:42) Levers for affecting the order

(14:35) Speeding up Safety Automation

(19:09) Future Work: How could this be researched properly?

---

First published:

January 22nd, 2026

Source:

https://www.lesswrong.com/posts/z4FvJigv3c8sZgaKZ/do-we-get-automated-alignment-research-before-an-ai-takeoff-1

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more
View all episodesView all episodes
Download on the App Store

LessWrong (30+ Karma)By LessWrong


More shows like LessWrong (30+ Karma)

View all
The Daily by The New York Times

The Daily

113,081 Listeners

Astral Codex Ten Podcast by Jeremiah

Astral Codex Ten Podcast

132 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,271 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

530 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,299 Listeners

AI Article Readings by Readings of great articles in AI voices

AI Article Readings

4 Listeners

Doom Debates by Liron Shapira

Doom Debates

14 Listeners

LessWrong posts by zvi by zvi

LessWrong posts by zvi

2 Listeners