LessWrong (30+ Karma)

“A framework for thinking about AI power-seeking” by Joe Carlsmith


Listen Later

Crossposted from the AI Alignment Forum. May contain more technical jargon than usual.

This post lays out a framework I’m currently using for thinking about when AI systems will seek power in problematic ways. I think this framework adds useful structure to the too-often-left-amorphous “instrumental convergence thesis,” and that it helps us recast the classic argument for existential risk from misaligned AI in a revealing way. In particular, I suggest, this recasting highlights how much classic analyses of AI risk load on the assumption that the AIs in question are powerful enough to take over the world very easily, via a wide variety of paths. If we relax this assumption, I suggest, the strategic trade-offs that an AI faces, in choosing whether or not to engage in some form of problematic power-seeking, become substantially more complex.

Prerequisites for rational takeover-seeking

For simplicity, I’ll focus here on the most extreme [...]

---

Outline:

(00:53) Prerequisites for rational takeover-seeking

(02:51) Agential prerequisites

(06:43) Goal-content prerequisites

(09:13) Takeover-favoring incentives

(13:38) Recasting the classic argument for AI risk using this framework

(26:13) What if the AI can’t take over so easily, or via so many different paths?

The original text contained 16 footnotes which were omitted from this narration.

The original text contained 1 image which was described by AI.

---

First published:

July 24th, 2024

Source:

https://www.lesswrong.com/posts/A9YYkbnjmfsP7Chfo/a-framework-for-thinking-about-ai-power-seeking

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more
View all episodesView all episodes
Download on the App Store

LessWrong (30+ Karma)By LessWrong


More shows like LessWrong (30+ Karma)

View all
The Daily by The New York Times

The Daily

112,842 Listeners

Astral Codex Ten Podcast by Jeremiah

Astral Codex Ten Podcast

130 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,215 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

531 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,221 Listeners

AI Article Readings by Readings of great articles in AI voices

AI Article Readings

4 Listeners

Doom Debates by Liron Shapira

Doom Debates

14 Listeners

LessWrong posts by zvi by zvi

LessWrong posts by zvi

2 Listeners