
Sign up to save your podcasts
Or


Introduction and summary
(This is the fourth in a series of four posts about how we might solve the alignment problem. See the first post for an introduction to this project and a summary of the series as a whole.)
In the first post in this series, I distinguished between “motivation control” (trying to ensure that a superintelligent AI's motivations have specific properties) and “option control” (trying to ensure that a superintelligent AI's options have specific properties). My second post surveyed some of the key challenges and available approaches for motivation control; and my third post did the same for option control.
In this fourth and final post, I wrap up the series with a discussion of two final topics: namely, moving from motivation/option control to safety-from-takeover (let's call this “incentive design”), and capability elicitation. Here's a summary:
---
Outline:
(00:04) Introduction and summary
(04:33) Incentive design
(10:59) Capability elicitation
(11:22) Comparison with avoiding takeover
(15:35) Two key questions
(19:41) Not an afterthought
(21:48) Conclusion
The original text contained 10 footnotes which were omitted from this narration.
The original text contained 3 images which were described by AI.
---
First published:
Source:
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
By LessWrongIntroduction and summary
(This is the fourth in a series of four posts about how we might solve the alignment problem. See the first post for an introduction to this project and a summary of the series as a whole.)
In the first post in this series, I distinguished between “motivation control” (trying to ensure that a superintelligent AI's motivations have specific properties) and “option control” (trying to ensure that a superintelligent AI's options have specific properties). My second post surveyed some of the key challenges and available approaches for motivation control; and my third post did the same for option control.
In this fourth and final post, I wrap up the series with a discussion of two final topics: namely, moving from motivation/option control to safety-from-takeover (let's call this “incentive design”), and capability elicitation. Here's a summary:
---
Outline:
(00:04) Introduction and summary
(04:33) Incentive design
(10:59) Capability elicitation
(11:22) Comparison with avoiding takeover
(15:35) Two key questions
(19:41) Not an afterthought
(21:48) Conclusion
The original text contained 10 footnotes which were omitted from this narration.
The original text contained 3 images which were described by AI.
---
First published:
Source:
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

112,847 Listeners

130 Listeners

7,206 Listeners

531 Listeners

16,150 Listeners

4 Listeners

14 Listeners

2 Listeners