
Sign up to save your podcasts
Or


Which alignment target?
Suppose you’re a lab or government, and you want to figure out what values to align your AI to. Here are three options, and some of their downsides:
AIs that are aligned to a set of consequentialist values are incentivized to acquire power to pursue those values. This creates power struggles between those AIs and:
This is true whether those values are misaligned with all humans, aligned with some humans, chosen by aggregating all humans’ values, or an attempt to specify some “moral truth”. In general, since humans have many different values, I think of the power struggle as being between coalitions which each contain some humans and some AIs.
AIs that are aligned to a set of deontological principles (like refusing to harm humans) are safer, but also less flexible. What's fine for an AI to do in one context might be harmful in another context; what's fine for one AI to do might be very harmful for a million [...]
---
Outline:
(00:09) Which alignment target?
(02:37) Aligning to virtues
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
By LessWrongWhich alignment target?
Suppose you’re a lab or government, and you want to figure out what values to align your AI to. Here are three options, and some of their downsides:
AIs that are aligned to a set of consequentialist values are incentivized to acquire power to pursue those values. This creates power struggles between those AIs and:
This is true whether those values are misaligned with all humans, aligned with some humans, chosen by aggregating all humans’ values, or an attempt to specify some “moral truth”. In general, since humans have many different values, I think of the power struggle as being between coalitions which each contain some humans and some AIs.
AIs that are aligned to a set of deontological principles (like refusing to harm humans) are safer, but also less flexible. What's fine for an AI to do in one context might be harmful in another context; what's fine for one AI to do might be very harmful for a million [...]
---
Outline:
(00:09) Which alignment target?
(02:37) Aligning to virtues
---
First published:
Source:
---
Narrated by TYPE III AUDIO.

112,326 Listeners

130 Listeners

7,242 Listeners

559 Listeners

16,321 Listeners

4 Listeners

14 Listeners

2 Listeners