
Sign up to save your podcasts
Or
Visiting researcher Rose Hadshar recently published a review of some evidence for existential risk from AI, focused on empirical evidence for misalignment and power seeking. (Previously from this project: a blogpost outlining some of the key claims that are often made about AI risk, a series of interviews of AI researchers, and a database of empirical evidence for misalignment and power seeking.)
In this report, Rose looks into evidence for:
Misalignment,[1] where AI systems develop goals which are misaligned with human goals; and
Power-seeking,[2] where misaligned AI systems seek power to achieve their goals.
Rose found the current state of this evidence for existential risk from misaligned power-seeking to be concerning but inconclusive:
There is empirical evidence of AI systems developing misaligned goals (via specification gaming[3] and via goal misgeneralization[4]), including in deployment (via specification gaming), but it's not clear to Rose whether these problems will scale far [...]
The original text contained 6 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
Visiting researcher Rose Hadshar recently published a review of some evidence for existential risk from AI, focused on empirical evidence for misalignment and power seeking. (Previously from this project: a blogpost outlining some of the key claims that are often made about AI risk, a series of interviews of AI researchers, and a database of empirical evidence for misalignment and power seeking.)
In this report, Rose looks into evidence for:
Misalignment,[1] where AI systems develop goals which are misaligned with human goals; and
Power-seeking,[2] where misaligned AI systems seek power to achieve their goals.
Rose found the current state of this evidence for existential risk from misaligned power-seeking to be concerning but inconclusive:
There is empirical evidence of AI systems developing misaligned goals (via specification gaming[3] and via goal misgeneralization[4]), including in deployment (via specification gaming), but it's not clear to Rose whether these problems will scale far [...]
The original text contained 6 footnotes which were omitted from this narration.
---
First published:
Source:
Narrated by TYPE III AUDIO.
26,446 Listeners
2,389 Listeners
7,910 Listeners
4,136 Listeners
87 Listeners
1,462 Listeners
9,095 Listeners
87 Listeners
389 Listeners
5,432 Listeners
15,174 Listeners
474 Listeners
121 Listeners
75 Listeners
459 Listeners