
Sign up to save your podcasts
Or


TL;DR
Datasets, evaluations, and fine-tune handles will be released.
Code and datasets
Why study what happens when a model believes it is AGI?
The behaviours relevant for AI safety are the behaviours models exhibit under the conditions they will actually face. Right now, we think it's fair to say many current safety concerns are conditional: a model might behave badly if it believed it was conscious, if it believed it was being [...]
---
Outline:
(01:22) Why study what happens when a model believes it is AGI?
(03:49) Setup
(05:31) Petri Results
(08:42) Agentic Misalignment Results
(10:13) Conclusion
(11:13) Limitations
(11:48) Appendix
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
By LessWrongTL;DR
Datasets, evaluations, and fine-tune handles will be released.
Code and datasets
Why study what happens when a model believes it is AGI?
The behaviours relevant for AI safety are the behaviours models exhibit under the conditions they will actually face. Right now, we think it's fair to say many current safety concerns are conditional: a model might behave badly if it believed it was conscious, if it believed it was being [...]
---
Outline:
(01:22) Why study what happens when a model believes it is AGI?
(03:49) Setup
(05:31) Petri Results
(08:42) Agentic Misalignment Results
(10:13) Conclusion
(11:13) Limitations
(11:48) Appendix
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

112,194 Listeners

131 Listeners

7,229 Listeners

563 Listeners

16,198 Listeners

4 Listeners

14 Listeners

2 Listeners