
Sign up to save your podcasts
Or
Developers of frontier AI systems will face increasingly challenging decisions about whether their AI systems are safe enough to develop and deploy. One reason why systems may not be safe is if they engage in scheming. In our new report "Towards evaluations-based safety cases for AI scheming", written in collaboration with researchers from the UK AI Safety Institute, METR, Redwood Research and UC Berkeley, we sketch how developers of AI systems could make a structured rationale – 'a safety case' – that an AI system is unlikely to cause catastrophic outcomes through scheming.
Note: This is a small step in advancing the discussion. We think it currently lacks crucial details that would be required to make a strong safety case.
Read the full report.
Figure 1. A condensed version of an example safety case sketch, included in the report. Provided for illustration.
Scheming and Safety Cases
For [...]
---
Outline:
(01:13) Scheming and Safety Cases
(02:01) Core Arguments and Challenges
(03:42) Safety cases in the near future
The original text contained 1 footnote which was omitted from this narration.
The original text contained 1 image which was described by AI.
---
First published:
Source:
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
Developers of frontier AI systems will face increasingly challenging decisions about whether their AI systems are safe enough to develop and deploy. One reason why systems may not be safe is if they engage in scheming. In our new report "Towards evaluations-based safety cases for AI scheming", written in collaboration with researchers from the UK AI Safety Institute, METR, Redwood Research and UC Berkeley, we sketch how developers of AI systems could make a structured rationale – 'a safety case' – that an AI system is unlikely to cause catastrophic outcomes through scheming.
Note: This is a small step in advancing the discussion. We think it currently lacks crucial details that would be required to make a strong safety case.
Read the full report.
Figure 1. A condensed version of an example safety case sketch, included in the report. Provided for illustration.
Scheming and Safety Cases
For [...]
---
Outline:
(01:13) Scheming and Safety Cases
(02:01) Core Arguments and Challenges
(03:42) Safety cases in the near future
The original text contained 1 footnote which was omitted from this narration.
The original text contained 1 image which was described by AI.
---
First published:
Source:
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
26,366 Listeners
2,383 Listeners
7,944 Listeners
4,137 Listeners
87 Listeners
1,459 Listeners
9,050 Listeners
88 Listeners
386 Listeners
5,422 Listeners
15,220 Listeners
473 Listeners
120 Listeners
76 Listeners
456 Listeners