
Sign up to save your podcasts
Or


Wednesday's post talked about the implications of Anthropic changing from v2.2 to v3.0 of its RSP, including that this broke promises that many people relied upon when making important decisions.
Today's post treats the new RSP v3.0 as a new document, and evaluates it.
First I’ll go over how the RSP v3.0 works at a high level. Then I’ll dive into the Roadmap and the Risk Report.
How RSP v3.0 Works
Normally I would pay closer attention to the exact written contents of the new RSP.
In this case, it's not that the RSP doesn’t matter. I do think the RSP will have some influence on what Anthropic chooses to do, as will the road map, as will the resulting risk reports.
However, the fundamental design principle is flexibility and a ‘strong argument,’ and they can change the contents at any time, all of which means the central principle is trust.
I read the contents as ‘here are the things we are worried about and plan to do,’ which mostly in practice should amount to doing what they believe is right and I don’t see anything on this map that seems likely [...]
---
Outline:
(00:40) How RSP v3.0 Works
(19:05) You Came Here For An Argument
(21:27) The Problem Remains Unsolved
(25:22) Wow That Thing We Did Was Pretty Risky, Huh?
(26:18) Risk Report #1
(28:19) Listen All Yall Its Sabotage
(38:05) Looking Forward
(39:42) Claude Gov
(40:02) What Is A Strong Argument?
(41:12) Recursive Self-Improvement
(42:32) Non-Novel Chemical and Biological Weapons
(44:51) Novel Chemical and Biological Weapons
(45:39) Cross-Cutting Content (Section 6)
(48:48) Risk Report Report
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
By zvi5
22 ratings
Wednesday's post talked about the implications of Anthropic changing from v2.2 to v3.0 of its RSP, including that this broke promises that many people relied upon when making important decisions.
Today's post treats the new RSP v3.0 as a new document, and evaluates it.
First I’ll go over how the RSP v3.0 works at a high level. Then I’ll dive into the Roadmap and the Risk Report.
How RSP v3.0 Works
Normally I would pay closer attention to the exact written contents of the new RSP.
In this case, it's not that the RSP doesn’t matter. I do think the RSP will have some influence on what Anthropic chooses to do, as will the road map, as will the resulting risk reports.
However, the fundamental design principle is flexibility and a ‘strong argument,’ and they can change the contents at any time, all of which means the central principle is trust.
I read the contents as ‘here are the things we are worried about and plan to do,’ which mostly in practice should amount to doing what they believe is right and I don’t see anything on this map that seems likely [...]
---
Outline:
(00:40) How RSP v3.0 Works
(19:05) You Came Here For An Argument
(21:27) The Problem Remains Unsolved
(25:22) Wow That Thing We Did Was Pretty Risky, Huh?
(26:18) Risk Report #1
(28:19) Listen All Yall Its Sabotage
(38:05) Looking Forward
(39:42) Claude Gov
(40:02) What Is A Strong Argument?
(41:12) Recursive Self-Improvement
(42:32) Non-Novel Chemical and Biological Weapons
(44:51) Novel Chemical and Biological Weapons
(45:39) Cross-Cutting Content (Section 6)
(48:48) Risk Report Report
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

26,380 Listeners

2,461 Listeners

1,105 Listeners

109 Listeners

291 Listeners

90 Listeners

551 Listeners

5,576 Listeners

137 Listeners

13 Listeners

150 Listeners

147 Listeners

475 Listeners

0 Listeners

143 Listeners