April 16, 2025

“OpenAI rewrote its Preparedness Framework” by Zach Stein-Perlman

1 minute

New: https://openai.com/index/updating-our-preparedness-framework/

Old: https://cdn.openai.com/openai-preparedness-framework-beta.pdf

Summary

Thresholds & responses: https://cdn.openai.com/pdf/18a02b5d-6b67-4cec-ab64-68cdfbddebcd/preparedness-framework-v2.pdf#page=5. High and Critical thresholds trigger responses, like in the old PF; responses to Critical thresholds are not yet specified.

Three main categories of capabilities:

Bio/chem: High capabilities trigger security controls and (for external deployment) misuse safeguards
Cyber: High capabilities trigger security controls and (for external deployment) misuse safeguards and (for large-scale internal deployment) misalignment safeguards
AI Self-improvement: High capabilities trigger security controls

Misuse safeguards, misalignment safeguards, and security controls for High capability levels: https://cdn.openai.com/pdf/18a02b5d-6b67-4cec-ab64-68cdfbddebcd/preparedness-framework-v2.pdf#page=16. My quick takes:

Misuse safeguards: fine categories but it's not clear what level of assurance would suffice
Misalignment safeguards: worrying categories and it's not clear what level of assurance would suffice
Security controls: it's impossible to evaluate security level based on principles like these

[I'll edit this post to add more analysis soon]

---

First published:

April 15th, 2025

Source:

https://www.lesswrong.com/posts/Yy5ijtbNfwv8DWin4/openai-rewrote-its-preparedness-framework

---

Narrated by TYPE III AUDIO.

...more

View all episodes

By LessWrong