
Sign up to save your podcasts
Or


Last week, OpenAI announced GPT-5.5, including GPT-5.5-Pro.
My overall read here is that GPT-5.5 is a solid improvement, and for many purposes GPT-5.5 is competitive with Claude Opus. Reactions are still coming in and it is early. My guess on the shape is that GPT-5.5 is the pick for ‘just the facts’ queries, web searches or straightforward well-specified requests, and Claude Opus 4.7 is the choice for more open ended or interpretive purposes. Coders can consider a hybrid approach.
On the alignment and safety fronts, it is unlikely to pose new big risks, and its alignment seems similar to that of previous models. There is some small additional risk arising from its improved agentic abilities, including computer use.
As always, when it is available, the system or model card is where we start.
OpenAI does not drop the giant doorstops that Anthropic gives us with every release.
After reading the Mythos and Opus 4.7 model cards, this strikes me as stingy. There's still good info here, but overall it tells you relatively little about what is going on, and feels incurious and more pro forma.
I would like to see a ‘yes and’ [...]
---
Outline:
(02:36) Pro Versus Proxy
(02:59) Disallowed Content (3.1)
(04:20) Dont Delete Data (3.3)
(04:56) Confirmation Confirmation (3.4)
(05:22) Jailbreaks (4.1)
(05:34) Prompt Injections (4.2)
(06:33) Health (5)
(06:56) Hallucinations (6)
(08:01) Alignment (7)
(11:28) Bias Evaluation (8)
(11:57) Preparedness (9)
(13:04) Bio (9.1.1)
(15:20) Cybersecurity (9.1.2)
(17:46) Self-Improvement (9.1.3)
(18:46) Sandbagging (9.2)
(19:46) Safeguards (9.3)
(21:41) What About Model Welfare?
(22:31) Would This Have Identified A Problem?
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
By zvi5
22 ratings
Last week, OpenAI announced GPT-5.5, including GPT-5.5-Pro.
My overall read here is that GPT-5.5 is a solid improvement, and for many purposes GPT-5.5 is competitive with Claude Opus. Reactions are still coming in and it is early. My guess on the shape is that GPT-5.5 is the pick for ‘just the facts’ queries, web searches or straightforward well-specified requests, and Claude Opus 4.7 is the choice for more open ended or interpretive purposes. Coders can consider a hybrid approach.
On the alignment and safety fronts, it is unlikely to pose new big risks, and its alignment seems similar to that of previous models. There is some small additional risk arising from its improved agentic abilities, including computer use.
As always, when it is available, the system or model card is where we start.
OpenAI does not drop the giant doorstops that Anthropic gives us with every release.
After reading the Mythos and Opus 4.7 model cards, this strikes me as stingy. There's still good info here, but overall it tells you relatively little about what is going on, and feels incurious and more pro forma.
I would like to see a ‘yes and’ [...]
---
Outline:
(02:36) Pro Versus Proxy
(02:59) Disallowed Content (3.1)
(04:20) Dont Delete Data (3.3)
(04:56) Confirmation Confirmation (3.4)
(05:22) Jailbreaks (4.1)
(05:34) Prompt Injections (4.2)
(06:33) Health (5)
(06:56) Hallucinations (6)
(08:01) Alignment (7)
(11:28) Bias Evaluation (8)
(11:57) Preparedness (9)
(13:04) Bio (9.1.1)
(15:20) Cybersecurity (9.1.2)
(17:46) Self-Improvement (9.1.3)
(18:46) Sandbagging (9.2)
(19:46) Safeguards (9.3)
(21:41) What About Model Welfare?
(22:31) Would This Have Identified A Problem?
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

26,278 Listeners

2,448 Listeners

1,107 Listeners

108 Listeners

288 Listeners

89 Listeners

564 Listeners

5,554 Listeners

138 Listeners

12 Listeners

146 Listeners

149 Listeners

460 Listeners

0 Listeners

141 Listeners