
Sign up to save your podcasts
Or
How good are Claude Opus 4 and Claude Sonnet 4?
They’re good models, sir.
If you don’t care about price or speed, Opus is probably the best model available today.
If you do care somewhat, Sonnet 4 is probably best in its class for many purposes, and deserves the 4 label because of its agentic aspects but isn’t a big leap over 3.7 for other purposes. I have been using 90%+ Opus so I can’t speak to this directly. There are some signs of some amount of ‘small model smell’ where Sonnet 4 has focused on common cases at the expense of rarer ones. That's what Opus is for.
That's all as of when I hit post. Things do escalate quickly these days, although I would not include Grok in this loop until proven otherwise, it's a three horse race and if you told me [...]
---
Outline:
(01:17) On Your Marks
(05:32) Standard Silly Benchmarks
(11:09) API Upgrades
(12:45) Coding Time Horizon
(13:47) The Key Missing Feature is Memory
(14:52) Early Reactions
(26:12) Opus 4 Has the Opus Nature
(32:27) Unprompted Attention
(35:09) Max Subscription
(36:24) In Summary
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
How good are Claude Opus 4 and Claude Sonnet 4?
They’re good models, sir.
If you don’t care about price or speed, Opus is probably the best model available today.
If you do care somewhat, Sonnet 4 is probably best in its class for many purposes, and deserves the 4 label because of its agentic aspects but isn’t a big leap over 3.7 for other purposes. I have been using 90%+ Opus so I can’t speak to this directly. There are some signs of some amount of ‘small model smell’ where Sonnet 4 has focused on common cases at the expense of rarer ones. That's what Opus is for.
That's all as of when I hit post. Things do escalate quickly these days, although I would not include Grok in this loop until proven otherwise, it's a three horse race and if you told me [...]
---
Outline:
(01:17) On Your Marks
(05:32) Standard Silly Benchmarks
(11:09) API Upgrades
(12:45) Coding Time Horizon
(13:47) The Key Missing Feature is Memory
(14:52) Early Reactions
(26:12) Opus 4 Has the Opus Nature
(32:27) Unprompted Attention
(35:09) Max Subscription
(36:24) In Summary
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.