
Sign up to save your podcasts
Or
I'm editing this post.
OpenAI announced (but hasn't released) o3 (skipping o2 for trademark reasons).
It gets 25% on FrontierMath, smashing the previous SoTA of 2%. (These are really hard math problems.) Wow.
72% on SWE-bench Verified, beating o1's 49%.
Also 88% on ARC-AGI.
---
First published:
Source:
Narrated by TYPE III AUDIO.
I'm editing this post.
OpenAI announced (but hasn't released) o3 (skipping o2 for trademark reasons).
It gets 25% on FrontierMath, smashing the previous SoTA of 2%. (These are really hard math problems.) Wow.
72% on SWE-bench Verified, beating o1's 49%.
Also 88% on ARC-AGI.
---
First published:
Source:
Narrated by TYPE III AUDIO.
26,342 Listeners
2,393 Listeners
7,949 Listeners
4,130 Listeners
87 Listeners
1,446 Listeners
8,756 Listeners
88 Listeners
372 Listeners
5,421 Listeners
15,306 Listeners
468 Listeners
122 Listeners
76 Listeners
447 Listeners