
Sign up to save your podcasts
Or


I'm editing this post.
OpenAI announced (but hasn't released) o3 (skipping o2 for trademark reasons).
It gets 25% on FrontierMath, smashing the previous SoTA of 2%. (These are really hard math problems.) Wow.
72% on SWE-bench Verified, beating o1's 49%.
Also 88% on ARC-AGI.
---
First published:
Source:
Narrated by TYPE III AUDIO.
By LessWrongI'm editing this post.
OpenAI announced (but hasn't released) o3 (skipping o2 for trademark reasons).
It gets 25% on FrontierMath, smashing the previous SoTA of 2%. (These are really hard math problems.) Wow.
72% on SWE-bench Verified, beating o1's 49%.
Also 88% on ARC-AGI.
---
First published:
Source:
Narrated by TYPE III AUDIO.

112,664 Listeners

130 Listeners

7,216 Listeners

530 Listeners

16,132 Listeners

4 Listeners

14 Listeners

2 Listeners