
Sign up to save your podcasts
Or


---
Outline:
(01:56) What's In a Name
(02:51) My Current Model Use Heuristics
(04:21) Huh, Upgrades
(05:31) Use All the Tools
(09:47) Search the Web
(10:27) On Your Marks
(18:15) The System Prompt
(19:00) The o3 and o4-mini System Card
(23:17) Tests o3 Aced
(25:14) Hallucinations
(31:41) Instruction Hierarchy
(32:52) Image Refusals
(33:18) METR Evaluations for Task Duration and Misalignment
(42:45) Apollo Evaluations for Scheming and Deception
(44:40) We Are Insufficiently Worried About These Alignment Failures
(47:16) GPT-4.1 Also Has Some Issues
(50:08) Pattern Lab Evaluations for Cybersecurity
(51:45) Preparedness Framework Tests
(52:14) Biological and Chemical Risks (4.2)
(58:20) Cybersecurity (4.3)
(59:27) AI Self-Improvement (4.4)
(01:00:51) Perpetual Shilling
(01:01:54) High Praise
(01:09:31) Syncopathy
(01:11:58) Mundane Utility Versus Capability Watch
(01:16:33) o3 Offers Mundane Utility
(01:24:10) o3 Doesn't Offer Mundane Utility
(01:30:54) o4-mini Also Exists
(01:31:31) Colin Fraser Dumb Model Watch
(01:32:52) o3 as Forecaster
(01:34:31) Is This AGI?
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
By LessWrong---
Outline:
(01:56) What's In a Name
(02:51) My Current Model Use Heuristics
(04:21) Huh, Upgrades
(05:31) Use All the Tools
(09:47) Search the Web
(10:27) On Your Marks
(18:15) The System Prompt
(19:00) The o3 and o4-mini System Card
(23:17) Tests o3 Aced
(25:14) Hallucinations
(31:41) Instruction Hierarchy
(32:52) Image Refusals
(33:18) METR Evaluations for Task Duration and Misalignment
(42:45) Apollo Evaluations for Scheming and Deception
(44:40) We Are Insufficiently Worried About These Alignment Failures
(47:16) GPT-4.1 Also Has Some Issues
(50:08) Pattern Lab Evaluations for Cybersecurity
(51:45) Preparedness Framework Tests
(52:14) Biological and Chemical Risks (4.2)
(58:20) Cybersecurity (4.3)
(59:27) AI Self-Improvement (4.4)
(01:00:51) Perpetual Shilling
(01:01:54) High Praise
(01:09:31) Syncopathy
(01:11:58) Mundane Utility Versus Capability Watch
(01:16:33) o3 Offers Mundane Utility
(01:24:10) o3 Doesn't Offer Mundane Utility
(01:30:54) o4-mini Also Exists
(01:31:31) Colin Fraser Dumb Model Watch
(01:32:52) o3 as Forecaster
(01:34:31) Is This AGI?
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

26,331 Listeners

2,462 Listeners

8,554 Listeners

4,172 Listeners

97 Listeners

1,607 Listeners

10,015 Listeners

97 Listeners

523 Listeners

5,522 Listeners

16,010 Listeners

564 Listeners

133 Listeners

93 Listeners

471 Listeners