Claude Opus 4.5 did so well on the METR task length graph they’re going to need longer tasks, and we still haven’t scored Gemini 3 Pro or GPT-5.2-Codex. Oh, also there's a GPT-5.2-Codex.
At week's end we did finally get at least a little of a Christmas break. It was nice.
Also nice was that New York Governor Kathy Hochul signed the RAISE Act, giving New York its own version of SB 53. The final version was not what we were hoping it would be, but it still is helpful on the margin.
Various people gave their 2026 predictions. Let's put it this way: Buckle up.
Language Models Offer Mundane Utility. AI suggests doing the minimum. Language Models Don’t Offer Mundane Utility. Gemini 3 doesn’t believe in itself. Huh, Upgrades. ChatGPT gets some personality knobs to turn. On Your Marks. PostTrainBench shows AIs below human baseline but improving. Claude Opus 4.5 Joins The METR Graph. Expectations were exceeded. Sufficiently Advanced Intelligence. You’re good enough, you’re smart enough. Deepfaketown and Botpocalypse Soon. Don’t worry, the UK PM's got this. Fun With Media Generation. Slop [...] ---
Outline:
(00:53) Language Models Offer Mundane Utility
(02:15) Language Models Don't Offer Mundane Utility
(02:55) Huh, Upgrades
(03:21) On Your Marks
(05:15) Claude Opus 4.5 Joins The METR Graph
(12:41) Sufficiently Advanced Intelligence
(15:09) Deepfaketown and Botpocalypse Soon
(18:12) Fun With Media Generation
(22:33) You Drive Me Crazy
(25:29) They Took Our Jobs
(28:45) The Art of the Jailbreak
(28:56) Get Involved
(29:12) Introducing
(32:24) In Other AI News
(33:46) Show Me the Money
(34:44) Quiet Speculations
(38:59) Whistling In The Dark
(40:27) Bubble, Bubble, Toil and Trouble
(42:20) Americans Really Dislike AI
(48:30) The Quest for Sane Regulations
(52:32) Chip City
(55:55) The Week in Audio
(56:45) Rhetorical Innovation
(01:00:53) Aligning a Smarter Than Human Intelligence is Difficult
(01:02:58) Mom, Owain Evans Is Turning The Models Evil Again
(01:06:10) Messages From Janusworld
(01:15:22) The Lighter Side
---
https://www.lesswrong.com/posts/GHW2rhYtnYgEn3tuq/ai-148-christmas-break
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.