
Sign up to save your podcasts
Or


GPT-5 was a long time coming.
Is it a good model, sir? Yes. In practice it is a good, but not great, model.
Or rather, it is several good models released at once: GPT-5, GPT-5-Thinking, GPT-5-With-The-Router, GPT-5-Pro, GPT-5-API. That leads to a lot of confusion.
What is most good? Cutting down on errors and hallucinations is a big deal. Ease of use and ‘just doing things’ have improved. Early reports are thinking mode is a large improvement on writing. Coding seems improved and can compete with Opus.
This first post covers an introduction, basic facts, benchmarks and the model card. Coverage will continue tomorrow.
This Fully Operational Battle Station
GPT-5 is here. They presented it as a really big deal. Death Star big.
Sam Altman (the night before release):
Nikita Bier: There is still time to delete.
PixelHulk:
Zvi [...]
---
Outline:
(01:04) This Fully Operational Battle Station
(04:20) Big Facts
(06:23) The System Card
(06:42) A Model By Any Other Name
(09:26) Safe Completions
(09:53) Mundane Safety
(10:48) Sycophancy
(14:46) The Art of the Jailbreak
(21:59) Hallucinations
(23:59) Deception
(27:58) Red Teaming
(29:03) Violent Attack Planning
(30:11) Prompt Injections
(32:20) Microsoft AI Red Teaming
(33:43) Preparedness Framework (Catastrophic and Existential Risks)
(33:49) Fine Tuning
(34:58) Safeguarding the API
(38:09) Biological Capabilities Remain Similar
(40:43) That One Graph From METR
(49:22) Big Compute
(49:53) On Your Marks
(57:00) Other People's Benchmarks
(01:01:22) Is That The Best You Can Do?
(01:03:08) Things To Come
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
By LessWrongGPT-5 was a long time coming.
Is it a good model, sir? Yes. In practice it is a good, but not great, model.
Or rather, it is several good models released at once: GPT-5, GPT-5-Thinking, GPT-5-With-The-Router, GPT-5-Pro, GPT-5-API. That leads to a lot of confusion.
What is most good? Cutting down on errors and hallucinations is a big deal. Ease of use and ‘just doing things’ have improved. Early reports are thinking mode is a large improvement on writing. Coding seems improved and can compete with Opus.
This first post covers an introduction, basic facts, benchmarks and the model card. Coverage will continue tomorrow.
This Fully Operational Battle Station
GPT-5 is here. They presented it as a really big deal. Death Star big.
Sam Altman (the night before release):
Nikita Bier: There is still time to delete.
PixelHulk:
Zvi [...]
---
Outline:
(01:04) This Fully Operational Battle Station
(04:20) Big Facts
(06:23) The System Card
(06:42) A Model By Any Other Name
(09:26) Safe Completions
(09:53) Mundane Safety
(10:48) Sycophancy
(14:46) The Art of the Jailbreak
(21:59) Hallucinations
(23:59) Deception
(27:58) Red Teaming
(29:03) Violent Attack Planning
(30:11) Prompt Injections
(32:20) Microsoft AI Red Teaming
(33:43) Preparedness Framework (Catastrophic and Existential Risks)
(33:49) Fine Tuning
(34:58) Safeguarding the API
(38:09) Biological Capabilities Remain Similar
(40:43) That One Graph From METR
(49:22) Big Compute
(49:53) On Your Marks
(57:00) Other People's Benchmarks
(01:01:22) Is That The Best You Can Do?
(01:03:08) Things To Come
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

26,392 Listeners

2,420 Listeners

8,842 Listeners

4,150 Listeners

92 Listeners

1,592 Listeners

9,844 Listeners

90 Listeners

501 Listeners

5,472 Listeners

16,009 Listeners

541 Listeners

129 Listeners

94 Listeners

499 Listeners