August 11, 2025

“GPT-5s Are Alive: Basic Facts, Benchmarks and the Model Card” by Zvi

1 hour 4 minutes

GPT-5 was a long time coming.

Is it a good model, sir? Yes. In practice it is a good, but not great, model.

Or rather, it is several good models released at once: GPT-5, GPT-5-Thinking, GPT-5-With-The-Router, GPT-5-Pro, GPT-5-API. That leads to a lot of confusion.

What is most good? Cutting down on errors and hallucinations is a big deal. Ease of use and ‘just doing things’ have improved. Early reports are thinking mode is a large improvement on writing. Coding seems improved and can compete with Opus.

This first post covers an introduction, basic facts, benchmarks and the model card. Coverage will continue tomorrow.

This Fully Operational Battle Station

GPT-5 is here. They presented it as a really big deal. Death Star big.

Sam Altman (the night before release):

Nikita Bier: There is still time to delete.

PixelHulk:

Zvi [...]

---

Outline:

(01:04) This Fully Operational Battle Station

(04:20) Big Facts

(06:23) The System Card

(06:42) A Model By Any Other Name

(09:26) Safe Completions

(09:53) Mundane Safety

(10:48) Sycophancy

(14:46) The Art of the Jailbreak

(21:59) Hallucinations

(23:59) Deception

(27:58) Red Teaming

(29:03) Violent Attack Planning

(30:11) Prompt Injections

(32:20) Microsoft AI Red Teaming

(33:43) Preparedness Framework (Catastrophic and Existential Risks)

(33:49) Fine Tuning

(34:58) Safeguarding the API

(38:09) Biological Capabilities Remain Similar

(40:43) That One Graph From METR

(49:22) Big Compute

(49:53) On Your Marks

(57:00) Other People's Benchmarks

(01:01:22) Is That The Best You Can Do?

(01:03:08) Things To Come

---

First published:

August 11th, 2025

Source:

https://www.lesswrong.com/posts/4fLB2uzCcH6dEGnGs/gpt-5s-are-alive-basic-facts-benchmarks-and-the-model-card

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

View all episodes

By LessWrong

August 11, 2025

“GPT-5s Are Alive: Basic Facts, Benchmarks and the Model Card” by Zvi

1 hour 4 minutes

GPT-5 was a long time coming.

Is it a good model, sir? Yes. In practice it is a good, but not great, model.

Or rather, it is several good models released at once: GPT-5, GPT-5-Thinking, GPT-5-With-The-Router, GPT-5-Pro, GPT-5-API. That leads to a lot of confusion.

This first post covers an introduction, basic facts, benchmarks and the model card. Coverage will continue tomorrow.

This Fully Operational Battle Station

GPT-5 is here. They presented it as a really big deal. Death Star big.

Sam Altman (the night before release):

Nikita Bier: There is still time to delete.

PixelHulk:

Zvi [...]

---

Outline:

(01:04) This Fully Operational Battle Station

(04:20) Big Facts

(06:23) The System Card

(06:42) A Model By Any Other Name

(09:26) Safe Completions

(09:53) Mundane Safety

(10:48) Sycophancy

(14:46) The Art of the Jailbreak

(21:59) Hallucinations

(23:59) Deception

(27:58) Red Teaming

(29:03) Violent Attack Planning

(30:11) Prompt Injections

(32:20) Microsoft AI Red Teaming

(33:43) Preparedness Framework (Catastrophic and Existential Risks)

(33:49) Fine Tuning

(34:58) Safeguarding the API

(38:09) Biological Capabilities Remain Similar

(40:43) That One Graph From METR

(49:22) Big Compute

(49:53) On Your Marks

(57:00) Other People's Benchmarks

(01:01:22) Is That The Best You Can Do?

(01:03:08) Things To Come

---

First published:

August 11th, 2025

Source:

https://www.lesswrong.com/posts/4fLB2uzCcH6dEGnGs/gpt-5s-are-alive-basic-facts-benchmarks-and-the-model-card

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

More shows like LessWrong (30+ Karma)

View all

The Daily

112,275 Listeners

Astral Codex Ten Podcast

131 Listeners

Interesting Times with Ross Douthat

7,238 Listeners

Dwarkesh Podcast

558 Listeners

The Ezra Klein Show

16,257 Listeners

AI Article Readings

4 Listeners

Doom Debates!

14 Listeners

LessWrong posts by zvi

2 Listeners

Share “GPT-5s Are Alive: Basic Facts, Benchmarks and the Model Card” by Zvi

Sign up to save your podcasts

“GPT-5s Are Alive: Basic Facts, Benchmarks and the Model Card” by Zvi

“GPT-5s Are Alive: Basic Facts, Benchmarks and the Model Card” by Zvi

More shows like LessWrong (30+ Karma)

The Daily

Astral Codex Ten Podcast

Interesting Times with Ross Douthat

Dwarkesh Podcast

The Ezra Klein Show

AI Article Readings

Doom Debates!

LessWrong posts by zvi