February 05, 2026

“The nature of LLM algorithmic progress” by Steven Byrnes

22 minutes

There's a lot of talk about “algorithmic progress” in LLMs, especially in the context of exponentially-improving algorithmic efficiency. For example:

Epoch AI: “[training] compute required to reach a set performance threshold has halved approximately every 8 months”.
Dario Amodei 2025: “I'd guess the number today is maybe ~4x/year”.
Gundlach et al. 2025a “Price of Progress”: “Isolating out open models to control for competition effects and dividing by hardware price declines, we estimate that algorithmic efficiency progress is around 3× per year”.

It's nice to see three independent sources reach almost exactly the same conclusion—halving times of 8 months, 6 months, and 7½ months respectively. Surely a sign that the conclusion is solid!

…Haha, just kidding! I’ll argue that these three bullet points are hiding three totally different stories. The first two bullets are about training efficiency, and I’ll argue that both are deeply misleading (each for a different reason!). The third is about inference efficiency, which I think is right, and mostly explained by distillation of ever-better frontier models into their “mini” cousins.

source

Tl;dr / outline

§1 is my attempted big-picture take on what “algorithmic progress” has looked like in LLMs. I split it into four categories:
- [...]

---

Outline:

(01:48) Tl;dr / outline

(04:00) Status of this post

(04:12) 1. The big picture of LLM algorithmic progress, as I understand it right now

(04:19) 1.1. Stereotypical algorithmic efficiency improvements: there's the Transformer itself, and ... well, actually, not much else to speak of

(06:36) 1.2. Optimizations: Let's say up to 20×, but there's a ceiling

(07:47) 1.3. Data-related improvements

(09:40) 1.4. Algorithmic changes that are not really quantifiable as efficiency

(10:25) 2. Explaining away the two training-efficiency exponential claims

(10:46) 2.1. The Epoch 8-month halving time claim is a weird artifact of their methodology

(12:33) 2.2. The Dario 4x/year claim is I think just confused

(17:28) 3. Sanity-check: nanochat

(19:43) 4. Optional bonus section: why does this matter?

The original text contained 5 footnotes which were omitted from this narration.

---

First published:

February 5th, 2026

Source:

https://www.lesswrong.com/posts/sGNFtWbXiLJg2hLzK/the-nature-of-llm-algorithmic-progress

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

View all episodes

By LessWrong

February 05, 2026

“The nature of LLM algorithmic progress” by Steven Byrnes

22 minutes

There's a lot of talk about “algorithmic progress” in LLMs, especially in the context of exponentially-improving algorithmic efficiency. For example:

Epoch AI: “[training] compute required to reach a set performance threshold has halved approximately every 8 months”.
Dario Amodei 2025: “I'd guess the number today is maybe ~4x/year”.
Gundlach et al. 2025a “Price of Progress”: “Isolating out open models to control for competition effects and dividing by hardware price declines, we estimate that algorithmic efficiency progress is around 3× per year”.

It's nice to see three independent sources reach almost exactly the same conclusion—halving times of 8 months, 6 months, and 7½ months respectively. Surely a sign that the conclusion is solid!

source

Tl;dr / outline

§1 is my attempted big-picture take on what “algorithmic progress” has looked like in LLMs. I split it into four categories:
- [...]

---

Outline:

(01:48) Tl;dr / outline

(04:00) Status of this post

(04:12) 1. The big picture of LLM algorithmic progress, as I understand it right now

(04:19) 1.1. Stereotypical algorithmic efficiency improvements: there's the Transformer itself, and ... well, actually, not much else to speak of

(06:36) 1.2. Optimizations: Let's say up to 20×, but there's a ceiling

(07:47) 1.3. Data-related improvements

(09:40) 1.4. Algorithmic changes that are not really quantifiable as efficiency

(10:25) 2. Explaining away the two training-efficiency exponential claims

(10:46) 2.1. The Epoch 8-month halving time claim is a weird artifact of their methodology

(12:33) 2.2. The Dario 4x/year claim is I think just confused

(17:28) 3. Sanity-check: nanochat

(19:43) 4. Optional bonus section: why does this matter?

The original text contained 5 footnotes which were omitted from this narration.

---

First published:

February 5th, 2026

Source:

https://www.lesswrong.com/posts/sGNFtWbXiLJg2hLzK/the-nature-of-llm-algorithmic-progress

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more

More shows like LessWrong (30+ Karma)

View all

The Daily

112,309 Listeners

Astral Codex Ten Podcast

130 Listeners

Interesting Times with Ross Douthat

7,241 Listeners

Dwarkesh Podcast

559 Listeners

The Ezra Klein Show

16,305 Listeners

AI Article Readings

4 Listeners

Doom Debates!

14 Listeners

LessWrong posts by zvi

2 Listeners

Share “The nature of LLM algorithmic progress” by Steven Byrnes

Sign up to save your podcasts

“The nature of LLM algorithmic progress” by Steven Byrnes

“The nature of LLM algorithmic progress” by Steven Byrnes

More shows like LessWrong (30+ Karma)

The Daily

Astral Codex Ten Podcast

Interesting Times with Ross Douthat

Dwarkesh Podcast

The Ezra Klein Show

AI Article Readings

Doom Debates!

LessWrong posts by zvi