LessWrong (30+ Karma)

“The nature of LLM algorithmic progress” by Steven Byrnes


Listen Later

There's a lot of talk about “algorithmic progress” in LLMs, especially in the context of exponentially-improving algorithmic efficiency. For example:

  • Epoch AI: “[training] compute required to reach a set performance threshold has halved approximately every 8 months”.
  • Dario Amodei 2025: “I'd guess the number today is maybe ~4x/year”.
  • Gundlach et al. 2025a “Price of Progress”: “Isolating out open models to control for competition effects and dividing by hardware price declines, we estimate that algorithmic efficiency progress is around 3× per year”.

It's nice to see three independent sources reach almost exactly the same conclusion—halving times of 8 months, 6 months, and 7½ months respectively. Surely a sign that the conclusion is solid!

…Haha, just kidding! I’ll argue that these three bullet points are hiding three totally different stories. The first two bullets are about training efficiency, and I’ll argue that both are deeply misleading (each for a different reason!). The third is about inference efficiency, which I think is right, and mostly explained by distillation of ever-better frontier models into their “mini” cousins.

source

Tl;dr / outline

  • §1 is my attempted big-picture take on what “algorithmic progress” has looked like in LLMs. I split it into four categories:
    • [...]

---

Outline:

(01:48) Tl;dr / outline

(04:00) Status of this post

(04:12) 1. The big picture of LLM algorithmic progress, as I understand it right now

(04:19) 1.1. Stereotypical algorithmic efficiency improvements: there's the Transformer itself, and ... well, actually, not much else to speak of

(06:36) 1.2. Optimizations: Let's say up to 20×, but there's a ceiling

(07:47) 1.3. Data-related improvements

(09:40) 1.4. Algorithmic changes that are not really quantifiable as efficiency

(10:25) 2. Explaining away the two training-efficiency exponential claims

(10:46) 2.1. The Epoch 8-month halving time claim is a weird artifact of their methodology

(12:33) 2.2. The Dario 4x/year claim is I think just confused

(17:28) 3. Sanity-check: nanochat

(19:43) 4. Optional bonus section: why does this matter?

The original text contained 5 footnotes which were omitted from this narration.

---

First published:

February 5th, 2026

Source:

https://www.lesswrong.com/posts/sGNFtWbXiLJg2hLzK/the-nature-of-llm-algorithmic-progress

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more
View all episodesView all episodes
Download on the App Store

LessWrong (30+ Karma)By LessWrong


More shows like LessWrong (30+ Karma)

View all
The Daily by The New York Times

The Daily

113,122 Listeners

Astral Codex Ten Podcast by Jeremiah

Astral Codex Ten Podcast

132 Listeners

Interesting Times with Ross Douthat by New York Times Opinion

Interesting Times with Ross Douthat

7,266 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

529 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

16,315 Listeners

AI Article Readings by Readings of great articles in AI voices

AI Article Readings

4 Listeners

Doom Debates by Liron Shapira

Doom Debates

14 Listeners

LessWrong posts by zvi by zvi

LessWrong posts by zvi

2 Listeners