
Sign up to save your podcasts
Or


Subtitle: Most recent progress probably isn't from unsustainable inference scaling.
As many have observed, since reasoning models first came out, the amount of compute LLMs use to complete tasks has increased greatly. This trend is often called inference scaling and there is an open question of how much of recent AI progress is driven by inference scaling versus by other capability improvements. Whether inference compute is driving most recent AI progress matters because you can only scale up inference so far before costs are too high for AI to be useful (while training compute can be amortized over usage).
However, it's important to distinguish between two reasons inference cost is going up:
LLMs are completing larger tasks that would have taken a human longer (and thus would have cost more to get a human to complete)
LLMs are using more compute as a fraction of the human cost for a given task
To understand this, it's helpful to think about the Pareto frontier of budget versus time-horizon. I’ll denominate this in 50% reliability time-horizon.1 Here is some fake data to illustrate what I expect this roughly looks like for recent progress:2
For the notion of [...]
The original text contained 6 footnotes which were omitted from this narration.
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.
By Redwood ResearchSubtitle: Most recent progress probably isn't from unsustainable inference scaling.
As many have observed, since reasoning models first came out, the amount of compute LLMs use to complete tasks has increased greatly. This trend is often called inference scaling and there is an open question of how much of recent AI progress is driven by inference scaling versus by other capability improvements. Whether inference compute is driving most recent AI progress matters because you can only scale up inference so far before costs are too high for AI to be useful (while training compute can be amortized over usage).
However, it's important to distinguish between two reasons inference cost is going up:
LLMs are completing larger tasks that would have taken a human longer (and thus would have cost more to get a human to complete)
LLMs are using more compute as a fraction of the human cost for a given task
To understand this, it's helpful to think about the Pareto frontier of budget versus time-horizon. I’ll denominate this in 50% reliability time-horizon.1 Here is some fake data to illustrate what I expect this roughly looks like for recent progress:2
For the notion of [...]
The original text contained 6 footnotes which were omitted from this narration.
---
First published:
Source:
---
Narrated by TYPE III AUDIO.
---
Images from the article:
Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.