LessWrong (30+ Karma)

“Forecasting time to automated superhuman coders [AI 2027 Timelines Forecast]” by elifland, Nikola Jurkovic


Listen Later

Authors: Eli Lifland, Nikola Jurkovic[1], FutureSearch[2]

This is supporting research for AI 2027. We'll be cross-posting these over the next week or so.

Assumes no large-scale catastrophes happen (e.g., a solar flare, a pandemic, nuclear war), no government or self-imposed slowdown, and no significant supply chain disruptions. All forecasts give a substantial chance of superhuman coding arriving in 2027.

Summary

We forecast when the leading AGI company will internally develop a superhuman coder (SC): an AI system that can do any coding tasks that the best AGI company engineer does, while being much faster and cheaper. At this point, the SC will likely speed up AI progress substantially as is explored in our takeoff forecast.

We first show Method 1: time-horizon-extension, a relatively simple model which forecasts when SC will arrive by extending the trend established by METR's report of AIs accomplishing tasks that take humans increasing amounts [...]

---

Outline:

(00:56) Summary

(02:43) Defining a superhuman coder (SC)

(03:35) Method 1: Time horizon extension

(05:05) METR's time horizon report

(06:30) Forecasting SC's arrival

(06:54) Method 2: Benchmarks and gaps

(06:59) Time to RE-Bench saturation

(07:03) Why RE-Bench?

(09:25) Forecasting saturation via extrapolation

(12:42) AI progress speedups after saturation

(14:04) Time to cross gaps between RE-Bench saturation and SC

(14:32) What are the gaps in task difficulty between RE-Bench saturation and SC?

(15:11) Methodology

(17:25) How fast can the task difficulty gaps be crossed?

(23:31) Other factors for benchmarks and gaps

(23:46) Compute scaling and algorithmic progress slowdown

(24:43) Gap between internal and external deployment

(25:20) Intermediate speedups

(26:55) Overall benchmarks and gaps forecasts

(27:44) Appendix

(27:47) Individual Forecaster Views for Benchmark-Gap Model Factors

(27:53) Engineering complexity: handling complex codebases

(31:16) Feedback loops: Working without externally provided feedback

(37:28) Parallel projects: Handling several interacting projects

(38:45) Specialization: Specializing in skills specific to frontier AI development

(40:17) Cost and speed

(48:48) Other task difficulty gaps

(50:52) Superhuman Coder (SC): time horizon and reliability requirements

(55:53) RE-Bench saturation resolution criteria

The original text contained 19 footnotes which were omitted from this narration.

---

First published:

April 10th, 2025

Source:

https://www.lesswrong.com/posts/ggqSg7bSLChanfunf/forecasting-time-to-automated-superhuman-coders-ai-2027

---

Narrated by TYPE III AUDIO.

---

Images from the article:

Apple Podcasts and Spotify do not show images in the episode description. Try Pocket Casts, or another podcast app.

...more
View all episodesView all episodes
Download on the App Store

LessWrong (30+ Karma)By LessWrong


More shows like LessWrong (30+ Karma)

View all
Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,336 Listeners

Conversations with Tyler by Mercatus Center at George Mason University

Conversations with Tyler

2,384 Listeners

The Peter Attia Drive by Peter Attia, MD

The Peter Attia Drive

8,004 Listeners

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas by Sean Carroll | Wondery

Sean Carroll's Mindscape: Science, Society, Philosophy, Culture, Arts, and Ideas

4,125 Listeners

ManifoldOne by Steve Hsu

ManifoldOne

90 Listeners

Your Undivided Attention by Tristan Harris and Aza Raskin, The Center for Humane Technology

Your Undivided Attention

1,494 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

9,255 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

91 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

424 Listeners

Hard Fork by The New York Times

Hard Fork

5,448 Listeners

The Ezra Klein Show by New York Times Opinion

The Ezra Klein Show

15,481 Listeners

Moonshots with Peter Diamandis by PHD Ventures

Moonshots with Peter Diamandis

505 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

127 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

71 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

465 Listeners