Machine Learning Street Talk (MLST)

The AI Models Smart Enough to Know They're Cheating — Beth Barnes & David Rein [METR]


Listen Later

Beth Barnes and David Rein on the one graph that ate the AI timelines discourse, and why the two people who built it are the most careful about how you read it.**SPONSOR**Prolific - Quality data. From real people. For faster breakthroughs.https://www.prolific.com/?utm_source=mlstInterview: https://youtu.be/cnxZZTl1tkk---Beth Barnes and David Rein from METR on the one graph that ate the AI timelines discourse, and why the people who built it are the most careful about how it gets read.Beth founded METR after leaving OpenAI alignment. David is first author on GPQA and co-author on HCAST and the METR Time Horizons paper. Together they built the measurement Daniel Kokotajlo called the single most important piece of evidence on AI timelines: the log-linear line of "how long a task a frontier model can complete at 50% reliability" vs release date.The conversation opens on reward hacking. Current models can articulate in chat why a behaviour is undesired and then execute it anyway as agents. From there: construct validity, Melanie Mitchell's four-problem taxonomy, and the ARC-AGI 1-to-2 collapse as a worked example of adversarially-selected benchmarks regressing once labs target them. Beth's counter: METR deliberately does not adversarially select. David's: models do not have to do the right thing for the right reasons.Methodology, then specification — David's compiler analogy, Beth on four-month tasks as expensive to evaluate rather than unspecifiable. Then the SWE-bench reality check, the METR finding that half of passing PRs would not be merged, and Beth's horses-versus-bank-tellers analogy for the labour market.The close: monitorability, the coin-spinning boat, two-year recursive self-improvement, and Beth's line that "overhyped now" and "big deal later" are not correlated claims.---TIMESTAMPS:00:00:00 Intro00:02:06 Sponsor break: Prolific human-feedback infrastructure00:02:33 Welcome and the scalable oversight motivation00:06:02 Construct validity, benchmark pathologies and the Chollet worry00:15:45 Time Horizons: human time, HCAST tasks and the 50% logistic00:24:50 Is human difficulty really one variable?00:33:05 Agent harness evolution and the inference-compute dividend00:40:00 Scaffolding bells, token budgets and the credit-assignment problem00:44:15 Look at the damn graph: regularisation bug and reliability nuance00:50:00 Why 50%? Reliability, reward hacking and pizza-party transcripts00:55:20 Extrapolation risk and straight lines on graphs00:59:25 Software engineering as a specification acquisition problem01:07:40 Compilers also made ugly code: vibe-coding quality and Claude on METR Slack01:15:15 Strongest defensible claim, Carlini's compiler swarm and AI 202701:23:45 SWE-bench merge rates, the bank-teller analogy and horses01:31:45 Scheming, alignment faking and the mentalistic vocabulary problem01:40:45 Reward hacking, monitorability and chain-of-thought faithfulness01:45:25 Recursive self-improvement, knowledge vs intelligence and closing

ReScript: https://app.rescript.info/public/share/de3bb40cc02ee39fdf36e2c60366eb4d

(PDF, refs, transcript etc)

...more
View all episodesView all episodes
Download on the App Store

Machine Learning Street Talk (MLST)By Machine Learning Street Talk (MLST)

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

95 ratings


More shows like Machine Learning Street Talk (MLST)

View all
The a16z Show by Andreessen Horowitz

The a16z Show

1,107 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

432 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

302 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

345 Listeners

Practical AI by Practical AI LLC

Practical AI

214 Listeners

Google DeepMind: The Podcast by Hannah Fry

Google DeepMind: The Podcast

197 Listeners

Last Week in AI by Skynet Today

Last Week in AI

318 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

564 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

510 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

146 Listeners

Latent Space: The AI Engineer Podcast by Latent.Space

Latent Space: The AI Engineer Podcast

101 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

224 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

691 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

460 Listeners

AI + a16z by a16z

AI + a16z

33 Listeners