April 03, 2025

Chasing Real AGI: Inside ARC Prize 2025 with Chollet & Knoop

1 hour

In this fascinating episode, we dive deep into the race towards true AI intelligence, AGI benchmarks, test-time adaptation, and program synthesis with star AI researcher (and philosopher) Francois Chollet, creator of Keras and the ARC AGI benchmark, and Mike Knoop, co-founder of Zapier and now co-founder with Francois of both the ARC Prize and the research lab Ndea. With the launch of ARC Prize 2025 and ARC-AGI 2, they explain why existing LLMs fall short on true intelligence tests, how new models like O3 mark a step change in capabilities, and what it will really take to reach AGI.

We cover everything from the technical evolution of ARC 1 to ARC 2, the shift toward test-time reasoning, and the role of program synthesis as a foundation for more general intelligence. The conversation also explores the philosophical underpinnings of intelligence, the structure of the ARC Prize, and the motivation behind launching Ndea — a ew AGI research lab that aims to build a "factory for rapid scientific advancement." Whether you're deep in the AI research trenches or just fascinated by where this is all headed, this episode offers clarity and inspiration.

Ndea

Website - https://ndea.com

X/Twitter - https://x.com/ndea

ARC Prize

Website - https://arcprize.org

X/Twitter - https://x.com/arcprize

François Chollet

LinkedIn - https://www.linkedin.com/in/fchollet

X/Twitter - https://x.com/fchollet

Mike Knoop

X/Twitter - https://x.com/mikeknoop

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) Intro

(01:05) Introduction to ARC Prize 2025 and ARC-AGI 2

(02:07) What is ARC and how it differs from other AI benchmarks

(02:54) Why current models struggle with fluid intelligence

(03:52) Shift from static LLMs to test-time adaptation

(04:19) What ARC measures vs. traditional benchmarks

(07:52) Limitations of brute-force scaling in LLMs

(13:31) Defining intelligence: adaptation and efficiency

(16:19) How O3 achieved a massive leap in ARC performance

(20:35) Speculation on O3's architecture and test-time search

(22:48) Program synthesis: what it is and why it matters

(28:28) Combining LLMs with search and synthesis techniques

(34:57) The ARC Prize structure: efficiency track, private vs. public

(42:03) Open source as a requirement for progress

(44:59) What's new in ARC-AGI 2 and human benchmark testing

(48:14) Capabilities ARC-AGI 2 is designed to test

(49:21) When will ARC-AGI 2 be saturated? AGI timelines

(52:25) Founding of NDEA and why now

(54:19) Vision beyond AGI: a factory for scientific advancement

(56:40) What NDEA is building and why it's different from LLM labs

(58:32) Hiring and remote-first culture at NDEA

(59:52) Closing thoughts and the future of AI research

...more

View all episodes

By Matt Turck

2424 ratings

April 03, 2025

Chasing Real AGI: Inside ARC Prize 2025 with Chollet & Knoop

1 hour

Ndea

Website - https://ndea.com

X/Twitter - https://x.com/ndea

ARC Prize

Website - https://arcprize.org

X/Twitter - https://x.com/arcprize

François Chollet

LinkedIn - https://www.linkedin.com/in/fchollet

X/Twitter - https://x.com/fchollet

Mike Knoop

X/Twitter - https://x.com/mikeknoop

FIRSTMARK

Website - https://firstmark.com

X/Twitter - https://twitter.com/FirstMarkCap

Matt Turck (Managing Director)

LinkedIn - https://www.linkedin.com/in/turck/

X/Twitter - https://twitter.com/mattturck

(00:00) Intro

(01:05) Introduction to ARC Prize 2025 and ARC-AGI 2

(02:07) What is ARC and how it differs from other AI benchmarks

(02:54) Why current models struggle with fluid intelligence

(03:52) Shift from static LLMs to test-time adaptation

(04:19) What ARC measures vs. traditional benchmarks

(07:52) Limitations of brute-force scaling in LLMs

(13:31) Defining intelligence: adaptation and efficiency

(16:19) How O3 achieved a massive leap in ARC performance

(20:35) Speculation on O3's architecture and test-time search

(22:48) Program synthesis: what it is and why it matters

(28:28) Combining LLMs with search and synthesis techniques

(34:57) The ARC Prize structure: efficiency track, private vs. public

(42:03) Open source as a requirement for progress

(44:59) What's new in ARC-AGI 2 and human benchmark testing

(48:14) Capabilities ARC-AGI 2 is designed to test

(49:21) When will ARC-AGI 2 be saturated? AGI timelines

(52:25) Founding of NDEA and why now

(54:19) Vision beyond AGI: a factory for scientific advancement

(56:40) What NDEA is building and why it's different from LLM labs

(58:32) Hiring and remote-first culture at NDEA

(59:52) Closing thoughts and the future of AI research

...more

More shows like The MAD Podcast with Matt Turck

View all

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

530 Listeners

The a16z Show

1,095 Listeners

Invest Like the Best with Patrick O'Shaughnessy

2,353 Listeners

Azeem Azhar's Exponential View

613 Listeners

Y Combinator Startup Podcast

224 Listeners

All-In with Chamath, Jason, Sacks & Friedberg

10,019 Listeners

Machine Learning Street Talk (MLST)

97 Listeners

Dwarkesh Podcast

525 Listeners

Big Technology Podcast

504 Listeners

No Priors: Artificial Intelligence | Technology | Startups

133 Listeners

Latent Space: The AI Engineer Podcast

93 Listeners

AI + a16z

34 Listeners

Sharp Tech with Ben Thompson

95 Listeners

TBPN

121 Listeners

Uncapped with Jack Altman

42 Listeners

Share Chasing Real AGI: Inside ARC Prize 2025 with Chollet & Knoop

Sign up to save your podcasts

Chasing Real AGI: Inside ARC Prize 2025 with Chollet & Knoop

Chasing Real AGI: Inside ARC Prize 2025 with Chollet & Knoop

More shows like The MAD Podcast with Matt Turck

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

The a16z Show

Invest Like the Best with Patrick O'Shaughnessy

Azeem Azhar's Exponential View

Y Combinator Startup Podcast

All-In with Chamath, Jason, Sacks & Friedberg

Machine Learning Street Talk (MLST)

Dwarkesh Podcast

Big Technology Podcast

No Priors: Artificial Intelligence | Technology | Startups

Latent Space: The AI Engineer Podcast

AI + a16z

Sharp Tech with Ben Thompson

TBPN

Uncapped with Jack Altman