December 15, 2025

Ep 80: CEO of Surge AI Edwin Chen on Why Frontier Labs Are Diverging, RL Environments & Developing Model Taste

Listen Later

48 minutes

Edwin Chen is the founder and CEO of Surge AI, the data infrastructure company behind nearly every major frontier model. Surge works with OpenAI, Anthropic, Meta, and Google, providing the high-quality data and evaluation infrastructure that powers their models.

Edwin reveals why optimizing for popular benchmarks like LMArena is "basically optimizing for clickbait," how one frontier lab's models regressed for 6-12 months without anyone knowing, and why the industry's approach to measurement is fundamentally broken. Jacob and Edwin discuss what actually makes elite AI evaluators, why "there's never going to be a one size fits all solution" for AI models, and how frontier labs are taking surprisingly divergent paths to AGI.

(0:00) Intro
(0:56) The Pitfalls of Optimizing for LMArena
(4:34) Issues with Data Quality and Measurement
(9:44) The Importance of Human Evaluations
(13:40) The Rise of RL Environments
(17:21) Challenges and Lessons in Model Training
(19:59) Silicon Valley's Pivot Culture
(23:06) Technology-Driven Approach
(24:18) Quality Beyond Credentials
(27:51) Impact of Scale Acquisition
(28:35) Hiring for Research Culture
(30:48) Divergence in AI Training Paradigms
(34:16) Future of AI Models
(39:32) Multimodal AI and Quality
(43:44) Quickfire

With your co-hosts:

@jacobeffron

- Partner at Redpoint, Former PM Flatiron Health

@patrickachase

- Partner at Redpoint, Former ML Engineer LinkedIn

@ericabrescia

- Former COO Github, Founder Bitnami (acq’d by VMWare)

@jordan_segall

- Partner at Redpoint

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

Unsupervised Learning with Jacob Effron

By by Redpoint Ventures

4.9

5151 ratings

December 15, 2025

Ep 80: CEO of Surge AI Edwin Chen on Why Frontier Labs Are Diverging, RL Environments & Developing Model Taste

Listen Later

48 minutes

Edwin Chen is the founder and CEO of Surge AI, the data infrastructure company behind nearly every major frontier model. Surge works with OpenAI, Anthropic, Meta, and Google, providing the high-quality data and evaluation infrastructure that powers their models.

Edwin reveals why optimizing for popular benchmarks like LMArena is "basically optimizing for clickbait," how one frontier lab's models regressed for 6-12 months without anyone knowing, and why the industry's approach to measurement is fundamentally broken. Jacob and Edwin discuss what actually makes elite AI evaluators, why "there's never going to be a one size fits all solution" for AI models, and how frontier labs are taking surprisingly divergent paths to AGI.

(0:00) Intro
(0:56) The Pitfalls of Optimizing for LMArena
(4:34) Issues with Data Quality and Measurement
(9:44) The Importance of Human Evaluations
(13:40) The Rise of RL Environments
(17:21) Challenges and Lessons in Model Training
(19:59) Silicon Valley's Pivot Culture
(23:06) Technology-Driven Approach
(24:18) Quality Beyond Credentials
(27:51) Impact of Scale Acquisition
(28:35) Hiring for Research Culture
(30:48) Divergence in AI Training Paradigms
(34:16) Future of AI Models
(39:32) Multimodal AI and Quality
(43:44) Quickfire

With your co-hosts:

@jacobeffron

- Partner at Redpoint, Former PM Flatiron Health

@patrickachase

- Partner at Redpoint, Former ML Engineer LinkedIn

@ericabrescia

- Former COO Github, Founder Bitnami (acq’d by VMWare)

@jordan_segall

- Partner at Redpoint

...more

More shows like Unsupervised Learning with Jacob Effron

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

539 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,101 Listeners

Invest Like the Best with Patrick O'Shaughnessy by Colossus | Investing & Business Podcasts

Invest Like the Best with Patrick O'Shaughnessy

2,337 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

234 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

10,203 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

99 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

560 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

512 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

141 Listeners

Latent Space: The AI Engineer Podcast by Latent.Space

Latent Space: The AI Engineer Podcast

100 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

459 Listeners

AI + a16z by a16z

AI + a16z

32 Listeners

TBPN by John Coogan & Jordi Hays

TBPN

140 Listeners

Uncapped with Jack Altman by Alt Capital

Uncapped with Jack Altman

41 Listeners

OpenAI Podcast by OpenAI

OpenAI Podcast

59 Listeners