June 16, 2026

Why Tejal Patwardhan stopped underestimating the models - Episode 21

Listen Later

44 minutes

The old tests are getting too easy. Tejal Patwardhan leads OpenAI’s frontier evals team, which is finding new ways to measure and forecast progress as models become more capable. She and host Andrew Mayne discuss why evals matter for research, how benchmarks can break or get gamed, and what models need to be judged on next.

Chapters

00:00:24 Growing up at OpenAI

00:03:10 Why reasoning changed everything

00:06:28 What made o1 surprising

00:11:20 Why old benchmarks stopped working

00:14:45 What makes a good benchmark

00:17:35 Why evals are getting harder

00:22:09 Measuring voice and vision models

00:24:48 Testing models on real science

00:33:23 How OpenAI tracks frontier progress

00:40:47 What AI means for work

Hosted on Acast. See acast.com/privacy for more information.

...more

View all episodes

View all episodes

Download on the App Store

Download on the App Store

Get it on Google Play

OpenAI Podcast

By OpenAI

4.4

5858 ratings

June 16, 2026

Why Tejal Patwardhan stopped underestimating the models - Episode 21

Listen Later

44 minutes

The old tests are getting too easy. Tejal Patwardhan leads OpenAI’s frontier evals team, which is finding new ways to measure and forecast progress as models become more capable. She and host Andrew Mayne discuss why evals matter for research, how benchmarks can break or get gamed, and what models need to be judged on next.

Chapters

00:00:24 Growing up at OpenAI

00:03:10 Why reasoning changed everything

00:06:28 What made o1 surprising

00:11:20 Why old benchmarks stopped working

00:14:45 What makes a good benchmark

00:17:35 Why evals are getting harder

00:22:09 Measuring voice and vision models

00:24:48 Testing models on real science

00:33:23 How OpenAI tracks frontier progress

00:40:47 What AI means for work

Hosted on Acast. See acast.com/privacy for more information.

...more

More shows like OpenAI Podcast

The a16z Show by Andreessen Horowitz

The a16z Show

1,093 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

345 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

228 Listeners

Practical AI by Daniel Whitenack and Chris Benson

Practical AI

208 Listeners

Last Week in AI by Skynet Today

Last Week in AI

314 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

99 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

576 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

508 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

143 Listeners

Latent Space: The AI Engineer Podcast by Latent.Space

Latent Space: The AI Engineer Podcast

101 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

226 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

682 Listeners

The Next Wave - AI and The Future of Technology by Mindstream (Hubspot Media)

The Next Wave - AI and The Future of Technology

54 Listeners

AI + a16z by a16z

AI + a16z

34 Listeners

How I AI by Claire Vo

How I AI

158 Listeners