Behind the Craft

AI Evaluations Crash Course in 50 Minutes (2025) | Hamel Husain


Listen Later

Today, I want to share a new episode with Hamel Husain.


Hamel has trained 2,000+ PMs and engineers from companies like OpenAI, Anthropic, and Google on how to run AI evals. In my new episode, he shares a free master class on how to build evals for a real AI agent in just 50 minutes using a simple spreadsheet. I learned a lot from Hamel and I think you will too.


Hamel and I talked about:

(00:00) What the most valuable part of evals is

(01:25) Live walkthrough: Analyzing 100 real production traces

(09:50) Creating the eval criteria using a simple spreadsheet

(24:44) Why binary pass/fail ratings beat 1-5 scores every time

(28:52) The agreement metric trap that fools most PMs

(30:08) True positive and negative rates explained

(36:00) How to set up continuous evals in production


Get the takeaways: https://creatoreconomy.so/p/ai-evaluations-crash-course-in-50-minutes-hamel-husain


Where to find Hamel:

X: https://x.com/HamelHusain

Website: https://hamel.dev/


📌 Subscribe to this channel – more interviews coming soon!

...more
View all episodesView all episodes
Download on the App Store

Behind the CraftBy Peter Yang

  • 5
  • 5
  • 5
  • 5
  • 5

5

4 ratings


More shows like Behind the Craft

View all
The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

527 Listeners

The Official SaaStr Podcast: SaaS | Founders | Investors by SaaStr

The Official SaaStr Podcast: SaaS | Founders | Investors

172 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,083 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

232 Listeners

Product Thinking by Melissa Perri

Product Thinking

147 Listeners

The Startup Ideas Podcast by Greg Isenberg

The Startup Ideas Podcast

202 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

133 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

97 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

559 Listeners

Everyday AI Podcast – An AI and ChatGPT Podcast by Everyday AI

Everyday AI Podcast – An AI and ChatGPT Podcast

104 Listeners

The Next Wave - AI and The Future of Technology by Hubspot Media

The Next Wave - AI and The Future of Technology

59 Listeners

AI + a16z by a16z

AI + a16z

33 Listeners

AI Applied: Covering AI News, Interviews and Tools - ChatGPT, Midjourney, Gemini, OpenAI, Anthropic by Jaeden Schafer and Conor Grennan

AI Applied: Covering AI News, Interviews and Tools - ChatGPT, Midjourney, Gemini, OpenAI, Anthropic

134 Listeners

Lightcone Podcast by Y Combinator

Lightcone Podcast

22 Listeners

Training Data by Sequoia Capital

Training Data

41 Listeners