Behind the Craft

AI Evaluations Crash Course in 50 Minutes (2025) | Hamel Husain


Listen Later

Today, I want to share a new episode with Hamel Husain.


Hamel has trained 2,000+ PMs and engineers from companies like OpenAI, Anthropic, and Google on how to run AI evals. In my new episode, he shares a free master class on how to build evals for a real AI agent in just 50 minutes using a simple spreadsheet. I learned a lot from Hamel and I think you will too.


Hamel and I talked about:

(00:00) What the most valuable part of evals is

(01:25) Live walkthrough: Analyzing 100 real production traces

(09:50) Creating the eval criteria using a simple spreadsheet

(24:44) Why binary pass/fail ratings beat 1-5 scores every time

(28:52) The agreement metric trap that fools most PMs

(30:08) True positive and negative rates explained

(36:00) How to set up continuous evals in production


Get the takeaways: https://creatoreconomy.so/p/ai-evaluations-crash-course-in-50-minutes-hamel-husain


Where to find Hamel:

X: https://x.com/HamelHusain

Website: https://hamel.dev/


📌 Subscribe to this channel – more interviews coming soon!

...more
View all episodesView all episodes
Download on the App Store

Behind the CraftBy Peter Yang

  • 5
  • 5
  • 5
  • 5
  • 5

5

7 ratings


More shows like Behind the Craft

View all
The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

546 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,102 Listeners

Masters of Scale by WaitWhat

Masters of Scale

3,980 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

235 Listeners

Practical AI by Practical AI LLC

Practical AI

212 Listeners

My First Million by Hubspot Media

My First Million

2,663 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

512 Listeners

The Startup Ideas Podcast by Greg Isenberg

The Startup Ideas Podcast

212 Listeners

Latent Space: The AI Engineer Podcast by Latent.Space

Latent Space: The AI Engineer Podcast

102 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

680 Listeners

Everyday AI Podcast – An AI and ChatGPT Podcast by Everyday AI

Everyday AI Podcast – An AI and ChatGPT Podcast

112 Listeners

AI + a16z by a16z

AI + a16z

33 Listeners

AI Explored by Michael Stelzner, Social Media Examiner—AI marketing

AI Explored

97 Listeners

Uncapped with Jack Altman by Alt Capital

Uncapped with Jack Altman

41 Listeners

How I AI by Claire Vo

How I AI

161 Listeners