Behind the Craft

Complete Beginner's Course on AI Evaluations: Step by Step (2025) | Aman Khan


Listen Later

Today, I want to share a new episode with Aman Khan.The best way to learn about AI evaluations is to watch 2 PMs build them live from scratch. In our new episode, Aman and I walk through creating evals for an AI customer support agent — from labeling a golden dataset to aligning LLM judges. This is the complete beginners AI eval course you've been waiting for.Aman and I talked about:

(00:00) What are AI evals and how to get good at them

(02:52) The 4 types of AI evaluations everyone should know

(06:08) Live demo: Building evals for a customer support agent

(10:29) Using Anthropic's console to generate great prompts

(15:13) Creating the evaluation criteria

(17:40) Adding human labels to the golden dataset

(31:05) Scaling evals with LLM-judge prompts

(38:21) How to align LLM judges with human judgmentGet the takeaways: https://creatoreconomy.so/p/complete-beginner-course-on-ai-evaluations-aman-khanWhere to find Aman:

X: https://www.linkedin.com/in/amanberkeley/

Website: https://arize.com/📌 Subscribe to this channel – more interviews coming soon!

...more
View all episodesView all episodes
Download on the App Store

Behind the CraftBy Peter Yang

  • 5
  • 5
  • 5
  • 5
  • 5

5

4 ratings


More shows like Behind the Craft

View all
The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch by Harry Stebbings

The Twenty Minute VC (20VC): Venture Capital | Startup Funding | The Pitch

527 Listeners

The Official SaaStr Podcast: SaaS | Founders | Investors by SaaStr

The Official SaaStr Podcast: SaaS | Founders | Investors

172 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,083 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

232 Listeners

Product Thinking by Melissa Perri

Product Thinking

147 Listeners

The Startup Ideas Podcast by Greg Isenberg

The Startup Ideas Podcast

202 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

133 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

97 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

559 Listeners

Everyday AI Podcast – An AI and ChatGPT Podcast by Everyday AI

Everyday AI Podcast – An AI and ChatGPT Podcast

104 Listeners

The Next Wave - AI and The Future of Technology by Hubspot Media

The Next Wave - AI and The Future of Technology

59 Listeners

AI + a16z by a16z

AI + a16z

33 Listeners

AI Applied: Covering AI News, Interviews and Tools - ChatGPT, Midjourney, Gemini, OpenAI, Anthropic by Jaeden Schafer and Conor Grennan

AI Applied: Covering AI News, Interviews and Tools - ChatGPT, Midjourney, Gemini, OpenAI, Anthropic

134 Listeners

Lightcone Podcast by Y Combinator

Lightcone Podcast

22 Listeners

Training Data by Sequoia Capital

Training Data

41 Listeners