Weaviate Podcast

Haize Labs with Leonard Tang - Weaviate Podcast #121!


Listen Later

How do you ensure your AI systems actually do what you expect them to do? Leonard Tang takes us deep into the revolutionary world of AI evaluation with concrete techniques you can apply today. Learn how Haize Labs is transforming AI testing through "scaling judge-time compute" - stacking weaker models to effectively evaluate stronger ones. Leonard unpacks the game-changing Verdict library that outperforms frontier models by 10-20% while dramatically reducing costs. Discover practical insights on creating contrastive evaluation sets that extract maximum signal from human feedback, implementing debate-based judging systems, and building custom reward models that align with enterprise needs. The conversation reveals powerful nuggets like using randomized agent debates to achieve consensus and lightweight guardrail models that run alongside inference. Whether you're developing AI applications or simply fascinated by how we'll ensure increasingly powerful AI systems perform as expected, this episode delivers immediate value with techniques you can implement right away, philosophical perspectives on AI safety, and a glimpse into the future of evaluation that will fundamentally shape how AI evolves.

...more
View all episodesView all episodes
Download on the App Store

Weaviate PodcastBy Weaviate

  • 4
  • 4
  • 4
  • 4
  • 4

4

4 ratings


More shows like Weaviate Podcast

View all
Fareed Zakaria GPS by CNN

Fareed Zakaria GPS

3,410 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

1,061 Listeners

Acquired by Ben Gilbert and David Rosenthal

Acquired

4,200 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

294 Listeners

Y Combinator Startup Podcast by Y Combinator

Y Combinator Startup Podcast

225 Listeners

DataFramed by DataCamp

DataFramed

269 Listeners

Practical AI by Practical AI LLC

Practical AI

189 Listeners

Last Week in AI by Skynet Today

Last Week in AI

297 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

9,266 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

424 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

126 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

69 Listeners

BG2Pod with Brad Gerstner and Bill Gurley by BG2Pod

BG2Pod with Brad Gerstner and Bill Gurley

464 Listeners

AI + a16z by a16z

AI + a16z

32 Listeners

OpenAI Podcast by OpenAI

OpenAI Podcast

30 Listeners