Weaviate Podcast

Haize Labs with Leonard Tang - Weaviate Podcast #121!


Listen Later

How do you ensure your AI systems actually do what you expect them to do? Leonard Tang takes us deep into the revolutionary world of AI evaluation with concrete techniques you can apply today. Learn how Haize Labs is transforming AI testing through "scaling judge-time compute" - stacking weaker models to effectively evaluate stronger ones. Leonard unpacks the game-changing Verdict library that outperforms frontier models by 10-20% while dramatically reducing costs. Discover practical insights on creating contrastive evaluation sets that extract maximum signal from human feedback, implementing debate-based judging systems, and building custom reward models that align with enterprise needs. The conversation reveals powerful nuggets like using randomized agent debates to achieve consensus and lightweight guardrail models that run alongside inference. Whether you're developing AI applications or simply fascinated by how we'll ensure increasingly powerful AI systems perform as expected, this episode delivers immediate value with techniques you can implement right away, philosophical perspectives on AI safety, and a glimpse into the future of evaluation that will fundamentally shape how AI evolves.

...more
View all episodesView all episodes
Download on the App Store

Weaviate PodcastBy Weaviate

  • 4
  • 4
  • 4
  • 4
  • 4

4

4 ratings


More shows like Weaviate Podcast

View all
Practical AI by Practical AI LLC

Practical AI

207 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

202 Listeners

All-In with Chamath, Jason, Sacks & Friedberg by All-In Podcast, LLC

All-In with Chamath, Jason, Sacks & Friedberg

9,927 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

517 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

131 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

93 Listeners

Interconnects by Nathan Lambert

Interconnects

9 Listeners

OpenAI Podcast by OpenAI

OpenAI Podcast

52 Listeners