
Sign up to save your podcasts
Or
This episode explores the world of AI evaluation, with insights from Chris Hay on why benchmarks are "stupid" and how to effectively evaluate AI models.
https://github.com/confident-ai/deepeval
https://x.com/MikeBirdTech
https://x.com/FieroTy
https://x.com/chrishayuk
This episode explores the world of AI evaluation, with insights from Chris Hay on why benchmarks are "stupid" and how to effectively evaluate AI models.
https://github.com/confident-ai/deepeval
https://x.com/MikeBirdTech
https://x.com/FieroTy
https://x.com/chrishayuk