November 17, 2025

Evals and Aliens – How model testing is not a binary affair

1 hour 5 minutes

Pete and Alex examine AI model evaluation methodologies, comparing traditional machine learning metrics with the qualitative assessment challenges of large language models. They discuss the collaborative requirements between technical and business teams to establish evaluation criteria for generative AI systems, highlighting the subjective nature of testing conversational outputs versus binary classification tasks. With the help […]

...more

View all episodes

By Digressive Podcasts

November 17, 2025

Evals and Aliens – How model testing is not a binary affair

1 hour 5 minutes

...more

Share Evals and Aliens – How model testing is not a binary affair

Sign up to save your podcasts

Evals and Aliens – How model testing is not a binary affair

Evals and Aliens – How model testing is not a binary affair