The Stack Overflow Podcast

How do you evaluate an LLM? Try an LLM.

04.16.2024 - By The Stack Overflow PodcastPlay

Download our free app to listen on your phone

Download on the App StoreGet it on Google Play

On this episode: Stack Overflow senior data scientist Michael Geden tells Ryan and Ben about how data scientists evaluate large language models (LLMs) and their output. They cover the challenges involved in evaluating LLMs, how LLMs are being used to evaluate other LLMs, the importance of data validating, the need for human raters, and more needs and tradeoffs involved in selecting and fine-tuning LLMs.

More episodes from The Stack Overflow Podcast