Super Data Science: ML & AI Podcast with Jon Krohn

706: Large Language Model Leaderboards and Benchmarks

08.18.2023 - By Jon KrohnPlay

Download our free app to listen on your phone

Download on the App StoreGet it on Google Play

In this episode, Caterina Constantinescu dives deep into Large Language Models (LLMs), spotlighting top leaderboards, evaluation benchmarks, and real-world user perceptions. Plus, discover the challenges of dataset contamination and the intricacies of platforms like HELM and Chatbot Arena.

Additional materials: www.superdatascience.com/706

Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.

More episodes from Super Data Science: ML & AI Podcast with Jon Krohn