In this episode, Caterina Constantinescu dives deep into Large Language Models (LLMs), spotlighting top leaderboards, evaluation benchmarks, and real-world user perceptions. Plus, discover the challenges of dataset contamination and the intricacies of platforms like HELM and Chatbot Arena. Additional materials: <a href="https://www.superdatascience.com/706">www.superdatascience.com/706</a> Interested in sponsoring a SuperDataScience Podcast episode? Visit <a href="https://www.jonkrohn.com/podcast">JonKrohn.com/podcast</a> for sponsorship information.

In this episode, Caterina Constantinescu dives deep into Large Language Models (LLMs), spotlighting top leaderboards, evaluation benchmarks, and real-world user perceptions. Plus, discover the challenges of dataset contamination and the intricacies of platforms like HELM and Chatbot Arena. Additional materials: www.superdatascience.com/706 Interested in sponsoring a SuperDataScience Podcast episode? Visit JonKrohn.com/podcast for sponsorship information.

In this episode, Caterina Constantinescu dives deep into Large Language Models (LLMs), spotlighting top leaderboards, evaluation benchmarks, and real-world user perceptions. Plus, discover the challenges of dataset contamination and the intricacies of platforms like HELM and Chatbot Arena. Additional materials: <a href="https://www.superdatascience.com/706" rel="noopener noreferrer">www.superdatascience.com/706</a> Interested in sponsoring a SuperDataScience Podcast episode? Visit <a href="https://www.jonkrohn.com/podcast" rel="noopener noreferrer">JonKrohn.com/podcast</a> for sponsorship information.

706: Large Language Model Leaderboards and Benchmarks

The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact.

Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy.

We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.

Science

Technology

The latest machine learning, A.I., and data career topics from across both academia and industry are brought to you by host Dr. Jon Krohn on the Super Data Science Podcast. As the quantity of data on our planet doubles every couple of years and with this trend set to continue for decades to come, there's an unprecedented opportunity for you to make a meaningful impact in your lifetime. In conversation with the biggest names in the data science industry, Jon cuts through hype to fuel that professional impact. Whether you're curious about getting started in a data career or you're a deep technical expert, whether you'd like to understand what A.I. is or you'd like to integrate more data-driven processes into your business, we have inspiring guests and lighthearted conversation for you to enjoy. We cover tools, techniques, and implementation tricks across data collection, databases, analytics, predictive modeling, visualization, software engineering, real-world applications, commercialization, and entrepreneurship − everything you need to crush it with data science.

Share 706: Large Language Model Leaderboards and Benchmarks

Sign up to save your podcasts

706: Large Language Model Leaderboards and Benchmarks

706: Large Language Model Leaderboards and Benchmarks

More shows like Super Data Science: ML & AI Podcast with Jon Krohn

Data Skeptic

Software Engineering Daily

Talk Python To Me

NVIDIA AI Podcast

AI Today Podcast

DataFramed

Practical AI

The Real Python Podcast

Machine Learning Street Talk (MLST)

No Priors: Artificial Intelligence | Technology | Startups

AI Chat: ChatGPT, AI News, Artificial Intelligence, OpenAI, Machine Learning

This Day in AI Podcast

The AI Daily Brief: Artificial Intelligence News and Analysis

AI For Humans: Making Artificial Intelligence Fun & Practical

Training Data