.NET Rocks!

Measuring LLMs with Jodie Burchell


Listen Later

How do you measure the quality of a large language model? Carl and Richard talk to Dr. Jodie Burchell about her work measuring large language models for accuracy, reliability, and consistency. Jodie talks about the variety of benchmarks that exist for LLMs and the problems they have. A broader conversation about quality digs into the idea that LLMs should be targeted to the particular topic area they are being used for - often, smaller is better! Building a good test suite for your LLM is challenging but can increase your confidence that the tool will work as expected.
...more
View all episodesView all episodes
Download on the App Store

.NET Rocks!By Carl Franklin and Richard Campbell

  • 4.4
  • 4.4
  • 4.4
  • 4.4
  • 4.4

4.4

37 ratings


More shows like .NET Rocks!

View all
Hanselminutes with Scott Hanselman by Scott Hanselman

Hanselminutes with Scott Hanselman

378 Listeners

Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

Software Engineering Radio - the podcast for professional software developers

265 Listeners

.NET Rocks! by Carl Franklin and Richard Campbell

.NET Rocks!

242 Listeners

RunAs Radio by Richard Campbell

RunAs Radio

82 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

285 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

43 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

624 Listeners

Soft Skills Engineering by Jamison Dance and Dave Smith

Soft Skills Engineering

271 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

439 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

203 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

984 Listeners

REWORK by 37signals

REWORK

212 Listeners

Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

Kubernetes Podcast from Google

184 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

62 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

137 Listeners

AI Report by Alexander Klöpping en Wietse Hage

AI Report

5 Listeners