Changelog Master Feed

Metrics Driven Development (Practical AI #284)


Listen Later

How do you systematically measure, optimize, and improve the performance of LLM applications (like those powered by RAG or tool use)? Ragas is an open source effort that has been trying to answer this question comprehensively, and they are promoting a “Metrics Driven Development” approach. Shahul from Ragas joins us to discuss Ragas in this episode, and we dig into specific metrics, the difference between benchmarking models and evaluating LLM apps, generating synthetic test data and more.

Join the discussion

Changelog++ members save 5 minutes on this episode because they made the ads disappear. Join today!

Sponsors:

  • Assembly AI – Turn voice data into summaries with AssemblyAI’s leading Speech AI models. Built by AI experts, their Speech AI models include accurate speech-to-text for voice data (such as calls, virtual meetings, and podcasts), speaker detection, sentiment analysis, chapter detection, PII redaction, and more.

Featuring:

  • Shahul Es – GitHub, LinkedIn, X
  • Daniel Whitenack – Website, GitHub, X

Show Notes:

  • Ragas
  • Something missing or broken? PRs welcome!

    ...more
    View all episodesView all episodes
    Download on the App Store

    Changelog Master FeedBy Changelog Media

    • 4.4
    • 4.4
    • 4.4
    • 4.4
    • 4.4

    4.4

    29 ratings


    More shows like Changelog Master Feed

    View all
    Software Engineering Radio by se-radio@computer.org

    Software Engineering Radio

    273 Listeners

    Hanselminutes with Scott Hanselman by Scott Hanselman

    Hanselminutes with Scott Hanselman

    379 Listeners

    The Changelog: Software Development, Open Source by Changelog Media

    The Changelog: Software Development, Open Source

    290 Listeners

    Software Engineering Daily by Software Engineering Daily

    Software Engineering Daily

    625 Listeners

    Talk Python To Me by Michael Kennedy

    Talk Python To Me

    588 Listeners

    Soft Skills Engineering by Jamison Dance and Dave Smith

    Soft Skills Engineering

    283 Listeners

    Thoughtworks Technology Podcast by Thoughtworks

    Thoughtworks Technology Podcast

    42 Listeners

    The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

    The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

    435 Listeners

    Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

    Syntax - Tasty Web Development Treats

    985 Listeners

    CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

    CoRecursive: Coding Stories

    188 Listeners

    Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

    Kubernetes Podcast from Google

    181 Listeners

    Practical AI by Practical AI LLC

    Practical AI

    212 Listeners

    The Stack Overflow Podcast by The Stack Overflow Podcast

    The Stack Overflow Podcast

    62 Listeners

    Big Technology Podcast by Alex Kantrowitz

    Big Technology Podcast

    476 Listeners

    Oxide and Friends by Oxide Computer Company

    Oxide and Friends

    59 Listeners