Changelog Master Feed

Towards high-quality (maybe synthetic) datasets (Practical AI #290)


Listen Later

As Argilla puts it: “Data quality is what makes or breaks AI.” However, what exactly does this mean and how can AI team probably collaborate with domain experts towards improved data quality? David Berenstein & Ben Burtenshaw, who are building Argilla & Distilabel at Hugging Face, join us to dig into these topics along with synthetic data generation & AI-generated labeling / feedback.

Join the discussion

Changelog++ members save 11 minutes on this episode because they made the ads disappear. Join today!

Sponsors:

  • Fly.ioThe home of Changelog.com — Deploy your apps close to your users — global Anycast load-balancing, zero-configuration private networking, hardware isolation, and instant WireGuard VPN connections. Push-button deployments that scale to thousands of instances. Check out the speedrun to get started in minutes.
  • WorkOSA platform that gives developers a set of building blocks for quickly adding enterprise-ready features to their application. Add Single Sign-On (Okta, Azure, Google, Microsoft OAuth), sync users from any SCIM directory, HRIS integration, audit trails (SIEM), free magic link sign-in. WorkOS is designed for developers and offers a single, elegant interface that abstracts dozens of enterprise integrations. Learn more and get started at WorkOS.com
  • Eight SleepTake your sleep and recovery to the next level. Go to eightsleep.com/PRACTICALAI and use the code PRACTICALAI to get $350 off your very own Pod 4 Ultra. You can try it for free for 30 days - but we’re confident you will not want to return it. Once you experience AI-optimized sleep, you’ll wonder how you ever slept without it. Currently shipping to: United States, Canada, United Kingdom, Europe, and Australia.
  • Featuring:

    • Ben Burtenshaw – GitHub, LinkedIn, X
    • David Berenstein – GitHub, LinkedIn, X
    • Chris Benson – Website, GitHub, LinkedIn, X
    • Daniel Whitenack – Website, GitHub, X

    Show Notes:

    • Argilla
    • Distilabel
    • Synthetic Data Generator UI
    • Hugging Face and Argilla meetups
    • Something missing or broken? PRs welcome!

      ...more
      View all episodesView all episodes
      Download on the App Store

      Changelog Master FeedBy Changelog Media

      • 4.4
      • 4.4
      • 4.4
      • 4.4
      • 4.4

      4.4

      29 ratings


      More shows like Changelog Master Feed

      View all
      Hanselminutes with Scott Hanselman by Scott Hanselman

      Hanselminutes with Scott Hanselman

      377 Listeners

      Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

      Software Engineering Radio - the podcast for professional software developers

      272 Listeners

      The Changelog: Software Development, Open Source by Changelog Media

      The Changelog: Software Development, Open Source

      284 Listeners

      Thoughtworks Technology Podcast by Thoughtworks

      Thoughtworks Technology Podcast

      40 Listeners

      Talk Python To Me by Michael Kennedy

      Talk Python To Me

      590 Listeners

      Software Engineering Daily by Software Engineering Daily

      Software Engineering Daily

      621 Listeners

      Python Bytes by Michael Kennedy and Brian Okken

      Python Bytes

      215 Listeners

      Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

      Syntax - Tasty Web Development Treats

      987 Listeners

      CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

      CoRecursive: Coding Stories

      189 Listeners

      Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

      Kubernetes Podcast from Google

      181 Listeners

      Practical AI by Practical AI LLC

      Practical AI

      192 Listeners

      The Stack Overflow Podcast by The Stack Overflow Podcast

      The Stack Overflow Podcast

      62 Listeners

      Oxide and Friends by Oxide Computer Company

      Oxide and Friends

      47 Listeners

      Latent Space: The AI Engineer Podcast by swyx + Alessio

      Latent Space: The AI Engineer Podcast

      75 Listeners

      The Pragmatic Engineer by Gergely Orosz

      The Pragmatic Engineer

      53 Listeners