Changelog Master Feed

Towards high-quality (maybe synthetic) datasets (Practical AI #290)


Listen Later

As Argilla puts it: “Data quality is what makes or breaks AI.” However, what exactly does this mean and how can AI team probably collaborate with domain experts towards improved data quality? David Berenstein & Ben Burtenshaw, who are building Argilla & Distilabel at Hugging Face, join us to dig into these topics along with synthetic data generation & AI-generated labeling / feedback.

Join the discussion

Changelog++ members save 11 minutes on this episode because they made the ads disappear. Join today!

Sponsors:

  • Fly.ioThe home of Changelog.com — Deploy your apps close to your users — global Anycast load-balancing, zero-configuration private networking, hardware isolation, and instant WireGuard VPN connections. Push-button deployments that scale to thousands of instances. Check out the speedrun to get started in minutes.
  • WorkOSA platform that gives developers a set of building blocks for quickly adding enterprise-ready features to their application. Add Single Sign-On (Okta, Azure, Google, Microsoft OAuth), sync users from any SCIM directory, HRIS integration, audit trails (SIEM), free magic link sign-in. WorkOS is designed for developers and offers a single, elegant interface that abstracts dozens of enterprise integrations. Learn more and get started at WorkOS.com
  • Eight SleepTake your sleep and recovery to the next level. Go to eightsleep.com/PRACTICALAI and use the code PRACTICALAI to get $350 off your very own Pod 4 Ultra. You can try it for free for 30 days - but we’re confident you will not want to return it. Once you experience AI-optimized sleep, you’ll wonder how you ever slept without it. Currently shipping to: United States, Canada, United Kingdom, Europe, and Australia.
  • Featuring:

    • Ben Burtenshaw – GitHub, LinkedIn, X
    • David Berenstein – GitHub, LinkedIn, X
    • Chris Benson – Website, GitHub, LinkedIn, X
    • Daniel Whitenack – Website, GitHub, X

    Show Notes:

    • Argilla
    • Distilabel
    • Synthetic Data Generator UI
    • Hugging Face and Argilla meetups
    • Something missing or broken? PRs welcome!

      ...more
      View all episodesView all episodes
      Download on the App Store

      Changelog Master FeedBy Changelog Media

      • 4.4
      • 4.4
      • 4.4
      • 4.4
      • 4.4

      4.4

      29 ratings


      More shows like Changelog Master Feed

      View all
      Software Engineering Radio by se-radio@computer.org

      Software Engineering Radio

      273 Listeners

      Hanselminutes with Scott Hanselman by Scott Hanselman

      Hanselminutes with Scott Hanselman

      380 Listeners

      The Changelog: Software Development, Open Source by Changelog Media

      The Changelog: Software Development, Open Source

      290 Listeners

      Software Engineering Daily by Software Engineering Daily

      Software Engineering Daily

      625 Listeners

      Talk Python To Me by Michael Kennedy

      Talk Python To Me

      588 Listeners

      Soft Skills Engineering by Jamison Dance and Dave Smith

      Soft Skills Engineering

      283 Listeners

      Thoughtworks Technology Podcast by Thoughtworks

      Thoughtworks Technology Podcast

      42 Listeners

      The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

      The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

      434 Listeners

      Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

      Syntax - Tasty Web Development Treats

      986 Listeners

      CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

      CoRecursive: Coding Stories

      188 Listeners

      Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

      Kubernetes Podcast from Google

      181 Listeners

      Practical AI by Practical AI LLC

      Practical AI

      208 Listeners

      The Stack Overflow Podcast by The Stack Overflow Podcast

      The Stack Overflow Podcast

      62 Listeners

      Big Technology Podcast by Alex Kantrowitz

      Big Technology Podcast

      476 Listeners

      Oxide and Friends by Oxide Computer Company

      Oxide and Friends

      59 Listeners