Changelog Master Feed

Speech tech and Common Voice at Mozilla (Practical AI #104)


Listen Later

Many people are excited about creating usable speech technology. However, most of the audio data used by large companies isn’t available to the majority of people, and that data is often biased in terms of language, accent, and gender. Jenny, Josh, and Remy from Mozilla join us to discuss how Mozilla is building an open-source voice database that anyone can use to make innovative apps for devices and the web (Common Voice). They also discuss efforts through Mozilla fellowship program to develop speech tech for African languages and understand bias in data sets.

Join the discussion

Changelog++ members get a bonus 2 minutes at the end of this episode and zero ads. Join today!

Sponsors:

  • LinodeOur cloud of choice and the home of Changelog.com. Deploy a fast, efficient, native SSD cloud server for only $5/month. Get 4 months free using the code changelog2019 OR changelog2020. To learn more and get started head to linode.com/changelog.
  • Pace.dev – Minimalist web based management tool for your teams. Async by default communication and simplistic task management gives you everything you need to build your next thing. Brought to you by Go Time panelist Mat Ryer. Try it out today!
  • FastlyOur bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com.
  • RollbarWe move fast and fix things because of Rollbar. Resolve errors in minutes. Deploy with confidence. Learn more at rollbar.com/changelog.
  • Featuring:

    • Jenny Zhang – Website, X
    • Remy Muhire – GitHub, X
    • Josh Meyer – GitHub, X
    • Chris Benson – Website, GitHub, LinkedIn, X
    • Daniel Whitenack – Website, GitHub, X

    Show Notes:

    • Mozilla Common Voice
    • Announcement of Josh and Remy’s fellowship work on speech tech for African languages
    • Artie Bias Corpus
    • Readings on Demographic Bias in ASR:
      • Voice recognition still has significant race and gender biases
      • Gender and Dialect Bias in YouTube’s Automatic Captions
      • Racial disparities in automated speech recognition
      • Common Voice LREC Paper
      • Common Voice + DeepSpeech collaborators for Low-resource languages:
        • Digital Umuganda
        • AI Lab, Makerere University
        • Language Technologies Unit, Bangor University
        • Linguistics Department, Indiana University Bloomington
        • “under-sampled majority” is a quote from Joy Boulamwini (see this article)
        • Something missing or broken? PRs welcome!

          ...more
          View all episodesView all episodes
          Download on the App Store

          Changelog Master FeedBy Changelog Media

          • 4.4
          • 4.4
          • 4.4
          • 4.4
          • 4.4

          4.4

          29 ratings


          More shows like Changelog Master Feed

          View all
          Hanselminutes with Scott Hanselman by Scott Hanselman

          Hanselminutes with Scott Hanselman

          377 Listeners

          Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

          Software Engineering Radio - the podcast for professional software developers

          272 Listeners

          The Changelog: Software Development, Open Source by Changelog Media

          The Changelog: Software Development, Open Source

          284 Listeners

          Thoughtworks Technology Podcast by Thoughtworks

          Thoughtworks Technology Podcast

          40 Listeners

          Talk Python To Me by Michael Kennedy

          Talk Python To Me

          590 Listeners

          Software Engineering Daily by Software Engineering Daily

          Software Engineering Daily

          621 Listeners

          Python Bytes by Michael Kennedy and Brian Okken

          Python Bytes

          215 Listeners

          Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

          Syntax - Tasty Web Development Treats

          987 Listeners

          CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

          CoRecursive: Coding Stories

          189 Listeners

          Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

          Kubernetes Podcast from Google

          181 Listeners

          Practical AI by Practical AI LLC

          Practical AI

          192 Listeners

          The Stack Overflow Podcast by The Stack Overflow Podcast

          The Stack Overflow Podcast

          62 Listeners

          Oxide and Friends by Oxide Computer Company

          Oxide and Friends

          47 Listeners

          Latent Space: The AI Engineer Podcast by swyx + Alessio

          Latent Space: The AI Engineer Podcast

          75 Listeners

          The Pragmatic Engineer by Gergely Orosz

          The Pragmatic Engineer

          63 Listeners