The Python Podcast.__init__

An Exploration Of Automated Speech Recognition


Listen Later

Summary

The overwhelming growth of smartphones, smart speakers, and spoken word content has corresponded with increasingly sophisticated machine learning models for recognizing speech content in audio data. Dylan Fox founded Assembly to provide access to the most advanced automated speech recognition models for developers to incorporate into their own products. In this episode he gives an overview of the current state of the art for automated speech recognition, the varying requirements for accuracy and speed of models depending on the context in which they are used, and what is required to build a special purpose model for your own ASR applications.

Announcements
  • Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science.
  • When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!
  • Your host as usual is Tobias Macey and today I’m interviewing Dylan Fox about the challenges of training and deploying large models for automated speech recognition
  • Interview
    • Introductions
    • How did you get introduced to Python?
    • What is involved in building an ASR model?
      • How does the complexity/difficulty compare to models for other data formats? (e.g. computer vision, NLP, NER, etc.)
      • How have ASR models changed over the last 5, 10, 15 years?
      • What are some other categories of ML applications that work with audio data?
        • How does the level of complexity compare to ASR applications?
        • What is the typical size of an ASR model that you are deploying at Assembly?
          • What are the factors that contribute to the overall size of a given model?
          • How does accuracy compare with model size?
          • How does the size of a model contribute to the overall challenge of deploying/monitoring/scaling it in a production environment?
          • How can startups effectively manage the time/cost that comes with training large models?
          • What are some techniques that you use/attributes that you focus on for feature definitions in the source audio data?
          • Can you describe the lifecycle stages of an ASR model at Assembly?
          • What are the aspects of ASR which are still intractable or impractical to productionize?
          • What are the most interesting, innovative, or unexpected ways that you have seen ASR technology used?
          • What are the most interesting, unexpected, or challenging lessons that you have learned while working on ASR?
          • What are the trends in research or industry that you are keeping an eye on?
          • Keep In Touch
            • LinkedIn
            • @YouveGotFox on Twitter
            • Picks
              • Tobias
                • The Hitman’s Wife’s Bodyguard
                • Dylan
                  • Inspiration 4 Documentary
                  • Closing Announcements
                    • Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.
                    • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
                    • If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
                    • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
                    • Join the community in the new Zulip chat workspace at pythonpodcast.com/chat
                    • Links
                      • Learn Python The Hard Way
                      • DeepSpeech
                      • Wav2Letter
                      • BERT
                      • GPT-3
                      • Convolutional Neural Network (CNN)
                      • Recurrent Neural Network (RNN)
                      • Mycroft
                        • Podcast Episode
                        • CMU Sphinx
                        • Pocket Sphinx
                        • Gaussian Mixture Model (GMM)
                        • Hidden Markov Model (HMM)
                        • DeepSpeech Paper
                        • Transformer Architecture
                        • Audio Analytic Sound Recognition Podcast Episode
                        • Horovod distributed training library
                        • Knowledge Distillation
                        • Libre Speech Data Set
                        • Lambda Labs
                        • Wav2Vec
                        • The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

                          ...more
                          View all episodesView all episodes
                          Download on the App Store

                          The Python Podcast.__init__By Tobias Macey

                          • 4.4
                          • 4.4
                          • 4.4
                          • 4.4
                          • 4.4

                          4.4

                          100 ratings


                          More shows like The Python Podcast.__init__

                          View all
                          TED Talks Daily by TED

                          TED Talks Daily

                          11,284 Listeners

                          6 Minute English by BBC Radio

                          6 Minute English

                          1,779 Listeners

                          The Changelog: Software Development, Open Source by Changelog Media

                          The Changelog: Software Development, Open Source

                          285 Listeners

                          Data Skeptic by Kyle Polich

                          Data Skeptic

                          474 Listeners

                          Talk Python To Me by Michael Kennedy

                          Talk Python To Me

                          585 Listeners

                          Software Engineering Daily by Software Engineering Daily

                          Software Engineering Daily

                          629 Listeners

                          The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

                          The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

                          425 Listeners

                          Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

                          Super Data Science: ML & AI Podcast with Jon Krohn

                          296 Listeners

                          Python Bytes by Michael Kennedy and Brian Okken

                          Python Bytes

                          213 Listeners

                          Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

                          Syntax - Tasty Web Development Treats

                          987 Listeners

                          DataFramed by DataCamp

                          DataFramed

                          267 Listeners

                          Practical AI by Practical AI LLC

                          Practical AI

                          196 Listeners

                          The Real Python Podcast by Real Python

                          The Real Python Podcast

                          137 Listeners

                          Last Week in AI by Skynet Today

                          Last Week in AI

                          275 Listeners

                          Latent Space: The AI Engineer Podcast by swyx + Alessio

                          Latent Space: The AI Engineer Podcast

                          66 Listeners