Delivering Deep Learning Powered Speech Recognition As A Service For Developers At AssemblyAI

08.04.2021 - By Tobias Macey Play

Download our free app to listen on your phone

Summary

Building a software-as-a-service (SaaS) business is a fairly well understood pattern at this point. When the core of the service is a set of machine learning products it introduces a whole new set of challenges. In this episode Dylan Fox shares his experience building Assembly AI as a reliable and affordable option for automatic speech recognition that caters to a developer audience. He discusses the machine learning development and deployment processes that his team relies on, the scalability and performance considerations that deep learning models introduce, and the user experience design that goes into building for a developer audience. This is a fascinating conversation about a unique cross-section of considerations and how Dylan and his team are building an impressive and useful service.

Announcements

Hello and welcome to Podcast.__init__, the podcast about Python’s role in data and science.

When you’re ready to launch your next app or want to try a project you hear about on the show, you’ll need somewhere to deploy it, so take a look at our friends over at Linode. With the launch of their managed Kubernetes platform it’s easy to get started with the next generation of deployment and scaling, powered by the battle tested Linode platform, including simple pricing, node balancers, 40Gbit networking, dedicated CPU and GPU instances, and worldwide data centers. Go to pythonpodcast.com/linode and get a $100 credit to try out a Kubernetes cluster of your own. And don’t forget to thank them for their continued support of this show!

Your host as usual is Tobias Macey and today I’m interviewing Dylan Fox about AssemblyAI, a powerful and easy to use speech recognition API designed for developers

Interview

Introductions

How did you get introduced to Python?

Can you describe what Assembly AI is and the story behind it?

Speech recognition is a service that is being added to every cloud platform, video service, and podcast product. What do you see as the motivating factors for the current growth in this industry?

How would you characterize your overall position in the market?

What are the core goals that you are focused on with AssemblyAI?

Can you describe the different ways that you are using Python across the company?

How is the AssemblyAI platform architected?

What are the complexities that you have to work around to maintain high uptime for an API powered by a deep learning model?

What are the scaling challenges that crop up, whether on the training or serving?

What are the axes for improvement for a speech recognition model?

How do you balance tradeoffs of speed and accuracy as you iterate on the model?

What is your process for managing the deep learning workflow?

How do you manage CI/CD for your deep learning models?

What are the open areas of research in speech recognition?

What are the most interesting, innovative, or unexpected ways that you have seen AssemblyAI used?

What are the most interesting, unexpected, or challenging lessons that you have learned while working on AssemblyAI?

When is AssemblyAI the wrong choice?

What do you have planned for the future of AssemblyAI?

Keep In Touch

@YouveGotFox on Twitter

Picks

Tobias

H.P. Lovecraft

Dylan

Project Hail Mary by Andy Weir

Closing Announcements

Thank you for listening! Don’t forget to check out our other show, the Data Engineering Podcast for the latest on modern data management.

Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.

If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.

To help other people find the show please leave a review on iTunes and tell your friends and co-workers

Join the community in the new Zulip chat workspace at pythonpodcast.com/chat

Links

AssemblyAI

Two Scoops of Django

Nuance

Dragon Natural Speaking

PyTorch

Podcast Episode

Tensorflow

FastAPI

Flask

Tornado

Podcast Episode

Neural Magic

Podcast Episode

The Martian

The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

More episodes from The Python Podcast.__init__