The Python Podcast.__init__

SpaCy with Matthew Honnibal


Listen Later

Summary

As the amount of text available on the internet and in businesses continues to increase, the need for fast and accurate language analysis becomes more prominent. This week Matthew Honnibal, the creator of SpaCy, talks about his experiences researching natural language processing and creating a library to make his findings accessible to industry.

Brief Introduction
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • I would like to thank everyone who has donated to the show. Your contributions help us make the show sustainable.
  • When you’re ready to launch your next project you’ll need somewhere to deploy it. Check out Linode at linode.com/podcastinit and get a $20 credit to try out their fast and reliable Linux virtual servers for running your awesome app.
  • You’ll want to make sure that your users don’t have to put up with bugs, so you should use Rollbar for tracking and aggregating your application errors to find and fix the bugs in your application before your users notice they exist. Use the link rollbar.com/podcastinit to get 90 days and 300,000 errors for free on their bootstrap plan.
  • Visit our site to subscribe to our show, sign up for our newsletter, read the show notes, and get in touch.
  • To help other people find the show you can leave a review on iTunes, or Google Play Music, and tell your friends and co-workers
  • Join our community! Visit discourse.pythonpodcast.com for your opportunity to find out about upcoming guests, suggest questions, and propose show ideas.
  • Your host as usual is Tobias Macey and today I’m interviewing Matthew Honnibal about SpaCy and Explosion.AI
  • Interview with Matthew Honnibal
    • Introductions
    • How did you get introduced to Python?
    • Can you start by sharing what SpaCy is and what problem you were trying to solve when you created it?
    • Another project for natural language processing that has been part of the Python ecosystem for a number of years is the Natural Language Tool Kit (NLTK). How does SpaCy differ from the NLTK and are there any cases where that would be the better choice?
    • How much knowledge of NLP and computational linguistics is necessary to be able to use SpaCy?
    • What does the internal design and architecture of SpaCy look like and what are the biggest challenges associated with its development to date and into the future?
    • One of the projects that you have built around SpaCy which I think is really cool and caught my attention when I first found your project is the displaCy visualization tool. Can you explain what that is and why you think it is important?
    • What are some kinds of applications where SpaCy would be useful which might not be obvious candidates for it?
    • Why is speed such an important focus for an NLP library?
    • One of the ways that you have been able to gain a speed boost is through releasing the GIL and allowing for true parallelism via Cython. How have you managed to ensure that this doesn’t lead to data races and program failures?
    • Building on the success of SpaCy you founded a company called Explosion AI. Can you explain what your goals are for this endeavor and the kinds of services that you are offering?
    • What are some of the most interesting uses of SpaCy that you have seen?
    • What do you have planned for the future of SpaCy?
    • Keep In Touch
      • Twitter
        • Matthew
        • SpaCy
        • Explosion AI
        • Mailing List
        • Explosion AI Contact Form
        • Picks
          • Tobias
            • Zoom H4N Pro
            • Shure SM58
            • Links
              • Reddit sense2vec demo
              • DisplaCy
              • DisplaCy Entity Visualizer
              • SpaCy Showcase
              • NLTK
              • Chartbeat
              • Cytora
              • The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA



                ...more
                View all episodesView all episodes
                Download on the App Store

                The Python Podcast.__init__By Tobias Macey

                • 4.4
                • 4.4
                • 4.4
                • 4.4
                • 4.4

                4.4

                100 ratings


                More shows like The Python Podcast.__init__

                View all
                The Changelog: Software Development, Open Source by Changelog Media

                The Changelog: Software Development, Open Source

                283 Listeners

                Data Skeptic by Kyle Polich

                Data Skeptic

                481 Listeners

                Chat With Traders by Tessa Dao

                Chat With Traders

                1,979 Listeners

                Talk Python To Me by Michael Kennedy

                Talk Python To Me

                590 Listeners

                Software Engineering Daily by Software Engineering Daily

                Software Engineering Daily

                622 Listeners

                The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

                The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

                444 Listeners

                Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

                Super Data Science: ML & AI Podcast with Jon Krohn

                297 Listeners

                Python Bytes by Michael Kennedy and Brian Okken

                Python Bytes

                215 Listeners

                Data Engineering Podcast by Tobias Macey

                Data Engineering Podcast

                141 Listeners

                Machine Learning Guide by OCDevel

                Machine Learning Guide

                764 Listeners

                Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

                Syntax - Tasty Web Development Treats

                986 Listeners

                DataFramed by DataCamp

                DataFramed

                267 Listeners

                Practical AI by Practical AI LLC

                Practical AI

                192 Listeners

                The Real Python Podcast by Real Python

                The Real Python Podcast

                139 Listeners

                Hard Fork by The New York Times

                Hard Fork

                5,431 Listeners