The Python Podcast.__init__

Pandas Extension Arrays with Tom Augspurger


Listen Later

Summary

Pandas is a swiss army knife for data processing in Python but it has long been difficult to customize. In the latest release there is now an extension interface for adding custom data types with namespaced APIs. This allows for building and combining domain specific use cases and alternative storage mechanisms. In this episode Tom Augspurger describes how the new ExtensionArray works, how it came to be, and how you can start building your own extensions today.

Preface
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute.
  • To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Your host as usual is Tobias Macey and today I’m interviewing Tom Augspurger about the extension interface for Pandas data frames and the use cases that it enables
  • Interview
    • Introductions
    • How did you get introduced to Python?
    • Most people are familiar with Pandas, but can you describe at a high level the new extension interface?
      • What is the story behind the implementation of this functionality?
      • Prior to this interface what was the option for anyone who wanted to extend Pandas?

      • What are some of the new data types that are available as external packages?

        • What are some of the unique use cases that they enable?

        • How is the new interface implemented within Pandas?

        • What were the most challenging or difficult aspects of building this new functionality?

        • What are some of the more interesting possibilities that you are aware of for new extension types?

        • What are the limitations of the interface for libraries that add new array functionality?

        • What is the next major change or improvement that you would like to add in Pandas?

        • Keep In Touch
          • tomaugspurger on GitHub
          • @TomAugspurger on Twitter
          • Picks
            • Tobias
              • Black Panther

              • Tom

                • Dask-ML

                • Links
                  • Pandas
                  • ExtensionArray
                  • Original IP Address proposal
                  • Mid-implementation blog post
                  • Dataframe
                  • Numpy
                  • Cyberpandas
                  • Geopandas
                  • GIS
                  • Arrow
                  • CuPy
                  • JQ
                  • Wes McKinney
                  • Array ufunc
                  • Matplotlib
                  • Altair
                  • Seaborn
                  • Bokeh
                    • Podcast.__init__ Interview

                    • Dask

                      • Data Engineering Interview

                      • The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

                        ...more
                        View all episodesView all episodes
                        Download on the App Store

                        The Python Podcast.__init__By Tobias Macey

                        • 4.4
                        • 4.4
                        • 4.4
                        • 4.4
                        • 4.4

                        4.4

                        100 ratings


                        More shows like The Python Podcast.__init__

                        View all
                        The Changelog: Software Development, Open Source by Changelog Media

                        The Changelog: Software Development, Open Source

                        284 Listeners

                        All Ears English Podcast by Lindsay McMahon and Michelle Kaplan

                        All Ears English Podcast

                        2,307 Listeners

                        Data Skeptic by Kyle Polich

                        Data Skeptic

                        475 Listeners

                        Talk Python To Me by Michael Kennedy

                        Talk Python To Me

                        583 Listeners

                        Software Engineering Daily by Software Engineering Daily

                        Software Engineering Daily

                        626 Listeners

                        The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

                        The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

                        438 Listeners

                        Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

                        Super Data Science: ML & AI Podcast with Jon Krohn

                        296 Listeners

                        Python Bytes by Michael Kennedy and Brian Okken

                        Python Bytes

                        214 Listeners

                        Data Engineering Podcast by Tobias Macey

                        Data Engineering Podcast

                        141 Listeners

                        Machine Learning Guide by OCDevel

                        Machine Learning Guide

                        770 Listeners

                        Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

                        Syntax - Tasty Web Development Treats

                        987 Listeners

                        DataFramed by DataCamp

                        DataFramed

                        270 Listeners

                        Practical AI by Practical AI LLC

                        Practical AI

                        187 Listeners

                        The Real Python Podcast by Real Python

                        The Real Python Podcast

                        140 Listeners

                        Business English from All Ears English by Lindsay McMahon

                        Business English from All Ears English

                        73 Listeners