The Python Podcast.__init__

Pandas Extension Arrays with Tom Augspurger


Listen Later

Summary

Pandas is a swiss army knife for data processing in Python but it has long been difficult to customize. In the latest release there is now an extension interface for adding custom data types with namespaced APIs. This allows for building and combining domain specific use cases and alternative storage mechanisms. In this episode Tom Augspurger describes how the new ExtensionArray works, how it came to be, and how you can start building your own extensions today.

Preface
  • Hello and welcome to Podcast.__init__, the podcast about Python and the people who make it great.
  • When you’re ready to launch your next app you’ll need somewhere to deploy it, so check out Linode. With private networking, shared block storage, node balancers, and a 200Gbit network, all controlled by a brand new API you’ve got everything you need to scale up. Go to podcastinit.com/linode to get a $20 credit and launch a new server in under a minute.
  • To get worry-free releases download GoCD, the open source continous delivery server built by Thoughworks. You can use their pipeline modeling and value stream map to build, control and monitor every step from commit to deployment in one place. And with their new Kubernetes integration it’s even easier to deploy and scale your build agents. Go to podcastinit.com/gocd to learn more about their professional support services and enterprise add-ons.
  • Visit the site to subscribe to the show, sign up for the newsletter, and read the show notes. And if you have any questions, comments, or suggestions I would love to hear them. You can reach me on Twitter at @Podcast__init__ or email [email protected])
  • To help other people find the show please leave a review on iTunes, or Google Play Music, tell your friends and co-workers, and share it on social media.
  • Your host as usual is Tobias Macey and today I’m interviewing Tom Augspurger about the extension interface for Pandas data frames and the use cases that it enables
  • Interview
    • Introductions
    • How did you get introduced to Python?
    • Most people are familiar with Pandas, but can you describe at a high level the new extension interface?
      • What is the story behind the implementation of this functionality?
      • Prior to this interface what was the option for anyone who wanted to extend Pandas?

      • What are some of the new data types that are available as external packages?

        • What are some of the unique use cases that they enable?

        • How is the new interface implemented within Pandas?

        • What were the most challenging or difficult aspects of building this new functionality?

        • What are some of the more interesting possibilities that you are aware of for new extension types?

        • What are the limitations of the interface for libraries that add new array functionality?

        • What is the next major change or improvement that you would like to add in Pandas?

        • Keep In Touch
          • tomaugspurger on GitHub
          • @TomAugspurger on Twitter
          • Picks
            • Tobias
              • Black Panther

              • Tom

                • Dask-ML

                • Links
                  • Pandas
                  • ExtensionArray
                  • Original IP Address proposal
                  • Mid-implementation blog post
                  • Dataframe
                  • Numpy
                  • Cyberpandas
                  • Geopandas
                  • GIS
                  • Arrow
                  • CuPy
                  • JQ
                  • Wes McKinney
                  • Array ufunc
                  • Matplotlib
                  • Altair
                  • Seaborn
                  • Bokeh
                    • Podcast.__init__ Interview

                    • Dask

                      • Data Engineering Interview

                      • The intro and outro music is from Requiem for a Fish The Freak Fandango Orchestra / CC BY-SA

                        ...more
                        View all episodesView all episodes
                        Download on the App Store

                        The Python Podcast.__init__By Tobias Macey

                        • 4.4
                        • 4.4
                        • 4.4
                        • 4.4
                        • 4.4

                        4.4

                        100 ratings


                        More shows like The Python Podcast.__init__

                        View all
                        Freakonomics Radio by Freakonomics Radio + Stitcher

                        Freakonomics Radio

                        32,021 Listeners

                        Odd Lots by Bloomberg

                        Odd Lots

                        1,930 Listeners

                        The Changelog: Software Development, Open Source by Changelog Media

                        The Changelog: Software Development, Open Source

                        289 Listeners

                        Data Skeptic by Kyle Polich

                        Data Skeptic

                        480 Listeners

                        Software Engineering Daily by Software Engineering Daily

                        Software Engineering Daily

                        623 Listeners

                        Talk Python To Me by Michael Kennedy

                        Talk Python To Me

                        585 Listeners

                        Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

                        Super Data Science: ML & AI Podcast with Jon Krohn

                        303 Listeners

                        Python Bytes by Michael Kennedy and Brian Okken

                        Python Bytes

                        215 Listeners

                        Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

                        Syntax - Tasty Web Development Treats

                        987 Listeners

                        DataFramed by DataCamp

                        DataFramed

                        269 Listeners

                        Practical AI by Practical AI LLC

                        Practical AI

                        207 Listeners

                        The Intelligence from The Economist by The Economist

                        The Intelligence from The Economist

                        2,552 Listeners

                        The Real Python Podcast by Real Python

                        The Real Python Podcast

                        142 Listeners

                        声动早咖啡 by 声动活泼

                        声动早咖啡

                        293 Listeners

                        The Foreign Affairs Interview by Foreign Affairs Magazine

                        The Foreign Affairs Interview

                        449 Listeners