The Real Python Podcast

Thinking in Pandas: Python Data Analysis the Right Way


Listen Later

Are you using the Python library Pandas the right way? Do you wonder about getting better performance, or how to optimize your data for analysis? What does normalization mean? This week on the show we have Hannah Stepanek to discuss her new book “Thinking in Pandas”.

The inspiration behind Hannah’s book came out of her talk at PyCon US 2019 titled “Thinking Like a Panda: Everything You Need to Know to Use Pandas the Right Way.” We discuss several core concepts covered in the book. She shares techniques for getting more performance when working with your data in Pandas. We also talk about her recent PyCon US 2020 online presentation about databases and migration.

Course Spotlight: Finding the Perfect Python Code Editor

Find your perfect Python development setup with this review of Python IDEs and code editors. With this course you’ll get an overview of the most common Python coding environments to help you make an informed decision.

Topics:

  • 00:00:00 – Introduction
  • 00:01:36 – Working for New Relic
  • 00:03:14 – Thinking in Pandas book release
  • 00:03:27 – Who is the intended reader?
  • 00:05:27 – What is the underlying tech for Pandas?
  • 00:09:04 – Why you shouldn’t use apply?
  • 00:13:00 – When you have to use apply
  • 00:16:06 – Normalizing your data
  • 00:17:05 – Do you have a preferred format for a dataframe?
  • 00:18:17 – More on multi-index dataframes
  • 00:24:50 – Creating NumPy types
  • 00:28:30 – Loading in your data
  • 00:30:33 – Video Course Spotlight
  • 00:31:41 – Pivoting data
  • 00:34:34 – Considering outside libraries and performance
  • 00:35:41 – What topic were you eager to share in the book?
  • 00:37:52 – What resources did you use to learn pandas?
  • 00:40:53 – PyCon 2020 talk about databases and migration
  • 00:45:34 – Delving into migration and Alembic
  • 00:53:15 – Speaking opportunities
  • 00:56:13 – What are you excited about in the world of Python?
  • 00:57:32 – What do you want to learn next?
  • 00:58:49 – Do you read source code to learn?
  • 01:00:16 – Is there a particularly well-written library?
  • 01:01:28 – Final Thanks
  • Links:

    • Thinking in Pandas: How to Use the Python Data Analysis Library the Right Way - Apress
    • Thinking like a Panda: Everything you need to know to use pandas the right way - PyCon 2019 - Hannah Stepanek
    • pandas
    • CPython Internals: Your Guide to the Python 3 Interpreter
    • MultiIndex / advanced indexing: pandas documentation
    • NumPy Data type objects (dtype)
    • pandas.DataFrame.pivot: pandas documentation
    • Let’s talk Databases in Python: SQLAlchemy and Alembic - PyCon 2020 - Hannah Stepanek
    • SQLAlchemy: The Python SQL Toolkit and Object Relational Mapper
    • Alembic: A database migration tool for SQLAlchemy
    • import asyncio: Learn Python’s AsyncIO #1 - The Async Ecosystem
    • Level up your Python skills with our expert-led courses:

      • Finding the Perfect Python Code Editor
      • Histogram Plotting in Python: NumPy, Matplotlib, Pandas & Seaborn
      • Idiomatic pandas: Tricks & Features You May Not Know
      • Support the podcast & join our community of Pythonistas

        ...more
        View all episodesView all episodes
        Download on the App Store

        The Real Python PodcastBy Real Python

        • 4.7
        • 4.7
        • 4.7
        • 4.7
        • 4.7

        4.7

        136 ratings


        More shows like The Real Python Podcast

        View all
        Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

        Software Engineering Radio - the podcast for professional software developers

        272 Listeners

        The Changelog: Software Development, Open Source by Changelog Media

        The Changelog: Software Development, Open Source

        283 Listeners

        Thoughtworks Technology Podcast by Thoughtworks

        Thoughtworks Technology Podcast

        41 Listeners

        Talk Python To Me by Michael Kennedy

        Talk Python To Me

        592 Listeners

        Software Engineering Daily by Software Engineering Daily

        Software Engineering Daily

        625 Listeners

        Soft Skills Engineering by Jamison Dance and Dave Smith

        Soft Skills Engineering

        269 Listeners

        Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

        Super Data Science: ML & AI Podcast with Jon Krohn

        298 Listeners

        Python Bytes by Michael Kennedy and Brian Okken

        Python Bytes

        213 Listeners

        Data Engineering Podcast by Tobias Macey

        Data Engineering Podcast

        142 Listeners

        Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

        Syntax - Tasty Web Development Treats

        981 Listeners

        DataFramed by DataCamp

        DataFramed

        266 Listeners

        Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

        Kubernetes Podcast from Google

        181 Listeners

        Practical AI by Practical AI LLC

        Practical AI

        190 Listeners

        The Stack Overflow Podcast by The Stack Overflow Podcast

        The Stack Overflow Podcast

        64 Listeners

        The Pragmatic Engineer by Gergely Orosz

        The Pragmatic Engineer

        52 Listeners