The Real Python Podcast

Harnessing the Power of Python Polars


Listen Later

What are the advantages of using Polars for your Python data projects? When should you use the lazy or eager APIs, and what are the benefits of each? This week on the show, we speak with Jeroen Janssens and Thijs Nieuwdorp about their new book, Python Polars: The Definitive Guide.

Jeroen and Thijs describe how they were introduced to Polars while working at Xomnia. They were converting a large data project to Python and saw surprising speed increases using the new library.

We discuss converting projects from pandas to Polars, getting away from indexes, consistent syntax, and using lazy vs eager APIs. Along the way, Jeroen and Thijs offer tips for getting the most out of Polars in your code.

We dig into the process of writing a definitive guide and the advantages of working collaboratively on a book project. They also share resources for practicing data wrangling and building visualizations with Pydy Tuesday.

Course Spotlight: Working With Python Polars

Welcome to the world of Polars, a powerful DataFrame library for Python. In this video course, you’ll get a hands-on introduction to Polars’ core features and see why this library is catching so much buzz.

Topics:

  • 00:00:00 – Introduction
  • 00:02:47 – Polars start at Xomnia
  • 00:04:08 – Putting Polars into production
  • 00:07:18 – Realizing the speed differences
  • 00:08:49 – Converting the project from R to Python
  • 00:14:34 – How did Polars improve the project?
  • 00:16:34 – Making the code more ergonomic and readable
  • 00:19:21 – Only grabbing the data that is needed
  • 00:20:37 – Titling and deciding to write the book
  • 00:24:40 – Advantages to collaboration
  • 00:29:34 – What were you excited to include in the book?
  • 00:31:55 – Working with different engines and Nvidia’s Cuda
  • 00:35:05 – Defining a Polars expression
  • 00:36:11 – Transitioning from pandas to Polars
  • 00:37:34 – Not needing an index
  • 00:39:56 – What inspired the syntax?
  • 00:45:01 – Defining lazy vs eager workflows
  • 00:49:16 – Examples covered in first chapter preview
  • 00:51:51 – Video Course Spotlight
  • 00:53:14 – Data formats and Arrow
  • 00:55:41 – Working with NaN, null, or None
  • 00:58:11 – Measuring performance through a benchmark
  • 00:59:12 – Advantages to working with the Discord community
  • 01:02:32 – Code examples and applying the techniques
  • 01:03:34 – Pydy Tuesday
  • 01:05:47 – What are you excited about in the world of Python?
  • 01:09:21 – What do you want to learn next?
  • 01:13:26 – What’s the best way to follow your work online?
  • 01:14:14 – Thanks and goodbye
  • Survey:

    • Listener Survey - Help Shape the Future of the Real Python Podcast
    • Show Links:

      • Python Polars: The Definitive Guide
      • Janssens & Nieuwdorp - What we learned by converting a large codebase from Pandas to Polars - YouTube
      • Polars — DataFrames for the new era
      • polars · PyPI
      • Xomnia - Home Page
      • Episode #140: Speeding Up Your DataFrames With Polars
      • Data Science at the Command Line - Jeroen Janssens
      • Tidyverse
      • PySpark Overview — PySpark 4.0.0 documentation
      • Episode #193: Wes McKinney on Improving the Data Stack & Composable Systems
      • Apache Arrow
      • TPC-H Homepage
      • Community – Python Polars: The Definitive Guide
      • pydytuesday: A Python package to download TidyTuesday datasets
      • PydyTuesday - Python How-to Videos - YouTube
      • Astral: High-performance Python tooling
      • Episode #238: Charlie Marsh: Accelerating Python Tooling With Ruff and uv
      • uv: An extremely fast Python package and project manager, written in Rust.
      • PEP 723 – Inline script metadata
      • Inline script metadata - Python Packaging User Guide
      • Package Your Python Code as a CLI - PyData London 25 - YouTube
      • marimo - A next-generation Python notebook
      • The Rust Programming Language Book
      • Pimsleur - Learn New Languages Online
      • Official Rosetta Stone - How Language Is Learned
      • Thijs Nieuwdorp
      • Jeroen Janssens
      • Python Polars: The Definitive Guide
      • Level up your Python skills with our expert-led courses:

        • Working With Python Polars
        • Graph Your Data With Python and ggplot
        • Working With Missing Data in Polars
        • Support the podcast & join our community of Pythonistas

          ...more
          View all episodesView all episodes
          Download on the App Store

          The Real Python PodcastBy Real Python

          • 4.7
          • 4.7
          • 4.7
          • 4.7
          • 4.7

          4.7

          138 ratings


          More shows like The Real Python Podcast

          View all
          Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

          Software Engineering Radio - the podcast for professional software developers

          271 Listeners

          The Changelog: Software Development, Open Source by Changelog Media

          The Changelog: Software Development, Open Source

          283 Listeners

          Thoughtworks Technology Podcast by Thoughtworks

          Thoughtworks Technology Podcast

          41 Listeners

          Talk Python To Me by Michael Kennedy

          Talk Python To Me

          584 Listeners

          Software Engineering Daily by Software Engineering Daily

          Software Engineering Daily

          627 Listeners

          Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

          Super Data Science: ML & AI Podcast with Jon Krohn

          294 Listeners

          Python Bytes by Michael Kennedy and Brian Okken

          Python Bytes

          214 Listeners

          Data Engineering Podcast by Tobias Macey

          Data Engineering Podcast

          141 Listeners

          Machine Learning Guide by OCDevel

          Machine Learning Guide

          768 Listeners

          Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

          Syntax - Tasty Web Development Treats

          987 Listeners

          CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

          CoRecursive: Coding Stories

          189 Listeners

          DataFramed by DataCamp

          DataFramed

          269 Listeners

          Practical AI by Practical AI LLC

          Practical AI

          189 Listeners

          The Stack Overflow Podcast by The Stack Overflow Podcast

          The Stack Overflow Podcast

          64 Listeners

          The Pragmatic Engineer by Gergely Orosz

          The Pragmatic Engineer

          62 Listeners