The Real Python Podcast

Speeding Up Your DataFrames With Polars


Listen Later

How can you get more performance from your existing data science infrastructure? What if a DataFrame library could take advantage of your machine’s available cores and provide built-in methods for handling larger-than-RAM datasets? This week on the show, Liam Brannigan is here to discuss Polars.

Liam is an experienced data scientist working in finance, technology, and environmental analysis. He’s recently started contributing to the documentation for Polars and developing a training course for the library.

We talk about the library’s overall speed and lack of additional dependencies. Liam explains the advantages of lazy vs eager mode and which to choose when performing data exploration or attempting to load a dataset larger than your RAM.

We also discuss potential barriers to switching to Polars from a pandas workflow. Across our conversation, we explore several other libraries and technologies, including Apache Arrow, DuckDB, query optimization, and the “rustification” of Python tools.

Course Spotlight: Graph Your Data With Python and ggplot

In this course, you’ll learn how to use ggplot in Python to build data visualizations with plotnine. You’ll discover what a grammar of graphics is and how it can help you create plots in a very concise and consistent way.

Show Topics:

  • 00:00:00 – Introduction
  • 00:02:06 – Liam’s background and intro to Polars
  • 00:03:37 – Hurdles to switching to Polars
  • 00:05:23 – Creating training resources
  • 00:08:15 – No index
  • 00:09:46 – Data science 2025 predictions
  • 00:12:02 – Contributions to Polars
  • 00:15:07 – Eager vs lazy mode & query optimization
  • 00:19:25 – Sponsor: Anaconda Nucleus
  • 00:20:00 – Apache Arrow and parquet
  • 00:24:43 – DuckDB and column orientation
  • 00:29:27 – The “rustification” of libraries
  • 00:34:49 – Video Course Spotlight
  • 00:36:16 – GPUs and memory requirements
  • 00:45:49 – No additional library requirements
  • 00:47:37 – Development of the ecosystem
  • 00:51:33 – Chaining operations
  • 00:53:39 – How can people follow your work?
  • 00:54:51 – What are you excited about in the world of Python?
  • 00:56:09 – What do you want to learn next?
  • 00:56:58 – Thanks and goodbye
  • Show Links:

    • Liam Brannigan - Data Scientist
    • Polars
    • polars - PyPI
    • Coming from Pandas - Polars - User Guide
    • Rho-Signal Data Analytics - YouTube
    • Cheatsheet for Pandas to Polars - Rho Signal
    • Data Analysis with Polars - Udemy
    • I wrote one of the fastest DataFrame libraries - Polars
    • Database-like ops benchmark comparison
    • Data science 2025 - Liam Brannigan
    • DuckDB - An in-process SQL OLAP database management system
    • The great Python DataFrame showdown, part 1: Demystifying Apache Arrow
    • Apache Arrow
    • Learn Rust - Rust Programming Language
    • Modern Polars
    • Anaconda - PyScript Updates: Bytecode Alliance, Pyodide, and MicroPython
    • Jupytext - Jupyter Notebooks as Markdown Documents, Julia, Python or R Scripts
    • Polars up and running - Liam Brannigan
    • Liam Brannigan - Data Scientist - Blog
    • Liam Brannigan (@braaannigan) - Twitter
    • Liam Brannigan - LinkedIn
    • Level up your Python skills with our expert-led courses:

      • Threading in Python
      • Reading and Writing Files With pandas
      • Graph Your Data With Python and ggplot
      • Support the podcast & join our community of Pythonistas

        ...more
        View all episodesView all episodes
        Download on the App Store

        The Real Python PodcastBy Real Python

        • 4.7
        • 4.7
        • 4.7
        • 4.7
        • 4.7

        4.7

        136 ratings


        More shows like The Real Python Podcast

        View all
        Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

        Software Engineering Radio - the podcast for professional software developers

        272 Listeners

        The Changelog: Software Development, Open Source by Changelog Media

        The Changelog: Software Development, Open Source

        283 Listeners

        Thoughtworks Technology Podcast by Thoughtworks

        Thoughtworks Technology Podcast

        41 Listeners

        Talk Python To Me by Michael Kennedy

        Talk Python To Me

        592 Listeners

        Software Engineering Daily by Software Engineering Daily

        Software Engineering Daily

        625 Listeners

        Soft Skills Engineering by Jamison Dance and Dave Smith

        Soft Skills Engineering

        269 Listeners

        Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

        Super Data Science: ML & AI Podcast with Jon Krohn

        296 Listeners

        Python Bytes by Michael Kennedy and Brian Okken

        Python Bytes

        213 Listeners

        Data Engineering Podcast by Tobias Macey

        Data Engineering Podcast

        142 Listeners

        Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

        Syntax - Tasty Web Development Treats

        983 Listeners

        DataFramed by DataCamp

        DataFramed

        266 Listeners

        Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

        Kubernetes Podcast from Google

        181 Listeners

        Practical AI by Practical AI LLC

        Practical AI

        189 Listeners

        The Stack Overflow Podcast by The Stack Overflow Podcast

        The Stack Overflow Podcast

        64 Listeners

        The Pragmatic Engineer by Gergely Orosz

        The Pragmatic Engineer

        52 Listeners