The Real Python Podcast

Narwhals: Expanding DataFrame Compatibility Between Libraries


Listen Later

How does a Python tool support all types of DataFrames and their various features? Could a lightweight library be used to add compatibility for newer formats like Polars or PyArrow? This week on the show, we speak with Marco Gorelli about his project, Narwhals.

Narwhals is a project aimed at library maintainers rather than end users. We discuss how the added compatibility benefits users by supporting modern features like lazy evaluation. We cover several projects Marco has been working with to implement Narwhals, including Altair, scikit-lego, and Ibis.

We also discuss how Marco started contributing to open-source projects. Marco has contributed to both pandas and Polars, which helps explain his interest in growing compatibility between libraries. He also offers advice on making your first contribution.

This episode is sponsored by CodeRabbit.

Course Spotlight: Differences Between Python’s Mutable and Immutable Types

In this video course, you’ll learn how Python’s mutable and immutable data types work internally and how you can take advantage of mutability or immutability to power your code.

Topics:

  • 00:00:00 – Introduction
  • 00:02:02 – Euro SciPy 2024 and sprints
  • 00:04:04 – How did you get involved in open source?
  • 00:07:18 – Finding a good issue to get started
  • 00:09:25 – Discord and open-source projects
  • 00:11:12 – Who would you describe Narwhals?
  • 00:16:47 – Working on Polars
  • 00:19:17 – Apache Arrow and a data interchange protocol
  • 00:22:55 – Sponsor: CodeRabbit
  • 00:23:55 – Digging into eager vs lazy
  • 00:27:04 – Ibis DataFrame library
  • 00:28:57 – What do libraries need from Narwhals?
  • 00:34:57 – The scikit-lego library
  • 00:37:15 – Video Course Spotlight
  • 00:38:45 – Other libraries interested in Narwhals
  • 00:41:56 – Compatibility policy
  • 00:45:18 – What should an end user expect?
  • 00:46:32 – Have other projects that attempted this?
  • 00:47:54 – Keeping the project light and pure Python
  • 00:49:32 – Contributors and how to get involved
  • 00:54:42 – What are you excited about in the world of Python?
  • 00:57:18 – What do you want to learn next?
  • 00:59:05 – How can people follow your work online?
  • 00:59:27 – Thanks and goodbye
  • Show Links:

    • Narwhals
    • EuroSciPy
    • narwhals: Lightweight and Extensible Compatibility Layer Between DataFrame Libraries! - GitHub
    • DataFrame Interoperability - What’s Been Achieved, and What Comes Next? - PyCon Lithuania - YouTube
    • How Narwhals Has Many End Users … That Never Use It Directly - YouTube
    • Polars Has a New Lightweight Plotting Backend - Altair
    • pandas - Python Data Analysis Library
    • Polars — DataFrames for the new era
    • great-tables - PyPI
    • Episode #214: Build Captivating Display Tables in Python With Great Tables
    • Ibis
    • Episode #201: Decoupling Systems to Get Closer to the Data
    • Great Tables is Now BYODF (Bring Your Own DataFrame)
    • How Narwhals and scikit-lego Came Together to Achieve DataFrame-Agnosticism
    • Explore Using Narwhals in Plotly Express · Issue #4749 - GitHub
    • Fairlearn
    • Perfect Backwards Compatibility Policy - Narwhals
    • uv: Unified Python packaging
    • pixi - Powerful Development Environments
    • Narwhals - Discord
    • marcogorelli (@[email protected]) - Fosstodon
    • Marco Gorelli - Quansight - LinkedIn
    • Level up your Python skills with our expert-led courses:

      • What's New in Python 3.13
      • pandas GroupBy: Grouping Real World Data in Python
      • The pandas DataFrame: Working With Data Efficiently
      • Support the podcast & join our community of Pythonistas

        ...more
        View all episodesView all episodes
        Download on the App Store

        The Real Python PodcastBy Real Python

        • 4.7
        • 4.7
        • 4.7
        • 4.7
        • 4.7

        4.7

        136 ratings


        More shows like The Real Python Podcast

        View all
        Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

        Software Engineering Radio - the podcast for professional software developers

        272 Listeners

        The Changelog: Software Development, Open Source by Changelog Media

        The Changelog: Software Development, Open Source

        283 Listeners

        Data Skeptic by Kyle Polich

        Data Skeptic

        481 Listeners

        Talk Python To Me by Michael Kennedy

        Talk Python To Me

        592 Listeners

        Software Engineering Daily by Software Engineering Daily

        Software Engineering Daily

        624 Listeners

        The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

        The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

        443 Listeners

        Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

        Super Data Science: ML & AI Podcast with Jon Krohn

        296 Listeners

        Python Bytes by Michael Kennedy and Brian Okken

        Python Bytes

        213 Listeners

        Data Engineering Podcast by Tobias Macey

        Data Engineering Podcast

        142 Listeners

        Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

        Syntax - Tasty Web Development Treats

        982 Listeners

        CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

        CoRecursive: Coding Stories

        189 Listeners

        DataFramed by DataCamp

        DataFramed

        266 Listeners

        Practical AI by Practical AI LLC

        Practical AI

        189 Listeners

        The Stack Overflow Podcast by The Stack Overflow Podcast

        The Stack Overflow Podcast

        64 Listeners

        Latent Space: The AI Engineer Podcast by swyx + Alessio

        Latent Space: The AI Engineer Podcast

        77 Listeners