The Real Python Podcast

Decoupling Systems to Get Closer to the Data


Listen Later

What are the benefits of using a decoupled data processing system? How do you write reusable queries for a variety of backend data platforms? This week on the show, Phillip Cloud, the lead maintainer of Ibis, will discuss this portable Python dataframe library.

Phillip contrasts Ibis’s workflow with other Python dataframe libraries. We discuss how “getting close to the data” speeds things up and conserves memory.

He describes the different approaches Ibis provides for querying data and how to select a specific backend. We discuss ways to get started with the library and how to access example data sets to experiment with the platform.

Phillip discovered Ibis while looking for a tool that allowed him to reuse SQL queries written for a specific data platform on a different one. He recounts how he got involved with the Ibis project, sharing his background in open source and learning how to contribute to a first project.

This episode is sponsored by Mailtrap.

Course Spotlight: Creating Web Maps From Your Data With Python Folium

You’ll learn how to create web maps from data using Folium. The package combines Python’s data-wrangling strengths with the data-visualization power of the JavaScript library Leaflet. In this video course, you’ll create and style a choropleth world map showing the ecological footprint per country.

Topics:

  • 00:00:00 – Introduction
  • 00:02:18 – How did you get started with Ibis?
  • 00:08:10 – First contribution to open source
  • 00:13:46 – Comparing Ibis to other dataframe libraries
  • 00:20:09 – Sponsor: Mailtrap
  • 00:20:43 – What goes into the selection of backend?
  • 00:27:07 – Database connections vs SQL compilers
  • 00:30:03 – Raw SQL approach
  • 00:34:06 – Dataframe approach
  • 00:38:31 – What does “getting close to the data” mean?
  • 00:41:52 – Video Course Spotlight
  • 00:43:24 – Phillip in the cloud - YouTube channel
  • 00:44:56 – Access to sample data sets
  • 00:50:11 – Additional resources
  • 00:52:50 – What are some of the backends Ibis supports?
  • 00:54:13 – Entry points to the platform
  • 00:55:00 – How are you supported?
  • 00:57:10 – Exporting a SQL query
  • 00:59:23 – What are you excited about in the world of Python?
  • 01:04:28 – What do you want to learn next?
  • 01:07:12 – How can people follow your work online?
  • 01:08:00 – Thanks and goodbye
  • Show Links:

    • Ibis - the portable Python dataframe library
    • The Leading Designer and Builder of Enterprise Data Systems - Voltron Data
    • PEP 249 – Python Database API Specification v2.0
    • sqlglot: Python SQL Parser and Transpiler - GitHub
    • Ibis – getting_started
    • ibis-examples: A repository of runnable examples using ibis
    • Ibis – Reference Documentation
    • PyScript - Run Python in your HTML
    • pixi - Prefix.dev
    • uv: An extremely fast Python package installer and resolver, written in Rust
    • PyCon US 2024
    • LearnCraft Spanish – Fluency for Serious Learners
    • ibis: the portable Python dataframe library - GitHub
    • Ibis – Blog Posts
    • Phillip in the Cloud - YouTube
    • Phillip Cloud (@cpcloudy) / X
    • cpcloud (Phillip Cloud) · GitHub
    • Level up your Python skills with our expert-led courses:

      • Building Python Project Documentation With MkDocs
      • Creating Web Maps From Your Data With Python Folium
      • Using raise for Effective Exceptions
      • Support the podcast & join our community of Pythonistas

        ...more
        View all episodesView all episodes
        Download on the App Store

        The Real Python PodcastBy Real Python

        • 4.7
        • 4.7
        • 4.7
        • 4.7
        • 4.7

        4.7

        136 ratings


        More shows like The Real Python Podcast

        View all
        Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

        Software Engineering Radio - the podcast for professional software developers

        272 Listeners

        The Changelog: Software Development, Open Source by Changelog Media

        The Changelog: Software Development, Open Source

        283 Listeners

        Data Skeptic by Kyle Polich

        Data Skeptic

        481 Listeners

        Talk Python To Me by Michael Kennedy

        Talk Python To Me

        592 Listeners

        Software Engineering Daily by Software Engineering Daily

        Software Engineering Daily

        624 Listeners

        The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

        The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

        443 Listeners

        Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

        Super Data Science: ML & AI Podcast with Jon Krohn

        296 Listeners

        Python Bytes by Michael Kennedy and Brian Okken

        Python Bytes

        213 Listeners

        Data Engineering Podcast by Tobias Macey

        Data Engineering Podcast

        142 Listeners

        Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

        Syntax - Tasty Web Development Treats

        982 Listeners

        CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

        CoRecursive: Coding Stories

        189 Listeners

        DataFramed by DataCamp

        DataFramed

        266 Listeners

        Practical AI by Practical AI LLC

        Practical AI

        189 Listeners

        The Stack Overflow Podcast by The Stack Overflow Podcast

        The Stack Overflow Podcast

        64 Listeners

        Latent Space: The AI Engineer Podcast by swyx + Alessio

        Latent Space: The AI Engineer Podcast

        77 Listeners