
Sign up to save your podcasts
Or
Send us a text
What is the modern data stack—and how do tools like Spark, dbt, and Airflow actually work together? In this lesson, I’ll break it down step by step with real-world insights.
In Lesson 2 of the Data Engineering 101 series, we take a deep dive into the Modern Data Stack—what it is, how the components work together, and why data modeling is still the backbone of every great pipeline.
💡 You’ll learn how tools like Spark, dbt, Pandas, and Airflow function like machines in a factory—transforming raw logs and cloud-stored data into meaningful business insight.
We’ll also talk about:
Databases vs. data warehouses
Cloud storage as digital basements
Why orchestration matters
When Excel is actually OK
And how to start mapping your own data stack
📦 This episode includes visuals, analogies, and action steps you can apply right now—whether you're building from scratch or modernizing legacy systems.
📚 Watch the full Data Engineering 101 Playlist:
👉 https://www.youtube.com/playlist?list=PLewT1HTMY0WZ4tpoBw-w_CcewwrEDNmsU
🚨 Don’t miss Lesson 3: SQL Fundamentals (coming next week!)
—
Looking to get more technical practice? Join Codecademy while supporting the channel!
If you are a high school or college student you get 35% off with this link:
https://www.pntrs.com/t/2-591852-361892-213588
Not a student, but want a special discount, get 15% off the normal prices with this special link from Gambill Data: https://www.pntrac.com/t/2-523372-361892-213588
🔔 Subscribe for more real-world data engineering strategies, tools, and career advice.
➕ Follow me on LinkedIn: https://www.linkedin.com/in/databasemanagement/
💻 Check out consulting & mentoring resources: https://www.gambilldataengineering.com
—
Chapters:
00:00 Intro – Why the Cloud Changed Everything
00:39 What We’ll Cover in This Lesson
00:58 What Is the Modern Data Stack?
01:34 Databases vs. Data Warehouses
02:18 Cloud Storage = Digital Basements
03:02 Data Processing Tools (Spark, dbt, Pandas)
03:24 Airflow: The Data Factory Supervisor
05:10 Cloud Platforms: Azure, AWS, GCP
06:03 Modeling Is Like Organizing a Library
07:06 Relational, Dimensional, and NoSQL Models
09:14 When Excel Is OK!
10:10 Key Takeaways
11:20 Coming Up Next: SQL Fundamentals
#DataEngineering #ModernDataStack #ApacheSpark #dbt #Airflow #DataModeling #ETL #CloudData #SQL #Azure #AWS #GCP
Support the show
Chris Gambill is a data engineering consultant and educator with 25+ years of experience helping organizations modernize their data stacks. As founder of Gambill Data, he specializes in data strategy, cloud migration, and building resilient analytics platforms for mid-market and enterprise clients. He’s passionate about making real-world data engineering accessible.
Connect with Chris on LinkedIn or learn more at gambilldata.com.
Send us a text
What is the modern data stack—and how do tools like Spark, dbt, and Airflow actually work together? In this lesson, I’ll break it down step by step with real-world insights.
In Lesson 2 of the Data Engineering 101 series, we take a deep dive into the Modern Data Stack—what it is, how the components work together, and why data modeling is still the backbone of every great pipeline.
💡 You’ll learn how tools like Spark, dbt, Pandas, and Airflow function like machines in a factory—transforming raw logs and cloud-stored data into meaningful business insight.
We’ll also talk about:
Databases vs. data warehouses
Cloud storage as digital basements
Why orchestration matters
When Excel is actually OK
And how to start mapping your own data stack
📦 This episode includes visuals, analogies, and action steps you can apply right now—whether you're building from scratch or modernizing legacy systems.
📚 Watch the full Data Engineering 101 Playlist:
👉 https://www.youtube.com/playlist?list=PLewT1HTMY0WZ4tpoBw-w_CcewwrEDNmsU
🚨 Don’t miss Lesson 3: SQL Fundamentals (coming next week!)
—
Looking to get more technical practice? Join Codecademy while supporting the channel!
If you are a high school or college student you get 35% off with this link:
https://www.pntrs.com/t/2-591852-361892-213588
Not a student, but want a special discount, get 15% off the normal prices with this special link from Gambill Data: https://www.pntrac.com/t/2-523372-361892-213588
🔔 Subscribe for more real-world data engineering strategies, tools, and career advice.
➕ Follow me on LinkedIn: https://www.linkedin.com/in/databasemanagement/
💻 Check out consulting & mentoring resources: https://www.gambilldataengineering.com
—
Chapters:
00:00 Intro – Why the Cloud Changed Everything
00:39 What We’ll Cover in This Lesson
00:58 What Is the Modern Data Stack?
01:34 Databases vs. Data Warehouses
02:18 Cloud Storage = Digital Basements
03:02 Data Processing Tools (Spark, dbt, Pandas)
03:24 Airflow: The Data Factory Supervisor
05:10 Cloud Platforms: Azure, AWS, GCP
06:03 Modeling Is Like Organizing a Library
07:06 Relational, Dimensional, and NoSQL Models
09:14 When Excel Is OK!
10:10 Key Takeaways
11:20 Coming Up Next: SQL Fundamentals
#DataEngineering #ModernDataStack #ApacheSpark #dbt #Airflow #DataModeling #ETL #CloudData #SQL #Azure #AWS #GCP
Support the show
Chris Gambill is a data engineering consultant and educator with 25+ years of experience helping organizations modernize their data stacks. As founder of Gambill Data, he specializes in data strategy, cloud migration, and building resilient analytics platforms for mid-market and enterprise clients. He’s passionate about making real-world data engineering accessible.
Connect with Chris on LinkedIn or learn more at gambilldata.com.