Share Chapter 10 - Big Data & Data Lakes

Copy link

March 11, 2026

Chapter 10 - Big Data & Data Lakes

23 minutes

This episode provides a comprehensive overview of Big Data and Data Lakes, framing them as the structural foundation for modern data-driven industries. The discussion centers on the Seven V's of big data—volume, variety, velocity, veracity, variability, value, and visualization—explaining how these metrics shift when moving from traditional databases to massive, unstructured troves of data.

Using a car insurance case study, the hosts illustrate how frameworks like MapReduce and Hadoop process chaotic telemetry data by distilling it into manageable "key-value pairs" across distributed nodes. The episode emphasizes the strategic importance of a hybrid approach, where big data insights are fed back into a data warehouse to enable proactive decision-making. Finally, the hosts use a "Farmer vs. Hunter-gatherer" analogy to contrast the disciplined structure of data warehouses with the raw, explorative nature of Data Lakes, warning that poor management can result in a "data swamp".

...more

View all episodes

By Andrew Austin

March 11, 2026

Chapter 10 - Big Data & Data Lakes

23 minutes

...more

Sign up to save your podcasts