
Sign up to save your podcasts
Or


This episode provides a comprehensive overview of Big Data and Data Lakes, framing them as the structural foundation for modern data-driven industries. The discussion centers on the Seven V's of big data—volume, variety, velocity, veracity, variability, value, and visualization—explaining how these metrics shift when moving from traditional databases to massive, unstructured troves of data.
Using a car insurance case study, the hosts illustrate how frameworks like MapReduce and Hadoop process chaotic telemetry data by distilling it into manageable "key-value pairs" across distributed nodes. The episode emphasizes the strategic importance of a hybrid approach, where big data insights are fed back into a data warehouse to enable proactive decision-making. Finally, the hosts use a "Farmer vs. Hunter-gatherer" analogy to contrast the disciplined structure of data warehouses with the raw, explorative nature of Data Lakes, warning that poor management can result in a "data swamp".
By Andrew AustinThis episode provides a comprehensive overview of Big Data and Data Lakes, framing them as the structural foundation for modern data-driven industries. The discussion centers on the Seven V's of big data—volume, variety, velocity, veracity, variability, value, and visualization—explaining how these metrics shift when moving from traditional databases to massive, unstructured troves of data.
Using a car insurance case study, the hosts illustrate how frameworks like MapReduce and Hadoop process chaotic telemetry data by distilling it into manageable "key-value pairs" across distributed nodes. The episode emphasizes the strategic importance of a hybrid approach, where big data insights are fed back into a data warehouse to enable proactive decision-making. Finally, the hosts use a "Farmer vs. Hunter-gatherer" analogy to contrast the disciplined structure of data warehouses with the raw, explorative nature of Data Lakes, warning that poor management can result in a "data swamp".