unSILOed with Greg LaBlanc

Dark Data: Why What You Don’t Know Matters feat. David Hand


Listen Later

We like to think we have everything we need to make decisions based on the numbers we are presented in a data set. But any large data set is bound to have problems. And it's often the data that we are missing that can lead us off course unexpectedly. 

David Hand has written many books, including The Improbability Principle: Why Coincidences, Miracles, and Rare Events Happen Every Day and the more recent, Dark Data: Why What You Don’t Know Matters. He is also emeritus professor of math at Imperial College.

David and Greg talk today about bias in statistics, interpreting data sets, and whether or not we are just more aware of global events happening than we were in the past, and how that affects stats?

Episode Quotes:

Interpreting data sets:

You need an element of caution, skepticism about the data because let's face it. Any large data set is likely to have some problems, measurement, error problems, duplications and missing values. In time, missing records, it's likely to have some problems. So, a skeptical attitude I think is a healthy attitude.

Observational data:

I think observational data is particularly risky and it has to be said that the data science revolution we are currently living through is in large part driven by big observational administrative data sets. Data sets which arise in the normal practice of everyday life. Running a credit card or a retail operation, for example or a transport company, a hospital or whatever. You're just observing what happens. You're not manipulating or intervening. And in that case, I think the opportunities for distortions are very severe. Now, whether those distortions will impact your conclusions depends on what question you're asking, but there is a great risk.

Misconceptions of big data sets:

People have this belief that big data, massive data sets, billions of data points - no need to worry, the size of the data or wash all the problems away. What I say is that big data has all the problems of small data and extra problems of their own because I think they have more opportunities for glitches to occur and problems to arise.


Show Links:


Guest Profile:

  • Faculty Profile at Imperial College London
  • Professional Profile at The British Academy


His work:

  • David Hand on Google Scholar
  • Dark Data: Why What You Don’t Know Matters
  • The Improbability Principle: Why Coincidences, Miracles, and Rare Events Happen Every Day
  • Measurement: A Very Short Introduction

Hosted by Simplecast, an AdsWizz company. See pcm.adswizz.com for information about our collection and use of personal data for advertising.

...more
View all episodesView all episodes
Download on the App Store

unSILOed with Greg LaBlancBy Greg La Blanc

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

62 ratings


More shows like unSILOed with Greg LaBlanc

View all
Odd Lots by Bloomberg

Odd Lots

1,898 Listeners

The Knowledge Project by Shane Parrish

The Knowledge Project

2,672 Listeners

The Psychology Podcast by iHeartPodcasts

The Psychology Podcast

1,852 Listeners

Making Sense with Sam Harris by Sam Harris

Making Sense with Sam Harris

26,338 Listeners

EconTalk by Russ Roberts

EconTalk

4,274 Listeners

Conversations with Tyler by Mercatus Center at George Mason University

Conversations with Tyler

2,441 Listeners

The Good Fight by Yascha Mounk

The Good Fight

900 Listeners

Capitalisn't by University of Chicago Podcast Network

Capitalisn't

542 Listeners

Eye On The Market by Michael Cembalest

Eye On The Market

292 Listeners

The Peter Attia Drive by Peter Attia, MD

The Peter Attia Drive

9,137 Listeners

The Acquirers Podcast by Tobias Carlisle

The Acquirers Podcast

300 Listeners

The Compound and Friends by The Compound

The Compound and Friends

2,113 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

505 Listeners

Clearer Thinking with Spencer Greenberg by Spencer Greenberg

Clearer Thinking with Spencer Greenberg

139 Listeners

Huberman Lab by Scicomm Media

Huberman Lab

29,209 Listeners