DataBytes

By Jessi & Susan

Data science, big data, artificial intelligence, machine learning… they’re all the rage. In this podcast, Jessi Cisewski-Kehe and Susan Wang, 2 statisticians, give you a perspective on what’s happenin... more

5

66 ratings

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about DataBytes:

How many episodes does DataBytes have?

The podcast currently has 50 episodes available.

DataBytes episodes:

January 24, 2020 #50: Extreme Classification: All You Need Is Some Hash (Functions)
In part 2 of this saga on extreme classification, we get into the weeds on how MACH is able to magically handle such massive classification problems. The title says it all -- hash functions are the magical ingredient. We provide a step-by-step view of how one might come up with the MACH algorithm from first principles.
...more
22min
January 17, 2020 #49: Extreme Classification: Going at MACH Speed (Part 1)
In this episode, Dr. Derek Feng drops by to chat about a recent paper on a divide-and-conquer approach (Merged-Averaged Classifiers via Hashing) to massive classification problems. In part 1 (of 2 episodes), we describe the general problem solved by and strategy taken by MACH, wherein the original large classification problem is broken down into smaller-sized classification problems. Next week in the second episode, we talk about more technical details of how the division of labor works, and why it works.
...more
17min
December 14, 2019 #48: Where Moneyball Meets Footy
We've long heard about the waves that statistics has made in baseball. But what about soccer? In this episode, we summarize a few applications of statistics in European football (or American soccer).
...more
17min
November 30, 2019 #47: Domoic Acid Testing -- A Crabshoot?
Domoic acid has plagued shellfish and other wildlife along the Pacific coastline in recent years. Testing for domoic acid concentration in crabs on a regular basis has become important for determining when crabs and their viscera can be safely consumed. Unlike many other common hypothesis tests, the setup used for domoic acid testing is based on the sample maximum rather than the sample mean. In this episode, we critique the testing methodology.
...more
19min
November 08, 2019 #46: Finding Your (Niche) Board Games
In this episode, we discuss how two statisticians used data from BoardGameGeek.com to put together their own board game recommendation engine, specifically designed to stay away from mainstream recommendations.
...more
13min
November 01, 2019 #45: Learning Publicly, with Private Data
In this episode, Dr. Derek Feng discusses the general issue of data privacy in the age of big data, including topics of differential privacy and federated learning.
...more
17min
October 25, 2019 #44: A Conversation with Jon Krohn
We sit down with Dr. Jon Krohn to chat about his work as a Chief Data Scientist at untapt, his newly published bestseller "Deep Learning Illustrated", and his teaching/research.
...more
34min
October 04, 2019 #43: To Google and Back
In this episode, Professor Albert Y. Kim of Smith College describes his post-PhD journey, which included a stint at Google Adwords before academic posts at Reed College, Middlebury College, Amherst College, and Smith College.
...more
30min
September 27, 2019 #42: Black in the Box
Dr. Derek Feng joins us again to discuss the two metrics by which we align all statistical/machine learning methods -- interpretability versus predictive ability. In a world where black box methods reign supreme, what does learning mean?
...more
23min
September 20, 2019 #41: What to do with Outliers
Guest Dylan O'Connell joins us today to talk about a recent surprising, but legitimate Democratic primary poll result done by Monmouth University. We discuss different perspectives on how to approach a data point that doesn't fit in with the others.
...more
23min

FAQs about DataBytes:

How many episodes does DataBytes have?

The podcast currently has 50 episodes available.