Share Open Source Sports
Share to email
Share to Facebook
Share to X
By Ron Yurko
4.8
1313 ratings
The podcast currently has 12 episodes available.
In this episode we talk to Stephanie Kovalchik about her paper 'A Statistical Model of Serve Return Impact Patterns in Professional Tennis' (co-authored with Jim Albert). Stephanie is a Staff Data Scientist at Zelus Analytics, where she works on advanced performance valuation for multiple pro sports. Before joining Zelus, Stephanie led data science innovation for the Game Insight Group of Tennis Australia, building first-of-a-kind metrics and real-time applications with tracking data. Stephanie is the founder of the tennis analytics blog "On the T" and tweets @StatsOnTheT.
For additional references mentioned in the show:
We discuss True Shot Charts with Syracuse University Professors Justin Ehrlich and Shane Sanders. For references mentioned in the show:
We discuss An Examination of Olympic Sport Climbing Competition Format and Scoring System with Quang Nguyen (@qntkhvn). This paper won the Carnegie Mellon Sports Analytics Conference Reproducible Research Competition in November 2021.
Quang Nguyen completed his Master of Science in Applied Statistics at Loyola University Chicago in 2021. He recently spent the Spring 2022 semester working as an instructor in the Dept of Mathematics and Statistics at Loyola. Quang previously completed his undergraduate degree in Mathematics and Data Science at Wittenberg University in Springfield, Ohio. Quang's current interests include statistics in sports, data science, statistics and data science education, and reproducibility. He is a die-hard supporter of Manchester United F.C. of the English Premier League. And last but not least, Quang is excited to join the Dept of Statistics and Data Science at CMU as a first-year PhD student this coming Fall 2022.
For additional references mentioned in the show:
We discuss Grinding the Bayes: A Hierarchical Modeling Approach to Predicting the NFL Draft with Benjamin Robinson (@benj_robinson). This paper was a finalist in the Carnegie Mellon Sports Analytics Conference Reproducible Research Competition in October 2020. You can submit an abstract to enter the 2021 Reproducible Research Competition now!
Benjamin Robinson is a data scientist living in Washington, D.C. and the creator of Grinding the Mocks, where since 2018 he has used mock drafts, the wisdom of crowds, and data science to predict the NFL Draft. He is a 2012 graduate of the University of Pittsburgh with degrees in Economics and Urban Studies and earned a Master of Public Policy degree from the University of Southern California in 2014. You can follow him on Twitter @benj_robinson and find the Grinding the Mocks project at grindingthemocks.com and @GrindingMocks.
For additional references mentioned in the show:
We discuss a previous Big Data Bowl finalist paper `Expected Hypothetical Completion Probability` (https://arxiv.org/abs/1910.12337) with authors Sameer Deshpande (@skdeshpande91) and Kathy Evans (@CausalKathy).
Sameer is a postdoctoral associate at MIT. Prior to that, he completed his Ph.D. at the Wharton School of the University of Pennsylvania. He is broadly interested in Bayesian methods and causal inference. He is a long-suffering but unapologetic fan of America's Team. He's also a fan of the Dallas Mavericks.
Kathy is the Director of Strategic Research for the Toronto Raptors. She completed her Ph.D. in Biostatistics at Harvard University. She doesn't have an opinion on Frequentist vs Bayesian or R vs Python, but will get very upset if Rise of Skywalker is your favorite Star Wars movie.
For additional references mentioned in the show:
We discuss Bang the Can Slowly: An Investigation into the 2017 Houston Astros with Ryan Elmore (@rtelmore) and Gregory J. Matthews (@StatsInTheWild). This paper was the winner of the Carnegie Mellon Sports Analytics Conference Reproducible Research Competition in October 2020.
Ryan Elmore is an Assistant Professor in the Department of Business Information and Analytics in the Daniels College of Business at the University of Denver (DU). He earned his Ph.D. in statistics at Penn State University and worked as a Senior Scientist at the National Renewable Energy Laboratory prior to DU. He has over 20 peer reviewed publications in outlets such as Journal of the American Statistical Association, Biometrika, The American Statistician, Big Data, Journal of Applied Statistics, Journal of Sports Economics, among others. He is currently an Associate Editor for the Journal of Quantitative Analysis in Sports and recently organized the conference “Rocky Mountain Symposium on Analytics in Sports” hosted at DU.
Gregory Matthews completed his Ph.D. In statistics at the University of Connecticut in 2011. From 2011-2014, he was a post-doc in the School of Public Health at the University of Massachusetts-Amherst. Since 2014, he has been a professor of statistics at Loyola University Chicago. He was recently promoted to Associate professor with tenure in March 2020.
For additional references mentioned in the show:
We discuss 'How often does the best team win? A unified approach to understanding randomness in North American sport' with Michael Lopez. Michael Lopez (@StatsbyLopez) is the Director of Football Data and Analytics at the National Football League and a Lecturer of Statistics and Research Associate at Skidmore College. At the National Football League, his work centers on how to use data to enhance and better understand the game of football.
For additional references mentioned in the show:
We discuss 'Player Chemistry: Striving for a Perfectly Balanced Soccer Team' with Lotte Bransen. This paper builds on the VAEP framework previously introduced Lotte and her colleagues, in order to quantify player chemistry. Our discussion covers details of the paper along with general challenges of estimating player chemistry in soccer and other sports, as well as the importance of interpretable machine learning.
Lotte Bransen (@LotteBransen) is a Lead Data Scientist at SciSports, where she leads the Data Analytics team that develops analytical tools to derive actionable insights from soccer data. An avid soccer player herself, Lotte primarily works on developing machine learning models to measure the impact of soccer players’ in-game actions and decisions on the courses and outcomes of matches. Prior to SciSports, Lotte obtained a Master of Science degree in Econometrics & Management Science from Erasmus University Rotterdam and a Bachelor of Science degree in Mathematics from Utrecht University.
References:
In the third episode of the show we discuss 'Competing process hazard function models for player ratings in ice hockey' with two guests, Andrew Thomas and Sam Ventura. The discussion ranges from paper details to thoughts on modeling in hockey and sports in general.
Andrew Thomas (@acthomasca) is the Director of Data Science for SMT (SportsMEDIA Technology), and former lead hockey researcher for the Minnesota Wild. He received his PhD in Statistics at Harvard University.
Sam Ventura is the Director of Hockey Research for the Pittsburgh Penguins, and an affiliated faculty member at Carnegie Mellon's Statistics & Data Science department, where he received his PhD in Statistics. Along with Andrew, he is the co-creator of war-on-ice.com and nhlscrapr. Additionally, he is the co-creator of nflscrapr with Maksim Horowitz and Ron Yurko, which no longer works...
Additional resources mentioned include:
In the second episode we discuss two papers by our guest Daniel Daly-Grafstein and Luke Bornn: Rao-Blackwellizing field goal percentage (published in JQAS and available at: http://www.lukebornn.com/papers/dalygrafstein_jqas_2019.pdf) and Using In-Game Shot Trajectories to Better Understand Defensive Impact in the NBA (available at: https://arxiv.org/pdf/1905.00822.pdf).
Daniel is currently a soccer data analyst at Sportlogiq, an sports AI company that, in soccer, focuses on generating tracking data using computer vision. The papers discussed in this episode were part of Daniel’s Master's degree in statistics at Simon Fraser University. In the fall Daniel is going to be starting his PhD in Statistics at the University of British Columbia.
Additional resources mentioned in the show:
Also you should read the wikipedia page on the Rao-Blackwell theorem: https://en.wikipedia.org/wiki/Rao%E2%80%93Blackwell_theorem
The podcast currently has 12 episodes available.