Shownotes
In this episode of Pain Points and Pull Requests, Fatimah and Carla walk you through their term projects for an undergraduate data mining course. Carla discusses the work her team did to predict whether a homicide will be solved or not and Fatimah discusses how she and her team analyzed the effects of COVID-19 on homeless shelter occupancy.
Timestamps
0:45 Part 1: Predicting Whether a Crime Is Solved Using ML
3:23 Data Analysis
7:00 Working with Imbalanced Datasets
10:21 Decision Tree and Logistic Regression Predictions
13:22 Part 2: Analyzing the Effect of Covid on Homeless Shelter Occupancy
18:52 Data Analysis and Data Preparation
22:17 Cofactor Analysis
26:41 Associated Rule Mining
Extra Resources
Techniques to Handle Imbalanced Datasets
https://towardsdatascience.com/the-5-most-useful-techniques-to-handle-imbalanced-datasets-6cdba096d55a
Source code for project #1: predicting whether a crime will be solved:
https://github.com/CarlaLeal/Predicting-Whether-a-Crime-Will-Be-Solved
Source code for project #2:
https://github.com/FatimahAreola/TorontoSheltersAnalysis
CBC article: City has far fewer homeless shelter beds than it says it has
https://www.cbc.ca/news/canada/toronto/toronto-shelter-space-1.5808905
University of Calgary study weather effects on homeless shelter occupancy
https://journalhosting.ucalgary.ca/index.php/sppp/article/view/42500/30390