Dear Analyst

Dear Analyst #90: Biostatistics, public health, and the #1 strategy to land a job in data with Tyler Vu


Listen Later


You go to a family gathering and everyone is fawning over you cousin who has a cushy stats job at Harvard. Knowing your cousin, you think to yourself: if my cousin can do it, so can I. Next thing you know, you are a research fellow at Harvard University. Tyler Vu was studying applied math at Cal State Fullerton and didn't realize he had a passion for Biostatistics until his fellowship at Harvard. He is currently getting his PhD in Biostatistics at UCSD and is the youngest person to ever pursue a PhD in Biostats at UCSD. In this episode we talk about doing network analysis for the public health sector, facial/voice recognition, and Tyler's #1 strategy he thinks everyone should use to land their next job or internship in data.







Predicting HIV rates when you are missing data



As a neophyte to the data science and machine learning space, Tyler definitely veered into concepts that were quite foreign to me as he discusses his current PhD thesis. His thesis involves analyzing social networks knowing that there's a lot of missing data within the context of public health. We talk about why finding the HIV rate in a sample is different from other metrics you could get from a sample.



For instance, if you want to get the average height of people in the U.S., you pick a random sample of people, find the average height, and extrapolate this to the rest of the population (roughly). This is a straightforward analysis since each person's height is independent of each other.



In the case of public health, people are connected via social networks. With HIV, predicting whether someone tests positive or negative is dependent on the people you are connected with and whether those people have tested positive or negative. In this type of analysis there's a lot of bias and "non-parametric estimation of network properties," according to Tyler. I'm not even going to pretend I know what these terms mean. There's actually very little published work on this subject so Tyler's thesis would be adding a lot to the current research on this subject.



Source: Alteryx community



Training a voice and face machine learning model



Tyler has a history of working on one-of-a-kind projects. During his undergrad years, he worked on a project that combined face and voice recognition. Kind of like having a double authenticator system if you wanted to unlock an iPhone, for instance. Since you're combining both image and voice features to train a model, it creates a "highly dimensional problem."







Tyler helped with coding the project all in MATLAB. Given the tools and frameworks available, Tyler was pleasantly surprised to see the speed in which they were able to go from hypothesis to working app on this project.



Predicting "fragile" countries



During Tyler's research at Harvard, he worked on a project to help predict which countries will become "fragile." This is the definition of a "fragile state" according to the United States Institute of Peace:



Each fragile state is fragile in its own way, but they all face significant governance and economic challenges. In fragile states, governments lack legitimacy in the eyes of citizens, and institutions struggle or fail to provide basic public goods—security, justice, and rudimentary services—and to manage political conflicts peacefully. 



The project's aim was basically trying to predict which countries might become fragile in the future so that the governments could better plan for these...
...more
View all episodesView all episodes
Download on the App Store

Dear AnalystBy KeyCuts

  • 3.8
  • 3.8
  • 3.8
  • 3.8
  • 3.8

3.8

5 ratings