Data Science Deployed

Data Versioning for Data Science


Listen Later

Today we talk about Data Versioning. Why you should do it, what to do about humans in the loop, and how to minimize mistakes. 

 

Tools mentioned:

 

DVC - https://dvc.org/

Quilt Data Versioning - https://quiltdata.com/

Apache Airflow - https://airflow.apache.org/

Apache Superset - https://superset.apache.org/

OpenProject - https://www.openproject.org/

 

----------------------------------------

 

Follow the podcast on Twitter: @dsdeployed

https://twitter.com/dsdeployed

 

----------------------------------------

 

Donny Winston

 

I help researchers do data-intensive science together.

Twitter: https://twitter.com/donnywinston @donnywinston

Website: https://polyneme.xyz/

LinkedIn: https://www.linkedin.com/in/donnywinston/

 

Ben Cook

I help data science teams deploy their algorithms because a machine learning model is only as good as the system that delivers it.

 

Twitter: ​​@jbencook https://twitter.com/jbencook

LinkedIn: https://www.linkedin.com/in/jbencook/

Website: https://sparrow.dev/

 

Jillian Rowe

I help biotech startups deploy scalable high performance compute infrastructure on AWS.

 

Website: https://www.dabbleofdevops.com

Twitter: www.twitter.com/jillianerowe

LinkedIn: https://www.linkedin.com/in/jillian-rowe-9410437a/

...more
View all episodesView all episodes
Download on the App Store

Data Science DeployedBy @dsdeployed