
Sign up to save your podcasts
Or


MLOps Coffee Sessions #102 with Yash Sheth, Fixing Your ML Data Blindspots, co-hosted by Adam Sroka.
Join the Community: https://go.mlops.community/YTJoinIn
Get the newsletter: https://go.mlops.community/YTNewsletter
// Abstract
Improving your dataset quality is absolutely critical for effective ML. Finding errors in your datasets is generally a slow, iterative, and painstaking process.
Data scientists should be proactively fixing their models’ blind spots by improving their training data. In this talk, Yash discusses how Galileo helps data scientists identify, fix, and track data across the entire ML workflow.
// Bio
Co-founder and VP of Engineering. Prior to starting Galileo, Yash spent the last decade working on Automatic Speech Recognition (ASR) at Google, leading their core speech recognition platform team, which powers speech-to-text across 20+ products at Google in over 80 languages, along with thousands of businesses through their Cloud Speech API.
// MLOps Jobs board
jobs.mlops.community
MLOps Swag/Merch
https://mlops-community.myshopify.com/
// Related Links
Website: https://www.rungalileo.io/
Trade-Off: Why Some Things Catch On, and Others book by Kevin Maney:
https://www.amazon.com/Trade-Off-Some-Things-Catch-Others/dp/0385525958
--------------- ✌️Connect With Us ✌️ -------------
Join our Slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Adam on LinkedIn: https://www.linkedin.com/in/aesroka/
Connect with Yash on LinkedIn: https://www.linkedin.com/in/yash-sheth-72111216/
Timestamps:
[00:00] Introduction to Yash Sheth
[02:53] Takeaways
[04:35] Why unstructured data?
[06:59] Fitting in the workflow
[10:56] Digging into the different pains
[18:23] Vision around the democratization of machine learning
[24:31] Unstructured data problem
[25:49] Galileo handling unified tools
[27:21] Calculus for ML
[28:45] Gatekeep
[29:49] Synthetic data in the unstructured data world of Galileo
[33:10] Tips for data scientists who have unstructured data but a small data set
[35:00] Benefits of users from Galileo
[37:15] Business case for dummies
[42:36] War stories
[44:49] Rapid-fire questions
[50:55] Wrap up
By Demetrios4.6
2323 ratings
MLOps Coffee Sessions #102 with Yash Sheth, Fixing Your ML Data Blindspots, co-hosted by Adam Sroka.
Join the Community: https://go.mlops.community/YTJoinIn
Get the newsletter: https://go.mlops.community/YTNewsletter
// Abstract
Improving your dataset quality is absolutely critical for effective ML. Finding errors in your datasets is generally a slow, iterative, and painstaking process.
Data scientists should be proactively fixing their models’ blind spots by improving their training data. In this talk, Yash discusses how Galileo helps data scientists identify, fix, and track data across the entire ML workflow.
// Bio
Co-founder and VP of Engineering. Prior to starting Galileo, Yash spent the last decade working on Automatic Speech Recognition (ASR) at Google, leading their core speech recognition platform team, which powers speech-to-text across 20+ products at Google in over 80 languages, along with thousands of businesses through their Cloud Speech API.
// MLOps Jobs board
jobs.mlops.community
MLOps Swag/Merch
https://mlops-community.myshopify.com/
// Related Links
Website: https://www.rungalileo.io/
Trade-Off: Why Some Things Catch On, and Others book by Kevin Maney:
https://www.amazon.com/Trade-Off-Some-Things-Catch-Others/dp/0385525958
--------------- ✌️Connect With Us ✌️ -------------
Join our Slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register
Catch all episodes, blogs, newsletters, and more: https://mlops.community/
Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with Adam on LinkedIn: https://www.linkedin.com/in/aesroka/
Connect with Yash on LinkedIn: https://www.linkedin.com/in/yash-sheth-72111216/
Timestamps:
[00:00] Introduction to Yash Sheth
[02:53] Takeaways
[04:35] Why unstructured data?
[06:59] Fitting in the workflow
[10:56] Digging into the different pains
[18:23] Vision around the democratization of machine learning
[24:31] Unstructured data problem
[25:49] Galileo handling unified tools
[27:21] Calculus for ML
[28:45] Gatekeep
[29:49] Synthetic data in the unstructured data world of Galileo
[33:10] Tips for data scientists who have unstructured data but a small data set
[35:00] Benefits of users from Galileo
[37:15] Business case for dummies
[42:36] War stories
[44:49] Rapid-fire questions
[50:55] Wrap up

1,296 Listeners

288 Listeners

1,105 Listeners

626 Listeners

583 Listeners

306 Listeners

343 Listeners

212 Listeners

551 Listeners

512 Listeners

150 Listeners

101 Listeners

228 Listeners

688 Listeners

34 Listeners