The MLOps Podcast

By Dean Pleban @ DagsHub

A podcast from DagsHub about bringing machine learning into the real world. Each episode features a conversation with top data science and machine learning practitioners, who'll share their thoughts, ... more

· Technology

Download on the App Store

Download on the App Store

Get it on Google Play

FAQs about The MLOps Podcast:

How many episodes does The MLOps Podcast have?

The podcast currently has 35 episodes available.

The MLOps Podcast episodes:

November 04, 2021🎓 MLOps lessons learned helping companies build their ML systems with Lee Harper, Lead DS at Catapult
In this episode, I'm speaking with Lee Harper, Principal Data Scientist at Catapult Systems. Lee holds a Ph.D. in Physical and Theoretical Chemistry. Lee is a teacher-turned-data scientist. We cover the various entry paths into the world of data science, the value of background diversity, security in ML production, and even AI fairness.
Join our Discord community: https://discord.gg/tEYvqxwhah
---
Timestamps:
00:00 Podcast intro

01:00 Guest introduction

01:39 How did you get into the fields of data science and machine learning?

05:04 Coding boot camps vs. academia & diversity of backgrounds in ML

09:37 How does the process of bringing your work into production change over the years?

13:02 How has the change in the languages used for data science affected production processes?

16:01 How do you accelerate the timeframes for getting from POC to production in ML?

18:19 Do data scientists reinvent the wheel more often than software developers, and why?

22:14 The value of learning how to Google

23:00 Recurring themes, challenges, and common issues in data science

27:50 Solving for security in ML in production

31:57 ML security considerations for startups

34:30 Data security considerations in ML

35:18 What is the most interesting topic in machine learning right now?

38:05 ML fairness, bias, and responsible AI

41:44 What does it mean to build a fair or unbiased model?

47:15 If you had to choose one challenge in bringing models to production, what would it be?

51:00 What are the tools and processes that you use to make the transition to production easier?

55:35 About "vendor lock-in"

58:00 Your favorite tool recommendations

1:03:35 Recommendations for the audience
---
Relevant Links:
Linux Command Line and Shell Scripting Bible – https://www.amazon.com/Linux-Command-Shell-Scripting-Bible/dp/1119700914

Project Hail Mary – https://www.amazon.com/Project-Hail-Mary-Andy-Weir/dp/0593135202
Social Links:
https://www.linkedin.com/company/dagshub/

https://www.linkedin.com/company/catapult-systems/

https://www.linkedin.com/in/leeharper2425/

https://twitter.com/DeanPlbn

https://twitter.com/TheRealDAGsHub
...more
1h 9min
September 20, 2021🧠 Algorithmic challenges in bringing ML models into production with Roey Mechrez, CTO at BeyondMinds
In this episode, I'm speaking with Roey Mechrez from BeyondMinds. Roey holds a Ph.D. in Electrical Engineering, with vast experience in computer vision and deep learning research. We discuss the challenges of gluing together infrastructure solutions for an end-to-end ML platform, as well as generating monitoring insights for non-technical stakeholders and combating catastrophic forgetting.
Join our Discord community: https://discord.gg/tEYvqxwhah
---
Timestamps:
00:00 Podcast intro

01:00 Guest intro

01:49 What does BeyondMinds do?

06:24 Audience for an end-to-end ML platform

12:14 Communicating with non-technical stakeholders/users

15:03 The future of "AI-powered tools", and human-machine collaboration

20:04 On complex system orchestration, generating insights from monitoring, and catastrophic forgetting – Biggest challenges in production ML

25:23 Why is catastrophic forgetting a hard problem and how do you deal with it?

30:02 "Secret" tips on how to get started with automating the retraining process

33:30 Generating monitoring insights and observations in a user-friendly format

38:12 Making data labeling issues explainable (automatically)

45:07 Customizing complex systems per user – Orchestrating an ML platform

52:58 API design in ML platform components

55:45 Measuring success for researchers, ML engineers, and software developers – can ML work fit into the Agile workflow.

1:02:22 Is "time to production" a good metric? Gains in time to production in the real world

1:06:02 How do you divide the work between ML researchers and engineers?

1:08:39 Recommendations for the audience
---
Relevant Links:
A16z blog about AI

Data Science work in an agile environment – A talk by Dima Goldenberg

Hayot Kis (Hebrew Podcast) חיות כיס

Data Engineering Podcast

ACX Podcast
Social Links:
https://www.linkedin.com/company/beyondminds/

https://www.linkedin.com/company/dagshub/

https://twitter.com/roeyme

https://twitter.com/DeanPlbn

https://twitter.com/TheRealDAGsHub
...more
1h 14min
August 11, 2021 🐤 Feature stores and CI/CD for machine learning with Qwak.ai VP Engineering, Ran Romano
In this episode, I'm speaking with Ran Romano from Qwak.ai. Ran built the ML platform at Wix, and we discuss the various data roles, when organizations should focus on ML infrastructure, solving the hard problems of features stores, and one approach to building an end-to-end ML platform.
Join our Discord community: https://discord.gg/tEYvqxwhah

---
Timestamps:
00:00 Podcast intro
01:00 Guest intro
01:30 Getting into the world of ML and ML Engineering
02:25 The line between Data Engineer, ML Engineer, and Data Scientist
03:50 The future of data roles – what are the trends?
07:21 The most exciting part about taking ML models into production
09:45 Jupyter notebooks in production (again??)
10:41 Signs that notebook productionization might not work
11:42 Building ML-focused CI/CD systems
15:32 Early days of building out the Wix ML platform
16:22 Signs that you might need to focus on ML infrastructure in your organization, and how to convince other stakeholders.
19:21 What part of the platform that you built are you most proud of?
23:51 Defining a feature store and the training/serving skew
27:24 Onboarding data scientists to using a feature store
33:49 When is it too early to build an ML platform?
35:33 Open source components – What parts of your platform did you choose not to build yourself?
40:16 Qwak.ai – What are you working on currently?
41:07 How do you define an "end-to-end" platform in the case of Qwak
44:25 End-to-end vs. Integrated – Advantages and disadvantages

---
Relevant Links:
- Qwak.ai: https://www.qwak.ai
- Wix ML Platform presentation by Ran: https://www.youtube.com/watch?v=E8839ENL-WY

- https://www.linkedin.com/company/dagshub
- https://www.linkedin.com/company/qwak-ai/

- https://twitter.com/TheRealDAGsHub
- https://twitter.com/DeanPlbn
- https://twitter.com/ranvromano
...more
46min
July 04, 2021🤗 Large ML models in production with HuggingFace CTO Julien Chaumond
In this episode, I'm speaking with Julien Chaumond from 🤗 HuggingFace, about how they got started, getting large language models to production in millisecond inference times, and the CERN for machine learning.
Join our Discord community: https://discord.gg/tEYvqxwhah
---
Timestamps:
01:00 - Guest intro

02:14 - Origin of HuggingFace

05:37 - Why the focus on NLP?

07:45 - The success of the HuggingFace community

13:14 - Reproducing models and scaling for the community

18:14 - Enabling large models in production

23:14 - How HuggingFace scales so many models

27:34 - The biggest challenge HuggingFace solved in MLOps

32:02 - How HuggingFace transitions from research to production

34:44 - Using notebooks vs python modules

38:27 - The most interesting topic in ML production

40:10 - Fascinating ML research

45:24 - Learning new things

51:14 - Something that is true but most people disagree with

56:54 - Tips to organize research teams

1:00:05 - New features for accelerated inference

1:01:35 - Most common use case of HuggingFace

1:04:17 - Integrating search algorithms into transformer library

1:05:09 - Integrating vision models

1:06:06 - Long term business model

1:10:55 - Automation and simplification of the process of building models

1:13:02 - Support for real-time inference

1:14:40 - Recommendations for the audience
---
Relevant Links:
FastDS: https://github.com/DAGsHub/fds

BigScience: https://bigscience.huggingface.co

https://www.linkedin.com/company/dagshub/

https://www.linkedin.com/company/huggingface/

https://twitter.com/TheRealDAGsHub

https://twitter.com/huggingface
...more
1h 20min
April 27, 2021 🛣 Finding your path in ML with NLP Engineer Urszula Czerwinska
In this episode, I'm speaking with Urszula Czerwinska about her path as a data scientist, the projects she worked on, experiences gained as a data scientist, as well as the challenges she's overcome in bringing her machine learning (ML) into production.
Join our Discord community: https://discord.gg/tEYvqxwhah

---

Timestamps:
0:00 - Podcast intro
1:15 - Guest intro and how you got into data science
3:48 - Finding your fit – research or industry and when to transition
7:23 - What types of ML projects do you specialize in
10:41 - ML explainability and interpretability
15:26 - ML explainability with non-technical stakeholders
17:13 - What problems does your team solve within the organization
20:56 - ML in production – how to bring your ML projects from research to production
25:17 - The tools you can't live without
28:11 - Do you have a set process for productizing ML projects
30:08 - Team structures and communication for data science teams
33:42 - Who's in charge of setting up infrastructure for a project and job title discussion
36:29 - Interesting tools and repositories you work with
39:30 - How do you stay up to date
42:00 - Biggest challenges for you in ML
45:12 - Favorite and least favorite thing about being a data scientist
49:52 - Handling a workplace that doesn't understand what a data scientist is
53:07 - Data scientists are 🦄 53:30 Good papers you read recently
58:12 - Tips to improve the data science workflow

Relevant Links:
- flair: https://github.com/flairNLP/flair
- AllenNLP: https://github.com/allenai/allennlp
- Papers with Code: https://paperswithcode.com/
- Dair.ai newsletter: https://dair.ai/newsletter/
- HuggingFace: https://huggingface.co/blog
...more
1h 2min

FAQs about The MLOps Podcast:

How many episodes does The MLOps Podcast have?

The podcast currently has 35 episodes available.