September 22, 2020

MLOps Coffee Sessions #10 Analyzing the Article “Continuous Delivery and Automation Pipelines in Machine Learning" // Part 2

1 hour 7 minutes

Second installation, David and Demetrios are reviewing the Google paper about Continuous training and automated pipelines. They dive deep into machine learning monitoring and also what exactly continuous training actually entails. Some key highlights are:

Automatically retraining and serving the models:
When to do it?
Outlier detection
Drift detection

Outlier detection:
What is it?
How you deal with it
Drift detection
Individual features may start to drift. This could be a bug, or it could be perfectly normal behavior that indicates that the world has changed, requiring the model to be retrained.

Example changes:
shifts in people’s preferences
marketing campaigns
competitor moves
the weather
the news cycle
Locations
Time
Devices (clients)

If the world you're working with is changing over time, model deployment should be treated as a continuous process. What this tells me is that you should keep the data scientists and engineers working on the model instead of immediately moving to another project.

Deeper dive into concept drift
Feature/target distributions change

An overview of concept drift applications: “.. data analysis applications, data evolve over time and must be analyzed in near real time. Patterns and relations in such data often evolve over time; thus, models built for analyzing such data quickly become obsolete over time. In machine learning and data mining, this phenomenon is referred to as concept drift.”
https://www.win.tue.nl/~mpechen/publications/pubs/CD_applications15.pdf
https://www-ai.cs.tu-dortmund.de/LEHRE/FACHPROJEKT/SS12/paper/concept-drift/tsymbal2004.pdf

Types of concept drift:
Sudden
Gradual

Google, in some way, is trying to address this concern - the world is changing, and you want your ML system to change as well, so it can avoid decreased performance but also improve over time and adapt to its environment. This sort of robustness is necessary for certain domains.
Continuous delivery and automation of pipelines (data, training, prediction service) was built with this in mind. Minimizing the commit-to-deploy interval and maximizing the velocity of software delivery and its components: maintainability, extensibility, and testability
Then the pipeline is ready, you can now run it. So you can do this continuously. After the pipeline is deployed to the production environment, it will be executed automatically and repetitively to produce a trained model that is stored in a central model registry.
This pipeline should be able to be run on a schedule or based on triggers: certain events that you have configured for your business domain - new data or drop in performance from the prod model.
The link between the model artifact and the pipeline is never severed. What pipeline trained them? What data was extracted, validated, and how was it prepared? What was the training configuration, and how was it evaluated? Etc. metrics are key here! Lineage tracking!!!
Keeping a close tie between the dev/experiment pipeline and the continuous production pipeline helps avoid inconsistencies between model artifacts produced by the pipeline and models being served - hard to debug

Join our Slack community: https://go.mlops.community/slack
Follow us on Twitter: @mlopscommunity
Sign up for the next meetup: https://go.mlops.community/register

Connect with Demetrios on LinkedIn: https://www.linkedin.com/in/dpbrinkm/
Connect with David on LinkedIn: https://www.linkedin.com/in/aponteanalytics/
Connect with Cris Sterry on LinkedIn: https://www.linkedin.com/in/chrissterry/

...more

View all episodes

By Demetrios

4.6

2323 ratings