AI Engineering Podcast

Shedding Light On Silent Model Failures With NannyML


Listen Later

Summary
Because machine learning models are constantly interacting with inputs from the real world they are subject to a wide variety of failures. The most commonly discussed error condition is concept drift, but there are numerous other ways that things can go wrong. In this episode Wojtek Kuberski explains how NannyML is designed to compare the predicted performance of your model against its actual behavior to identify silent failures and provide context to allow you to determine whether and how urgently to address them.
Announcements
  • Hello and welcome to the Machine Learning Podcast, the podcast about machine learning and how to bring it from idea to delivery.
  • Data powers machine learning, but poor data quality is the largest impediment to effective ML today. Galileo is a collaborative data bench for data scientists building Natural Language Processing (NLP) models to programmatically inspect, fix and track their data across the ML workflow (pre-training, post-training and post-production) – no more excel sheets or ad-hoc python scripts. Get meaningful gains in your model performance fast, dramatically reduce data labeling and procurement costs, while seeing 10x faster ML iterations. Galileo is offering listeners a free 30 day trial and a 30% discount on the product there after. This offer is available until Aug 31, so go to themachinelearningpodcast.com/galileo and request a demo today!
  • Your host is Tobias Macey and today I’m interviewing Wojtek Kuberski about NannyML and the work involved in post-deployment data science
Interview
  • Introduction
  • How did you get involved in machine learning?
  • Can you describe what NannyML is and the story behind it?
  • What is "post-deployment data science"? 
    • How does it differ from the metrics/monitoring approach to managing the model lifecycle?
    • Who is typically responsible for this work? How does NannyML augment their skills?
    • What are some of your experiences with model failure that motivated you to spend your time and focus on this problem?
  • What are the main contributing factors to alert fatigue for ML systems?
  • What are some of the ways that a model can fail silently? 
    • How does NannyML detect those conditions?
  • What are the remediation actions that might be necessary once an issue is detected in a model?
  • Can you describe how NannyML is implemented? 
    • What are some of the technical and UX design problems that you have had to address?
    • What are some of the ideas/assumptions that you have had to re-evaluate in the process of building NannyML?
  • What additional capabilities are necessary for supporting less structured data?
  • Can you describe what is involved in setting up NannyML and how it fits into an ML engineer’s workflow? 
    • Once a model is deployed, what additional outputs/data can/should be collected to improve the utility of NannyML and feed into analysis of the real-world operation?
  • What are the most interesting, innovative, or unexpected ways that you have seen NannyML used?
  • What are the most interesting, unexpected, or challenging lessons that you have learned while working on NannyML?
  • When is NannyML the wrong choice?
  • What do you have planned for the future of NannyML?
Contact Info
  • LinkedIn
Parting Question
  • From your perspective, what is the biggest barrier to adoption of machine learning today?
Closing Announcements
  • Thank you for listening! Don’t forget to check out our other shows. The Data Engineering Podcast covers the latest on modern data management. Podcast.__init__ covers the Python language, its community, and the innovative ways it is being used.
  • Visit the site to subscribe to the show, sign up for the mailing list, and read the show notes.
  • If you’ve learned something or tried out a project from the show then tell us about it! Email [email protected]) with your story.
  • To help other people find the show please leave a review on iTunes and tell your friends and co-workers
Links
  • NannyML
  • F1 Score
  • ROC Curve
  • Concept Drift
  • A/B Testing
  • Jupyter Notebook
  • Vector Embedding
  • Airflow
  • EDA == Exploratory Data Analysis
  • Inspired book (affiliate link)
  • ZenML
The intro and outro music is from Hitman’s Lovesong feat. Paola Graziano by The Freak Fandango Orchestra/CC BY-SA 3.0
...more
View all episodesView all episodes
Download on the App Store

AI Engineering PodcastBy Tobias Macey

  • 4.3
  • 4.3
  • 4.3
  • 4.3
  • 4.3

4.3

6 ratings


More shows like AI Engineering Podcast

View all
The Cloudcast by Massive Studios

The Cloudcast

153 Listeners

a16z Podcast by Andreessen Horowitz

a16z Podcast

994 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

629 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

296 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

322 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

139 Listeners

AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion by AI & Data Today

AI Today Podcast: Artificial Intelligence Insights, Experts, and Opinion

144 Listeners

Practical AI by Practical AI LLC

Practical AI

189 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

63 Listeners

Last Week in AI by Skynet Today

Last Week in AI

281 Listeners

Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

Machine Learning Street Talk (MLST)

88 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

124 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

63 Listeners

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

423 Listeners

AI + a16z by a16z

AI + a16z

33 Listeners