The MLOps Podcast

🤗 Large ML models in production with HuggingFace CTO Julien Chaumond


Listen Later

In this episode, I'm speaking with Julien Chaumond from 🤗 HuggingFace, about how they got started, getting large language models to production in millisecond inference times, and the CERN for machine learning.

Join our Discord community: https://discord.gg/tEYvqxwhah

---

Timestamps: 

  • 01:00 - Guest intro
  • 02:14 - Origin of HuggingFace
  • 05:37 - Why the focus on NLP?
  • 07:45 - The success of the HuggingFace community
  • 13:14 - Reproducing models and scaling for the community
  • 18:14 - Enabling large models in production
  • 23:14 - How HuggingFace scales so many models
  • 27:34 - The biggest challenge HuggingFace solved in MLOps
  • 32:02 - How HuggingFace transitions from research to production
  • 34:44 - Using notebooks vs python modules
  • 38:27 - The most interesting topic in ML production
  • 40:10 - Fascinating ML research
  • 45:24 - Learning new things
  • 51:14 - Something that is true but most people disagree with
  • 56:54 - Tips to organize research teams
  • 1:00:05 - New features for accelerated inference
  • 1:01:35 - Most common use case of HuggingFace
  • 1:04:17 - Integrating search algorithms into transformer library
  • 1:05:09 - Integrating vision models
  • 1:06:06 - Long term business model
  • 1:10:55 - Automation and simplification of the process of building models
  • 1:13:02 - Support for real-time inference
  • 1:14:40 - Recommendations for the audience
  • ---

    Relevant Links:

    • FastDS: https://github.com/DAGsHub/fds
    • BigScience: https://bigscience.huggingface.co
    • https://www.linkedin.com/company/dagshub/
    • https://www.linkedin.com/company/huggingface/
    • https://twitter.com/TheRealDAGsHub
    • https://twitter.com/huggingface
    • ...more
      View all episodesView all episodes
      Download on the App Store

      The MLOps PodcastBy Dean Pleban @ DagsHub