The Real Python Podcast

Improving Classification Models With XGBoost


Listen Later

How can you improve a classification model while avoiding overfitting? Once you have a model, what tools can you use to explain it to others? This week on the show, we talk with author and Python trainer Matt Harrison about his new book Effective XGBoost: Tuning, Understanding, and Deploying Classification Models.

Matt talks about the process of developing the book and how he wanted it to be an interactive experience for the reader. He explains the concept of gradient boosting and provides metaphors for developing a model. He shares his appreciation for exploratory data analysis as a crucial step in understanding your data.

He also shares additional libraries to help you explain your model. We discuss how difficult it is to develop the story of how the model works to share it with stakeholders.

He illustrates why covering the complete process is essential, from exploring data and building a model to finally deploying it. He shares many of the tools he found along the way.

This week’s episode is brought to you by Scout APM.

Course Spotlight: Starting With Linear Regression in Python

In this video course, you’ll get started with linear regression in Python. Linear regression is one of the fundamental statistical and machine learning techniques, and Python is a popular choice for machine learning.

Topics:

  • 00:00:00 – Introduction
  • 00:02:16 – Starting on the book
  • 00:04:36 – What is tabular prediction?
  • 00:06:50 – Who could leverage XGBoost?
  • 00:09:46 – Background to get started
  • 00:11:50 – Using XGBoost to explore data
  • 00:21:06 – Sponsor: ScoutAPM
  • 00:21:54 – Focusing on using the tool
  • 00:26:37 – Not being a developer
  • 00:30:53 – Contrasting XGBoost and logistic regression
  • 00:41:57 – Video Course Spotlight
  • 00:43:21 – Using SHAP to explain the model
  • 00:48:06 – Working with hyperparameters
  • 00:51:40 – Deploying your model
  • 00:53:09 – XGBoost Feature Interactions Reshaped (XGBFIR)
  • 00:55:47 – Communicating the story of a model
  • 00:57:57 – How to find the book
  • 00:59:07 – What are you excited about in the world of Python?
  • 01:02:46 – What do you want to learn next?
  • 01:03:12 – How can people follow what you do online?
  • 01:03:59 – Thanks and goodbye
  • Show Links:

    • MetaSnake - Custom Python Training
    • Effective XGBoost Book - Store Link (Discount expires end of September 2023)
    • XGBoost Documentation — xgboost 1.7.6 documentation
    • Gradient boosting - Wikipedia
    • SHAP (SHapley Additive exPlanations) Documentation
    • Hyperopt Documentation
    • MLflow - A platform for the machine learning lifecycle
    • xgbfir: XGBoost Feature Interactions Reshaped
    • Effective XGBoost Book - Store Link (Discount expires end of September 2023)
    • Mojo 🔥: Programming language for all of AI
    • MetaSnake - Blog
    • 🐍 Matt Harrison - LinkedIn
    • Matt Harrison (@__mharrison__) - Twitter
    • Level up your Python skills with our expert-led courses:

      • Data Cleaning With pandas and NumPy
      • Using k-Nearest Neighbors (kNN) in Python
      • Starting With Linear Regression in Python
      • Support the podcast & join our community of Pythonistas

        ...more
        View all episodesView all episodes
        Download on the App Store

        The Real Python PodcastBy Real Python

        • 4.7
        • 4.7
        • 4.7
        • 4.7
        • 4.7

        4.7

        139 ratings


        More shows like The Real Python Podcast

        View all
        The Changelog: Software Development, Open Source by Changelog Media

        The Changelog: Software Development, Open Source

        288 Listeners

        Software Engineering Daily by Software Engineering Daily

        Software Engineering Daily

        625 Listeners

        Talk Python To Me by Michael Kennedy

        Talk Python To Me

        579 Listeners

        Soft Skills Engineering by Jamison Dance and Dave Smith

        Soft Skills Engineering

        289 Listeners

        Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

        Super Data Science: ML & AI Podcast with Jon Krohn

        303 Listeners

        Python Bytes by Michael Kennedy and Brian Okken

        Python Bytes

        213 Listeners

        Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

        Syntax - Tasty Web Development Treats

        988 Listeners

        Darknet Diaries by Jack Rhysider

        Darknet Diaries

        8,087 Listeners

        Tech Brew Ride Home by Morning Brew

        Tech Brew Ride Home

        967 Listeners

        Practical AI by Practical AI LLC

        Practical AI

        197 Listeners

        AWS Podcast by Amazon Web Services

        AWS Podcast

        207 Listeners

        Django Chat by William Vincent and Carlton Gibson

        Django Chat

        75 Listeners

        Last Week in AI by Skynet Today

        Last Week in AI

        311 Listeners

        Machine Learning Street Talk (MLST) by Machine Learning Street Talk (MLST)

        Machine Learning Street Talk (MLST)

        100 Listeners

        The Pragmatic Engineer by Gergely Orosz

        The Pragmatic Engineer

        70 Listeners