AWS Podcast

#600: Amazon SageMaker Multi Model Endpoints


Listen Later

Amazon SageMaker Multi-Model Endpoint (MME) is fully managed capability of SageMaker Inference that allows customers to deploy thousands of models on a single endpoint and save costs by sharing instances on which the endpoints run across all the models. Until recently, MME was only supported for machine learning (ML) models which run on CPU instances. Now, customers can use MME to deploy thousands of ML models on GPU based instances as well, and potentially save costs by 90%. MME dynamically loads and unloads models from GPU memory based on incoming traffic to the endpoint. Customers save cost with MME as the GPU instances are shared by thousands of models. Customers can run ML models from multiple ML frameworks including PyTorch, TensorFlow, XGBoost, and ONNX. Customers can get started by using the NVIDIA Triton™ Inference Server and deploy models on SageMaker’s GPU instances in “multi-model“ mode. Once the MME is created, customers specify the ML model from which they want to obtain inference while invoking the endpoint. Multi Model Endpoints for GPU is available in all AWS regions where Amazon SageMaker is available.
To learn more checkout:
Our launch blog: https://go.aws/3NwtJyh
Amazon SageMaker website: https://go.aws/44uCdNr
...more
View all episodesView all episodes
Download on the App Store

AWS PodcastBy Amazon Web Services

  • 4.3
  • 4.3
  • 4.3
  • 4.3
  • 4.3

4.3

201 ratings


More shows like AWS Podcast

View all
The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

289 Listeners

WSJ Tech News Briefing by The Wall Street Journal

WSJ Tech News Briefing

1,651 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

623 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

581 Listeners

CyberWire Daily by N2K Networks

CyberWire Daily

1,031 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

300 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

347 Listeners

Practical AI by Practical AI LLC

Practical AI

210 Listeners

Last Week in AI by Skynet Today

Last Week in AI

314 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

513 Listeners

Cybersecurity Headlines by CISO Series

Cybersecurity Headlines

138 Listeners

Bloomberg Tech by Bloomberg

Bloomberg Tech

63 Listeners

Latent Space: The AI Engineer Podcast by Latent.Space

Latent Space: The AI Engineer Podcast

100 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

227 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

651 Listeners