The AWS Developers Podcast

3 ways to deploy your large language models on AWS


Listen Later

In this episode of the AWS Developers Podcast, we dive into the different ways to deploy large language models (LLMs) on AWS. From self-managed deployments on EC2 to fully managed services like SageMaker and Bedrock, we break down the pros and cons of each approach. Whether you're optimizing for compliance, cost, or time-to-market, we explore the trade-offs between flexibility and simplicity. You'll hear practical insights into instance selection, infrastructure management, model sizing, and prototyping strategies. We also examine how services like SageMaker Jumpstart and serverless architectures like Bedrock can streamline your machine learning workflows. If you're building or scaling AI applications in the cloud, this episode will help you navigate your options and design a deployment strategy that fits your needs.

With Germaine Ong, Startup Solution Architect ; With Jarett Yeo, Startup Solution Architect

    • Blog: Deploying Deepseek R1 Distill on Amazon EC2
      Blog: Deploying DeepSeek R1 Distill on Amazon Sagemaker Jumpstart
      Ollama
      Open Web UI
      Doc: deploy your own model on Amazon Sagemaker
      Doc: deploy your own model on Amazon Bedrock
  • ...more
    View all episodesView all episodes
    Download on the App Store

    The AWS Developers PodcastBy Amazon Web Services

    • 4.8
    • 4.8
    • 4.8
    • 4.8
    • 4.8

    4.8

    22 ratings


    More shows like The AWS Developers Podcast

    View all
    Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

    Software Engineering Radio - the podcast for professional software developers

    272 Listeners

    The Changelog: Software Development, Open Source by Changelog Media

    The Changelog: Software Development, Open Source

    284 Listeners

    The Cloudcast by Massive Studios

    The Cloudcast

    152 Listeners

    Thoughtworks Technology Podcast by Thoughtworks

    Thoughtworks Technology Podcast

    40 Listeners

    Talk Python To Me by Michael Kennedy

    Talk Python To Me

    590 Listeners

    Software Engineering Daily by Software Engineering Daily

    Software Engineering Daily

    621 Listeners

    AWS Podcast by Amazon Web Services

    AWS Podcast

    201 Listeners

    Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

    Syntax - Tasty Web Development Treats

    987 Listeners

    Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

    Kubernetes Podcast from Google

    181 Listeners

    Practical AI by Practical AI LLC

    Practical AI

    192 Listeners

    Google DeepMind: The Podcast by Hannah Fry

    Google DeepMind: The Podcast

    198 Listeners

    The Stack Overflow Podcast by The Stack Overflow Podcast

    The Stack Overflow Podcast

    62 Listeners

    WorkLab by Microsoft

    WorkLab

    61 Listeners

    AWS Bites by AWS Bites

    AWS Bites

    11 Listeners

    The Pragmatic Engineer by Gergely Orosz

    The Pragmatic Engineer

    53 Listeners