Changelog Master Feed

Large models on CPUs (Practical AI #221)


Listen Later

Model sizes are crazy these days with billions and billions of parameters. As Mark Kurtz explains in this episode, this makes inference slow and expensive despite the fact that up to 90%+ of the parameters don’t influence the outputs at all.

Mark helps us understand all of the practicalities and progress that is being made in model optimization and CPU inference, including the increasing opportunities to run LLMs and other Generative AI models on commodity hardware.

Join the discussion

Changelog++ members save 1 minute on this episode because they made the ads disappear. Join today!

Sponsors:

  • FastlyOur bandwidth partner. Fastly powers fast, secure, and scalable digital experiences. Move beyond your content delivery network to their powerful edge cloud platform. Learn more at fastly.com
  • Fly.ioThe home of Changelog.com — Deploy your apps and databases close to your users. In minutes you can run your Ruby, Go, Node, Deno, Python, or Elixir app (and databases!) all over the world. No ops required. Learn more at fly.io/changelog and check out the speedrun in their docs.
  • Featuring:

    • Mark Kurtz – LinkedIn, X
    • Daniel Whitenack – Website, GitHub, X

    Show Notes:

    • Neural Magic
    • SparseML
    • SparseZoo
    • Neural Magic Scales up MLPerf™ Inference v3.0 Performance With Demonstrated Power Efficiency; No GPUs Needed
    • Deploy Optimized Hugging Face Models With DeepSparse and SparseZoo
    • SparseGPT: Remove 100 Billion Parameters for Free
    • Something missing or broken? PRs welcome!

      ...more
      View all episodesView all episodes
      Download on the App Store

      Changelog Master FeedBy Changelog Media

      • 4.4
      • 4.4
      • 4.4
      • 4.4
      • 4.4

      4.4

      29 ratings


      More shows like Changelog Master Feed

      View all
      Hanselminutes with Scott Hanselman by Scott Hanselman

      Hanselminutes with Scott Hanselman

      377 Listeners

      Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

      Software Engineering Radio - the podcast for professional software developers

      272 Listeners

      The Changelog: Software Development, Open Source by Changelog Media

      The Changelog: Software Development, Open Source

      284 Listeners

      Thoughtworks Technology Podcast by Thoughtworks

      Thoughtworks Technology Podcast

      40 Listeners

      Talk Python To Me by Michael Kennedy

      Talk Python To Me

      590 Listeners

      Software Engineering Daily by Software Engineering Daily

      Software Engineering Daily

      621 Listeners

      Python Bytes by Michael Kennedy and Brian Okken

      Python Bytes

      215 Listeners

      Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

      Syntax - Tasty Web Development Treats

      987 Listeners

      CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

      CoRecursive: Coding Stories

      189 Listeners

      Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

      Kubernetes Podcast from Google

      181 Listeners

      Practical AI by Practical AI LLC

      Practical AI

      192 Listeners

      The Stack Overflow Podcast by The Stack Overflow Podcast

      The Stack Overflow Podcast

      62 Listeners

      Oxide and Friends by Oxide Computer Company

      Oxide and Friends

      47 Listeners

      Latent Space: The AI Engineer Podcast by swyx + Alessio

      Latent Space: The AI Engineer Podcast

      75 Listeners

      The Pragmatic Engineer by Gergely Orosz

      The Pragmatic Engineer

      53 Listeners