Software Huddle

Deep Dive into Inference Optimization for LLMs with Philip Kiely


Listen Later

Today we have Philip Kiely from Baseten on the show. Baseten is a Series B startup focused on providing infrastructure for AI workloads.


We go deep on Inference Optimization. We cover choosing a model, discuss the hype around Compound AI, choosing an Inference Engine, Optimization Techniques like Quantization and Speculative Decoding all the way down to your GPU choice.

...more
View all episodesView all episodes
Download on the App Store

Software HuddleBy Software Huddle

  • 5
  • 5
  • 5
  • 5
  • 5

5

4 ratings


More shows like Software Huddle

View all
The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

288 Listeners

Oxide and Friends by Oxide Computer Company

Oxide and Friends

67 Listeners

Database School by Try Hard Studios

Database School

4 Listeners