Software Engineering Daily

Scaling Large ML Models to Small Devices with Atila Orhon


Listen Later

The size of ML models is growing into the many billions of parameters. This poses a challenge for running inference on non-dedicated hardware like phones and laptops.

Argmax is a startup focused on developing methods to run large models on commodity hardware. A key observation behind their strategy is that the largest models are getting larger, but the smallest models that are commercially relevant are getting smaller. The company was started in 2023 and has raised money from General Catalyst and other industry leaders.

Atila Orhon is the founder of Argmax and he previously worked at Apple and NVIDIA. He joins the show to talk about working in computer vision, building ML tooling at Apple, optimizing ML models, and more.

Sean’s been an academic, startup founder, and Googler. He has published works covering a wide range of topics from information visualization to quantum computing. Currently, Sean is Head of Marketing and Developer Relations at Skyflow and host of the podcast Partially Redacted, a podcast about privacy and security engineering. You can connect with Sean on Twitter @seanfalconer.

 

Please click here to see the transcript of this episode.

Sponsorship inquiries: [email protected]

The post Scaling Large ML Models to Small Devices with Atila Orhon appeared first on Software Engineering Daily.

...more
View all episodesView all episodes
Download on the App Store

Software Engineering DailyBy Software Engineering Daily

  • 4.4
  • 4.4
  • 4.4
  • 4.4
  • 4.4

4.4

615 ratings


More shows like Software Engineering Daily

View all
Software Engineering Radio by se-radio@computer.org

Software Engineering Radio

271 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

289 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

585 Listeners

Soft Skills Engineering by Jamison Dance and Dave Smith

Soft Skills Engineering

289 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

43 Listeners

Data Engineering Podcast by Tobias Macey

Data Engineering Podcast

146 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

988 Listeners

The freeCodeCamp Podcast by freeCodeCamp.org

The freeCodeCamp Podcast

487 Listeners

CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

CoRecursive: Coding Stories

190 Listeners

Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

Kubernetes Podcast from Google

182 Listeners

Practical AI by Practical AI LLC

Practical AI

211 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

202 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

64 Listeners

The Real Python Podcast by Real Python

The Real Python Podcast

142 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

93 Listeners