The New Stack Podcast

Confronting AI’s Next Big Challenge: Inference Compute


Listen Later

While AI training garners most of the spotlight — and investment — the demands ofAI inferenceare shaping up to be an even bigger challenge. In this episode ofThe New Stack Makers, Sid Sheth, founder and CEO of d-Matrix, argues that inference is anything but one-size-fits-all. Different use cases — from low-cost to high-interactivity or throughput-optimized — require tailored hardware, and existing GPU architectures aren’t built to address all these needs simultaneously.

“The world of inference is going to be truly heterogeneous,” Sheth said, meaning specialized hardware will be required to meet diverse performance profiles. A major bottleneck? The distance between memory and compute. Inference, especially in generative AI and agentic workflows, requires constant memory access, so minimizing the distance data must travel is key to improving performance and reducing cost.

To address this, d-Matrix developed Corsair, a modular platform where memory and compute are vertically stacked — “like pancakes” — enabling faster, more efficient inference. The result is scalable, flexible AI infrastructure purpose-built for inference at scale.

Learn more from The New Stack about inference compute and AI

Scaling AI Inference at the Edge with Distributed PostgreSQL

Deep Infra Is Building an AI Inference Cloud for Developers

Join our community of newsletter subscribers to stay on top of the news and at the top of your game 

 

...more
View all episodesView all episodes
Download on the App Store

The New Stack PodcastBy The New Stack

  • 4.3
  • 4.3
  • 4.3
  • 4.3
  • 4.3

4.3

31 ratings


More shows like The New Stack Podcast

View all
Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

Software Engineering Radio - the podcast for professional software developers

271 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

283 Listeners

The Cloudcast by Massive Studios

The Cloudcast

152 Listeners

Thoughtworks Technology Podcast by Thoughtworks

Thoughtworks Technology Podcast

41 Listeners

The New Stack Analysts by The New Stack

The New Stack Analysts

9 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

627 Listeners

The New Stack @ Scale by The New Stack

The New Stack @ Scale

3 Listeners

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence) by Sam Charrington

The TWIML AI Podcast (formerly This Week in Machine Learning & Artificial Intelligence)

437 Listeners

The New Stack Context by The New Stack

The New Stack Context

4 Listeners

AWS Podcast by Amazon Web Services

AWS Podcast

202 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

987 Listeners

CoRecursive: Coding Stories by Adam Gordon Bell - Software Developer

CoRecursive: Coding Stories

189 Listeners

Kubernetes Podcast from Google by Abdel Sghiouar, Kaslin Fields

Kubernetes Podcast from Google

184 Listeners

Practical AI by Practical AI LLC

Practical AI

190 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

64 Listeners

Oxide and Friends by Oxide Computer Company

Oxide and Friends

58 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

87 Listeners

The Pragmatic Engineer by Gergely Orosz

The Pragmatic Engineer

62 Listeners