MLOps.community

Serving LLMs in Production: Performance, Cost & Scale // CAST AI Roundtable


Listen Later

Roundtable CAST AI episode: Serving LLMs in Production: Performance, Cost & Scale.


Join the Community:

https://go.mlops.community/YTJoinIn

Get the newsletter: https://go.mlops.community/YTNewsletter

MLOps GPU Guide:

https://go.mlops.community/gpuguide


// Abstract

Experimenting with LLMs is easy. Running them reliably and cost-effectively in production is where things break.

Most AI teams never make it past demos and proofs of concept. A smaller group is pushing real workloads to production—and running into very real challenges around infrastructure efficiency, runaway cloud costs, and reliability at scale.

This session is for engineers and platform teams moving beyond experimentation and building AI systems that actually hold up in production.


// Bio

Ioana Apetrei

Ioana is a Senior Product Manager at CAST AI, leading the AI Enabler product, an AI Gateway platform for cost-effective LLM infrastructure deployment. She brings 12 years of experience building B2C and B2B products reaching over 10 million users. Outside of work, she enjoys assembling puzzles and LEGOs and watching motorsports.


Igor Šušić

Igor is a founding Machine Learning Engineer at CAST AI’s AI Enabler, where he focuses on optimizing inference and training at scale. With a strong background in Natural Language Processing (NLP) and Recommender Systems, Igor has been tackling the challenges of large-scale model optimization long before transformers became mainstream. Prior to CAST AI, he worked at industry leaders like Bloomreach and Infobip, where he contributed to the development and deployment of large-scale AI and personalization systems from the early days of the field.


// Related Links

Website: https://cast.ai/


~~~~~~~~ ✌️Connect With Us ✌️ ~~~~~~~

Catch all episodes, blogs, newsletters, and more: https://go.mlops.community/TYExplore

Join our Slack community [https://go.mlops.community/slack]

Follow us on X/Twitter [@mlopscommunity](https://x.com/mlopscommunity) or [LinkedIn](https://go.mlops.community/linkedin)]

Sign up for the next meetup: [https://go.mlops.community/register]

MLOps Swag/Merch: [https://shop.mlops.community/]


Connect with Demetrios on LinkedIn: /dpbrinkm

Connect with Ioana on LinkedIn: /ioanaapetrei/

Connect with Igor on LinkedIn: /igor-%C5%A1u%C5%A1i%C4%87/

...more
View all episodesView all episodes
Download on the App Store

MLOps.communityBy Demetrios

  • 4.6
  • 4.6
  • 4.6
  • 4.6
  • 4.6

4.6

23 ratings


More shows like MLOps.community

View all
This Week in Startups by Jason Calacanis

This Week in Startups

1,298 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

288 Listeners

The a16z Show by Andreessen Horowitz

The a16z Show

1,096 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

628 Listeners

Talk Python To Me by Michael Kennedy

Talk Python To Me

583 Listeners

Super Data Science: ML & AI Podcast with Jon Krohn by Jon Krohn

Super Data Science: ML & AI Podcast with Jon Krohn

302 Listeners

NVIDIA AI Podcast by NVIDIA

NVIDIA AI Podcast

344 Listeners

Practical AI by Practical AI LLC

Practical AI

215 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

561 Listeners

Big Technology Podcast by Alex Kantrowitz

Big Technology Podcast

511 Listeners

No Priors: Artificial Intelligence | Technology | Startups by Conviction

No Priors: Artificial Intelligence | Technology | Startups

141 Listeners

Latent Space: The AI Engineer Podcast by Latent.Space

Latent Space: The AI Engineer Podcast

100 Listeners

This Day in AI Podcast by Michael Sharkey, Chris Sharkey

This Day in AI Podcast

229 Listeners

The AI Daily Brief: Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief: Artificial Intelligence News and Analysis

675 Listeners

AI + a16z by a16z

AI + a16z

32 Listeners