ConTejas Code

John McBride: How to Build Your Own AI Infrastucture with Kubernetes


Listen Later

Links


- Codecrafters (sponsor): https://tej.as/codecrafters

- OpenSauced blog post: https://opensauced.pizza/blog/how-we-saved-thousands-of-dollars-deploying-low-cost-open-source-ai-technologies

- John on X: https://x.com/johncodezzz

- Tejas on X: https://x.com/tejaskumar_


Summary


John McBride discusses his experience deploying open-source AI technologies at scale with Kubernetes. He shares insights on building AI-enabled applications and the challenges of managing large-scale data engineering.


The conversation focuses on the use of Kubernetes as a platform for running compute and the decision to use TimeScaleDB for storing time-series data and vectors. McBride also highlights the importance of data-intensive applications and recommends the book 'Designing Data-Intensive Applications' for further reading.


The conversation discusses the process of migrating from OpenAI to an open-source large language model (LLM) inference engine. The decision to switch to an open-source LLM was driven by the need for cost optimization and the desire to have more control over the infrastructure. VLLM was chosen as the inference engine due to its compatibility with the OpenAI API and its performance. The migration process involved deploying Kubernetes, setting up node groups with GPUs, running VLLM pods, and using a Kubernetes service for load balancing.


The conversation emphasizes the importance of choosing the right level of abstraction and understanding the trade-offs involved.


Takeaways


1. Building AI-enabled applications requires good mass-scale data engineering.

2. Kubernetes is an excellent platform for servicing large-scale applications.

3. TimeScaleDB, built on top of Postgres, is a suitable choice for storing time-series data and vectors.

4. The book 'Designing Data-Intensive Applications' is recommended for understanding data-intensive application development.

5. Choosing the right level of abstraction is important, and it depends on factors such as expertise, time constraints, and specific requirements.

6. The use of Kubernetes can be complex and expensive, and it may not be necessary for all startups.

7. The decision to adopt Kubernetes should consider the scale and needs of the company, as well as the operational burden it may bring.


Chapters


00:00 John McBride

03:05 Introduction and Background

07:24 Summary of the Blog Post

12:15 The Role of Kubernetes in AI-Enabled Applications

16:10 The Use of TimeScaleDB for Storing Time-Series Data and Vectors

35:37 Migrating to an Open-Source LLM Inference Engine

47:35 Deploying Kubernetes and Setting Up Node Groups

55:14 Choosing VLLM as the Inference Engine

1:02:21 The Migration Process: Deploying Kubernetes and Setting Up Node Groups

1:08:02 Choosing the Right Level of Abstraction

1:24:12 Challenges in Evaluating Language Model Performance

1:31:41 Considerations for Adopting Kubernetes in Startups

Hosted on Acast. See acast.com/privacy for more information.

...more
View all episodesView all episodes
Download on the App Store

ConTejas CodeBy Tejas Kumar

  • 5
  • 5
  • 5
  • 5
  • 5

5

9 ratings


More shows like ConTejas Code

View all
Hanselminutes with Scott Hanselman by Scott Hanselman

Hanselminutes with Scott Hanselman

377 Listeners

Software Engineering Radio - the podcast for professional software developers by se-radio@computer.org

Software Engineering Radio - the podcast for professional software developers

272 Listeners

The Changelog: Software Development, Open Source by Changelog Media

The Changelog: Software Development, Open Source

284 Listeners

Accidental Tech Podcast by Marco Arment, Casey Liss, John Siracusa

Accidental Tech Podcast

2,092 Listeners

Software Engineering Daily by Software Engineering Daily

Software Engineering Daily

621 Listeners

Soft Skills Engineering by Jamison Dance and Dave Smith

Soft Skills Engineering

269 Listeners

Syntax - Tasty Web Development Treats by Wes Bos & Scott Tolinski - Full Stack JavaScript Web Developers

Syntax - Tasty Web Development Treats

987 Listeners

Practical AI by Practical AI LLC

Practical AI

192 Listeners

The Stack Overflow Podcast by The Stack Overflow Podcast

The Stack Overflow Podcast

62 Listeners

Dwarkesh Podcast by Dwarkesh Patel

Dwarkesh Podcast

426 Listeners

PodRocket - A web development podcast from LogRocket by LogRocket

PodRocket - A web development podcast from LogRocket

57 Listeners

devtools.fm: Developer Tools, Open Source, Software Development by Andrew Lisowski, Justin Bennett

devtools.fm: Developer Tools, Open Source, Software Development

26 Listeners

Latent Space: The AI Engineer Podcast by swyx + Alessio

Latent Space: The AI Engineer Podcast

75 Listeners

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis by Nathaniel Whittemore

The AI Daily Brief (Formerly The AI Breakdown): Artificial Intelligence News and Analysis

491 Listeners

The Pragmatic Engineer by Gergely Orosz

The Pragmatic Engineer

63 Listeners