The AI Kubernetes Show

Navigating the AI Era at Bloomberg


Listen Later

Don't miss this deep dive with Bloomberg's Alexa Griffith on how a financial firm is navigating the cultural and technological shift to Generative AI. Learn the critical role of platform engineering and Kubernetes in building a secure, scalable, and future-proof AI infrastructure.

In this episode of the AI Kubernetes Show, Senior Software Engineer Alexa Griffith discusses Bloomberg's journey from predictive to Generative AI. She reveals how their early investment in Kubernetes back in 2016 for AI workloads led to open-source projects like KServe and the newer Envoy AI Gateway, built to manage traffic and unify APIs in a hybrid cloud environment. The conversation explores the massive challenge of wrangling the non-deterministic nature of LLMs, which dramatically affects cost and observability—a problem Bloomberg addresses by building a "deterministic cage" around the core models.

Alexa details the new requirements for platform engineering teams, including the shift to token streaming and increased GPU usage. She introduces the concept of a Model Garden as a crucial layer of abstraction and enablement, providing developers with benchmarking data and features to make informed, use-case-driven choices. The discussion also covers critical security concerns, the use of LLMs as judges for evaluation, and the debunking of assumptions around RAG systems safety. Finally, she explains the evolution of performance metrics, highlighting the importance of time to first token and token flow for a good user experience in the era of agentic systems.

Read the blog post: 

Key Learnings/Takeaways

Generative AI is an evolution, not a revolution, built on a strong predictive AI foundation.

✓ Non-deterministic LLMs are managed by building a "deterministic cage" for control, cost, and observability.

✓ The Model Garden is a key platform engineering concept for abstracting model complexity and empowering developers.

✓ New performance metrics are critical, specifically time to first token (snappiness) and token flow (continuous experience).

✓ The Envoy AI Gateway and the MCP protocol are essential for unifying disparate model APIs and managing agentic systems.

If you enjoyed this conversation, hit the like button, subscribe for more content on Kubernetes and AI, and leave a comment below! What is the biggest platform engineering challenge you're currently facing with Generative AI?

#GenerativeAI #AI #Kubernetes #PlatformEngineering #LLMOps #AIInfrastructure #KServe #CNCF #CloudNative #TechAtBloomberg

...more
View all episodesView all episodes
Download on the App Store

The AI Kubernetes ShowBy The AI Kubernetes Show