
Sign up to save your podcasts
Or


The future of the service mesh is here! Don't miss this deep dive with Linkerd creator Oliver Gould on how to conquer the toughest challenges of running stateful AI workloads in production.
In this episode of the AI Kubernetes Show, Buoyant CTO Oliver Gould reveals how Linkerd is adapting its battle-tested features to the new demands of the AI workloads era. He emphasizes that the cost of AI inference failures is extremely high, making network layer tools like Linkerd’s intelligent load balancing and fault tolerance features more critical than ever. The discussion zeroes in on the emerging MCP (Model Context Protocol), a stateful streaming protocol that poses unique challenges to traditional network tooling, particularly because critical information is buried in the payload, not the headers.
Gould details the clash between MCP’s stateful nature and the prevalent Kubernetes microservice architecture, stressing the need to evolve infrastructure planning for auto-scaling and resource provisioning for GPUs. He shares a vision for Linkerd’s MCP support, which includes a load balancer mode optimized for streams and extending policy APIs for MCPRoute. Finally, Gould touches on the positive impact of AI tooling on developer productivity, sharing a personal anecdote about how Copilot and Cortex dramatically accelerated the diagnosis of a complex Go race condition, proving the value of using robots for data analysis and humans for high-value design.
Read the blog post:
Key Learnings/Takeaways
✓ The high cost of AI inference failures makes network reliability (load balancing, retries) an enormous "cost lever" for organizations.
✓ MCP is a stateful streaming protocol where success/failure information is in the JSON payload, making it opaque to most network headers-based tooling.
✓ Running stateful MCP workloads at scale requires evolving stateless Kubernetes infrastructure around auto-scaling, GPU provisioning, and stable load distribution.
✓ Linkerd is building a load balancer optimized for streams and extending policy APIs (like MCPRoute) to support the new protocol.
✓ AI tooling, like Copilot and Cortex, significantly boosts developer productivity by automating boilerplate (like YAML) and quickly diagnosing complex issues.
If you found this discussion valuable, hit the Like button and Subscribe for more insights from the AI Kubernetes Show! Let us know in the comments: What is the biggest stateful workload challenge you are currently facing in your Kubernetes cluster?
#Linkerd #ServiceMesh #Kubernetes #MCP #LoadBalancing #CloudNative #KubeCon
By The AI Kubernetes ShowThe future of the service mesh is here! Don't miss this deep dive with Linkerd creator Oliver Gould on how to conquer the toughest challenges of running stateful AI workloads in production.
In this episode of the AI Kubernetes Show, Buoyant CTO Oliver Gould reveals how Linkerd is adapting its battle-tested features to the new demands of the AI workloads era. He emphasizes that the cost of AI inference failures is extremely high, making network layer tools like Linkerd’s intelligent load balancing and fault tolerance features more critical than ever. The discussion zeroes in on the emerging MCP (Model Context Protocol), a stateful streaming protocol that poses unique challenges to traditional network tooling, particularly because critical information is buried in the payload, not the headers.
Gould details the clash between MCP’s stateful nature and the prevalent Kubernetes microservice architecture, stressing the need to evolve infrastructure planning for auto-scaling and resource provisioning for GPUs. He shares a vision for Linkerd’s MCP support, which includes a load balancer mode optimized for streams and extending policy APIs for MCPRoute. Finally, Gould touches on the positive impact of AI tooling on developer productivity, sharing a personal anecdote about how Copilot and Cortex dramatically accelerated the diagnosis of a complex Go race condition, proving the value of using robots for data analysis and humans for high-value design.
Read the blog post:
Key Learnings/Takeaways
✓ The high cost of AI inference failures makes network reliability (load balancing, retries) an enormous "cost lever" for organizations.
✓ MCP is a stateful streaming protocol where success/failure information is in the JSON payload, making it opaque to most network headers-based tooling.
✓ Running stateful MCP workloads at scale requires evolving stateless Kubernetes infrastructure around auto-scaling, GPU provisioning, and stable load distribution.
✓ Linkerd is building a load balancer optimized for streams and extending policy APIs (like MCPRoute) to support the new protocol.
✓ AI tooling, like Copilot and Cortex, significantly boosts developer productivity by automating boilerplate (like YAML) and quickly diagnosing complex issues.
If you found this discussion valuable, hit the Like button and Subscribe for more insights from the AI Kubernetes Show! Let us know in the comments: What is the biggest stateful workload challenge you are currently facing in your Kubernetes cluster?
#Linkerd #ServiceMesh #Kubernetes #MCP #LoadBalancing #CloudNative #KubeCon