Platform Engineering Playbook Podcast

AWS re:Invent 2025 Recap Part 3/4 - EKS & Cloud Operations


Listen Later

Part 3 of our AWS re:Invent 2025 series. AWS transforms Kubernetes into an AI infrastructure platform with massive scale and AI-native operations.

In this episode:

- EKS Ultra Scale: 100,000 nodes per cluster (vs 15K GKE, 5K AKS)—1.6 million Trainium accelerators or 800K GPUs in a single cluster
- AWS replaced etcd's Raft consensus with their internal "journal" system and moved to in-memory storage for 500 pods/sec at 100K scale
- Anthropic using EKS Ultra Scale for Claude training, improving latency KPIs from 35% to 90%+
- EKS Capabilities: Fully managed Argo CD, AWS Controllers for Kubernetes (200+ CRDs for 50+ services), Kube Resource Orchestrator
- EKS MCP Server: Natural language Kubernetes management—"show me all pods not running" instead of kubectl
- EKS Provisioned Control Plane: XL/2XL/4XL tiers ($1.65-$6.90/hr), 4XL supports 40K nodes
- CloudWatch Gen AI Observability: LangChain, LangGraph, CrewAI agent tracing
- DevOps Agent (Preview): Autonomous on-call engineer—Kindle saw 80% time savings
- CloudWatch unified data store with S3 Tables, OCSF, Apache Iceberg

📰 News Segment Links:

• cert-manager v1.19.2 CVE Patches (CVE-2025-61727, CVE-2025-61729)
  https://github.com/cert-manager/cert-manager/releases/tag/v1.19.2
• cert-manager v1.18.4 Backport
  https://github.com/cert-manager/cert-manager/releases/tag/v1.18.4
• Canonical Extends Kubernetes Long-Term Support to 15 Years
  https://thenewstack.io/canonical-extends-kubernetes-long-term-support-to-15-years/
• OpenTofu 1.11 with Ephemeral Resources
  https://github.com/opentofu/opentofu/releases/tag/v1.11.0
• Cloudflare Shift-Left Enterprise IaC
  https://blog.cloudflare.com/shift-left-enterprise-scale/

🔗 Main Content Sources:

• EKS Ultra Scale 100K Nodes
  hhttps://aws.amazon.com/blogs/containers/under-the-hood-amazon-eks-ultra-scale-clusters/
• EKS Capabilities Announcement
  https://aws.amazon.com/blogs/aws/announcing-amazon-eks-capabilities-for-workload-orchestration-and-cloud-resource-management/
• EKS MCP Server
  https://aws.amazon.com/blogs/containers/introducing-the-fully-managed-amazon-eks-mcp-server-preview/
• EKS Provisioned Control Plane
  https://aws.amazon.com/blogs/containers/amazon-eks-introduces-provisioned-control-plane/
• Cloud Operations Top 10 Announcements
  https://aws.amazon.com/blogs/mt/2025-top-10-announcements-for-aws-cloud-operations/
• AI-driven Operations at re:Invent
  https://aws.amazon.com/blogs/mt/embracing-ai-driven-operations-and-observability-at-reinvent-2025/

Perfect for platform engineers, SREs, DevOps engineers, and cloud architects looking to level up their platform engineering skills.

Episode URL: https://platformengineering.org/podcasts/00051-aws-reinvent-2025-eks-cloud-operations

Series: AWS re:Invent 2025 (Part 3 of 4)

Episode URL: https://platformengineeringplaybook.com/podcasts/00051-aws-reinvent-2025-eks-cloud-operations

Part 1: The Agentic AI Revolution - https://platformengineeringplaybook.com/podcasts/00049-aws-reinvent-2025-agentic-ai-revolution

Part 2: Infrastructure & Developer Experience - https://platformengineeringplaybook.com/podcasts/00050-aws-reinvent-2025-infrastructure-developer-experience

Category: Technology

Subcategory: Software How-To

Keywords: AWS, re:Invent 2025, EKS, Kubernetes, EKS Ultra Scale, EKS Capabilities, Argo CD, ACK, MCP Server, CloudWatch, DevOps Agent, AIOps, platform engineering

...more
View all episodesView all episodes
Download on the App Store

Platform Engineering Playbook PodcastBy vibesre