Platform Engineering Playbook Podcast

Okta's GitOps Journey - Scaling ArgoCD from 12 to 1,000 Clusters


Listen Later

In five years, Okta scaled Auth0's private cloud from 12 to 1,000+ Kubernetes clusters using ArgoCD. At KubeCon 2025, engineers Jérémy Albuixech and Kahou Lei shared their hard-won lessons. This episode breaks down the challenges, solutions, and practical wisdom for scaling GitOps to enterprise levels.

Full episode page: https://platformengineeringplaybook.com/podcasts/00058-okta-gitops-argocd-1000-clusters

In this episode, we cover:

- The 83x scaling journey: from 12 clusters in 2020 to 1,000+ in 2025
- Five major challenges at scale: controller degradation, centralized bottlenecks, application explosion, global latency, observability gaps
- Five key solutions: controller sharding, ArgoCD Agent hub-spoke model, Application Sets templating, progressive rollouts, purpose-built observability
- When to implement sharding (hint: 100+ clusters is the threshold)
- The ArgoCD UI degradation wall at 1,000 applications
- Six lessons learned including "GitOps doesn't solve organizational problems"
- Practical guidance for teams at 10-50, 100-500, and 500+ cluster scales

Plus news on Helm v4.0.4/v3.19.4 releases, Zero Trust in CI/CD Pipelines guide, 1 billion row migration without downtime, Microsoft Azure HorizonDB, and the Platform Engineering State 2026 report.

Sources:

- The New Stack: How Okta Scaled From 12 to 1000 Kubernetes Clusters With Argo CD
- ITNEXT: How We Load Test Argo CD at Scale: 1,000 vClusters with GitOps
- Red Hat: Multi-cluster GitOps with the Argo CD Agent
- KubeCon + CloudNativeCon Atlanta 2025: "One Dozen To One Thousand Clusters" by Jérémy Albuixech and Kahou Lei

#DevOps #PlatformEngineering #GitOps #ArgoCD #Kubernetes #MultiCluster #CNCF #KubeCon #CloudNative #SRE

...more
View all episodesView all episodes
Download on the App Store

Platform Engineering Playbook PodcastBy vibesre