DevOps & Cloud Interview Prep: Real Scenarios & Answers

Karpenter Lifecycle: How GPU Pods Get Unstuck


Listen Later

A pending ML training job needing 8 GPUs is a classic Karpenter interview scenario — here's the exact four-step lifecycle an interviewer expects you to walk through.

You'll learn:

  • Why the K8s scheduler marks pods unschedulable and how Karpenter's controller watches for that signal
  • How Karpenter evaluates all pod constraints at once — resource requests, nodeSelector, nodeAffinity, tolerations, and topology spread
  • How it calls the EC2 API to select the right instance (p3.16xlarge for 8 GPUs) in the correct availability zone
  • Why Karpenter provisions the node but the K8s scheduler still does the final pod binding — a gotcha that trips up a lot of candidates
  • Keywords: Karpenter node provisioning, Kubernetes GPU scheduling, pending pods interview question, Karpenter vs cluster autoscaler, K8s scheduler lifecycle

    🎧 Listen, then go deeper — DevOps & Cloud interview-prep ebooks at DevOpsInterview.Cloud

    ...more
    View all episodesView all episodes
    Download on the App Store

    DevOps & Cloud Interview Prep: Real Scenarios & AnswersBy https://DevOpsInterview.Cloud