Kubernetes Pod Stuck in Pending? Here's How to Debug It Like a Pro
You've deployed your application to Kubernetes, but something's wrong. Your pod is just sitting there, stubbornly stuck in Pending state. No errors, no crashes—just... waiting. Sound familiar?
This is one of the most common frustrations for developers working with Kubernetes. The good news? Once you know where to look, the fix is usually straightforward. In this guide, we'll walk through exactly how to diagnose and resolve pending pods, with real commands and scenarios you can use today.
What Does "Pending" Actually Mean?
When a pod is in Pending state, it means the Kubernetes scheduler hasn't been able to assign it to a node. This isn't about your container crashing—it hasn't even started yet. The scheduler is essentially saying, "I can't find a suitable home for this pod."
The reasons usually fall into these categories:
- Insufficient resources: Not enough CPU, memory, or storage on available nodes
-
Node selection constraints:
nodeSelector,nodeAffinity, ortaints/tolerationsthat don't match - Persistent volume issues: PVCs that can't bind to a PV
- Resource quotas: Limits that prevent scheduling in a namespace
Let's debug each of these systematically.
Step 1: Check Pod Events with kubectl describe
Your first stop is always kubectl describe pod. This shows the Events section at the bottom, which tells you exactly why the scheduler rejected your pod.
kubectl describe pod <pod-name> -n <namespace>
Look for the Events section at the bottom:
Events:
Type Reason Age From Message
---- ------ ---- ---- -------
Warning FailedScheduling 12s default-scheduler 0/3 nodes are available: 3 Insufficient cpu.
This output tells you the scheduler tried all 3 nodes and none had enough CPU. The message is your first clue—use it to guide your next steps.
Step 2: Check Node Resources
If the events mention insufficient CPU or memory, check your nodes' available resources:
kubectl describe nodes | grep -A 5 "Allocated resources"
Or get a cleaner view with:
kubectl top nodes
You'll see something like:
NAME CPU(cores) CPU% MEMORY(bytes) MEMORY%
node-1 1800m 90% 14Gi 70%
node-2 1500m 75% 12Gi 60%
node-3 1900m 95% 15Gi 75%
If nodes are heavily utilized, you have a few options:
- Scale down less critical workloads
- Add more nodes to the cluster
- Reduce your pod's resource requests (if possible)
Check what your pod is requesting:
kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.spec.containers[*].resources.requests}'
Step 3: Check Node Selectors and Affinities
If your pod uses nodeSelector or nodeAffinity, ensure nodes with matching labels exist:
# Check your pod's node selector
kubectl get pod <pod-name> -n <namespace> -o yaml | grep -A 10 nodeSelector
# List nodes with their labels
kubectl get nodes --show-labels
For nodeAffinity, the check is similar:
# Your pod spec might have:
affinity:
nodeAffinity:
requiredDuringSchedulingIgnoredDuringExecution:
nodeSelectorTerms:
- matchExpressions:
- key: disktype
operator: In
values:
- ssd
If no node has disktype=ssd, your pod will stay pending forever.
Fix: Either add the label to a node:
kubectl label node <node-name> disktype=ssd
Or remove/modify the affinity rule in your pod spec.
Step 4: Check PVC Binding Issues
If your pod uses a PersistentVolumeClaim (PVC), ensure it's bound:
kubectl get pvc -n <namespace>
You want to see:
NAME STATUS VOLUME CAPACITY ACCESS MODES STORAGECLASS AGE
data-pvc Bound pvc-abc123-... 10Gi RWO standard 5m
If the status is Pending, the PVC can't find a matching PV. Check the storage class and access modes:
kubectl describe pvc <pvc-name> -n <namespace>
Common issues:
- StorageClass doesn't exist: Ensure the StorageClass is created
- No PV available: If using manual provisioning, create a PV matching the PVC's requirements
-
Access mode mismatch: PVC requests
ReadWriteManybut onlyReadWriteOncePVs exist
Step 5: Check Taints and Tolerations
Nodes can have taints that repel pods unless the pods have matching tolerations:
# Check node taints
kubectl describe nodes | grep -A 5 Taints
Common taints:
Taints: node.kubernetes.io/not-ready:NoSchedule
Taints: node.kubernetes.io/unschedulable:NoSchedule
Taints: dedicated=gpu:NoSchedule
If you see NoSchedule taints, your pod needs tolerations:
tolerations:
- key: "dedicated"
operator: "Equal"
value: "gpu"
effect: "NoSchedule"
Or to remove the taint:
kubectl taint nodes <node-name> dedicated:NoSchedule-
Real-World Scenarios
Scenario 1: Cluster Over-Provisioned
Symptoms: Multiple deployments stuck in Pending, events show "Insufficient cpu/memory"
Root Cause: Your cluster is running too many workloads for its capacity.
Solutions:
- Remove unused deployments
- Add nodes (horizontal scaling)
- Reduce pod resource requests (vertical optimization)
# Find pods using most resources
kubectl top pods --all-namespaces --sort-by=memory
kubectl top pods --all-namespaces --sort-by=cpu
Scenario 2: Node Selector Mismatch
Symptoms: Pod pending with message like "0/3 nodes are available: 3 node(s) didn't match node selector"
Root Cause: Pod requires a node label that doesn't exist.
Solution: Add the label or remove the constraint.
# Add label to make it schedulable
kubectl label node node-1 zone=us-east-1a
Scenario 3: PVC Not Binding
Symptoms: Pod stuck, PVC shows Pending status
Root Cause: No PersistentVolume matches the PVC's requirements.
Solution: Create a matching PV or use dynamic provisioning:
apiVersion: v1
kind: PersistentVolume
metadata:
name: manual-pv
spec:
capacity:
storage: 10Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"
Scenario 4: Resource Quotas Blocking
Symptoms: Pod pending, events mention "exceeded quota"
Root Cause: Namespace has a ResourceQuota limiting total resources.
Solution: Check and adjust the quota:
kubectl get resourcequota -n <namespace>
kubectl describe resourcequota <quota-name> -n <namespace>
Either increase the quota or reduce resource requests in your deployment.
FAQ
Q: Why does my pod work in dev but not prod?
Different environments often have different node counts, resource limits, and storage classes. Always check:
- Node count and resources (
kubectl top nodes) - StorageClasses available (
kubectl get storageclass) - ResourceQuotas (
kubectl get quota)
Q: How do I see scheduler logs?
# For kubeadm clusters
kubectl logs -n kube-system kube-scheduler-<master-node-name>
# Or check control plane logs directly
journalctl -u kube-scheduler
Q: Can I force a pod onto a specific node?
Yes, but only use this for debugging:
spec:
nodeName: <node-name>
This bypasses the scheduler entirely. For production, use nodeAffinity instead.
Q: What if I don't have enough nodes?
If you're running locally (minikube, kind, Docker Desktop), you're limited to one node by default. Consider:
- Reducing resource requests
- Using cluster autoscaler on managed Kubernetes
- Adding nodes to your local cluster
Conclusion
A pod stuck in Pending state is frustrating but always diagnosable. The key is to follow a systematic approach:
-
Start with
kubectl describe pod— the Events section is your friend - Check node resources — ensure capacity for your pod's requests
- Verify selectors and affinities — labels must match
- Confirm PVC binding — storage must be available
- Review taints and tolerations — pods need tolerations for tainted nodes
Once you've diagnosed the issue, the fix is usually straightforward: adjust resource requests, add missing labels, provision storage, or remove taints. Keep this guide handy, and you'll never be stuck wondering why your pod won't schedule.
Happy debugging! 🚀
What's your most confusing Kubernetes scheduling issue? Drop a comment below and I'll help you debug it.
Top comments (0)