AttractivePenguin

Posted on Mar 21

Kubernetes Pod Stuck in Pending? Here's How to Debug It Like a Pro

#kubernetes #devops #debugging #tutorial

Kubernetes Pod Stuck in Pending? Here's How to Debug It Like a Pro

You've deployed your application to Kubernetes, but something's wrong. Your pod is just sitting there, stubbornly stuck in Pending state. No errors, no crashes—just... waiting. Sound familiar?

This is one of the most common frustrations for developers working with Kubernetes. The good news? Once you know where to look, the fix is usually straightforward. In this guide, we'll walk through exactly how to diagnose and resolve pending pods, with real commands and scenarios you can use today.

What Does "Pending" Actually Mean?

When a pod is in Pending state, it means the Kubernetes scheduler hasn't been able to assign it to a node. This isn't about your container crashing—it hasn't even started yet. The scheduler is essentially saying, "I can't find a suitable home for this pod."

The reasons usually fall into these categories:

Insufficient resources: Not enough CPU, memory, or storage on available nodes
Node selection constraints: nodeSelector, nodeAffinity, or taints/tolerations that don't match
Persistent volume issues: PVCs that can't bind to a PV
Resource quotas: Limits that prevent scheduling in a namespace

Let's debug each of these systematically.

Step 1: Check Pod Events with `kubectl describe`

Your first stop is always kubectl describe pod. This shows the Events section at the bottom, which tells you exactly why the scheduler rejected your pod.

kubectl describe pod <pod-name> -n <namespace>

Look for the Events section at the bottom:

Events:
  Type     Reason            Age   From               Message
  ----     ------            ----  ----               -------
  Warning  FailedScheduling  12s   default-scheduler  0/3 nodes are available: 3 Insufficient cpu.

This output tells you the scheduler tried all 3 nodes and none had enough CPU. The message is your first clue—use it to guide your next steps.

Step 2: Check Node Resources

If the events mention insufficient CPU or memory, check your nodes' available resources:

kubectl describe nodes | grep -A 5 "Allocated resources"

Or get a cleaner view with:

kubectl top nodes

You'll see something like:

NAME       CPU(cores)   CPU%   MEMORY(bytes)   MEMORY%
node-1     1800m        90%    14Gi            70%
node-2     1500m        75%    12Gi            60%
node-3     1900m        95%    15Gi            75%

If nodes are heavily utilized, you have a few options:

Scale down less critical workloads
Add more nodes to the cluster
Reduce your pod's resource requests (if possible)

Check what your pod is requesting:

kubectl get pod <pod-name> -n <namespace> -o jsonpath='{.spec.containers[*].resources.requests}'

Step 3: Check Node Selectors and Affinities

If your pod uses nodeSelector or nodeAffinity, ensure nodes with matching labels exist:

# Check your pod's node selector
kubectl get pod <pod-name> -n <namespace> -o yaml | grep -A 10 nodeSelector

# List nodes with their labels
kubectl get nodes --show-labels

For nodeAffinity, the check is similar:

# Your pod spec might have:
affinity:
  nodeAffinity:
    requiredDuringSchedulingIgnoredDuringExecution:
      nodeSelectorTerms:
      - matchExpressions:
        - key: disktype
          operator: In
          values:
          - ssd

If no node has disktype=ssd, your pod will stay pending forever.

Fix: Either add the label to a node:

kubectl label node <node-name> disktype=ssd

Or remove/modify the affinity rule in your pod spec.

Step 4: Check PVC Binding Issues

If your pod uses a PersistentVolumeClaim (PVC), ensure it's bound:

kubectl get pvc -n <namespace>

You want to see:

NAME        STATUS   VOLUME                                     CAPACITY   ACCESS MODES   STORAGECLASS   AGE
data-pvc    Bound    pvc-abc123-...                            10Gi       RWO            standard       5m

If the status is Pending, the PVC can't find a matching PV. Check the storage class and access modes:

kubectl describe pvc <pvc-name> -n <namespace>

Common issues:

StorageClass doesn't exist: Ensure the StorageClass is created
No PV available: If using manual provisioning, create a PV matching the PVC's requirements
Access mode mismatch: PVC requests ReadWriteMany but only ReadWriteOnce PVs exist

Step 5: Check Taints and Tolerations

Nodes can have taints that repel pods unless the pods have matching tolerations:

# Check node taints
kubectl describe nodes | grep -A 5 Taints

Common taints:

Taints: node.kubernetes.io/not-ready:NoSchedule
Taints: node.kubernetes.io/unschedulable:NoSchedule
Taints: dedicated=gpu:NoSchedule

If you see NoSchedule taints, your pod needs tolerations:

tolerations:
- key: "dedicated"
  operator: "Equal"
  value: "gpu"
  effect: "NoSchedule"

Or to remove the taint:

kubectl taint nodes <node-name> dedicated:NoSchedule-

Real-World Scenarios

Scenario 1: Cluster Over-Provisioned

Symptoms: Multiple deployments stuck in Pending, events show "Insufficient cpu/memory"

Root Cause: Your cluster is running too many workloads for its capacity.

Solutions:

Remove unused deployments
Add nodes (horizontal scaling)
Reduce pod resource requests (vertical optimization)

# Find pods using most resources
kubectl top pods --all-namespaces --sort-by=memory
kubectl top pods --all-namespaces --sort-by=cpu

Scenario 2: Node Selector Mismatch

Symptoms: Pod pending with message like "0/3 nodes are available: 3 node(s) didn't match node selector"

Root Cause: Pod requires a node label that doesn't exist.

Solution: Add the label or remove the constraint.

# Add label to make it schedulable
kubectl label node node-1 zone=us-east-1a

Scenario 3: PVC Not Binding

Symptoms: Pod stuck, PVC shows Pending status

Root Cause: No PersistentVolume matches the PVC's requirements.

Solution: Create a matching PV or use dynamic provisioning:

apiVersion: v1
kind: PersistentVolume
metadata:
  name: manual-pv
spec:
  capacity:
    storage: 10Gi
  accessModes:
    - ReadWriteOnce
  hostPath:
    path: "/mnt/data"

Scenario 4: Resource Quotas Blocking

Symptoms: Pod pending, events mention "exceeded quota"

Root Cause: Namespace has a ResourceQuota limiting total resources.

Solution: Check and adjust the quota:

kubectl get resourcequota -n <namespace>
kubectl describe resourcequota <quota-name> -n <namespace>

Either increase the quota or reduce resource requests in your deployment.

FAQ

Q: Why does my pod work in dev but not prod?

Different environments often have different node counts, resource limits, and storage classes. Always check:

Node count and resources (kubectl top nodes)
StorageClasses available (kubectl get storageclass)
ResourceQuotas (kubectl get quota)

Q: How do I see scheduler logs?

# For kubeadm clusters
kubectl logs -n kube-system kube-scheduler-<master-node-name>

# Or check control plane logs directly
journalctl -u kube-scheduler

Q: Can I force a pod onto a specific node?

Yes, but only use this for debugging:

spec:
  nodeName: <node-name>

This bypasses the scheduler entirely. For production, use nodeAffinity instead.

Q: What if I don't have enough nodes?

If you're running locally (minikube, kind, Docker Desktop), you're limited to one node by default. Consider:

Reducing resource requests
Using cluster autoscaler on managed Kubernetes
Adding nodes to your local cluster

Conclusion

A pod stuck in Pending state is frustrating but always diagnosable. The key is to follow a systematic approach:

Start with kubectl describe pod — the Events section is your friend
Check node resources — ensure capacity for your pod's requests
Verify selectors and affinities — labels must match
Confirm PVC binding — storage must be available
Review taints and tolerations — pods need tolerations for tainted nodes

Once you've diagnosed the issue, the fix is usually straightforward: adjust resource requests, add missing labels, provision storage, or remove taints. Keep this guide handy, and you'll never be stuck wondering why your pod won't schedule.

Happy debugging! 🚀

What's your most confusing Kubernetes scheduling issue? Drop a comment below and I'll help you debug it.

DEV Community

Kubernetes Pod Stuck in Pending? Here's How to Debug It Like a Pro

Kubernetes Pod Stuck in Pending? Here's How to Debug It Like a Pro

What Does "Pending" Actually Mean?

Step 1: Check Pod Events with `kubectl describe`

Step 2: Check Node Resources

Step 3: Check Node Selectors and Affinities

Step 4: Check PVC Binding Issues

Step 5: Check Taints and Tolerations

Real-World Scenarios

Scenario 1: Cluster Over-Provisioned

Scenario 2: Node Selector Mismatch

Scenario 3: PVC Not Binding

Scenario 4: Resource Quotas Blocking

FAQ

Q: Why does my pod work in dev but not prod?

Q: How do I see scheduler logs?

Q: Can I force a pod onto a specific node?

Q: What if I don't have enough nodes?

Conclusion

Top comments (0)

Kubernetes Pod Stuck in Pending? Here's How to Debug It Like a Pro

What Does "Pending" Actually Mean?

Step 1: Check Pod Events with kubectl describe

Step 2: Check Node Resources

Step 3: Check Node Selectors and Affinities

Step 4: Check PVC Binding Issues

Step 5: Check Taints and Tolerations

Real-World Scenarios

Scenario 1: Cluster Over-Provisioned

Scenario 2: Node Selector Mismatch

Scenario 3: PVC Not Binding

Scenario 4: Resource Quotas Blocking

FAQ

Q: Why does my pod work in dev but not prod?

Q: How do I see scheduler logs?

Q: Can I force a pod onto a specific node?

Q: What if I don't have enough nodes?

Conclusion

Step 1: Check Pod Events with `kubectl describe`