DEV Community

Yash
Yash

Posted on

Kubernetes CrashLoopBackOff: Root Cause and Fix (With Real Examples)

Kubernetes CrashLoopBackOff: Root Cause and Fix (With Real Examples)

NAME          READY   STATUS             RESTARTS   AGE
my-app-pod    0/1     CrashLoopBackOff   8          4m
Enter fullscreen mode Exit fullscreen mode

You've seen this. Your pod is stuck in a death loop and you don't know why.

CrashLoopBackOff is not a Kubernetes bug. It means your container started, crashed, Kubernetes tried to restart it, and it crashed again — repeatedly, with increasing back-off delays.

Here's how to find the real cause and fix it.

Step 1: Get the Crash Logs

# Get logs from the PREVIOUS (crashed) container instance
kubectl logs <pod-name> --previous

# If multiple containers in the pod
kubectl logs <pod-name> -c <container-name> --previous
Enter fullscreen mode Exit fullscreen mode

The --previous flag is critical. Without it, you might get logs from the brand-new (about to crash) instance.

Real output examples:

Error: Cannot find module '/app/server.js'
    at Function.Module._resolveFilename
Enter fullscreen mode Exit fullscreen mode

→ Wrong entrypoint in Dockerfile

Error: connect ECONNREFUSED 127.0.0.1:5432
Enter fullscreen mode Exit fullscreen mode

→ Database isn't reachable from inside the pod

Killed
Enter fullscreen mode Exit fullscreen mode

→ OOM kill (memory limit exceeded)

Step 2: Describe the Pod

kubectl describe pod <pod-name>
Enter fullscreen mode Exit fullscreen mode

Look at the Events section at the bottom:

Events:
  Warning  BackOff    2m    kubelet  Back-off restarting failed container
  Warning  OOMKilling 3m    kubelet  Memory limit reached. Killing container my-app
Enter fullscreen mode Exit fullscreen mode

Key events to look for:

  • OOMKilling — memory limit too low
  • FailedMount — volume or secret not found
  • Pulling + ErrImagePull — image tag doesn't exist
  • Unhealthy — liveness probe failing

Step 3: Check Resource Limits

kubectl get pod <pod-name> -o yaml | grep -A 10 resources
Enter fullscreen mode Exit fullscreen mode
resources:
  limits:
    memory: "64Mi"     # This is often way too low for Node apps
    cpu: "250m"
  requests:
    memory: "32Mi"
    cpu: "125m"
Enter fullscreen mode Exit fullscreen mode

If your app needs 200Mi to start but the limit is 64Mi, it gets OOM killed before the first request.

Common Root Causes and Fixes

1. Missing Environment Variable

# Check what env vars the app expects
kubectl exec -it <pod-name> -- env | sort

# Compare against your deployment
kubectl get deployment <name> -o yaml | grep -A 20 env
Enter fullscreen mode Exit fullscreen mode

Fix: Add the missing env var to your deployment YAML.

2. Database Connection Refused

# Test connectivity from inside the pod
kubectl exec -it <pod-name> -- nc -zv db-service 5432
Enter fullscreen mode Exit fullscreen mode

Fix: Ensure the database service name matches what your app uses. In Kubernetes, use service names, not IPs.

3. Memory Limit Too Low

# Check current memory usage before it dies
kubectl top pod <pod-name>
Enter fullscreen mode Exit fullscreen mode

Fix:

resources:
  limits:
    memory: "512Mi"  # Increase this
  requests:
    memory: "256Mi"
Enter fullscreen mode Exit fullscreen mode

4. Wrong Image or Entrypoint

# Test the image locally
docker run --rm -it your-image:tag sh
# Then try running the entrypoint manually
Enter fullscreen mode Exit fullscreen mode

Quick Fix Command Sequence

# 1. Get crash reason
kubectl logs <pod-name> --previous 2>&1 | tail -30

# 2. Check k8s events
kubectl describe pod <pod-name> | grep -A 20 Events

# 3. Check resource usage
kubectl top pod <pod-name>

# 4. Once fixed, force rollout
kubectl rollout restart deployment/<name>

# 5. Watch the new pods
kubectl get pods -w
Enter fullscreen mode Exit fullscreen mode

I built ARIA to solve exactly this.
Try it free at step2dev.com — no credit card needed.

Top comments (0)