Yash

Posted on Mar 22

Kubernetes CrashLoopBackOff: Root Cause and Fix (With Real Examples)

#kubernetes #k8s #devops #docker

Kubernetes CrashLoopBackOff: Root Cause and Fix (With Real Examples)

NAME          READY   STATUS             RESTARTS   AGE
my-app-pod    0/1     CrashLoopBackOff   8          4m

You've seen this. Your pod is stuck in a death loop and you don't know why.

CrashLoopBackOff is not a Kubernetes bug. It means your container started, crashed, Kubernetes tried to restart it, and it crashed again — repeatedly, with increasing back-off delays.

Here's how to find the real cause and fix it.

Step 1: Get the Crash Logs

# Get logs from the PREVIOUS (crashed) container instance
kubectl logs <pod-name> --previous

# If multiple containers in the pod
kubectl logs <pod-name> -c <container-name> --previous

The --previous flag is critical. Without it, you might get logs from the brand-new (about to crash) instance.

Real output examples:

Error: Cannot find module '/app/server.js'
    at Function.Module._resolveFilename

→ Wrong entrypoint in Dockerfile

Error: connect ECONNREFUSED 127.0.0.1:5432

→ Database isn't reachable from inside the pod

Killed

→ OOM kill (memory limit exceeded)

Step 2: Describe the Pod

kubectl describe pod <pod-name>

Look at the Events section at the bottom:

Events:
  Warning  BackOff    2m    kubelet  Back-off restarting failed container
  Warning  OOMKilling 3m    kubelet  Memory limit reached. Killing container my-app

Key events to look for:

OOMKilling — memory limit too low
FailedMount — volume or secret not found
Pulling + ErrImagePull — image tag doesn't exist
Unhealthy — liveness probe failing

Step 3: Check Resource Limits

kubectl get pod <pod-name> -o yaml | grep -A 10 resources

resources:
  limits:
    memory: "64Mi"     # This is often way too low for Node apps
    cpu: "250m"
  requests:
    memory: "32Mi"
    cpu: "125m"

If your app needs 200Mi to start but the limit is 64Mi, it gets OOM killed before the first request.

Common Root Causes and Fixes

1. Missing Environment Variable

# Check what env vars the app expects
kubectl exec -it <pod-name> -- env | sort

# Compare against your deployment
kubectl get deployment <name> -o yaml | grep -A 20 env

Fix: Add the missing env var to your deployment YAML.

2. Database Connection Refused

# Test connectivity from inside the pod
kubectl exec -it <pod-name> -- nc -zv db-service 5432

Fix: Ensure the database service name matches what your app uses. In Kubernetes, use service names, not IPs.

3. Memory Limit Too Low

# Check current memory usage before it dies
kubectl top pod <pod-name>

Fix:

resources:
  limits:
    memory: "512Mi"  # Increase this
  requests:
    memory: "256Mi"

4. Wrong Image or Entrypoint

# Test the image locally
docker run --rm -it your-image:tag sh
# Then try running the entrypoint manually

Quick Fix Command Sequence

# 1. Get crash reason
kubectl logs <pod-name> --previous 2>&1 | tail -30

# 2. Check k8s events
kubectl describe pod <pod-name> | grep -A 20 Events

# 3. Check resource usage
kubectl top pod <pod-name>

# 4. Once fixed, force rollout
kubectl rollout restart deployment/<name>

# 5. Watch the new pods
kubectl get pods -w

I built ARIA to solve exactly this.
Try it free at step2dev.com — no credit card needed.

DEV Community

Kubernetes CrashLoopBackOff: Root Cause and Fix (With Real Examples)

Kubernetes CrashLoopBackOff: Root Cause and Fix (With Real Examples)

Step 1: Get the Crash Logs

Step 2: Describe the Pod

Step 3: Check Resource Limits

Common Root Causes and Fixes

1. Missing Environment Variable

2. Database Connection Refused

3. Memory Limit Too Low

4. Wrong Image or Entrypoint

Quick Fix Command Sequence

Top comments (0)