DEV Community

Mumtaz Jahan
Mumtaz Jahan

Posted on

How to Fixed a Kubernetes CrashLoopBackOff in Production

Tags: kubernetes devops debugging cloud

Problem: Application Was DOWN in Kubernetes

One of the most stressful moments in DevOps — you check your monitoring dashboard and your application is completely DOWN in Kubernetes. No graceful degradation. Just... down.

Here's exactly how I diagnosed and fixed it in under an hour.


Issue Found: Pod Was in CrashLoopBackOff

Running kubectl get pods revealed the culprit immediately:

NAME                        READY   STATUS             RESTARTS   AGE
my-app-7d9f8b6c4-xk2pq     0/1     CrashLoopBackOff   8          20m
Enter fullscreen mode Exit fullscreen mode

CrashLoopBackOff means Kubernetes is repeatedly trying to start your container, it crashes, and Kubernetes backs off with increasing wait times before retrying. Something inside the container was causing it to exit immediately on startup.


Debug Steps

Step 1: Checked Logs (kubectl logs)

kubectl logs my-app-7d9f8b6c4-xk2pq --previous
Enter fullscreen mode Exit fullscreen mode

The --previous flag is crucial here — it lets you see logs from the crashed container, not the current (possibly empty) one. The logs showed repeated connection errors on startup.

Step 2: Checked Config

kubectl describe pod my-app-7d9f8b6c4-xk2pq
Enter fullscreen mode Exit fullscreen mode

I inspected the environment variables and ConfigMaps attached to the pod. The describe command is a goldmine — it shows events, resource limits, volume mounts, and more.

Step 3: Found DB Connection Issue

The logs made it clear: the app was trying to connect to the database using an incorrect connection string. The host value in the environment variable was pointing to a stale endpoint. The app would crash immediately on boot since it couldn't reach the DB.


Fix Applied

Corrected Environment Variables

Updated the Kubernetes secret/configmap with the correct database host:

kubectl edit secret my-app-db-secret
# or
kubectl set env deployment/my-app DB_HOST=correct-db-host.internal
Enter fullscreen mode Exit fullscreen mode

Restarted the Deployment

kubectl rollout restart deployment/my-app
Enter fullscreen mode Exit fullscreen mode

Then watched the rollout:

kubectl rollout status deployment/my-app
Enter fullscreen mode Exit fullscreen mode

🎉 Result

 Application UP
 Issue resolved
Enter fullscreen mode Exit fullscreen mode

The pods came up healthy, readiness probes passed, and traffic started flowing again.


Key Takeaways

Always check logs with --previous — the live container may have no logs if it crashes before writing any.

kubectl describe pod is your best friend for seeing the full picture: events, env vars, resource pressure.

CrashLoopBackOff is almost always one of: bad env vars/secrets, missing config, OOM kill, or a bug triggered at startup.

kubectl rollout restart is safer than deleting pods manually — it does a rolling restart with zero downtime.


*Hit a similar issue? Drop your debugging story in the comments *


Top comments (0)