Most DevOps tutorials have a problem.
They explain things like this:
“Here is what CrashLoopBackOff means.”
“Here is how to fix it.”
But real DevOps work doesn’t look like that.
Real incidents look like this:
kubectl get pods
kubectl logs api
kubectl describe pod api
kubectl get services
You investigate.
You read logs.
You try commands.
You guess.
You debug.
So I built a small DevOps Learning Simulator where you practice debugging Kubernetes incidents like you would in a real environment.
The Idea
Instead of reading solutions, you interactively investigate problems using real commands from Kubernetes.
You run commands such as:
kubectl get pods
kubectl logs <pod>
kubectl describe pod <pod>
kubectl get services
kubectl describe service <service>
kubectl get endpoints
Then you try to find the root cause of the incident.
Example Incident
You start the simulator and run:
kubectl get pods --show-labels
Output:
NAME READY STATUS RESTARTS AGE LABELS
api-deployment-7d4f8b9c 0/1 CrashLoopBackOff 8 36m -
nginx-deployment-5f6g7h8i 1/1 Running 0 5m app=nginx
You investigate logs:
kubectl logs api-deployment
Then describe the pod:
kubectl describe pod api-deployment
Eventually, you discover the issue:
A missing ConfigMap caused the container to crash.
The simulator then checks your answer and gives feedback.
Current Scenarios
Version 1 includes several common production incidents:
• CrashLoopBackOff caused by missing configuration
• OOMKilled due to incorrect memory limits
• DNS / service selector mismatch
These are problems engineers regularly see when working with Kubernetes.
Why I Built This
Many developers learn DevOps tools but never practice debugging real incidents.
They know commands but haven’t used them in a realistic investigation.
This project aims to address that by providing a safe environment for troubleshooting practice.
Think of it like a flight simulator for DevOps engineers.
Try It
You can try it here:
GitHub Repository
https://github.com/FarooqShabbir/devops_simulator
Run it locally:
git clone https://github.com/FarooqShabbir/devops_simulator.git
cd devops_simulator
python devops_simulator.py
Then start investigating incidents.
Future Plans
Some ideas for future versions:
• CI/CD pipeline failures
• Infrastructure drift debugging
• Network policy issues
• Multi-service production incidents
If you have ideas or want to contribute, feel free to open an issue or pull request.
Feedback
I would love to hear from other DevOps engineers:
What incidents would you add to a DevOps debugging simulator?
Top comments (0)