DEV Community

Farooq Shabbir
Farooq Shabbir

Posted on

I Built a DevOps Simulator to Practice Kubernetes Debugging

Most DevOps tutorials have a problem.

They explain things like this:

“Here is what CrashLoopBackOff means.”
“Here is how to fix it.”

But real DevOps work doesn’t look like that.

Real incidents look like this:

kubectl get pods
kubectl logs api
kubectl describe pod api
kubectl get services
Enter fullscreen mode Exit fullscreen mode

You investigate.
You read logs.
You try commands.
You guess.
You debug.

So I built a small DevOps Learning Simulator where you practice debugging Kubernetes incidents like you would in a real environment.

The Idea

Instead of reading solutions, you interactively investigate problems using real commands from Kubernetes.

You run commands such as:

kubectl get pods
kubectl logs <pod>
kubectl describe pod <pod>
kubectl get services
kubectl describe service <service>
kubectl get endpoints
Enter fullscreen mode Exit fullscreen mode

Then you try to find the root cause of the incident.

Example Incident

You start the simulator and run:

kubectl get pods --show-labels

Enter fullscreen mode Exit fullscreen mode

Output:

NAME                          READY STATUS             RESTARTS AGE LABELS
api-deployment-7d4f8b9c       0/1   CrashLoopBackOff        8 36m -
nginx-deployment-5f6g7h8i     1/1   Running                 0 5m app=nginx
Enter fullscreen mode Exit fullscreen mode

You investigate logs:

kubectl logs api-deployment
Enter fullscreen mode Exit fullscreen mode

Then describe the pod:

kubectl describe pod api-deployment
Enter fullscreen mode Exit fullscreen mode

Eventually, you discover the issue:

A missing ConfigMap caused the container to crash.

The simulator then checks your answer and gives feedback.

Current Scenarios
Enter fullscreen mode Exit fullscreen mode

Version 1 includes several common production incidents:

• CrashLoopBackOff caused by missing configuration
• OOMKilled due to incorrect memory limits
• DNS / service selector mismatch

These are problems engineers regularly see when working with Kubernetes.

Why I Built This
Enter fullscreen mode Exit fullscreen mode

Many developers learn DevOps tools but never practice debugging real incidents.

They know commands but haven’t used them in a realistic investigation.

This project aims to address that by providing a safe environment for troubleshooting practice.

Think of it like a flight simulator for DevOps engineers.

Try It

You can try it here:

GitHub Repository

https://github.com/FarooqShabbir/devops_simulator

Run it locally:

git clone https://github.com/FarooqShabbir/devops_simulator.git
cd devops_simulator
python devops_simulator.py
Enter fullscreen mode Exit fullscreen mode

Then start investigating incidents.

Future Plans

Some ideas for future versions:

• CI/CD pipeline failures
• Infrastructure drift debugging
• Network policy issues
• Multi-service production incidents

If you have ideas or want to contribute, feel free to open an issue or pull request.

Feedback

I would love to hear from other DevOps engineers:

What incidents would you add to a DevOps debugging simulator?

Top comments (0)