Most engineers learn what a CrashLoopBackOff or an OOMKill actually looks like during an incident. That is the worst possible time. So we built a tool to create those conditions on purpose.
Rotelle is an open source Kubernetes failure simulator written in Rust. You deploy it into a cluster and trigger named failure modes on demand: crash loops, OOM kills, missing env vars, unreachable services and more. Each one is reproducible so you can drill on it, write the runbook before you need it or use it as a fixture to test tooling against.
Reproducible failures are also why we are building this. The Infraware Terminal at infraware.dev investigates real incidents with a human approving every step, and you cannot train or evaluate something like that without realistic failures to point it at.
If you run Kubernetes, try it on a test cluster and open an issue with anything that breaks or any failure mode you want added. Adding a scenario is one Rust file plus one registration line.
Apache 2.0: github.com/infraware-dev/infraware-rotelle
Top comments (0)