Working on a microservices-based platform, I kept running into the same frustrating cycle:
Something breaks → we figure out the fix in real time → someone says "we should document this" → nobody does → same thing breaks 3 months later.
Even when runbooks existed, they were either too generic ("restart the pod") or specific to whoever wrote them and completely useless to the next on-call person.
So I'm building RunbookAI — you describe your stack once (AKS, EKS, Node.js, PostgreSQL, whatever) and it generates incident playbooks grounded in SRE best practices. Stack-aware, not copy-pasted templates.
Very early stage — just launched a waitlist today.
My question to the Dev.to community: What's the worst runbook experience you've had? Either nonexistent, outdated, or just completely wrong? Would love to understand the real pain before I build the wrong thing.
Waitlist: runbookai.in
For further actions, you may consider blocking this person and/or reporting abuse
Top comments (0)