A couple of years ago we built our own incident management system instead of buying one. I'd do it again. Here's why, and the pieces that mattered.
Why not buy?
We looked at PagerDuty, Incident.io, FireHydrant, and a couple of others. Good tools. Each was $40-80/user/month. For 40 engineers, that's $20-40k/year.
The real problem: none of them fit our workflow exactly. We'd pay $30k/year and still have to work around the tool.
What we built
A small Slack-first tool. Total: ~3000 lines of Go. Took one engineer 3 weeks.
Features:
-
/incident start [title]creates a channel, pings on-call, assigns a commander -
/incident update [message]appends to a timeline that gets used in the retro -
/incident severity [sev-1..sev-5]routes escalation based on severity -
/incident closetriggers post-mortem doc auto-generation from the timeline - Integrations with our monitoring, Jira, and status page
That's it. No 50-feature bloat.
What we skipped
Most of the fancy features in commercial tools go unused. We skipped:
- Custom roles and permissions
- Auto-generated stakeholder updates (we write them by hand better)
- Post-mortem templates beyond the one we chose
- Runbook hosting (we use our docs repo)
Would I buy instead today?
If you're under 50 engineers, probably yes buy. Your engineering time is more valuable than the tool cost.
If you're bigger and have specific workflow needs, build. A focused in-house tool beats a feature-bloated commercial one every time.
The worst option is buying a tool and then fighting it. Pick the fit, not the feature list.
Written by Dr. Samson Tanimawo
BSc · MSc · MBA · PhD
Founder & CEO, Nova AI Ops. https://novaaiops.com
Top comments (0)