Hey everyone đź‘‹
Over the past couple of weeks, I’ve been building a side project called Opsrift.
It started from a pretty simple frustration:postmortems, handovers, and incident documentation take way too much time — and most of it is repetitive.
But while building it, I realized something more interesting:
The real problem isn’t writing postmortems.It’s understanding what actually happened during an incident.
So I ended up going a bit further than just a generator.
What Opsrift does right now
The platform is focused on incident workflows — mostly for people working in SRE, support, or operations.
Right now it includes:
- Postmortem generator
Takes incident data and generates structured postmortems in seconds.
- Handover generator
Useful for shift-based teams — turns messy updates into clean handovers.
- Runbook generator
Creates structured runbooks based on incident patterns or inputs.
- Incident Investigator (main focus)
This is the part I’m most interested in:
Pulls data from tools like Jira, PagerDuty, and Opsgenie
Correlates it with deployments from GitHub
Tries to reconstruct what actually happened (timeline, possible causes, etc.)
The goal is to reduce the time spent jumping between tools during investigations.
- Status page
Basic external communication for incidents.
Integrations
Current integrations:
Jira
PagerDuty
Opsgenie
GitHub
Slack
Confluence
Still early — some of these are rough.
What it’s NOT (yet)
I want to be upfront:
It’s not a replacement for your incident management tools
It’s not perfect at root cause analysis
It’s not “production-grade” in every edge case
Right now it’s closer to:
an AI layer on top of your existing tools to speed up investigation and documentation
Known issues
To save you time:
GitHub login ❌ (bugged right now)
Slack login ❌ (also bugged)
👉 You can still use:
Google login
Email/password signup
Fixing these next.
What I’m trying to figure out
This is where I’d really appreciate help.
I’m trying to validate a few things:
Does the Incident Investigator actually help or is it just “nice to have”?
Are the outputs accurate enough to be trusted?
Would you use something like this in real workflows?
What’s missing for it to be genuinely useful?
Where I want to take this
Longer term, I’m thinking about moving beyond just generating outputs and more into:
detecting patterns across incidents
identifying unstable services
highlighting teams with high escalation rates
correlating deployments with incidents automatically
Basically:
turning incident data into something you can actually act on
If you want to try it
👉 https://opsrift.com
No pressure — even quick feedback is super helpful.
Final note
I’ve worked in NOC/SOC and incident-heavy environments, so this is very much a “scratch your own itch” project.
That said, I’m aware tools like this can easily become:
too generic
inaccurate
or just another dashboard nobody uses
So I’d rather get honest feedback early.
Even if it’s:
“this doesn’t solve anything for me”
That’s useful.
Thanks in advance 🙌
Top comments (0)