DEV Community

Wachd
Wachd

Posted on

We built a self-hosted OpsGenie replacement with AI root cause analysis β€” and it runs completely air-gapped

Hey DevOps community πŸ‘‹
Like many of you, we got the Atlassian email. OpsGenie is shutting down April 2027.
We looked at the alternatives. PagerDuty and Better Stack are SaaS β€” hard no for regulated environments. Grafana OnCall just entered archive mode. Everything self-hosted was just a router β€” it knew who to call but had zero idea why the alert fired.
So we built what we actually needed.
Wachd is a self-hosted alert intelligence platform that runs entirely inside your Kubernetes cluster. One Helm chart. No data leaves your network.
When an alert fires from Grafana, Datadog, or any webhook source β€” instead of just waking up an engineer with an alert title β€” it automatically:
Fetches the last git commits to the affected service
Pulls recent error logs and metric history
Correlates the timeline β€” alert at T, last deploy at T-6min, error spike at T+1
Strips all PII before any AI touches it
Sends the on-call engineer the probable cause in plain English
The AI runs locally via Ollama β€” Llama 3, Mistral, or Phi-3 on your own hardware. Zero external API calls. For teams comfortable with cloud AI, one config line switches to Claude or OpenAI.
It also includes full on-call scheduling, rotation management, escalation chains, override calendars, AD and SSO integration, SMS and voice via Twilio, and a Slack bot with action buttons.
Apache 2.0. Everything open.
We built it for the teams the SaaS vendors cannot serve β€” banks, fintechs, healthcare, defence β€” anywhere data residency is non-negotiable.
Would love feedback from anyone running on-call in regulated environments or evaluating OpsGenie replacements.
πŸ‘‰ github.com/wachd/wachd
🌐 wachd.io
https://www.producthunt.com/products/wachd

Top comments (0)