Grafana OnCall is an open-source incident response tool for DevOps and SRE teams. It provides on-call scheduling, alert routing, and escalation policies — all integrated with your existing monitoring stack.
What Is Grafana OnCall?
Grafana OnCall (formerly Amixr) is an on-call management system that integrates natively with Grafana, Prometheus, Alertmanager, and 100+ monitoring tools. It routes alerts to the right people at the right time.
Key Features:
- On-call schedules and rotations
- Alert routing and escalation policies
- Slack, Telegram, MS Teams notifications
- Phone calls and SMS alerts
- Grafana native integration
- ChatOps for incident management
- Public API for automation
Installation
# Via Docker Compose
git clone https://github.com/grafana/oncall.git
cd oncall
docker compose up -d
# Or via Helm
helm repo add grafana https://grafana.github.io/helm-charts
helm install oncall grafana/oncall -n oncall --create-namespace
Grafana OnCall API: Automate Incident Response
import requests
BASE = "https://oncall.example.com/api/v1"
HEADERS = {"Authorization": "Bearer YOUR_API_TOKEN"}
# List on-call schedules
schedules = requests.get(f"{BASE}/schedules", headers=HEADERS).json()
for schedule in schedules["results"]:
print(f"Schedule: {schedule[name]}, Team: {schedule[team_id]}")
# Get current on-call users
oncall = requests.get(
f"{BASE}/schedules/{schedule_id}/on-call",
headers=HEADERS
).json()
print(f"On-call now: {oncall[users]}")
Alert Routing Rules
# Create an integration (webhook endpoint)
integration = requests.post(
f"{BASE}/integrations",
headers=HEADERS,
json={
"type": "webhook",
"name": "Production Alerts",
"team_id": "TEAM_ID"
}
).json()
print(f"Webhook URL: {integration[link]}")
# Create routing rule
route = requests.post(
f"{BASE}/routes",
headers=HEADERS,
json={
"integration_id": integration["id"],
"routing_regex": "severity=critical",
"escalation_chain_id": "CHAIN_ID",
"position": 0
}
).json()
Escalation Chains
# Create escalation chain
chain = requests.post(
f"{BASE}/escalation_chains",
headers=HEADERS,
json={"name": "Critical P1", "team_id": "TEAM_ID"}
).json()
# Add escalation policies
# Step 1: Notify on-call
requests.post(f"{BASE}/escalation_policies", headers=HEADERS, json={
"escalation_chain_id": chain["id"],
"type": "notify_persons",
"persons_to_notify": ["USER_ID"],
"position": 0
})
# Step 2: Wait 5 minutes
requests.post(f"{BASE}/escalation_policies", headers=HEADERS, json={
"escalation_chain_id": chain["id"],
"type": "wait",
"duration": 300,
"position": 1
})
# Step 3: Notify entire team
requests.post(f"{BASE}/escalation_policies", headers=HEADERS, json={
"escalation_chain_id": chain["id"],
"type": "notify_team_members",
"notify_to_team_members": "TEAM_ID",
"position": 2
})
Trigger Alert via Webhook
curl -X POST https://oncall.example.com/integrations/v1/webhook/INTEGRATION_TOKEN/ \
-H "Content-Type: application/json" \
-d x27{"title": "High CPU on prod-web-01", "message": "CPU usage >95% for 10min", "severity": "critical"}x27
Resources
- Grafana OnCall Docs
- Grafana OnCall GitHub — 3.5K+ stars
- API Reference
Need to scrape monitoring data or web sources? Check out my web scraping tools on Apify — production-ready actors for Reddit, Google Maps, and more. Questions? Email me at spinov001@gmail.com
Top comments (0)