Automated Incident Response Workflows with n8n and Monitoring Tools
Most teams face the same challenge: alerts either go to everyone (causing fatigue) or get missed entirely (causing outages). The solution? Intelligent routing based on severity, business hours, and context.
Building a Smart Incident Response Workflow
We'll create an n8n workflow that connects Prometheus alerts to intelligent response actions. Here's what our system will do:
Analyze alert severity and timing
Route critical after-hours alerts to PagerDuty
Send routine alerts to Discord/Slack
Attempt automated resolution for common issues
Document everything for post-incident analysis
Step 1: Set Up Your Monitoring Stack
First, you need Prometheus and AlertManager configured to send alerts to n8n:
# AlertManager configuration
route:
receiver: 'n8n-webhook'
group_by: ['alertname', 'instance']
group_wait: 30s
group_interval: 1m
repeat_interval: 30m
receivers:
- name: 'n8n-webhook'
webhook_configs:
- url: 'http://your-n8n-instance:5678/webhook/prometheus'
send_resolved: true
Step 2: Build the n8n Workflow
The workflow consists of several key nodes:
Webhook Node - Receives alerts from AlertManager
Function Node - Classifies alerts intelligently:
const alerts = items[0].json.body.alerts || [];
return alerts.map(alert => {
const startsAt = new Date(alert.startsAt);
const hour = startsAt.getUTCHours();
const isBusinessHours = hour >= 9 && hour < 17;
const durationMinutes = (Date.now() - startsAt.getTime()) / 1000 / 60;
return {
json: {
alertname: alert.labels.alertname,
severity: alert.labels.severity,
instance: alert.labels.instance,
description: alert.annotations.description,
isBusinessHours: isBusinessHours,
durationMinutes: durationMinutes
}
};
});
- Switch Node - Routes based on criticality and business hours:
Critical + After Hours → PagerDuty (immediate response)
Critical + Business Hours → Discord (urgent channel)
Non-critical → Discord (general alerts)
Step 3: Add Intelligent Auto-Resolution
For common issues like high CPU usage, add an AI-powered decision node:
// AI prompt for auto-resolution decision
Analyze this alert to determine if auto-resolution should occur:
- Alert: {{ $node["Code"].json["alertname"] }}
- Severity: {{ $node["Code"].json["severity"] }}
- Duration: {{ $node["Code"].json.durationMinutes }} minutes
- Business Hours: {{ $node["Code"].json["isBusinessHours"] }}
Auto-resolve if:
1. CPU > 80% AND outside business hours
2. CPU > 90% AND duration < 5 minutes
3. Critical severity AND outside business hours
Return JSON: {"shouldAutoResolve": boolean, "reason": "explanation"}
This approach lets you automatically restart services or scale resources for known issues while escalating complex problems to humans.
Step 4: Document Everything
Add a Notion or database node to log all incidents:
Timestamp and duration
Severity and affected services
Resolution method (auto vs manual)
Follow-up actions needed
Sample Workflow Structure
Webhook (Prometheus)
→ Function (Classification)
→ Switch (Routing)
├── PagerDuty (Critical + After Hours)
├── Discord (Business Hours)
└── AI Analysis (Auto-Resolution)
├── Lambda (Restart Service)
└── Notion (Document Incident)
Getting Started
Set up basic monitoring with Prometheus and AlertManager
Create the n8n workflow with webhook and classification nodes
Add routing logic based on your team's needs
Implement auto-resolution for common, well-understood issues
Document and iterate based on real-world usage
Beyond Basic Automation
While this n8n workflow provides powerful automation capabilities, teams with critical uptime requirements might need more advanced features like:
Sub-minute monitoring intervals
AI-powered anomaly detection
Advanced correlation and grouping
Enterprise-grade reliability features
The key is starting with intelligent routing and basic automation, then expanding based on your team's specific needs and operational maturity.
For the complete n8n workflow JSON and deployment scripts, check out our full implementation guide.
Read more at https://bubobot.com/blog/automated-incident-response-workflows-with-n8n-and-monitoring-tools
Top comments (0)