Email Infrastructure Monitoring: Metrics That Actually Matter
Most email monitoring dashboards show vanity metrics. Here is what to actually watch - and how KumoMTA exposes all of it.
The Critical Metrics
Queue Depth
Number of messages waiting to be delivered. Growing queue = problem. This is your earliest warning signal.
alert: queue_depth > 10000
action: check reputation, throttle, investigate
Delivery Rate
Messages successfully delivered / messages attempted. Should stay above 98%. Spikes indicate reputation problems.
Bounce Rate by Enhanced Code
Not just total bounces - specific bounce types tell you what to fix:
- 5.1.1 (unknown user) spikes = list quality problem
- 4.2.0 (mailbox busy) spikes = temporary receiver issue
- 5.7.1 (policy blocked) spikes = content or reputation problem
Reputation Score
Google Postmaster Tools provides domain and IP reputation scores:
- Bad: Emails go to spam or rejected
- Poor: High risk of spam placement
- Medium: Normal scrutiny
- Healthy: Minimal filtering
KumoMTA Monitoring Endpoints
# Queue status
curl http://localhost:8000/api/queue
# Returns:
{
"depth": 1247,
"in_flight": 345,
"connections": 45,
"delivery_rate": "2340/minute",
"bounce_rate": 0.023
}
# Per-domain stats
curl http://localhost:8000/api/domain/outbound.postmta.com
# Returns bounce classification
{
"hard_bounce_rate": 0.008,
"soft_bounce_rate": 0.015,
"avg_latency_ms": 142,
"gmail_complaint_rate": 0.001
}
Alerting Strategy
Set alerts at these thresholds:
- Queue depth > 5000: Warning
- Queue depth > 10000: Critical
- Bounce rate > 2%: Warning
- Bounce rate > 5%: Critical
- Complaint rate > 0.1%: Critical
The Dashboard Pattern
Build a real-time dashboard with:
- Queue depth (time series)
- Delivery rate (time series)
- Bounce breakdown (pie chart by enhanced code)
- Top bounce domains (bar chart)
- Reputation score (Google Postmaster API)
PostMTA provides this dashboard as part of managed service. For self-hosted KumoMTA, the Prometheus exporter built into KumoMTA connects to Grafana directly.
Log Analysis
KumoMTA logs every delivery attempt with full context:
2026-05-18T10:23:45Z delivery success to mx.google.com [192.168.1.1]
2026-05-18T10:23:46Z bounce 550 5.1.1 from mx.example.com [10.0.0.5]
2026-05-18T10:23:47Z retry scheduled for mailbox@problem.com in 15m
Aggregate these logs to build your own BI reports on delivery performance over time.
Top comments (0)