You have Prometheus running. You have Grafana dashboards. You have Slack alerts configured. And yet, somehow, your on-call engineer still sleeps through a critical outage at 3AM.
Here's the uncomfortable truth: humans ignore notifications. Push alerts get swiped away. Slack messages get buried under 200 others. Emails sit unread until morning standup, when the damage is already done.
But a phone call? That's impossible to ignore.
In this post, I'll show you how to build a production-ready AI voice alert system that calls your on-call engineer when production goes down — using VoIPBin's REST API and about 80 lines of Python.
The Problem with Silent Monitoring
Modern observability stacks are brilliant at generating alerts. They're terrible at making humans pay attention.
Alert fatigue is real. When everything is tagged critical, nothing feels critical. Engineers develop a Pavlovian response to notification sounds: dismiss, snooze, ignore.
The solution isn't more alerts — it's the right kind of alert at the right moment. A phone call interrupts whatever you're doing. It demands a response. And when production is on fire, that friction is exactly what you need.
What We're Building
A webhook handler that:
- Receives an alert from your monitoring system (Prometheus Alertmanager, Grafana, CloudWatch, Datadog — anything with webhook support)
- Calls your on-call engineer via an AI voice agent
- Reads out the alert details in plain English using text-to-speech
- Escalates through the on-call chain if the first engineer doesn't pick up
Prerequisites
- Python 3.8+
- A VoIPBin account — signup is API-first, no forms or email verification
- A phone number provisioned from VoIPBin
Step 1: Get Your API Token
VoIPBin signup takes one API call:
curl -X POST "https://api.voipbin.net/v1.0/auth/signup" \
-H "Content-Type: application/json" \
-d {}
You'll get back:
{
"accesskey": {
"token": "your-api-token-here"
}
}
Store that token. Every API call uses it as a Bearer token.
Step 2: The Alert Webhook Handler
Install dependencies:
pip install flask requests
Now the handler:
from flask import Flask, request, jsonify
import requests
from threading import Thread
import time
app = Flask(__name__)
VOIPBIN_TOKEN = "your-api-token-here"
VOIPBIN_BASE = "https://api.voipbin.net/v1.0"
FROM_NUMBER = "+19876543210" # Your VoIPBin number
# Escalation chain — calls in order until someone answers
ECALATION_CHAIN = [
"+1234567890", # Primary on-call
"+1987654321", # Secondary on-call
"+1555000111", # Engineering manager
]
def make_voice_alert(alert_message: str, to_number: str) -> dict:
"""Trigger an AI voice call with the alert details."""
payload = {
"from": FROM_NUMBER,
"to": to_number,
"actions": [
{
"type": "talk",
"text": (
f"Attention. This is a production alert. "
f"{alert_message}. "
"Press 1 to acknowledge and stop escalation."
)
},
{
"type": "gather",
"timeout": 15,
"num_digits": 1
}
]
}
response = requests.post(
f"{VOIPBIN_BASE}/calls",
headers={
"Authorization": f"Bearer {VOIPBIN_TOKEN}",
"Content-Type": "application/json"
},
json=payload
)
return response.json()
def escalate_alert(message: str, wait_seconds: int = 90):
"""Walk the escalation chain until someone picks up."""
for number in ESCALATION_CHAIN:
print(f"[ALERT] Calling {number}...")
result = make_voice_alert(message, number)
call_id = result.get("id")
if not call_id:
print(f"[ERROR] Failed to create call: {result}")
continue
# Wait for the call to complete, then check status
time.sleep(wait_seconds)
status_resp = requests.get(
f"{VOIPBIN_BASE}/calls/{call_id}",
headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"}
)
call_status = status_resp.json().get("status", "unknown")
if call_status in ("answered", "completed"):
print(f"[OK] Alert acknowledged by {number}")
return
print(f"[ESCALATE] No answer from {number}, trying next...")
print("[CRITICAL] Escalation chain exhausted. Alert unacknowledged.")
@app.route("/webhook/alerts", methods=["POST"])
def handle_alert():
"""Receive Prometheus Alertmanager (or similar) webhooks."""
data = request.json
alerts = data.get("alerts", [])
firing_alerts = [a for a in alerts if a.get("status") == "firing"]
if not firing_alerts:
return jsonify({"status": "no firing alerts"}), 200
for alert in firing_alerts:
alert_name = alert.get("labels", {}).get("alertname", "Unknown Alert")
severity = alert.get("labels", {}).get("severity", "unknown")
summary = alert.get("annotations", {}).get(
"summary", "Check your systems immediately."
)
message = f"Severity {severity}. Alert name: {alert_name}. {summary}"
# Escalate in background so we can return 200 immediately
Thread(
target=escalate_alert,
args=(message,),
daemon=True
).start()
return jsonify({"status": "escalation started", "alerts": len(firing_alerts)}), 200
if __name__ == "__main__":
app.run(port=5000)
Step 3: Wire Up Prometheus Alertmanager
Add a receiver in your alertmanager.yml:
receivers:
- name: voice-alerts
webhook_configs:
- url: "https://your-server.com/webhook/alerts"
send_resolved: false
route:
group_wait: 30s
repeat_interval: 4h
routes:
- match:
severity: critical
receiver: voice-alerts
For Grafana, go to Alerting → Contact points → Add contact point → Webhook, and point it at the same URL.
For CloudWatch or Datadog, add an SNS topic or webhook integration that POSTs to your endpoint.
What VoIPBin Handles for You
If you've built alert systems before, you know the usual pain:
- Standing up a SIP server or navigating a CPaaS console
- Parsing DTMF input from raw audio streams
- Managing call state across retries
- Implementing TTS yourself or paying for a separate service
- Dealing with RTP when you want real-time speech
VoIPBin's Media Offloading model eliminates all of this. Your application sends a plain REST API call with an action list. VoIPBin converts your text to speech, makes the call, collects the DTMF keypress, and delivers events back to your webhook. Your Python code never touches audio — not a single byte of RTP.
This is the core principle: your AI agent or alert system handles logic, VoIPBin handles the telephony.
Step 4: Harden for Production
A few things you'll want before going live:
Add webhook authentication — verify that the request actually came from your alerting system:
import hmac, hashlib
SECRET = "your-shared-webhook-secret"
def verify_signature(payload: bytes, sig_header: str) -> bool:
expected = hmac.new(SECRET.encode(), payload, hashlib.sha256).hexdigest()
return hmac.compare_digest(expected, sig_header)
@app.before_request
def check_signature():
sig = request.headers.get("X-Webhook-Signature", "")
if not verify_signature(request.data, sig):
return jsonify({"error": "unauthorized"}), 401
Track acknowledged alerts — use Redis or a simple database to mark alerts as acknowledged so the escalation loop stops:
import redis
r = redis.Redis()
@app.route("/webhook/acknowledged", methods=["POST"])
def mark_acknowledged():
call_id = request.json.get("call_id")
r.set(f"ack:{call_id}", "1", ex=3600) # expires in 1 hour
return jsonify({"status": "acknowledged"})
# In escalate_alert, check before calling the next person:
if r.get(f"ack:{call_id}"):
print("Alert acknowledged via keypress. Stopping escalation.")
return
Run with gunicorn in production:
pip install gunicorn
gunicorn -w 4 -b 0.0.0.0:5000 app:app
Result
You now have a voice alert system that:
- ✅ Works with any monitoring tool that supports webhooks
- ✅ Reads alert details in plain English — no cryptic codes
- ✅ Escalates through your entire on-call chain automatically
- ✅ Requires no SIP knowledge, no audio handling, no RTP
- ✅ Can be deployed in an afternoon on any cloud platform
The on-call engineer gets a real phone call. Not a notification they can swipe away. A call.
Get Started
VoIPBin signup is one API call:
curl -X POST "https://api.voipbin.net/v1.0/auth/signup" \
-H "Content-Type: application/json" \
-d {}
Documentation, SDKs, and pricing at voipbin.net.
If you're using the Golang SDK instead:
go get github.com/voipbin/voipbin-go
Have you built a voice alerting system or tried a different approach to on-call notifications? I'd love to hear what worked — and what didn't — in the comments.
Top comments (0)