You're in a meeting. Your phone rings. You ignore it. Fifteen minutes later, you check voicemail — "Hi, this is Dave from Acme Corp, we need to talk about our $200k contract, please call back—"
That was 20 minutes ago. Dave called your competitor while he waited.
This happens dozens of times a day in any business. The fix isn't hiring someone to monitor phones. It's building a voicemail handler that transcribes every missed call, classifies its urgency, and routes it to the right person in real time.
Here's how to build one in Python with about 80 lines of code.
The Architecture
When a caller leaves a voicemail:
- VoIPBin records the audio
- Fires a webhook to your server with the recording URL
- Your server downloads the audio and sends it to OpenAI Whisper for transcription
- GPT-4o classifies the urgency and extracts key info
- Your system routes it: Slack for urgent, email digest for routine
No telephony expertise required. No SIP configuration. No audio codec negotiation.
Setting Up VoIPBin
Sign up and get your API token in one call:
curl -X POST https://api.voipbin.net/v1.0/auth/signup \
-H "Content-Type: application/json" \
-d '{
"username": "yourname",
"password": "yourpassword",
"email": "you@example.com"
}'
This returns an accesskey.token immediately — no OTP, no email confirmation, no waiting.
Then configure a Flow on your inbound number to record calls and fire a webhook when callers leave a message:
curl -X POST https://api.voipbin.net/v1.0/flows \
-H "Authorization: Bearer YOUR_TOKEN" \
-H "Content-Type: application/json" \
-d '{
"name": "voicemail-handler",
"actions": [
{
"type": "talk",
"text": "You have reached our office. Please leave a message after the beep."
},
{
"type": "record",
"max_duration": 120,
"on_complete": {
"type": "webhook",
"url": "https://yourserver.com/voicemail"
}
}
]
}'
That's the entire telephony setup. Everything else is Python.
Building the Webhook Handler
from flask import Flask, request, jsonify
import requests
import openai
import os
import json
app = Flask(__name__)
openai.api_key = os.environ["OPENAI_API_KEY"]
VOIPBIN_TOKEN = os.environ["VOIPBIN_TOKEN"]
@app.route("/voicemail", methods=["POST"])
def handle_voicemail():
data = request.json
recording_url = data.get("recording_url")
caller = data.get("from", "Unknown")
timestamp = data.get("timestamp", "")
if not recording_url:
return jsonify({"error": "No recording URL"}), 400
# Download the audio
audio_response = requests.get(recording_url)
audio_path = f"/tmp/voicemail_{caller}.wav"
with open(audio_path, "wb") as f:
f.write(audio_response.content)
# Transcribe with Whisper
transcript = transcribe_audio(audio_path)
# Classify and extract info
analysis = analyze_voicemail(transcript, caller)
# Route based on urgency
route_voicemail(analysis, caller, timestamp)
return jsonify({"status": "processed", "urgency": analysis["urgency"]})
def transcribe_audio(audio_path: str) -> str:
with open(audio_path, "rb") as f:
result = openai.audio.transcriptions.create(
model="whisper-1",
file=f
)
return result.text
def analyze_voicemail(transcript: str, caller: str) -> dict:
response = openai.chat.completions.create(
model="gpt-4o-mini",
messages=[
{
"role": "system",
"content": """Analyze this voicemail and return JSON with:
- urgency: \"critical\", \"high\", \"normal\", or \"low\"
- summary: one sentence summary
- action_required: what needs to happen next
- callback_needed: true/false
- sentiment: \"positive\", \"neutral\", \"frustrated\", or \"angry\"
Urgency guide:
- critical: contract, legal, emergency, immediate action
- high: decision needed soon, time-sensitive request
- normal: general inquiry or follow-up
- low: spam, unsolicited sales pitch"""
},
{
"role": "user",
"content": f"Caller: {caller}\nTranscript: {transcript}"
}
],
response_format={"type": "json_object"}
)
return json.loads(response.choices[0].message.content)
def route_voicemail(analysis: dict, caller: str, timestamp: str):
urgency = analysis["urgency"]
summary = analysis["summary"]
action = analysis["action_required"]
message = f"""
📞 *New Voicemail from {caller}*
🕐 {timestamp}
⚡ Urgency: {urgency.upper()}
📝 {summary}
✅ Action: {action}
""".strip()
if urgency in ["critical", "high"]:
send_slack_alert(message)
else:
queue_for_digest(message)
# Auto-acknowledge if callback was explicitly requested
if analysis.get("callback_needed"):
send_callback_sms(caller)
def send_slack_alert(message: str):
webhook_url = os.environ["SLACK_WEBHOOK_URL"]
requests.post(webhook_url, json={"text": message})
def queue_for_digest(message: str):
# Append to DB or message queue for daily digest
print(f"[DIGEST QUEUE] {message}")
def send_callback_sms(phone_number: str):
"""Acknowledge the caller automatically via SMS."""
requests.post(
"https://api.voipbin.net/v1.0/messages",
headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"},
json={
"to": phone_number,
"message": "Thanks for calling. We received your voicemail and will call you back within the hour."
}
)
if __name__ == "__main__":
app.run(port=8080)
What Happens in Practice
Here's a real-world example of what the analysis output looks like for a contract renewal call:
{
"urgency": "critical",
"summary": "Existing customer calling about contract renewal deadline today at 5pm.",
"action_required": "Call back immediately before 5pm to confirm renewal terms.",
"callback_needed": true,
"sentiment": "frustrated"
}
That message hits Slack within 10 seconds of the call ending. The caller also gets an SMS acknowledgment automatically. You call back with full context before they've had time to dial a competitor.
Extending This for Production
CRM enrichment before classification
Cross-reference the caller's phone number with your CRM before the AI sees it. A missed call from your largest account should always be marked critical, regardless of what they said.
def enrich_caller(phone_number: str) -> dict:
response = requests.get(
f"https://your-crm.com/api/contacts?phone={phone_number}",
headers={"Authorization": f"Bearer {CRM_TOKEN}"}
)
return response.json() if response.ok else {}
# Then pass enrichment context into analyze_voicemail
Trend analytics
Store every transcript and classification in a database. After a month, you'll know:
- Which hours generate the most missed calls (staff scheduling)
- What percentage of voicemails are actually urgent (team triage calibration)
- Whether caller sentiment is trending up or down (product/support signal)
Escalation chains
For critical calls, don't just Slack one person — escalate progressively if no one acknowledges:
import time
import threading
def escalate_if_unacknowledged(message: str, call_id: str):
def check_and_escalate():
time.sleep(300) # Wait 5 minutes
if not is_acknowledged(call_id):
notify_manager(message) # Escalate to manager
time.sleep(300)
if not is_acknowledged(call_id):
notify_on_call(message) # Page the on-call person
threading.Thread(target=check_and_escalate, daemon=True).start()
Why This Works Without Telephony Expertise
The full stack here is:
- VoIPBin: handles inbound call routing, recording, codec negotiation, and webhook delivery
- OpenAI Whisper: transcription
- GPT-4o mini: classification and extraction
- Flask: webhook receiver
- Slack/SMS: delivery
The pieces you write are the webhook handler and the AI prompts. The pieces you don't write are everything telephony — SIP signaling, RTP streams, audio encoding, call state machines.
This is the Media Offloading model: AI agents handle intelligence, VoIPBin handles communication infrastructure.
Getting Started
- Sign up at voipbin.net — instant, no credit card required
- Get a phone number from the dashboard
- Deploy the Flask app above (any cloud instance works)
- Configure the flow to point to your webhook
- Call your number, don't answer, leave a message
Your Slack channel should ping within 10 seconds.
The missed $200k renewal call? That's now a Slack alert. Dave gets an SMS saying you'll call him back. You call back with the full transcript in front of you.
Dave stays your customer.
Top comments (0)