Never Miss an Urgent Call Again: Build an AI Voicemail Handler

#ai #voip #tutorial #python

You're in a meeting. Your phone rings. You ignore it. Fifteen minutes later, you check voicemail — "Hi, this is Dave from Acme Corp, we need to talk about our $200k contract, please call back—"

That was 20 minutes ago. Dave called your competitor while he waited.

This happens dozens of times a day in any business. The fix isn't hiring someone to monitor phones. It's building a voicemail handler that transcribes every missed call, classifies its urgency, and routes it to the right person in real time.

Here's how to build one in Python with about 80 lines of code.

The Architecture

When a caller leaves a voicemail:

VoIPBin records the audio
Fires a webhook to your server with the recording URL
Your server downloads the audio and sends it to OpenAI Whisper for transcription
GPT-4o classifies the urgency and extracts key info
Your system routes it: Slack for urgent, email digest for routine

No telephony expertise required. No SIP configuration. No audio codec negotiation.

Setting Up VoIPBin

curl -X POST https://api.voipbin.net/v1.0/auth/signup \
  -H "Content-Type: application/json" \
  -d '{
    "username": "yourname",
    "password": "yourpassword",
    "email": "you@example.com"
  }'

This returns an accesskey.token immediately — no OTP, no email confirmation, no waiting.

Then configure a Flow on your inbound number to record calls and fire a webhook when callers leave a message:

curl -X POST https://api.voipbin.net/v1.0/flows \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "voicemail-handler",
    "actions": [
      {
        "type": "talk",
        "text": "You have reached our office. Please leave a message after the beep."
      },
      {
        "type": "record",
        "max_duration": 120,
        "on_complete": {
          "type": "webhook",
          "url": "https://yourserver.com/voicemail"
        }
      }
    ]
  }'

That's the entire telephony setup. Everything else is Python.

Building the Webhook Handler

from flask import Flask, request, jsonify
import requests
import openai
import os
import json

app = Flask(__name__)
openai.api_key = os.environ["OPENAI_API_KEY"]
VOIPBIN_TOKEN = os.environ["VOIPBIN_TOKEN"]

@app.route("/voicemail", methods=["POST"])
def handle_voicemail():
    data = request.json
    recording_url = data.get("recording_url")
    caller = data.get("from", "Unknown")
    timestamp = data.get("timestamp", "")

    if not recording_url:
        return jsonify({"error": "No recording URL"}), 400

    # Download the audio
    audio_response = requests.get(recording_url)
    audio_path = f"/tmp/voicemail_{caller}.wav"
    with open(audio_path, "wb") as f:
        f.write(audio_response.content)

    # Transcribe with Whisper
    transcript = transcribe_audio(audio_path)

    # Classify and extract info
    analysis = analyze_voicemail(transcript, caller)

    # Route based on urgency
    route_voicemail(analysis, caller, timestamp)

    return jsonify({"status": "processed", "urgency": analysis["urgency"]})


def transcribe_audio(audio_path: str) -> str:
    with open(audio_path, "rb") as f:
        result = openai.audio.transcriptions.create(
            model="whisper-1",
            file=f
        )
    return result.text


def analyze_voicemail(transcript: str, caller: str) -> dict:
    response = openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": """Analyze this voicemail and return JSON with:
- urgency: \"critical\", \"high\", \"normal\", or \"low\"
- summary: one sentence summary
- action_required: what needs to happen next
- callback_needed: true/false
- sentiment: \"positive\", \"neutral\", \"frustrated\", or \"angry\"

Urgency guide:
- critical: contract, legal, emergency, immediate action
- high: decision needed soon, time-sensitive request
- normal: general inquiry or follow-up
- low: spam, unsolicited sales pitch"""
            },
            {
                "role": "user",
                "content": f"Caller: {caller}\nTranscript: {transcript}"
            }
        ],
        response_format={"type": "json_object"}
    )
    return json.loads(response.choices[0].message.content)


def route_voicemail(analysis: dict, caller: str, timestamp: str):
    urgency = analysis["urgency"]
    summary = analysis["summary"]
    action = analysis["action_required"]

    message = f"""
📞 *New Voicemail from {caller}*
🕐 {timestamp}
⚡ Urgency: {urgency.upper()}
📝 {summary}
✅ Action: {action}
    """.strip()

    if urgency in ["critical", "high"]:
        send_slack_alert(message)
    else:
        queue_for_digest(message)

    # Auto-acknowledge if callback was explicitly requested
    if analysis.get("callback_needed"):
        send_callback_sms(caller)


def send_slack_alert(message: str):
    webhook_url = os.environ["SLACK_WEBHOOK_URL"]
    requests.post(webhook_url, json={"text": message})


def queue_for_digest(message: str):
    # Append to DB or message queue for daily digest
    print(f"[DIGEST QUEUE] {message}")


def send_callback_sms(phone_number: str):
    """Acknowledge the caller automatically via SMS."""
    requests.post(
        "https://api.voipbin.net/v1.0/messages",
        headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"},
        json={
            "to": phone_number,
            "message": "Thanks for calling. We received your voicemail and will call you back within the hour."
        }
    )


if __name__ == "__main__":
    app.run(port=8080)

What Happens in Practice

Here's a real-world example of what the analysis output looks like for a contract renewal call:

{
  "urgency": "critical",
  "summary": "Existing customer calling about contract renewal deadline today at 5pm.",
  "action_required": "Call back immediately before 5pm to confirm renewal terms.",
  "callback_needed": true,
  "sentiment": "frustrated"
}

That message hits Slack within 10 seconds of the call ending. The caller also gets an SMS acknowledgment automatically. You call back with full context before they've had time to dial a competitor.

Extending This for Production

CRM enrichment before classification

Cross-reference the caller's phone number with your CRM before the AI sees it. A missed call from your largest account should always be marked critical, regardless of what they said.

def enrich_caller(phone_number: str) -> dict:
    response = requests.get(
        f"https://your-crm.com/api/contacts?phone={phone_number}",
        headers={"Authorization": f"Bearer {CRM_TOKEN}"}
    )
    return response.json() if response.ok else {}

# Then pass enrichment context into analyze_voicemail

Trend analytics

Store every transcript and classification in a database. After a month, you'll know:

Which hours generate the most missed calls (staff scheduling)
What percentage of voicemails are actually urgent (team triage calibration)
Whether caller sentiment is trending up or down (product/support signal)

Escalation chains

For critical calls, don't just Slack one person — escalate progressively if no one acknowledges:

import time
import threading

def escalate_if_unacknowledged(message: str, call_id: str):
    def check_and_escalate():
        time.sleep(300)  # Wait 5 minutes
        if not is_acknowledged(call_id):
            notify_manager(message)  # Escalate to manager
            time.sleep(300)
            if not is_acknowledged(call_id):
                notify_on_call(message)  # Page the on-call person

    threading.Thread(target=check_and_escalate, daemon=True).start()

Why This Works Without Telephony Expertise

The full stack here is:

VoIPBin: handles inbound call routing, recording, codec negotiation, and webhook delivery
OpenAI Whisper: transcription
GPT-4o mini: classification and extraction
Flask: webhook receiver
Slack/SMS: delivery

The pieces you write are the webhook handler and the AI prompts. The pieces you don't write are everything telephony — SIP signaling, RTP streams, audio encoding, call state machines.

This is the Media Offloading model: AI agents handle intelligence, VoIPBin handles communication infrastructure.

Getting Started

Sign up at voipbin.net — instant, no credit card required
Get a phone number from the dashboard
Deploy the Flask app above (any cloud instance works)
Configure the flow to point to your webhook
Call your number, don't answer, leave a message

Your Slack channel should ping within 10 seconds.

The missed $200k renewal call? That's now a Slack alert. Dave gets an SMS saying you'll call him back. You call back with the full transcript in front of you.

Dave stays your customer.