voipbin

Posted on Apr 14

Build a Smart AI Call Router: Let Intent Decide Where Every Call Goes

#voip #ai #python #tutorial

Traditional phone menus are a form of user punishment.

"Press 1 for sales. Press 2 for support. Press 3 for billing. Press 4 to hear these options again."

Everyone hates them. The caller hates navigating them. The developer hates maintaining them. And yet — for years — this was the only scalable option.

Until AI made it possible to just ask: "What do you need?" and route based on the actual answer.

This post walks through building a smart call router using VoIPBin that:

Answers the call and greets the caller
Captures what they say
Classifies their intent with an LLM
Routes them to the right flow or agent

No DTMF menus. No press-1-for-X. Just natural language routing.

The Architecture

Incoming Call
     ↓
VoIPBin answers → speaks greeting → STT captures response
     ↓
Webhook → Your Server
     ↓
LLM classifies intent ("billing", "support", "sales", "other")
     ↓
VoIPBin executes route-specific flow (transfer / speak / hang up)

Your backend never processes audio. It receives transcribed text, runs classification, and returns routing instructions. VoIPBin handles everything telephony-related.

Step 1: Get a VoIPBin API Key

curl -s -X POST https://api.voipbin.net/v1.0/auth/signup \
  -H "Content-Type: application/json" \
  -d '{"username": "your@email.com", "password": "yourpassword"}'

The response includes accesskey.token — use that as your bearer token. No email verification, no OTP.

Step 2: Create a Phone Number

curl -s -X POST https://api.voipbin.net/v1.0/numbers \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "country_code": "US",
    "number_type": "local",
    "webhook_url": "https://your-server.com/call/incoming"
  }'

When any call arrives at this number, VoIPBin POSTs to your webhook.

Step 3: Build the Intent Classifier

Here's a minimal Python function using OpenAI's API to classify caller intent:

from openai import OpenAI

client = OpenAI(api_key="YOUR_OPENAI_KEY")

ROUTES = {
    "billing": "Caller asking about invoices, payments, refunds, or charges",
    "support": "Caller reporting a bug, outage, or needing technical help",
    "sales":   "Caller asking about pricing, plans, demos, or purchasing",
    "other":   "Anything that doesn't match the above"
}

def classify_intent(transcript: str) -> str:
    route_desc = "\n".join(f"- {k}: {v}" for k, v in ROUTES.items())
    prompt = f"""Classify the caller's intent into exactly one of: billing, support, sales, other.

Routes:\n{route_desc}

Caller said: \"{transcript}\"

Respond with only the route name."""

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=10,
        temperature=0
    )
    return response.choices[0].message.content.strip().lower()

Step 4: Handle the Webhook and Route

from flask import Flask, request, jsonify
import requests

app = Flask(__name__)

VOIPBIN_TOKEN = "YOUR_TOKEN"
VOIPBIN_BASE  = "https://api.voipbin.net/v1.0"

# Map intent → human agent SIP URIs or VoIPBin flow IDs
ROUTE_MAP = {
    "billing": {"type": "transfer", "destination": "sip:billing@yourdomain.com"},
    "support": {"type": "transfer", "destination": "sip:support@yourdomain.com"},
    "sales":   {"type": "transfer", "destination": "sip:sales@yourdomain.com"},
    "other":   {"type": "speak",    "text": "I'm not sure how to help with that. Let me connect you to our main team."},
}

@app.route("/call/incoming", methods=["POST"])
def handle_call():
    body = request.json
    call_id = body["call_id"]

    # Step 1: Greet the caller and collect their voice input
    _send_action(call_id, {
        "action": "speak_and_listen",
        "text": "Hi, thanks for calling. How can I help you today?",
        "listen_webhook": "https://your-server.com/call/response"
    })
    return jsonify({"status": "ok"})

@app.route("/call/response", methods=["POST"])
def handle_response():
    body = request.json
    call_id    = body["call_id"]
    transcript = body.get("transcript", "")

    intent = classify_intent(transcript)
    route  = ROUTE_MAP.get(intent, ROUTE_MAP["other"])

    if route["type"] == "transfer":
        _send_action(call_id, {
            "action": "speak",
            "text": f"Let me connect you to our {intent} team."
        })
        _send_action(call_id, {
            "action": "transfer",
            "destination": route["destination"]
        })
    else:
        _send_action(call_id, {
            "action": "speak",
            "text": route["text"]
        })

    return jsonify({"status": "ok"})

def _send_action(call_id: str, payload: dict):
    requests.post(
        f"{VOIPBIN_BASE}/calls/{call_id}/actions",
        headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"},
        json=payload
    )

if __name__ == "__main__":
    app.run(port=5000)

Step 5: Handling Ambiguity Gracefully

Callers aren't always clear. "I need help" could mean billing or support. A good router asks a follow-up when confidence is low:

def classify_intent_with_confidence(transcript: str) -> tuple[str, float]:
    prompt = f"""Classify intent as billing/support/sales/other.
Also give a confidence score 0.0-1.0.
Format: intent|score

Caller: \"{transcript}\""""

    result = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=20,
        temperature=0
    ).choices[0].message.content.strip()

    parts = result.split("|")
    intent = parts[0].strip()
    score  = float(parts[1].strip()) if len(parts) > 1 else 0.5
    return intent, score

# In your webhook handler:
intent, confidence = classify_intent_with_confidence(transcript)

if confidence < 0.7:
    # Ask a clarifying question
    _send_action(call_id, {
        "action": "speak_and_listen",
        "text": "Could you tell me a bit more — are you calling about a technical issue, billing, or something else?",
        "listen_webhook": "https://your-server.com/call/response"
    })
else:
    # Route with confidence
    ...

What You Actually Built

Let's be clear about what happened here:

Traditional IVR	AI Router
Rigid menu tree	Natural language input
Caller must memorize options	Caller just speaks their need
New intent = redeploy menu	New intent = update prompt
DTMF only	Voice + intent classification
No nuance	Confidence-based fallback

And your backend never dealt with:

RTP streams
DTMF tone decoding
STT pipeline setup
Audio codec negotiation

VoIPBin absorbed all of that. Your code is just HTTP in, HTTP out.

Adding More Routes

Extending the router is trivial:

ROUTES["appointment"] = "Caller wants to schedule or reschedule a meeting"
ROUTE_MAP["appointment"] = {
    "type": "flow",
    "flow_id": "your-voipbin-booking-flow-id"
}

No IVR tree to redraw. No press-5-for-appointments to record. One line of Python.

Try It

Sign up: POST https://api.voipbin.net/v1.0/auth/signup
SDK: go get github.com/voipbin/voipbin-go
MCP Server: uvx voipbin-mcp (works with Claude Code and Cursor)
Website: voipbin.net

If you're building any kind of voice-first app and don't want to deal with the telephony layer, VoIPBin lets you stay in your comfort zone: HTTP, JSON, and clean code.

Have questions or a use case to share? Drop them in the comments.

DEV Community