DEV Community

voipbin
voipbin

Posted on

Build a Smart AI Call Router: Let Intent Decide Where Every Call Goes

Traditional phone menus are a form of user punishment.

"Press 1 for sales. Press 2 for support. Press 3 for billing. Press 4 to hear these options again."

Everyone hates them. The caller hates navigating them. The developer hates maintaining them. And yet — for years — this was the only scalable option.

Until AI made it possible to just ask: "What do you need?" and route based on the actual answer.

This post walks through building a smart call router using VoIPBin that:

  • Answers the call and greets the caller
  • Captures what they say
  • Classifies their intent with an LLM
  • Routes them to the right flow or agent

No DTMF menus. No press-1-for-X. Just natural language routing.


The Architecture

Incoming Call
     ↓
VoIPBin answers → speaks greeting → STT captures response
     ↓
Webhook → Your Server
     ↓
LLM classifies intent ("billing", "support", "sales", "other")
     ↓
VoIPBin executes route-specific flow (transfer / speak / hang up)
Enter fullscreen mode Exit fullscreen mode

Your backend never processes audio. It receives transcribed text, runs classification, and returns routing instructions. VoIPBin handles everything telephony-related.


Step 1: Get a VoIPBin API Key

curl -s -X POST https://api.voipbin.net/v1.0/auth/signup \
  -H "Content-Type: application/json" \
  -d '{"username": "your@email.com", "password": "yourpassword"}'
Enter fullscreen mode Exit fullscreen mode

The response includes accesskey.token — use that as your bearer token. No email verification, no OTP.


Step 2: Create a Phone Number

curl -s -X POST https://api.voipbin.net/v1.0/numbers \
  -H "Authorization: Bearer YOUR_TOKEN" \
  -H "Content-Type: application/json" \
  -d '{
    "country_code": "US",
    "number_type": "local",
    "webhook_url": "https://your-server.com/call/incoming"
  }'
Enter fullscreen mode Exit fullscreen mode

When any call arrives at this number, VoIPBin POSTs to your webhook.


Step 3: Build the Intent Classifier

Here's a minimal Python function using OpenAI's API to classify caller intent:

from openai import OpenAI

client = OpenAI(api_key="YOUR_OPENAI_KEY")

ROUTES = {
    "billing": "Caller asking about invoices, payments, refunds, or charges",
    "support": "Caller reporting a bug, outage, or needing technical help",
    "sales":   "Caller asking about pricing, plans, demos, or purchasing",
    "other":   "Anything that doesn't match the above"
}

def classify_intent(transcript: str) -> str:
    route_desc = "\n".join(f"- {k}: {v}" for k, v in ROUTES.items())
    prompt = f"""Classify the caller's intent into exactly one of: billing, support, sales, other.

Routes:\n{route_desc}

Caller said: \"{transcript}\"

Respond with only the route name."""

    response = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=10,
        temperature=0
    )
    return response.choices[0].message.content.strip().lower()
Enter fullscreen mode Exit fullscreen mode

Step 4: Handle the Webhook and Route

from flask import Flask, request, jsonify
import requests

app = Flask(__name__)

VOIPBIN_TOKEN = "YOUR_TOKEN"
VOIPBIN_BASE  = "https://api.voipbin.net/v1.0"

# Map intent → human agent SIP URIs or VoIPBin flow IDs
ROUTE_MAP = {
    "billing": {"type": "transfer", "destination": "sip:billing@yourdomain.com"},
    "support": {"type": "transfer", "destination": "sip:support@yourdomain.com"},
    "sales":   {"type": "transfer", "destination": "sip:sales@yourdomain.com"},
    "other":   {"type": "speak",    "text": "I'm not sure how to help with that. Let me connect you to our main team."},
}

@app.route("/call/incoming", methods=["POST"])
def handle_call():
    body = request.json
    call_id = body["call_id"]

    # Step 1: Greet the caller and collect their voice input
    _send_action(call_id, {
        "action": "speak_and_listen",
        "text": "Hi, thanks for calling. How can I help you today?",
        "listen_webhook": "https://your-server.com/call/response"
    })
    return jsonify({"status": "ok"})

@app.route("/call/response", methods=["POST"])
def handle_response():
    body = request.json
    call_id    = body["call_id"]
    transcript = body.get("transcript", "")

    intent = classify_intent(transcript)
    route  = ROUTE_MAP.get(intent, ROUTE_MAP["other"])

    if route["type"] == "transfer":
        _send_action(call_id, {
            "action": "speak",
            "text": f"Let me connect you to our {intent} team."
        })
        _send_action(call_id, {
            "action": "transfer",
            "destination": route["destination"]
        })
    else:
        _send_action(call_id, {
            "action": "speak",
            "text": route["text"]
        })

    return jsonify({"status": "ok"})

def _send_action(call_id: str, payload: dict):
    requests.post(
        f"{VOIPBIN_BASE}/calls/{call_id}/actions",
        headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"},
        json=payload
    )

if __name__ == "__main__":
    app.run(port=5000)
Enter fullscreen mode Exit fullscreen mode

Step 5: Handling Ambiguity Gracefully

Callers aren't always clear. "I need help" could mean billing or support. A good router asks a follow-up when confidence is low:

def classify_intent_with_confidence(transcript: str) -> tuple[str, float]:
    prompt = f"""Classify intent as billing/support/sales/other.
Also give a confidence score 0.0-1.0.
Format: intent|score

Caller: \"{transcript}\""""

    result = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{"role": "user", "content": prompt}],
        max_tokens=20,
        temperature=0
    ).choices[0].message.content.strip()

    parts = result.split("|")
    intent = parts[0].strip()
    score  = float(parts[1].strip()) if len(parts) > 1 else 0.5
    return intent, score

# In your webhook handler:
intent, confidence = classify_intent_with_confidence(transcript)

if confidence < 0.7:
    # Ask a clarifying question
    _send_action(call_id, {
        "action": "speak_and_listen",
        "text": "Could you tell me a bit more — are you calling about a technical issue, billing, or something else?",
        "listen_webhook": "https://your-server.com/call/response"
    })
else:
    # Route with confidence
    ...
Enter fullscreen mode Exit fullscreen mode

What You Actually Built

Let's be clear about what happened here:

Traditional IVR AI Router
Rigid menu tree Natural language input
Caller must memorize options Caller just speaks their need
New intent = redeploy menu New intent = update prompt
DTMF only Voice + intent classification
No nuance Confidence-based fallback

And your backend never dealt with:

  • RTP streams
  • DTMF tone decoding
  • STT pipeline setup
  • Audio codec negotiation

VoIPBin absorbed all of that. Your code is just HTTP in, HTTP out.


Adding More Routes

Extending the router is trivial:

ROUTES["appointment"] = "Caller wants to schedule or reschedule a meeting"
ROUTE_MAP["appointment"] = {
    "type": "flow",
    "flow_id": "your-voipbin-booking-flow-id"
}
Enter fullscreen mode Exit fullscreen mode

No IVR tree to redraw. No press-5-for-appointments to record. One line of Python.


Try It

  • Sign up: POST https://api.voipbin.net/v1.0/auth/signup
  • SDK: go get github.com/voipbin/voipbin-go
  • MCP Server: uvx voipbin-mcp (works with Claude Code and Cursor)
  • Website: voipbin.net

If you're building any kind of voice-first app and don't want to deal with the telephony layer, VoIPBin lets you stay in your comfort zone: HTTP, JSON, and clean code.


Have questions or a use case to share? Drop them in the comments.

Top comments (0)