voipbin

Posted on May 6

Replace 'Press 1 for Sales' with AI Intent Routing

#ai #voip #tutorial #python

Every caller hears it: "Press 1 for Sales. Press 2 for Support. Press 3 for Billing."

And every caller hates it.

Numeric IVR menus were invented because computers could not understand speech. That limitation is gone. Your AI can now listen to what callers actually say and route them instantly — no dial-pad gymnastics required.

This post shows you how to build natural language call routing with a real phone number, in about 50 lines of Python.

The Problem with Traditional IVR

Classic IVR works like this:

Play a menu
Wait for a DTMF keypress
Branch on the digit

Simple code, terrible experience:

Callers forget which option they want by option 7
"Press 0 to hear these options again" is a UX failure
Callers say "representative" into the void and nothing happens
Any menu change requires re-recording audio

The fix is obvious: let the caller say what they want.

The Architecture

Caller dials your number
        |
        v
  VoIPBin answers
  STT converts speech to text
        |
        v
  Your webhook receives
  the transcript
        |
        v
  LLM classifies intent
  sales / support / billing / other
        |
        v
  VoIPBin transfers call
  to the right destination

Your server never touches audio. VoIPBin handles RTP, STT, and TTS. You write pure business logic.

Step 1: Sign Up and Get a Number

# Create an account
curl -X POST https://api.voipbin.net/v1.0/auth/signup \
  -H "Content-Type: application/json" \
  -d '{"username":"you","password":"pass","email":"you@example.com"}'
# Save the accesskey.token as TOKEN

# Buy a phone number
curl -X POST https://api.voipbin.net/v1.0/numbers \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"country_code":"US","area_code":"415"}'

The number is active immediately. Point it to your webhook in the VoIPBin dashboard.

Step 2: Answer the Call and Prompt the Caller

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route("/call", methods=["POST"])
def handle_call():
    return jsonify({
        "actions": [
            {
                "type": "talk",
                "text": "Hi! Who would you like to speak with? "
                        "You can say Sales, Support, Billing, or describe your issue."
            },
            {
                "type": "input",
                "speech": {"timeout": 5},
                "action_url": "https://yourserver.com/route"
            }
        ]
    })

The input action listens for speech, transcribes it, and POSTs the result to /route.

Step 3: Classify Intent and Transfer

import openai

ROUTING_TABLE = {
    "sales":   "+14155550100",
    "support": "+14155550101",
    "billing": "+14155550102",
}

def classify_intent(transcript: str) -> str:
    resp = openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a call routing classifier. "
                    "Respond with exactly one word: sales, support, billing, or unknown."
                )
            },
            {"role": "user", "content": transcript}
        ],
        max_tokens=5,
    )
    return resp.choices[0].message.content.strip().lower()


@app.route("/route", methods=["POST"])
def route_call():
    data = request.json
    transcript = data.get("speech", {}).get("results", [{}])[0].get("text", "")
    intent = classify_intent(transcript)
    destination = ROUTING_TABLE.get(intent)

    if destination:
        return jsonify({
            "actions": [
                {"type": "talk", "text": f"Connecting you to {intent} now."},
                {"type": "transfer", "destination": destination}
            ]
        })

    # Fallback
    return jsonify({
        "actions": [
            {"type": "talk", "text": "Let me transfer you to our main desk."},
            {"type": "transfer", "destination": "+14155550199"}
        ]
    })


if __name__ == "__main__":
    app.run(port=5000)

That is the complete router.

What Callers Experience

Caller says	Intent	Goes to
"I want to buy something"	sales	Sales team
"My account is broken"	support	Support team
"Question about my invoice"	billing	Billing team
"Can I speak to someone?"	unknown	Main desk
"Quiero hablar con soporte"	support	Support team

That last row matters. LLM-based routing handles multilingual callers with zero extra configuration.

Useful Extensions

Priority routing — detect urgent calls before classification runs:

PRIORITY_KEYWORDS = {"urgent", "outage", "down", "emergency"}

if any(kw in transcript.lower() for kw in PRIORITY_KEYWORDS):
    return jsonify({
        "actions": [
            {"type": "talk", "text": "This sounds urgent. Connecting you to on-call now."},
            {"type": "transfer", "destination": "+14155550911"}
        ]
    })

Multi-turn context — if intent is unclear, ask a follow-up and send the full exchange to the LLM:

def classify_with_context(turns: list[dict]) -> str:
    messages = [{"role": "system", "content": "Route to: sales, support, billing, or unknown."}]
    messages.extend(turns)
    resp = openai.chat.completions.create(
        model="gpt-4o-mini", messages=messages, max_tokens=5
    )
    return resp.choices[0].message.content.strip().lower()

Why This Scales

Your server is stateless. Every webhook hit is an independent HTTP request. Run it behind any load balancer with zero sticky-session config. Ten calls or ten thousand — same code, same latency.

VoIPBin owns the stateful parts: active call legs, audio streams, DTMF detection. Your code stays clean.

Run It Locally

pip install flask openai

# Tunnel for local testing
npx localtunnel --port 5000

# Set your VoIPBin webhook to <tunnel-url>/call
# Call your purchased number

For production, deploy to any Python host (Railway, Render, Fly.io) and update the webhook URL.

What You Built

A phone number that understands natural language
LLM-based intent classification — swap models any time
Automatic transfer to the right team
Multilingual support at zero extra cost
A stateless, horizontally scalable webhook server

DEV Community