DEV Community

voipbin
voipbin

Posted on

Replace 'Press 1 for Sales' with AI Intent Routing

Every caller hears it: "Press 1 for Sales. Press 2 for Support. Press 3 for Billing."

And every caller hates it.

Numeric IVR menus were invented because computers could not understand speech. That limitation is gone. Your AI can now listen to what callers actually say and route them instantly — no dial-pad gymnastics required.

This post shows you how to build natural language call routing with a real phone number, in about 50 lines of Python.


The Problem with Traditional IVR

Classic IVR works like this:

  1. Play a menu
  2. Wait for a DTMF keypress
  3. Branch on the digit

Simple code, terrible experience:

  • Callers forget which option they want by option 7
  • "Press 0 to hear these options again" is a UX failure
  • Callers say "representative" into the void and nothing happens
  • Any menu change requires re-recording audio

The fix is obvious: let the caller say what they want.


The Architecture

Caller dials your number
        |
        v
  VoIPBin answers
  STT converts speech to text
        |
        v
  Your webhook receives
  the transcript
        |
        v
  LLM classifies intent
  sales / support / billing / other
        |
        v
  VoIPBin transfers call
  to the right destination
Enter fullscreen mode Exit fullscreen mode

Your server never touches audio. VoIPBin handles RTP, STT, and TTS. You write pure business logic.


Step 1: Sign Up and Get a Number

# Create an account
curl -X POST https://api.voipbin.net/v1.0/auth/signup \
  -H "Content-Type: application/json" \
  -d '{"username":"you","password":"pass","email":"you@example.com"}'
# Save the accesskey.token as TOKEN

# Buy a phone number
curl -X POST https://api.voipbin.net/v1.0/numbers \
  -H "Authorization: Bearer $TOKEN" \
  -H "Content-Type: application/json" \
  -d '{"country_code":"US","area_code":"415"}'
Enter fullscreen mode Exit fullscreen mode

The number is active immediately. Point it to your webhook in the VoIPBin dashboard.


Step 2: Answer the Call and Prompt the Caller

from flask import Flask, request, jsonify

app = Flask(__name__)

@app.route("/call", methods=["POST"])
def handle_call():
    return jsonify({
        "actions": [
            {
                "type": "talk",
                "text": "Hi! Who would you like to speak with? "
                        "You can say Sales, Support, Billing, or describe your issue."
            },
            {
                "type": "input",
                "speech": {"timeout": 5},
                "action_url": "https://yourserver.com/route"
            }
        ]
    })
Enter fullscreen mode Exit fullscreen mode

The input action listens for speech, transcribes it, and POSTs the result to /route.


Step 3: Classify Intent and Transfer

import openai

ROUTING_TABLE = {
    "sales":   "+14155550100",
    "support": "+14155550101",
    "billing": "+14155550102",
}

def classify_intent(transcript: str) -> str:
    resp = openai.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {
                "role": "system",
                "content": (
                    "You are a call routing classifier. "
                    "Respond with exactly one word: sales, support, billing, or unknown."
                )
            },
            {"role": "user", "content": transcript}
        ],
        max_tokens=5,
    )
    return resp.choices[0].message.content.strip().lower()


@app.route("/route", methods=["POST"])
def route_call():
    data = request.json
    transcript = data.get("speech", {}).get("results", [{}])[0].get("text", "")
    intent = classify_intent(transcript)
    destination = ROUTING_TABLE.get(intent)

    if destination:
        return jsonify({
            "actions": [
                {"type": "talk", "text": f"Connecting you to {intent} now."},
                {"type": "transfer", "destination": destination}
            ]
        })

    # Fallback
    return jsonify({
        "actions": [
            {"type": "talk", "text": "Let me transfer you to our main desk."},
            {"type": "transfer", "destination": "+14155550199"}
        ]
    })


if __name__ == "__main__":
    app.run(port=5000)
Enter fullscreen mode Exit fullscreen mode

That is the complete router.


What Callers Experience

Caller says Intent Goes to
"I want to buy something" sales Sales team
"My account is broken" support Support team
"Question about my invoice" billing Billing team
"Can I speak to someone?" unknown Main desk
"Quiero hablar con soporte" support Support team

That last row matters. LLM-based routing handles multilingual callers with zero extra configuration.


Useful Extensions

Priority routing — detect urgent calls before classification runs:

PRIORITY_KEYWORDS = {"urgent", "outage", "down", "emergency"}

if any(kw in transcript.lower() for kw in PRIORITY_KEYWORDS):
    return jsonify({
        "actions": [
            {"type": "talk", "text": "This sounds urgent. Connecting you to on-call now."},
            {"type": "transfer", "destination": "+14155550911"}
        ]
    })
Enter fullscreen mode Exit fullscreen mode

Multi-turn context — if intent is unclear, ask a follow-up and send the full exchange to the LLM:

def classify_with_context(turns: list[dict]) -> str:
    messages = [{"role": "system", "content": "Route to: sales, support, billing, or unknown."}]
    messages.extend(turns)
    resp = openai.chat.completions.create(
        model="gpt-4o-mini", messages=messages, max_tokens=5
    )
    return resp.choices[0].message.content.strip().lower()
Enter fullscreen mode Exit fullscreen mode

Why This Scales

Your server is stateless. Every webhook hit is an independent HTTP request. Run it behind any load balancer with zero sticky-session config. Ten calls or ten thousand — same code, same latency.

VoIPBin owns the stateful parts: active call legs, audio streams, DTMF detection. Your code stays clean.


Run It Locally

pip install flask openai

# Tunnel for local testing
npx localtunnel --port 5000

# Set your VoIPBin webhook to <tunnel-url>/call
# Call your purchased number
Enter fullscreen mode Exit fullscreen mode

For production, deploy to any Python host (Railway, Render, Fly.io) and update the webhook URL.


What You Built

  • A phone number that understands natural language
  • LLM-based intent classification — swap models any time
  • Automatic transfer to the right team
  • Multilingual support at zero extra cost
  • A stateless, horizontally scalable webhook server

No telephony SDK. No DTMF parsing. No recorded menus to maintain.

The days of Press 1 for Sales are over.


Resources:

  • VoIPBin: https://voipbin.net
  • Sign up: POST https://api.voipbin.net/v1.0/auth/signup
  • MCP Server: uvx voipbin-mcp (works in Claude Code and Cursor)
  • Go SDK: go get github.com/voipbin/voipbin-go

Top comments (0)