voipbin

Posted on Apr 9

Building an AI-Powered IVR with VoIPBin Flows

#voip #ai #tutorial #webdev

Traditional IVR (Interactive Voice Response) systems are famously terrible. Press 1 for billing. Press 2 for support. Press 0 to repeat the menu. They're rigid, frustrating, and haven't fundamentally changed in decades.

With VoIPBin Flows, you can replace the entire DTMF press-tree with a natural language AI that understands intent, handles multi-turn conversations, and takes real actions — all without managing audio infrastructure.

This post walks through designing and deploying an AI-powered IVR using VoIPBin's Flows API.

What Is a VoIPBin Flow?

A Flow is a reusable call behavior definition — a sequence of actions that VoIPBin executes when a call is answered. Actions can:

Speak text (TTS)
Collect speech input (STT + transcription)
Branch on conditions
Hand off to an AI agent
Transfer to a human agent
Play audio files
Hang up

You compose these actions into a call flow that runs server-side. Your application code stays clean and stateless — VoIPBin tracks the in-progress call state.

The Problem With Classic IVR

A typical press-to-navigate IVR looks like this:

"Press 1 for Sales"
"Press 2 for Support"
"Press 3 for Billing"
"Press 0 to speak with an agent"

Users get lost, press wrong numbers, and abandon calls. Developers maintain brittle decision trees that grow one branch at a time.

An AI-powered IVR replaces the menu with a simple open question:

"Hi! How can I help you today?"

The AI routes the call based on what the caller says, not which button they press.

Step 0: Get Your Access Key

curl -s -X POST "https://api.voipbin.net/v1.0/auth/signup" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "my-ivr-project",
    "email": "you@example.com",
    "password": "secure-password"
  }'
# Returns: { "accesskey": { "token": "your-access-key" } }

Use the returned token as ?accesskey=<token> on all subsequent requests.

Step 1: Create a Routing AI Agent

First, create an AI agent whose job is to understand the caller's intent and return a routing decision:

curl -s -X POST "https://api.voipbin.net/v1.0/agents?accesskey=YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "ivr-router",
    "detail": "You are a voice IVR router. When the caller says what they need, classify it into one of: SALES, SUPPORT, BILLING, or GENERAL. Respond with only the category name. If unclear, say GENERAL.",
    "engine_type": "openai",
    "engine_model": "gpt-4o"
  }'

Note the returned id — you'll reference it in the flow.

Step 2: Create Specialized Agents

For each department, create a specialized agent:

# Sales agent
curl -s -X POST "https://api.voipbin.net/v1.0/agents?accesskey=YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "sales-agent",
    "detail": "You are a helpful sales representative. Assist callers with product information, pricing, and demos. Be friendly and concise.",
    "engine_type": "openai",
    "engine_model": "gpt-4o"
  }'

# Support agent
curl -s -X POST "https://api.voipbin.net/v1.0/agents?accesskey=YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "support-agent",
    "detail": "You are a technical support specialist. Help callers troubleshoot issues with their account or product. Ask clarifying questions when needed.",
    "engine_type": "openai",
    "engine_model": "gpt-4o"
  }'

# Billing agent
curl -s -X POST "https://api.voipbin.net/v1.0/agents?accesskey=YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "billing-agent",
    "detail": "You are a billing specialist. Help callers with invoices, payments, refunds, and subscription changes.",
    "engine_type": "openai",
    "engine_model": "gpt-4o"
  }'

Step 3: Create the IVR Flow

Now assemble the flow. The structure is:

Greet the caller
Collect their intent via speech
Route to the appropriate AI agent

curl -s -X POST "https://api.voipbin.net/v1.0/flows?accesskey=YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "name": "smart-ivr-flow",
    "actions": [
      {
        "type": "talk",
        "text": "Hello! Thanks for calling. How can I help you today?"
      },
      {
        "type": "transcribe",
        "end_silence_timeout": 2,
        "max_duration": 15
      },
      {
        "type": "ai_route",
        "agent_id": "<ivr-router-agent-id>",
        "routes": {
          "SALES": {
            "type": "ai_talk",
            "agent_id": "<sales-agent-id>",
            "welcome_message": "Connecting you to our sales team. One moment."
          },
          "SUPPORT": {
            "type": "ai_talk",
            "agent_id": "<support-agent-id>",
            "welcome_message": "I will help you with that support issue."
          },
          "BILLING": {
            "type": "ai_talk",
            "agent_id": "<billing-agent-id>",
            "welcome_message": "Let me look into your billing question."
          },
          "GENERAL": {
            "type": "ai_talk",
            "agent_id": "<sales-agent-id>",
            "welcome_message": "Let me connect you with someone who can help."
          }
        }
      }
    ]
  }'

Step 4: Attach a Phone Number to the Flow

Purchase or assign a number to your flow:

# List available numbers
curl -s "https://api.voipbin.net/v1.0/numbers/available?accesskey=YOUR_KEY&country_code=US"

# Purchase a number and link it to your flow
curl -s -X POST "https://api.voipbin.net/v1.0/numbers?accesskey=YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "number": "+12025550100",
    "flow_id": "<your-flow-id>"
  }'

Now any call to that number runs your smart IVR flow automatically.

Python Example: Dynamically Creating Flows

For production systems, you might want to generate flows programmatically:

import httpx

BASE_URL = "https://api.voipbin.net/v1.0"
ACCESS_KEY = "your-access-key"

def create_ivr_flow(name: str, agent_ids: dict) -> dict:
    """Create a smart IVR flow with AI routing."""
    actions = [
        {
            "type": "talk",
            "text": "Hello! Thanks for calling. How can I help you today?"
        },
        {
            "type": "transcribe",
            "end_silence_timeout": 2,
            "max_duration": 15
        },
        {
            "type": "ai_route",
            "agent_id": agent_ids["router"],
            "routes": {
                dept: {
                    "type": "ai_talk",
                    "agent_id": agent_id,
                    "welcome_message": f"Routing to {dept.lower()} now."
                }
                for dept, agent_id in agent_ids.items()
                if dept != "router"
            }
        }
    ]

    response = httpx.post(
        f"{BASE_URL}/flows",
        params={"accesskey": ACCESS_KEY},
        json={"name": name, "actions": actions}
    )
    response.raise_for_status()
    return response.json()

# Usage
flow = create_ivr_flow(
    name="smart-ivr",
    agent_ids={
        "router": "<router-agent-id>",
        "SALES": "<sales-agent-id>",
        "SUPPORT": "<support-agent-id>",
        "BILLING": "<billing-agent-id>",
    }
)
print(f"Flow created: {flow['id']}")

Bonus: Webhook for Post-Call Analytics

You can receive a webhook after each call with the full transcript, routing path, and duration:

from flask import Flask, request

app = Flask(__name__)

@app.route("/call-events", methods=["POST"])
def handle_call_event():
    event = request.json
    event_type = event.get("type")

    if event_type == "call.ended":
        call_id = event["call_id"]
        transcript = event.get("transcript", [])
        routed_to = event.get("routed_agent")
        duration = event.get("duration_seconds")

        print(f"Call {call_id}: routed to {routed_to}, {duration}s")
        print(f"Transcript: {transcript}")

        # Log to your analytics pipeline
        log_call_data(call_id, routed_to, transcript, duration)

    return "", 204

if __name__ == "__main__":
    app.run(port=8080)

This gives you real routing data — which intents are most common, where callers get stuck, which agents handle calls fastest.

Classic IVR vs. AI IVR: Quick Comparison

Dimension	Classic IVR	AI-Powered IVR
Navigation	DTMF button presses	Natural language
Routing logic	Hardcoded decision tree	LLM intent classification
Maintenance	Add branches manually	Update agent prompt
Caller experience	Frustrating menus	Conversational
Misrouting	Common on complex trees	Handled by "GENERAL" fallback
Time to deploy	Hours of flow design	Minutes

MCP Shortcut

If you prefer to stay in your editor, the VoIPBin MCP server makes flow creation conversational:

uvx voipbin-mcp

Then in Claude Code or Cursor:

"Create an AI IVR flow for a SaaS company with routing to sales, support, and billing agents"

The MCP server handles all the API calls — you describe the behavior, it builds the flow.

Summary

VoIPBin Flows let you replace rigid DTMF menus with a clean action-based pipeline:

Greet — TTS welcome message
Listen — STT captures caller intent
Route — AI agent classifies and routes
Handle — Specialized AI agent manages the conversation
Analyze — Webhook delivers call data post-hang-up

No audio pipeline to manage. No codec configuration. Just flow logic and LLM prompts.

Resources

DEV Community