voipbin

Posted on May 3

Run AI Outbound Call Campaigns Without Managing Dialer Infrastructure

#voip #ai #tutorial #webdev

You have a list of phone numbers. You want an AI agent to call each one, carry a real conversation, and log the outcome.

Maybe it's appointment reminders. Maybe it's post-purchase follow-ups. Maybe it's a survey. The use case doesn't matter — the infrastructure problem is always the same.

To run outbound calls at any real scale, you traditionally need:

A SIP trunk with outbound routing
A predictive dialer or campaign manager
RTP media handling and codec negotiation
DTMF detection and TTS/STT pipelines
Retry logic, concurrency limits, and DNC compliance
A team who knows what all of the above means

That's months of engineering before your AI makes its first call.

This post shows a different approach: your app triggers calls via a simple HTTP API, and VoIPBin handles everything beneath the AI — the SIP stack, the audio, the real-time transcription, and the TTS responses.

How the Architecture Works

The key idea behind VoIPBin's Media Offloading model is a clean separation of concerns:

Your AI handles conversation logic — it reads transcripts, generates responses
VoIPBin handles telephony — it dials the number, streams audio, runs STT/TTS, manages call state

Your AI agent never touches RTP. It never talks to a SIP server. It just receives text and returns text. VoIPBin is the bridge between your AI and the actual phone network.

This means you can scale outbound campaigns by scaling your API calls — not by provisioning telephony infrastructure.

Step 1: Get Your API Key

curl -X POST https://api.voipbin.net/v1.0/auth/signup \
  -H "Content-Type: application/json" \
  -d '{"username": "your@email.com", "password": "yourpassword"}'

Response:

{
  "token": "your-access-token-here"
}

Save that token. Every API call from here uses it as a Bearer token.

Step 2: Trigger a Single Outbound Call

Before building a campaign loop, verify a single call works:

import requests

VOIPBIN_TOKEN = "your-access-token-here"
BASE_URL = "https://api.voipbin.net/v1.0"

def make_outbound_call(destination_number: str, ai_webhook_url: str) -> dict:
    """
    Trigger a single outbound call.
    VoIPBin dials the number; when answered, it posts transcripts to your webhook.
    """
    payload = {
        "source": "+15550001234",           # Your VoIPBin number
        "destinations": [
            {
                "type": "tel",
                "target": destination_number
            }
        ],
        "flow_id": "your-ai-flow-id",       # Pre-configured conversation flow
        "webhook": ai_webhook_url
    }

    resp = requests.post(
        f"{BASE_URL}/calls",
        json=payload,
        headers={
            "Authorization": f"Bearer {VOPBIN_TOKEN}",
            "Content-Type": "application/json"
        }
    )
    resp.raise_for_status()
    return resp.json()

# Test it
call = make_outbound_call("+14155550100", "https://yourapp.com/ai-webhook")
print(f"Call initiated: {call['id']}")

When the callee picks up, VoIPBin starts the conversation flow, runs STT on their speech, and sends transcripts to your webhook in real time.

Step 3: Build the Campaign Loop

A campaign is just a controlled loop over your contact list:

import time
import requests
from typing import Optional

VOIPBIN_TOKEN = "your-access-token-here"
BASE_URL = "https://api.voipbin.net/v1.0"
SOURCE_NUMBER = "+15550001234"
FLOW_ID = "your-ai-flow-id"
WEBHOOK_URL = "https://yourapp.com/ai-webhook"

# Rate limit: how many concurrent calls you want to run
MAX_CONCURRENT = 5
CALL_DELAY_SECONDS = 2  # Pause between each dial

def run_campaign(contacts: list[dict]) -> None:
    """
    contacts = [
        {"phone": "+14155550100", "name": "Alice", "context": "appointment_tuesday"},
        {"phone": "+14155550101", "name": "Bob",   "context": "appointment_thursday"},
        ...
    ]
    """
    active_calls: dict[str, dict] = {}

    for contact in contacts:
        # Check active call count, throttle if needed
        while len(active_calls) >= MAX_CONCURRENT:
            active_calls = prune_completed_calls(active_calls)
            time.sleep(1)

        call = initiate_call(
            destination=contact["phone"],
            webhook=f"{WEBHOOK_URL}?context={contact['context']}"
        )
        active_calls[call["id"]] = contact
        print(f"[{contact['name']}] Call started: {call['id']}")
        time.sleep(CALL_DELAY_SECONDS)

    # Wait for stragglers
    while active_calls:
        active_calls = prune_completed_calls(active_calls)
        time.sleep(2)

    print("Campaign complete.")


def initiate_call(destination: str, webhook: str) -> dict:
    payload = {
        "source": SOURCE_NUMBER,
        "destinations": [{"type": "tel", "target": destination}],
        "flow_id": FLOW_ID,
        "webhook": webhook
    }
    resp = requests.post(
        f"{BASE_URL}/calls",
        json=payload,
        headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"}
    )
    resp.raise_for_status()
    return resp.json()


def prune_completed_calls(active: dict[str, dict]) -> dict[str, dict]:
    still_active = {}
    for call_id, contact in active.items():
        status = get_call_status(call_id)
        if status in ("active", "ringing"):
            still_active[call_id] = contact
        else:
            print(f"[{contact['name']}] Call ended ({status})")
    return still_active


def get_call_status(call_id: str) -> str:
    resp = requests.get(
        f"{BASE_URL}/calls/{call_id}",
        headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"}
    )
    return resp.json().get("status", "unknown")


# Run it
contacts = [
    {"phone": "+14155550100", "name": "Alice", "context": "reminder_may5"},
    {"phone": "+14155550101", "name": "Bob",   "context": "reminder_may6"},
    {"phone": "+14155550102", "name": "Carol",  "context": "reminder_may5"},
]
run_campaign(contacts)

Step 4: Handle Conversation Outcomes at Your Webhook

Every time the callee speaks, VoIPBin posts a transcript event to your webhook. Your AI reads it and returns the next response:

from flask import Flask, request, jsonify
import openai

app = Flask(__name__)
client = openai.OpenAI()

@app.route("/ai-webhook", methods=["POST"])
def handle_call_event():
    event = request.json
    context = request.args.get("context", "")

    if event["type"] == "transcript":
        caller_said = event["transcript"]
        call_id = event["call_id"]

        # Your AI generates the next line
        ai_reply = generate_response(caller_said, context)

        # Log outcome keywords
        log_outcome(call_id, caller_said, context)

        return jsonify({"speak": ai_reply})

    elif event["type"] == "call_ended":
        # Final summary hook
        save_call_summary(event)
        return jsonify({"status": "ok"})

    return jsonify({})


def generate_response(transcript: str, context: str) -> str:
    system_prompt = (
        "You are a friendly scheduling assistant calling to confirm an appointment. "
        f"Context: {context}. "
        "Be concise. If they confirm, say thank you and end the call. "
        "If they want to reschedule, note the request."
    )
    resp = client.chat.completions.create(
        model="gpt-4o",
        messages=[
            {"role": "system", "content": system_prompt},
            {"role": "user",   "content": transcript}
        ]
    )
    return resp.choices[0].message.content


def log_outcome(call_id: str, transcript: str, context: str) -> None:
    # Save to your DB, Notion, Airtable, wherever
    print(f"[{call_id}] [{context}] Caller said: {transcript}")


def save_call_summary(event: dict) -> None:
    print(f"Call {event['call_id']} ended. Duration: {event.get('duration')}s")

Your webhook decides what the AI says next. VoIPBin handles converting that text into actual speech on the call.

What You Get Without Building

Here's what VoIPBin handles on your behalf:

Concern	Without VoIPBin	With VoIPBin
SIP stack	Build and maintain	Fully managed
RTP audio	Handle codecs, jitter	Offloaded
Speech-to-text	Choose, integrate, scale STT	Built-in
Text-to-speech	Choose voice, manage SSML	Built-in
Call state machine	Implement from scratch	API
Retry on no-answer	Custom logic	Configurable
Concurrency scaling	Provision trunk channels	API throttle

Your code handles one thing: conversation logic. The loop above is the entire campaign runner. The webhook is the entire AI layer.

Practical Limits and Considerations

Rate limiting: VoIPBin will cap concurrent outbound calls based on your plan. The MAX_CONCURRENT variable in the loop above should match your limit.

DNC compliance: Do-not-call regulations vary by country. VoIPBin doesn't manage your DNC list — that's your application's responsibility before you pass numbers to the API.

Voicemail detection: If a call goes to voicemail, VoIPBin will still trigger your webhook with transcript events. You can detect common voicemail patterns ("Please leave a message") and hang up gracefully, or leave a prerecorded message.

Timezone-aware scheduling: For campaigns that need to respect local business hours, wrap the run_campaign call with timezone logic before triggering — VoIPBin dials immediately when you POST.

What's Next

Once your basic campaign loop is running, common next steps:

Outcome tagging: Parse transcripts for confirmed/declined/rescheduled and update your CRM
A/B testing prompts: Run two webhook variants on split contact lists, compare outcomes
Escalation paths: If AI can't resolve, transfer to a live agent via VoIPBin's transfer API
Webhook retries: Add a retry queue for webhook failures so no call result is lost

Start Here

If you want to try it:

# Install the MCP server for interactive testing
uvx voipbin-mcp

# Or use the Golang SDK
go get github.com/voipbin/voipbin-go

Full API docs and signup at voipbin.net.

The campaign loop above is maybe 80 lines of Python. The AI webhook is another 40. The telephony layer — SIP, RTP, STT, TTS, codec negotiation — is zero lines, because it's not yours to write.

That's the trade worth making.

DEV Community