You have a list of phone numbers. You want an AI agent to call each one, carry a real conversation, and log the outcome.
Maybe it's appointment reminders. Maybe it's post-purchase follow-ups. Maybe it's a survey. The use case doesn't matter — the infrastructure problem is always the same.
To run outbound calls at any real scale, you traditionally need:
- A SIP trunk with outbound routing
- A predictive dialer or campaign manager
- RTP media handling and codec negotiation
- DTMF detection and TTS/STT pipelines
- Retry logic, concurrency limits, and DNC compliance
- A team who knows what all of the above means
That's months of engineering before your AI makes its first call.
This post shows a different approach: your app triggers calls via a simple HTTP API, and VoIPBin handles everything beneath the AI — the SIP stack, the audio, the real-time transcription, and the TTS responses.
How the Architecture Works
The key idea behind VoIPBin's Media Offloading model is a clean separation of concerns:
- Your AI handles conversation logic — it reads transcripts, generates responses
- VoIPBin handles telephony — it dials the number, streams audio, runs STT/TTS, manages call state
Your AI agent never touches RTP. It never talks to a SIP server. It just receives text and returns text. VoIPBin is the bridge between your AI and the actual phone network.
This means you can scale outbound campaigns by scaling your API calls — not by provisioning telephony infrastructure.
Step 1: Get Your API Key
Sign up is a single API call — no dashboard, no email verification loop:
curl -X POST https://api.voipbin.net/v1.0/auth/signup \
-H "Content-Type: application/json" \
-d '{"username": "your@email.com", "password": "yourpassword"}'
Response:
{
"token": "your-access-token-here"
}
Save that token. Every API call from here uses it as a Bearer token.
Step 2: Trigger a Single Outbound Call
Before building a campaign loop, verify a single call works:
import requests
VOIPBIN_TOKEN = "your-access-token-here"
BASE_URL = "https://api.voipbin.net/v1.0"
def make_outbound_call(destination_number: str, ai_webhook_url: str) -> dict:
"""
Trigger a single outbound call.
VoIPBin dials the number; when answered, it posts transcripts to your webhook.
"""
payload = {
"source": "+15550001234", # Your VoIPBin number
"destinations": [
{
"type": "tel",
"target": destination_number
}
],
"flow_id": "your-ai-flow-id", # Pre-configured conversation flow
"webhook": ai_webhook_url
}
resp = requests.post(
f"{BASE_URL}/calls",
json=payload,
headers={
"Authorization": f"Bearer {VOPBIN_TOKEN}",
"Content-Type": "application/json"
}
)
resp.raise_for_status()
return resp.json()
# Test it
call = make_outbound_call("+14155550100", "https://yourapp.com/ai-webhook")
print(f"Call initiated: {call['id']}")
When the callee picks up, VoIPBin starts the conversation flow, runs STT on their speech, and sends transcripts to your webhook in real time.
Step 3: Build the Campaign Loop
A campaign is just a controlled loop over your contact list:
import time
import requests
from typing import Optional
VOIPBIN_TOKEN = "your-access-token-here"
BASE_URL = "https://api.voipbin.net/v1.0"
SOURCE_NUMBER = "+15550001234"
FLOW_ID = "your-ai-flow-id"
WEBHOOK_URL = "https://yourapp.com/ai-webhook"
# Rate limit: how many concurrent calls you want to run
MAX_CONCURRENT = 5
CALL_DELAY_SECONDS = 2 # Pause between each dial
def run_campaign(contacts: list[dict]) -> None:
"""
contacts = [
{"phone": "+14155550100", "name": "Alice", "context": "appointment_tuesday"},
{"phone": "+14155550101", "name": "Bob", "context": "appointment_thursday"},
...
]
"""
active_calls: dict[str, dict] = {}
for contact in contacts:
# Check active call count, throttle if needed
while len(active_calls) >= MAX_CONCURRENT:
active_calls = prune_completed_calls(active_calls)
time.sleep(1)
call = initiate_call(
destination=contact["phone"],
webhook=f"{WEBHOOK_URL}?context={contact['context']}"
)
active_calls[call["id"]] = contact
print(f"[{contact['name']}] Call started: {call['id']}")
time.sleep(CALL_DELAY_SECONDS)
# Wait for stragglers
while active_calls:
active_calls = prune_completed_calls(active_calls)
time.sleep(2)
print("Campaign complete.")
def initiate_call(destination: str, webhook: str) -> dict:
payload = {
"source": SOURCE_NUMBER,
"destinations": [{"type": "tel", "target": destination}],
"flow_id": FLOW_ID,
"webhook": webhook
}
resp = requests.post(
f"{BASE_URL}/calls",
json=payload,
headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"}
)
resp.raise_for_status()
return resp.json()
def prune_completed_calls(active: dict[str, dict]) -> dict[str, dict]:
still_active = {}
for call_id, contact in active.items():
status = get_call_status(call_id)
if status in ("active", "ringing"):
still_active[call_id] = contact
else:
print(f"[{contact['name']}] Call ended ({status})")
return still_active
def get_call_status(call_id: str) -> str:
resp = requests.get(
f"{BASE_URL}/calls/{call_id}",
headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"}
)
return resp.json().get("status", "unknown")
# Run it
contacts = [
{"phone": "+14155550100", "name": "Alice", "context": "reminder_may5"},
{"phone": "+14155550101", "name": "Bob", "context": "reminder_may6"},
{"phone": "+14155550102", "name": "Carol", "context": "reminder_may5"},
]
run_campaign(contacts)
Step 4: Handle Conversation Outcomes at Your Webhook
Every time the callee speaks, VoIPBin posts a transcript event to your webhook. Your AI reads it and returns the next response:
from flask import Flask, request, jsonify
import openai
app = Flask(__name__)
client = openai.OpenAI()
@app.route("/ai-webhook", methods=["POST"])
def handle_call_event():
event = request.json
context = request.args.get("context", "")
if event["type"] == "transcript":
caller_said = event["transcript"]
call_id = event["call_id"]
# Your AI generates the next line
ai_reply = generate_response(caller_said, context)
# Log outcome keywords
log_outcome(call_id, caller_said, context)
return jsonify({"speak": ai_reply})
elif event["type"] == "call_ended":
# Final summary hook
save_call_summary(event)
return jsonify({"status": "ok"})
return jsonify({})
def generate_response(transcript: str, context: str) -> str:
system_prompt = (
"You are a friendly scheduling assistant calling to confirm an appointment. "
f"Context: {context}. "
"Be concise. If they confirm, say thank you and end the call. "
"If they want to reschedule, note the request."
)
resp = client.chat.completions.create(
model="gpt-4o",
messages=[
{"role": "system", "content": system_prompt},
{"role": "user", "content": transcript}
]
)
return resp.choices[0].message.content
def log_outcome(call_id: str, transcript: str, context: str) -> None:
# Save to your DB, Notion, Airtable, wherever
print(f"[{call_id}] [{context}] Caller said: {transcript}")
def save_call_summary(event: dict) -> None:
print(f"Call {event['call_id']} ended. Duration: {event.get('duration')}s")
Your webhook decides what the AI says next. VoIPBin handles converting that text into actual speech on the call.
What You Get Without Building
Here's what VoIPBin handles on your behalf:
| Concern | Without VoIPBin | With VoIPBin |
|---|---|---|
| SIP stack | Build and maintain | Fully managed |
| RTP audio | Handle codecs, jitter | Offloaded |
| Speech-to-text | Choose, integrate, scale STT | Built-in |
| Text-to-speech | Choose voice, manage SSML | Built-in |
| Call state machine | Implement from scratch | API |
| Retry on no-answer | Custom logic | Configurable |
| Concurrency scaling | Provision trunk channels | API throttle |
Your code handles one thing: conversation logic. The loop above is the entire campaign runner. The webhook is the entire AI layer.
Practical Limits and Considerations
Rate limiting: VoIPBin will cap concurrent outbound calls based on your plan. The MAX_CONCURRENT variable in the loop above should match your limit.
DNC compliance: Do-not-call regulations vary by country. VoIPBin doesn't manage your DNC list — that's your application's responsibility before you pass numbers to the API.
Voicemail detection: If a call goes to voicemail, VoIPBin will still trigger your webhook with transcript events. You can detect common voicemail patterns ("Please leave a message") and hang up gracefully, or leave a prerecorded message.
Timezone-aware scheduling: For campaigns that need to respect local business hours, wrap the run_campaign call with timezone logic before triggering — VoIPBin dials immediately when you POST.
What's Next
Once your basic campaign loop is running, common next steps:
- Outcome tagging: Parse transcripts for confirmed/declined/rescheduled and update your CRM
- A/B testing prompts: Run two webhook variants on split contact lists, compare outcomes
- Escalation paths: If AI can't resolve, transfer to a live agent via VoIPBin's transfer API
- Webhook retries: Add a retry queue for webhook failures so no call result is lost
Start Here
If you want to try it:
# Install the MCP server for interactive testing
uvx voipbin-mcp
# Or use the Golang SDK
go get github.com/voipbin/voipbin-go
Full API docs and signup at voipbin.net.
The campaign loop above is maybe 80 lines of Python. The AI webhook is another 40. The telephony layer — SIP, RTP, STT, TTS, codec negotiation — is zero lines, because it's not yours to write.
That's the trade worth making.
Top comments (0)