If you're building AI agents that need to communicate with humans, you've probably hit the same wall: voice is hard.
Not the AI part. The telephony part.
RTP, SIP, DTMF, codecs, NAT traversal — this is a 40-year-old stack that was not designed for agents. Most developers end up either avoiding voice entirely, or spending weeks fighting infrastructure before writing a single line of agent logic.
There's a better path.
The Core Problem: Agents Shouldn't Handle Audio
A typical DIY voice bot pipeline:
- Receive raw RTP audio from the caller
- Run STT to get a transcript
- Send the transcript to your LLM
- Run TTS on the response
- Stream audio back over RTP
Every step has latency, codec issues, and infrastructure concerns. And none of it is your actual product.
Media Offloading: Let VoIPBin Handle Audio
VoIPBin uses Media Offloading. Your AI agent only ever sees text. VoIPBin handles RTP, STT, and TTS.
Caller → VoIPBin (RTP/STT) → Your Agent (text only) → VoIPBin (TTS/RTP) → Caller
Getting Started
1. Sign Up
curl -X POST https://api.voipbin.net/v1.0/auth/signup \
-H "Content-Type: application/json" \
-d '{"username": "myagent", "password": "mypassword"}'
You get an accesskey.token immediately — no email verification needed.
2. Create a Call Flow
curl -X POST "https://api.voipbin.net/v1.0/flows?accesskey=YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"name": "Support Bot",
"actions": [
{"type": "talk", "text": "Hello! How can I help you today?"},
{"type": "transcribe", "end_silence_timeout": 2}
]
}'
3. Make a Call
curl -X POST "https://api.voipbin.net/v1.0/calls?accesskey=YOUR_KEY" \
-H "Content-Type: application/json" \
-d '{
"flow_id": "<flow-id>",
"destination": "+15551234567"
}'
VoIPBin dials out, handles the audio, and your agent logic runs on transcripts.
No Phone Number? No Problem
VoIPBin supports Direct Hash SIP URIs — no number provisioning needed:
sip:direct.<12-hex-chars>@sip.voipbin.net
Great for internal tools, dev testing, or agent-to-agent communication.
Use It From Claude Code (MCP)
VoIPBin ships an MCP server. Add to your settings:
{
"mcpServers": {
"voipbin": {
"command": "uvx",
"args": ["voipbin-mcp"],
"env": { "VOIPBIN_API_KEY": "your-access-key" }
}
}
}
Then just tell Claude Code: "make a test call to this number" — no curl needed.
What You Skip
- No RTP stack to manage
- No codec negotiation
- No STT/TTS infrastructure to deploy
- No SIP registration
You keep: your agent logic and LLM calls.
Top comments (0)