Three months.
That’s how long many teams spend building telephony infrastructure before writing a single line of actual conversation logic for an AI voice agent.
Not because the AI was hard.
Because telephony is brutal.
Today, we’re open-sourcing the solution so you don’t have to go through the same pain.
The Hidden Problem with AI Calling Agents
Building an AI calling agent sounds straightforward:
- Use an LLM
- Add speech-to-text
- Add text-to-speech
- Connect it to a phone number
In reality, that’s where most teams hit a wall.
To make real phone calls, you end up dealing with:
- SIP trunks & PSTN providers
- Low-latency, bidirectional audio
- Real-time orchestration of STT, LLM, and TTS
- Call state, interruptions, transfers
- Scaling, monitoring, recordings, persistence
The result?
Most teams spend weeks or months on infrastructure before they ever touch the conversation itself.
We did too. And eventually asked:
“Why is building voice AI still this hard?”
Introducing Siphon
Siphon is an open-source Python framework that handles the telephony complexity for you, so you can focus on building great conversations.
Here’s what a complete AI receptionist looks like with Siphon:
from siphon.agent import Agent
from siphon.plugins import openai, cartesia, deepgram
agent = Agent(
agent_name="receptionist",
llm=openai.LLM(model="gpt-4"),
tts=cartesia.TTS(voice="helpful-assistant"),
stt=deepgram.STT(model="nova-2"),
system_instructions="""
You are a friendly receptionist for Acme Corp.
Help callers schedule appointments or route them correctly.
"""
)
if __name__ == "__main__":
agent.start()
Run this, and your agent can answer real phone calls via any SIP provider (Twilio, Telnyx, etc.).
What Siphon Handles for You
🔌 SIP & PSTN connectivity
Works with any SIP provider, no FreeSWITCH pain.⚡ Real-time audio pipeline
Built on LiveKit with streaming audio and sub-500ms voice-to-voice latency.🤖 AI orchestration
Plug-and-play support for LLMs, STT, and TTS.
Swap providers with a single line:
llm=anthropic.LLM(model="claude-3-5-sonnet")
- 📈 Production-ready by default Auto-scaling, call recordings, transcripts, state handling, and observability.
Quick Start
Install:
pip install siphon-ai
Create an agent:
from siphon.agent import Agent
from siphon.plugins import openai, cartesia, deepgram
agent = Agent(
agent_name="my_first_agent",
llm=openai.LLM(),
tts=cartesia.TTS(),
stt=deepgram.STT(),
system_instructions="You are a helpful assistant.",
)
agent.start()
That’s it.
Your agent is live and answering phone calls.
(Full setup, outbound calling, and advanced examples are in the docs.)
Why We Open-Sourced It
We could’ve kept Siphon proprietary or turned it into a closed SaaS.
But we believe voice AI shouldn’t be locked behind massive infrastructure effort.
Siphon is:
- Apache 2.0 licensed
- Provider-agnostic
- Fully self-hostable
- No vendor lock-in
Use it commercially, modify it, or build on top of it.
What You Can Build
- 📞 Customer support agents
- 📅 Appointment scheduling
- 💼 Sales qualification
- 📊 Surveys & feedback collection
- 🏥 Healthcare intake systems
If it involves phone calls and conversations, Siphon handles the hard parts.
Get Involved
⭐ GitHub: https://github.com/blackdwarftech/siphon
📖 Docs: https://siphon.blackdwarf.in/docs
🐛 Issues & feature requests welcome
🤝 PRs encouraged
We’re building Siphon in public and would love community feedback.
If you’ve ever thought
“I wish building AI calling agents was simpler”
— give Siphon a try.
Built by BLACKDWARF
Mission: Democratize complex technologies for developers.

Top comments (0)