Most developers think of call screening as a phone carrier feature — press 1 to accept, or send to voicemail. But what if your AI could actually understand who is calling, why they are calling, and decide what to do next?
That is exactly what we will build today: an AI-powered call screening system that answers every incoming call, understands the intent, and routes accordingly — all without you ever picking up the phone.
The Problem with "Press 1"
Traditional IVR menus are friction machines. Callers hate them. Agents hate them. And they still do not solve the core problem: you do not know if a call is worth taking until you take it.
Here is a realistic scenario:
- You run a consulting business or small SaaS
- You get 30 calls a day
- Maybe 5 are real leads
- The rest are spam, vendors, or questions your docs already answer
- You have no way to know which is which without answering
An AI screener changes this completely.
The Architecture
Incoming Call
|
v
VoIPBin receives it
|
v
Webhook -> Your Server
|
v
AI greets caller, asks purpose
|
+-- Spam/vendor --> Politely end call
+-- Support question --> AI resolves it
+-- Sales lead --> Transfer to you
+-- Urgent issue --> SMS alert + transfer
Your AI handles the first 30 seconds of every call. You only get involved when it matters.
Setting Up VoIPBin
First, sign up and grab your API credentials:
curl -X POST https://api.voipbin.net/v1.0/auth/signup \
-H "Content-Type: application/json" \
-d \x27{"username": "your-email@example.com", "password": "your-password", "name": "Your Name"}\x27
This returns an accesskey.token immediately — no OTP, no waiting.
Next, rent a phone number and point it at your webhook URL, and you are ready.
Building the Screener
Here is the core screening bot in Python using FastAPI:
from fastapi import FastAPI, Request
from openai import OpenAI
import httpx, json
app = FastAPI()
client = OpenAI()
VOIPBIN_TOKEN = "YOUR_VOIPBIN_TOKEN"
MY_PHONE = "+14155551234"
BASE_URL = "https://api.voipbin.net/v1.0"
sessions = {}
SCREENER_PROMPT = """
You are an AI call screener. Your job:
1. Greet the caller and ask who they are and why they are calling
2. Classify the call as one of:
- SPAM: robocalls, solicitations, irrelevant vendors
- SUPPORT: tech questions, how-to, existing customers
- SALES: potential new customers, partnership inquiries
- URGENT: production issues, emergencies
Respond with JSON:
{"classification": "SALES", "summary": "Jane from Acme, wants enterprise pricing", "response": "What you said to the caller"}
"""
@app.post("/webhook/call")
async def handle_call(request: Request):
event = await request.json()
call_id = event["call_id"]
event_type = event["type"]
if event_type == "call.started":
sessions[call_id] = {"history": [], "turn": 0}
await speak(call_id, "Hi, thanks for calling. Could you tell me your name and what you are calling about today?")
elif event_type == "call.transcription":
caller_text = event["text"]
session = sessions.get(call_id, {"history": [], "turn": 0})
session["history"].append({"role": "user", "content": caller_text})
session["turn"] += 1
result = await screen_call(session["history"])
if "classification" in result:
await handle_classification(call_id, result)
else:
response_text = result.get("response", "Could you tell me a bit more?")
session["history"].append({"role": "assistant", "content": response_text})
await speak(call_id, response_text)
sessions[call_id] = session
return {"status": "ok"}
async def screen_call(history: list) -> dict:
messages = [{"role": "system", "content": SCREENER_PROMPT}] + history
response = client.chat.completions.create(
model="gpt-4o",
messages=messages,
response_format={"type": "json_object"}
)
return json.loads(response.choices[0].message.content)
async def handle_classification(call_id: str, result: dict):
classification = result["classification"]
summary = result["summary"]
if classification == "SPAM":
await speak(call_id, "Thanks for calling. We are not interested at this time. Have a great day!")
await end_call(call_id)
elif classification == "SUPPORT":
await speak(call_id, "Let me help you with that directly.")
# Add RAG over your docs here
elif classification == "SALES":
await speak(call_id, "This sounds like a great conversation. Let me connect you with our team.")
await transfer_call(call_id, MY_PHONE)
elif classification == "URGENT":
await speak(call_id, "I understand this is urgent. Connecting you right away.")
await send_sms_alert(summary)
await transfer_call(call_id, MY_PHONE)
async def speak(call_id: str, text: str):
async with httpx.AsyncClient() as http:
await http.post(
f"{BASE_URL}/calls/{call_id}/actions",
headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"},
json={"action": "speak", "text": text, "language": "en-US"}
)
async def transfer_call(call_id: str, phone: str):
async with httpx.AsyncClient() as http:
await http.post(
f"{BASE_URL}/calls/{call_id}/actions",
headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"},
json={"action": "transfer", "destination": phone}
)
async def end_call(call_id: str):
async with httpx.AsyncClient() as http:
await http.delete(
f"{BASE_URL}/calls/{call_id}",
headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"}
)
async def send_sms_alert(summary: str):
async with httpx.AsyncClient() as http:
await http.post(
f"{BASE_URL}/messages",
headers={"Authorization": f"Bearer {VOIPBIN_TOKEN}"},
json={"to": MY_PHONE, "text": f"URGENT CALL: {summary}"}
)
How It Flows in Practice
Let us trace a real call:
Spam call:
Bot: "Hi, thanks for calling. Could you tell me your name and why you are calling?"
Caller: "This is an automated message about your car extended warranty..."
AI classifies: SPAM
Bot: "Thanks for calling. We are not interested. Have a great day!"
Call ends. Total time: ~10 seconds.
Sales lead:
Bot: "Hi, thanks for calling. Could you tell me your name and why you are calling?"
Caller: "Hi, I am Sarah from TechCorp. We are evaluating voice APIs for our AI product."
AI classifies: SALES, summary: "Sarah from TechCorp, evaluating voice APIs"
Bot: "This sounds like a great conversation. Let me connect you with our team."
Transfer to your phone. You pick up knowing it is a qualified lead.
What You Do Not Have to Build
Notice what is missing from the code above:
- No audio recording or playback code
- No RTP/SIP handling
- No STT pipeline
- No TTS setup
- No phone carrier integrations
VoIPBin handles all of that. Your webhook receives clean text (call.transcription events), and your actions send text back (speak). The voice layer is completely abstracted.
This is the real value: you write business logic, not telephony plumbing.
Extending the Screener
Caller ID enrichment:
async def enrich_caller(phone_number: str) -> dict:
existing = await crm.lookup(phone_number)
if existing:
return {"known": True, "name": existing.name, "tier": existing.tier}
return {"known": False}
Time-based rules:
import datetime
def is_after_hours():
hour = datetime.datetime.now().hour
return hour < 9 or hour > 18
# After hours: only URGENT gets through
Screening summary log:
async def log_screening(call_id, result):
await db.insert("screened_calls", {
"call_id": call_id,
"classification": result["classification"],
"summary": result["summary"],
"timestamp": datetime.datetime.utcnow()
})
After a week, you have a dataset showing exactly what kinds of calls you get, how often, and what the AI decided — without you having answered a single spam call.
The Real Win
This is not about productivity hacks. It is about attention economics.
Every time you pick up an unknown call, you are making a bet: is this worth interrupting what I am doing? Most of the time, the answer is no. An AI screener shifts that bet entirely — you only engage when the AI has already decided the call is worth your time.
For small teams and solo founders, this is transformative. Your phone becomes a high-signal channel instead of an interruption machine.
Get Started
-
VoIPBin signup:
POST https://api.voipbin.net/v1.0/auth/signup— instant access, no OTP -
MCP for Claude:
uvx voipbin-mcp— test calls directly from Claude Desktop -
Go SDK:
go get github.com/voipbin/voipbin-go - Docs and API: voipbin.net
If you try this out, what classification categories do you end up using? Drop a comment below.
Top comments (0)