DEV Community

Mart Schweiger
Mart Schweiger

Posted on • Originally published at assemblyai.com

Twilio Phone Agent with AssemblyAI Universal-3 Pro Streaming

Twilio Phone Agent with AssemblyAI Universal-3 Pro Streaming

Build an AI phone agent that handles real calls using Twilio Voice + Media Streams and the AssemblyAI Universal-3 Pro Streaming model for real-time speech-to-text.

The key detail: Twilio streams 8kHz μ-law audio. AssemblyAI Universal-3 Pro accepts pcm_mulaw at sample_rate=8000 natively — no resampling, no format conversion.

Architecture

Incoming call
     │
  Twilio Voice
     │ TwiML → open WebSocket
     ▼
Your server (/media-stream WebSocket)
     │                        │
     │ mulaw 8kHz audio       │ synthesized mulaw audio
     ▼                        ▲
AssemblyAI Universal-3 Pro    ElevenLabs TTS
     │ transcript + turn signal
     ▼
  OpenAI GPT-4o
Enter fullscreen mode Exit fullscreen mode

Prerequisites

Quick Start

git clone https://github.com/kelseyefoster/voice-agent-twilio-universal-3-pro
cd voice-agent-twilio-universal-3-pro

python -m venv .venv && source .venv/bin/activate
pip install -r requirements.txt

cp .env.example .env

uvicorn server:app --host 0.0.0.0 --port 8000
ngrok http 8000
Enter fullscreen mode Exit fullscreen mode

Configure Twilio

  1. Twilio Console > Phone Numbers > your number > Voice & Fax
  2. Set A Call Comes In to Webhook: https://your-ngrok-url.ngrok.io/incoming-call
  3. Call your Twilio number

AssemblyAI WebSocket Parameters for Twilio

ASSEMBLYAI_WS_URL = (
    "wss://streaming.assemblyai.com/v3/ws"
    "?speech_model=u3-rt-pro"
    "&encoding=pcm_mulaw"      # must match Twilio's audio format
    "&sample_rate=8000"        # must match Twilio's 8kHz stream
    "&end_of_turn_confidence_threshold=0.5"
    "&min_turn_silence=400"
)
Enter fullscreen mode Exit fullscreen mode

Phone calls have more background noise than browser audio. The higher confidence threshold and longer min_turn_silence reduce false triggers.

Post-Call Transcription

import assemblyai as aai
transcriber = aai.Transcriber()
transcript = transcriber.transcribe(recording_url)
print(transcript.text)
Enter fullscreen mode Exit fullscreen mode

Keyterm Prompting

ASSEMBLYAI_WS_URL += "&keyterms_prompt=YourBrand&keyterms_prompt=SpecialTerm"
Enter fullscreen mode Exit fullscreen mode

Deploy

# Railway
railway login && railway init && railway up

# Render — Web Service with start command:
# uvicorn server:app --host 0.0.0.0 --port $PORT
Enter fullscreen mode Exit fullscreen mode

Resources

Top comments (0)