Cloud Call Centre Architecture Deep Dive with Vonage
← Day 1: What is a Cloud Call Centre? | Day 3: Setup from Scratch →
🎯 What You'll Learn Today
By the end of this post, you'll be able to:
- Explain every architectural component of a Vonage-powered cloud call centre
- Understand how data flows between components in real time
- Identify where your application code plugs into the Vonage platform
- Make informed design decisions before writing a single line of code
- Spot common architectural mistakes before they cost you in production
This is the post you'll come back to as a reference throughout the series.
🗺️ The 10,000-Foot View
Before going deep, let's establish the complete picture. A production cloud call centre built on Vonage has six architectural zones:
┌─────────────────────────────────────────────────────────────────────┐
│ VONAGE CLOUD CALL CENTRE — ZONES │
│ │
│ ┌───────────────────────────────────────────────────────────────┐ │
│ │ ZONE 1: CUSTOMER-FACING CHANNELS │ │
│ │ Voice · SMS · WhatsApp · Web Chat · Email · Video │ │
│ └───────────────────────────┬───────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────▼───────────────────────────────────┐ │
│ │ ZONE 2: VONAGE NETWORK LAYER │ │
│ │ PSTN Gateway · SIP Trunking · WebRTC Bridge · Number Pool │ │
│ └───────────────────────────┬───────────────────────────────────┘ │
│ │ │
│ ┌───────────────────────────▼───────────────────────────────────┐ │
│ │ ZONE 3: YOUR APPLICATION LAYER ◄── YOU BUILD THIS │ │
│ │ Webhook Handlers · NCCO Logic · Business Rules · State │ │
│ └────────┬──────────────────┬──────────────────┬────────────────┘ │
│ │ │ │ │
│ ┌────────▼────────┐ ┌───────▼───────┐ ┌───────▼────────┐ │
│ │ ZONE 4: AI & │ │ ZONE 5: │ │ ZONE 6: │ │
│ │ AUTOMATION │ │ AGENT LAYER │ │ DATA LAYER │ │
│ │ │ │ │ │ │ │
│ │ IVR · Bots │ │ Desktop · CRM │ │ CDRs · Reports │ │
│ │ Transcription │ │ Skills Routing│ │ Analytics · DB │ │
│ │ Sentiment │ │ Queuing │ │ Compliance │ │
│ └─────────────────┘ └───────────────┘ └────────────────┘ │
└─────────────────────────────────────────────────────────────────────┘
Now let's go deep into each zone.
🔵 Zone 1: Customer-Facing Channels
This is where your customers actually reach you. Vonage supports seven distinct channel types, all of which can be unified into a single routing and conversation system.
CUSTOMER ENTRY POINTS
│
┌──────────────────┼──────────────────┐
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ 📞 PSTN │ │ 💬 Chat │ │ 📱 Mobile│
│ Voice │ │ Web │ │ Apps │
│ │ │ Widget │ │ │
│ Dial a │ │ Embedded │ │ In-app │
│ number │ │ on your │ │ support │
│ │ │ website │ │ │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
─────────────────┼──────────────────
│
┌────────────────┼────────────────┐
│ │ │
▼ ▼ ▼
┌──────────┐ ┌──────────┐ ┌──────────┐
│ 📲 SMS │ │📱WhatsApp│ │ 📧 Email │
│ │ │ │ │ │
│ Two-way │ │ Business │ │ Inbound │
│ text │ │ messaging│ │ support │
│ │ │ │ │ tickets │
└────┬─────┘ └────┬─────┘ └────┬─────┘
│ │ │
───────────────┼───────────────
│
▼
VONAGE API LAYER
(Zone 2 receives
all of the above)
Key insight: Channels are just data
From an architectural perspective, every channel above eventually becomes a Vonage Conversation — a unified data structure that holds messages, participants, and events regardless of which channel they came from. This is the superpower of Vonage's design. More on this in Zone 3.
🔵 Zone 2: The Vonage Network Layer
This is Vonage's infrastructure — the part you don't manage directly but absolutely need to understand.
┌──────────────────────────────────────────────────────────────┐
│ VONAGE NETWORK LAYER │
│ │
│ ┌─────────────────┐ ┌─────────────────────────────┐ │
│ │ PSTN GATEWAY │ │ WEBRTC BRIDGE │ │
│ │ │ │ │ │
│ │ Converts │ │ Converts browser/app │ │
│ │ traditional │ │ audio (WebRTC) ◄──► │ │
│ │ phone calls │ │ traditional telephony │ │
│ │ to IP packets │ │ (RTP/SIP) │ │
│ └────────┬────────┘ └──────────────┬──────────────┘ │
│ │ │ │
│ └──────────────┬──────────────┘ │
│ │ │
│ ┌────────▼────────┐ │
│ │ MEDIA SERVER │ │
│ │ │ │
│ │ • Audio mixing │ │
│ │ • Recording │ │
│ │ • DTMF detect │ │
│ │ • Transcoding │ │
│ └────────┬────────┘ │
│ │ │
│ ┌─────────────────┐ │ ┌─────────────────────────┐ │
│ │ NUMBER POOL │ │ │ SIP TRUNKING │ │
│ │ │ │ │ │ │
│ │ Virtual phone │ │ │ Connect your existing │ │
│ │ numbers in │─────┘ │ PBX or SIP provider │ │
│ │ 160+ countries │ │ to Vonage │ │
│ └─────────────────┘ └─────────────────────────┘ │
└──────────────────────────────────────────────────────────────┘
What is a Vonage Virtual Number?
When a customer dials your call centre number, they dial a Vonage virtual number — a phone number hosted in Vonage's platform. This number is linked to your application via a webhook URL. The moment a call arrives, Vonage pings your server.
Customer dials: +44 20 7946 0123
│
▼
Is this a Vonage virtual number?
│
YES
│
▼
Look up the Answer Webhook URL
for this number → https://yourapp.com/webhook/answer
│
▼
HTTP GET to your server
│
▼
Your server returns NCCO JSON
│
▼
Vonage executes the NCCO actions
This webhook-driven model is fundamental. Vonage never makes decisions — your code does.
🔵 Zone 3: Your Application Layer
This is where you live as a developer. Your application is the brain of the call centre.
┌───────────────────────────────────────────────────────────────────┐
│ YOUR APPLICATION LAYER │
│ │
│ │
│ ┌─────────────────────────────────────────────────────────────┐ │
│ │ WEBHOOK HANDLERS │ │
│ │ │ │
│ │ POST /webhooks/answer ← Vonage fires when a call arrives│ │
│ │ POST /webhooks/event ← Vonage fires on call events │ │
│ │ POST /webhooks/fallback ← Vonage fires on errors │ │
│ │ POST /webhooks/recording ← Vonage fires when rec. ready │ │
│ │ POST /webhooks/message ← Vonage fires on SMS/chat msg │ │
│ └───────────────────┬─────────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ BUSINESS LOGIC │ │
│ │ │ │
│ │ • Which IVR menu to present? │ │
│ │ • Which agent skill is needed? │ │
│ │ • Is it within business hours? │ │
│ │ • Is this customer a VIP? (CRM lookup) │ │
│ │ • Should this call be recorded? │ │
│ │ • What language should the IVR speak? │ │
│ └───────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────────────────────────────┐ │
│ │ NCCO BUILDER │ │
│ │ │ │
│ │ Dynamically construct NCCO JSON based on logic above │ │
│ │ Return it to Vonage as the HTTP response │ │
│ └───────────────────┬─────────────────────────────────────┘ │
│ │ │
│ ┌───────────┼───────────┐ │
│ ▼ ▼ ▼ │
│ Vonage API Database External │
│ Calls (State) Services │
│ (outbound) (CRM, AI) │
└───────────────────────────────────────────────────────────────┘
The Webhook Request/Response Cycle
Every interaction in a Vonage call centre follows this pattern:
YOUR SERVER VONAGE
│ │
│ │ Customer dials in
│ │◄─────────────────
│ │
│ POST /webhooks/answer │
│◄─────────────────────────────────│
│ Body: { from, to, uuid, ... } │
│ │
│ 200 OK + NCCO JSON │
│─────────────────────────────────►│
│ │
│ │ Executes NCCO
│ │ (plays greeting,
│ │ records, connects)
│ │
│ POST /webhooks/event │
│◄─────────────────────────────────│
│ Body: { status: "answered" } │
│ │
│ 200 OK │
│─────────────────────────────────►│
│ │
│ POST /webhooks/event │
│◄─────────────────────────────────│
│ Body: { status: "completed", │
│ duration: 245 } │
│ │
│ 200 OK │
│─────────────────────────────────►│
⚠️ Critical rule: Your webhook endpoints must respond within 3 seconds. Vonage will time out and try the fallback URL if you take too long. Keep business logic fast — offload heavy work asynchronously.
🔵 The Conversation API: Vonage's Unified Data Model
The Conversation API is the single most important API to understand in this stack. It is the data layer that unifies every channel.
┌──────────────────────────────────────────────────────────────┐
│ CONVERSATION API DATA MODEL │
│ │
│ CONVERSATION │
│ ┌─────────────────────┐ │
│ │ id: CON-abc123 │ │
│ │ display_name: │ │
│ │ "Support - Alice" │ │
│ │ state: ACTIVE │ │
│ └──────────┬──────────┘ │
│ │ │
│ ┌───────────────┼────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌─────────┐ ┌─────────┐ ┌─────────┐ │
│ │ MEMBER │ │ MEMBER │ │ MEMBER │ │
│ │ │ │ │ │ │ │
│ │Customer │ │ Agent │ │ Bot │ │
│ │ Alice │ │ Bob │ │ (IVR) │ │
│ │ │ │ │ │ │ │
│ │ channel:│ │ channel:│ │ channel:│ │
│ │ voice │ │ app │ │ app │ │
│ └────┬────┘ └────┬────┘ └────┬────┘ │
│ │ │ │ │
│ └───────────────┼───────────────┘ │
│ │ │
│ ▼ │
│ ┌──────────────────┐ │
│ │ EVENTS │ │
│ │ │ │
│ │ • member:joined │ │
│ │ • member:left │ │
│ │ • audio:say:done │ │
│ │ • audio:record │ │
│ │ • message:submit │ │
│ └──────────────────┘ │
└──────────────────────────────────────────────────────────────┘
Why This Matters
The magic of the Conversation API is that a customer can start on web chat, escalate to voice, and both interactions live in the same Conversation object. The agent sees the full history.
SAME CUSTOMER — DIFFERENT CHANNELS — ONE CONVERSATION
10:00 AM Customer opens web chat
"Hi, I have a billing question."
│
▼ Conversation CON-abc123 created
10:04 AM Customer frustrated, clicks "Call Me."
Voice call initiated
│
▼ Voice MEMBER added to CON-abc123
10:05 AM Agent answers
Sees: "This customer was chatting
about billing since 10:00."
│
▼ Full context preserved, no repeat needed
10:12 AM Issue resolved
Both chat messages and call recordings
stored in CON-abc123
🔵 Zone 4: The AI & Automation Layer
This zone sits between your application logic and the customer. It handles contacts that don't immediately need a human agent.
┌────────────────────────────────────────────────────────────────┐
│ AI & AUTOMATION LAYER │
│ │
│ INBOUND CONTACT │
│ │ │
│ ▼ │
│ ┌─────────────────────────────────┐ │
│ │ IVR / BOT LAYER │ │
│ │ │ │
│ │ Can this be resolved without │ │
│ │ a human agent? │ │
│ └────────────────┬────────────────┘ │
│ │ │
│ ┌──────────────┼──────────────┐ │
│ │ │ │ │
│ ▼ ▼ ▼ │
│ ┌──────────┐ ┌──────────┐ ┌──────────┐ │
│ │ DTMF │ │ NLU │ │ SELF │ │
│ │ IVR │ │ Bot │ │ SERVICE │ │
│ │ │ │ │ │ │ │
│ │ Press 1 │ │ "I want │ │ Balance │ │
│ │ for Sales│ │ to check│ │ check, │ │
│ │ Press 2 │ │ my bill"│ │ order │ │
│ │ for Supp.│ │ │ │ status, │ │
│ └────┬─────┘ └────┬─────┘ │ FAQs │ │
│ │ │ └────┬─────┘ │
│ │ │ │ │
│ └─────────────┼─────────────┘ │
│ │ │
│ Resolved? │
│ │ │ │
│ YES NO │
│ │ │ │
│ ▼ ▼ │
│ End call Escalate to │
│ human agent │
│ (with context) │
└────────────────────────────────────────────────────────────────┘
Real-Time AI Assistants
Beyond IVR, Vonage's audio streaming capability allows you to run AI models during a live call:
LIVE CALL AI PIPELINE
Customer speaking ──► Vonage Media Server
│
│ Audio stream (WebSocket)
▼
Your App / ASR Service
(e.g., Google STT,
AWS Transcribe)
│
│ Text transcript
▼
NLU / LLM Layer
(Intent detection,
sentiment scoring,
suggested responses)
│
│ Insights via WebSocket
▼
Agent Desktop UI
(Shows agent: "Customer
seems frustrated,
suggest discount offer")
🔵 Zone 5: The Agent Layer
The agent layer is what your agents interact with every day. Understanding its components is critical to designing a good agent experience.
┌──────────────────────────────────────────────────────────────┐
│ AGENT LAYER │
│ │
│ ┌─────────────────────────────────────────────────────┐ │
│ │ AGENT DESKTOP │ │
│ │ │ │
│ │ ┌──────────────┐ ┌──────────────┐ ┌──────────┐ │ │
│ │ │ SOFTPHONE │ │ CUSTOMER │ │ QUEUE │ │ │
│ │ │ │ │ CONTEXT │ │ STATUS │ │ │
│ │ │ Answer/Hold │ │ │ │ │ │ │
│ │ │ Transfer │ │ Name, account│ │ Waiting: │ │ │
│ │ │ Conference │ │ history, │ │ 4 │ │ │
│ │ │ Mute/unmute │ │ previous │ │ Avg wait:│ │ │
│ │ │ │ │ contacts, │ │ 1m 20s │ │ │
│ │ │ [WebRTC] │ │ open tickets │ │ │ │ │
│ │ └──────────────┘ └──────────────┘ └──────────┘ │ │
│ └──────────────────────────┬────────────────────────────┘ │
│ │ │
│ ┌──────────────────────────▼────────────────────────────┐ │
│ │ VONAGE CLIENT SDK │ │
│ │ │ │
│ │ Handles all WebRTC audio in the browser │ │
│ │ Listens for incoming calls │ │
│ │ Emits events: call:received, call:answered, etc. │ │
│ └──────────────────────────┬────────────────────────────┘ │
│ │ │
│ ┌─────────────────┼─────────────────┐ │
│ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ ROUTING │ │ SKILLS │ │ AGENT │ │
│ │ ENGINE │ │ MANAGEMENT │ │ STATUS │ │
│ │ │ │ │ │ │ │
│ │ Which agent │ │ Define skill│ │ Available │ │
│ │ gets this │ │ groups: │ │ Busy │ │
│ │ contact? │ │ • Billing │ │ Wrap-up │ │
│ │ │ │ • Technical │ │ Offline │ │
│ │ Priority, │ │ • Spanish │ │ │ │
│ │ skills, │ │ • VIP │ │ │ │
│ │ availability│ │ │ │ │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
└──────────────────────────────────────────────────────────────┘
Skills-Based Routing Flow
INBOUND CONTACT ARRIVES
│
▼
What skill does this contact need?
(determined by IVR input or channel)
│
┌──────┴──────┐
│ │
▼ ▼
Billing Technical
Support Support
│ │
▼ ▼
Find available agent Find available agent
with BILLING skill with TECHNICAL skill
│ │
▼ ▼
Agent found? Agent found?
│ │ │ │
YES NO YES NO
│ │ │ │
▼ ▼ ▼ ▼
Connect Queue Connect Queue
now customer now customer
│ │
▼ ▼
Wait for the next Offer callback
available agent or voicemail
🔵 Zone 6: The Data Layer
Every interaction generates data. This zone is responsible for storing, processing, and surfacing that data.
┌──────────────────────────────────────────────────────────────┐
│ DATA LAYER │
│ │
│ DATA SOURCES │
│ ┌───────────┐ ┌───────────┐ ┌───────────┐ ┌─────────┐ │
│ │ Call │ │ Vonage │ │ Agent │ │ CSAT │ │
│ │ Events │ │ Reports │ │ Activity │ │ Surveys │ │
│ │ (webhooks)│ │ API (CDRs)│ │ (status) │ │ │ │
│ └─────┬─────┘ └─────┬─────┘ └─────┬─────┘ └────┬────┘ │
│ │ │ │ │ │
│ └──────────────┼──────────────┘──────────────┘ │
│ │ │
│ ▼ │
│ ┌────────────────────────────────────────────────────────┐ │
│ │ DATA PIPELINE │ │
│ │ │ │
│ │ Ingest ──► Normalise ──► Store ──► Query ──► Visualise│ │
│ └────────────────────────────────────────────────────────┘ │
│ │ │
│ ┌─────────────┼─────────────┐ │
│ ▼ ▼ ▼ │
│ ┌─────────────┐ ┌─────────┐ ┌──────────────┐ │
│ │ REAL-TIME │ │HISTORICAL│ │ COMPLIANCE │ │
│ │ DASHBOARD │ │REPORTING │ │ & AUDIT │ │
│ │ │ │ │ │ │ │
│ │ Live queue │ │ Daily CDR│ │ Call records │ │
│ │ Agent status│ │ Trends │ │ Retention │ │
│ │ SLA alerts │ │ CSAT avg │ │ GDPR logs │ │
│ │ Active calls│ │ Handle │ │ PCI masking │ │
│ │ │ │ time │ │ │ │
│ └─────────────┘ └─────────┘ └──────────────┘ │
└──────────────────────────────────────────────────────────────┘
What's in a CDR?
A Call Detail Record (CDR) from Vonage's Reports API contains:
{
"record_index": 0,
"call_uuid": "63f61863-4a51-4f6b-86e1-46edebio0391",
"conversation_uuid": "CON-abc123",
"direction": "inbound",
"from": "+441234567890",
"to": "+442079460123",
"date_start": "2024-01-15T09:00:00Z",
"date_end": "2024-01-15T09:04:05Z",
"duration": "245",
"rate": "0.00450",
"price": "0.01838",
"network": "GB-FIXED",
"status": "completed",
"recording_url": "https://api.nexmo.com/v1/files/...",
"agent_id": "agent-bob",
"queue_time": "32",
"handle_time": "213"
}
We'll use this data in Week 5 to build real-time dashboards and analytics.
🔄 End-to-End Architecture: Putting It All Together
Now let's trace a complete, realistic scenario from start to finish. A customer calls with a billing dispute:
STEP 1: CALL ARRIVES
━━━━━━━━━━━━━━━━━━━━
Customer dials +44 20 7946 0123
│
▼
Vonage receives call on virtual number
│
▼
POST /webhooks/answer fired to your app
│
▼
Your app queries CRM: "Is this a known customer?"
│
YES — VIP customer identified
│
▼
NCCO returned: Skip standard IVR, play VIP greeting
STEP 2: IVR / INTENT DETECTION
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
NCCO: "Welcome back, Sarah. How can we help you today?"
│
▼
Customer speaks: "I have a problem with my bill."
│
▼
Vonage ASR detects speech → sends to NLU
│
▼
Intent: BILLING_DISPUTE, Sentiment: NEGATIVE
│
▼
Routing decision: needs BILLING_SENIOR skill
STEP 3: ROUTING
━━━━━━━━━━━━━━━
Query agent pool: who has BILLING_SENIOR skill?
│
▼
Agent Bob — Available ✅
Agent Carol — Busy ❌
Agent Dave — Wrap-up ❌
│
▼
Assign to Agent Bob
STEP 4: AGENT NOTIFICATION
━━━━━━━━━━━━━━━━━━━━━━━━━━
Vonage Client SDK fires call:received on Bob's browser
│
▼
Agent desktop shows:
• Customer: Sarah Johnson
• Account value: £4,200/yr (VIP)
• Intent: Billing dispute
• Sentiment: Negative ⚠️
• Previous contacts: 2 (last: billing, 3 months ago)
• Suggested action: "Offer goodwill credit"
│
▼
Bob clicks "Answer"
STEP 5: CONNECTED
━━━━━━━━━━━━━━━━━
WebRTC audio channel established
Both parties connected
Recording starts (with consent tone)
Real-time transcription begins → feeds AI assistant
STEP 6: RESOLUTION & POST-CALL
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Call ends (Bob clicks "End")
│
▼
Vonage fires POST /webhooks/event {status: "completed"}
│
├──► CDR written to database
├──► Recording stored in object storage
├──► CRM updated with call outcome
├──► CSAT survey SMS sent to Sarah
└──► Analytics dashboard updated in real time
⚡ Architecture Decision: Stateless vs Stateful
One of the most important architectural decisions you'll make is how your application manages state during a call.
Option A: Stateless (Recommended for simplicity)
┌──────────────────────────────────────────────────────────────┐
│ STATELESS ARCHITECTURE │
│ │
│ Each webhook request contains everything you need │
│ No in-memory state — all state from Vonage or DB │
│ │
│ Webhook arrives │
│ │ │
│ ▼ │
│ Extract call UUID from request │
│ │ │
│ ▼ │
│ Fetch state from Redis/DB using UUID │
│ │ │
│ ▼ │
│ Apply business logic │
│ │ │
│ ▼ │
│ Return NCCO / 200 OK │
│ │
│ ✅ Scales horizontally — run 10 instances, no problem │
│ ✅ Survives server restarts │
│ ✅ Easy to debug — each request is self-contained │
└──────────────────────────────────────────────────────────────┘
Option B: Stateful (Simpler but fragile)
┌──────────────────────────────────────────────────────────────┐
│ STATEFUL ARCHITECTURE │
│ │
│ Call state stored in application memory (Map, object) │
│ │
│ // In-memory call store │
│ const calls = new Map(); │
│ calls.set(callUUID, { step: 'ivr', customer: 'sarah' }); │
│ │
│ ❌ Fails if server restarts mid-call │
│ ❌ Doesn't scale — webhook must hit same instance │
│ ❌ Memory leak risk on high volume │
│ ⚠️ OK for local dev / prototypes only │
└──────────────────────────────────────────────────────────────┘
💡 Recommendation: Use Redis as your state store from Day 1. It's fast enough for real-time call state and survives restarts.
🌐 Deployment Architecture
Where do you run your application backend? Here are the three common patterns:
PATTERN 1: SINGLE SERVER (Dev/Small scale)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Vonage ──► Your VPS/EC2 ──► App + Redis
Simple but single point of failure
PATTERN 2: CONTAINERISED (Recommended)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Vonage ──► Load Balancer
│
┌──────┼──────┐
▼ ▼ ▼
App 1 App 2 App 3 (stateless)
│ │ │
└──────┼──────┘
▼
Redis
(shared state)
Scales horizontally, survives instance failure
PATTERN 3: SERVERLESS (Event-driven)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Vonage ──► API Gateway ──► Lambda/Cloud Functions
│
DynamoDB / Redis
Pay-per-call, infinite scale
Watch for cold start latency (must respond in 3s!)
🔒 Security Architecture Overview
Security sits across every zone. Here's where each control lives:
┌──────────────────────────────────────────────────────────────┐
│ SECURITY CONTROLS BY ZONE │
│ │
│ Zone 1 (Channels) │
│ • Number masking for privacy │
│ • TLS 1.2+ for all channel data in transit │
│ │
│ Zone 2 (Vonage Network) │
│ • SRTP for encrypted audio │
│ • Vonage signs all webhooks (JWT / signature header) │
│ │
│ Zone 3 (Your App) │
│ • Validate Vonage webhook signatures on every request │
│ • Store API keys in environment variables / secrets mgr │
│ • HTTPS-only endpoints │
│ │
│ Zone 4 (AI Layer) │
│ • PCI: Pause recording during card number input │
│ • GDPR: Consent before recording │
│ │
│ Zone 5 (Agent Layer) │
│ • JWT-based agent authentication │
│ • Role-based access control (RBAC) │
│ • Screen recording audit logs │
│ │
│ Zone 6 (Data Layer) │
│ • Encryption at rest for recordings │
│ • Retention policies (auto-delete after N days) │
│ • Audit trail for all data access │
└──────────────────────────────────────────────────────────────┘
We'll implement all of these in Week 6. For now, just know where they belong.
📐 Designing for Scale: Numbers to Know
Before you design, know your scale targets. Here are reference figures for Vonage architecture:
| Metric | Small (Startup) | Medium (SME) | Large (Enterprise) |
|---|---|---|---|
| Concurrent calls | < 50 | 50–500 | 500–5,000+ |
| Agents | 5–20 | 20–200 | 200–2,000+ |
| Webhook response budget | 3 seconds | 3 seconds | 3 seconds |
| Recommended state store | Redis single | Redis cluster | Redis cluster + read replicas |
| Recommended deploy | Single container | K8s / ECS | K8s multi-region |
| CDR storage | PostgreSQL | PostgreSQL + S3 | Data warehouse (Snowflake / BigQuery) |
🧪 Architecture Anti-Patterns to Avoid
Learn from common mistakes before you make them:
❌ ANTI-PATTERN 1: Synchronous CRM lookup in webhook handler
Webhook fires → you query Salesforce → Salesforce takes 4s
Result: Vonage times out, call fails
✅ FIX: Cache CRM data in Redis (TTL 5 min)
or query async + use default NCCO if not ready
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
❌ ANTI-PATTERN 2: One webhook URL for everything
POST /webhook handles ALL events
Complex if/else logic, hard to debug
✅ FIX: Separate routes per event type
/webhooks/answer
/webhooks/event
/webhooks/recording
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
❌ ANTI-PATTERN 3: Storing recordings on your app server
Recordings fill disk, no redundancy, not CDN-cached
✅ FIX: Stream recordings directly to S3/GCS on receipt
Use presigned URLs for agent access
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
❌ ANTI-PATTERN 4: No fallback webhook URL
Your server goes down mid-call
Vonage has nowhere to send events → call hangs
✅ FIX: Always configure a fallback URL
Use a secondary server or a simple Lambda
that queues events for retry
🗂️ Technology Stack Reference
Here's a recommended stack for building on Vonage. We'll use this throughout the series:
┌──────────────────────────────────────────────────────────────┐
│ RECOMMENDED TECHNOLOGY STACK │
│ │
│ Backend Language │ Node.js 20 (Express or Fastify) │
│ │ Python 3.12 (FastAPI) │
│ │ (examples in both throughout series) │
│ │ │
│ State Store │ Redis 7 │
│ │ │
│ Database │ PostgreSQL 16 (for CDRs, records) │
│ │ │
│ Message Queue │ Redis Pub/Sub or BullMQ │
│ │ (async post-call processing) │
│ │ │
│ Object Storage │ AWS S3 or GCS │
│ │ (recordings, exports) │
│ │ │
│ Agent Desktop │ React 18 + Vonage Client SDK │
│ │ │
│ Infrastructure │ Docker + docker-compose (local dev) │
│ │ Kubernetes or ECS (production) │
│ │ │
│ CI/CD │ GitHub Actions │
│ │ │
│ Local Tunnelling │ ngrok (Vonage webhooks in local dev) │
└──────────────────────────────────────────────────────────────┘
✅ Architecture Checklist
Before starting to build, run through this checklist:
ARCHITECTURE PLANNING CHECKLIST
Channels
□ Which channels do you need? (voice, SMS, chat, WhatsApp)
□ Do you need omnichannel (unified history across channels)?
Routing
□ What skills/queues do you need?
□ Do you need time-of-day routing?
□ What's the overflow strategy (queue, voicemail, callback)?
IVR / Automation
□ DTMF menus or natural language?
□ What can be self-served without an agent?
□ What's the escalation path?
Application
□ What language/framework for the backend?
□ Where are webhook handlers deployed?
□ What's your state store? (Redis recommended)
□ Fallback webhook URL configured?
Agent Experience
□ Custom agent desktop or existing CRM embed?
□ What context does the agent need at call answer?
□ What post-call wrap-up workflow is needed?
Data & Analytics
□ Where are CDRs stored?
□ What real-time metrics do supervisors need?
□ What's the recording retention policy?
Security
□ Webhook signature validation in place?
□ API keys in secrets manager (not code)?
□ Recording encryption and access control?
□ GDPR consent flow for recordings?
🚀 What's Next
In Day 3, we get hands-on. We'll:
- Create your Vonage API account
- Install the Vonage CLI
- Buy a virtual phone number
- Set up ngrok for local webhook development
- Deploy a working "Hello World" call handler that answers a real phone call
By the end of Day 3 you'll be able to call a real phone number and have your code answer it.
💬 Discussion
Now that you've seen the full architecture:
- Which zone are you most interested to build first?
- Are you building for a startup, migrating from an on-premise PBX, or adding call centre capabilities to an existing product?
- Any architectural patterns from your own experience you'd add?
Drop a comment below — your questions shape the depth of future posts.
📌 Bookmark this post — it's your architecture reference for the entire series.
🔔 Follow for Day 3: Setup from Scratch — tomorrow.
Series: Building Cloud Call Centres with Vonage APIs
← Day 1: What is a Cloud Call Centre? | Day 2 of 30 | Day 3: Setup from Scratch →
Top comments (0)