DEV Community

Cover image for Series: Building Cloud Call Centres with Vonage APIs — Day 2 of 30
Shashi Kiran
Shashi Kiran

Posted on

Series: Building Cloud Call Centres with Vonage APIs — Day 2 of 30

Cloud Call Centre Architecture Deep Dive with Vonage

Day 1: What is a Cloud Call Centre? | Day 3: Setup from Scratch


🎯 What You'll Learn Today

By the end of this post, you'll be able to:

  • Explain every architectural component of a Vonage-powered cloud call centre
  • Understand how data flows between components in real time
  • Identify where your application code plugs into the Vonage platform
  • Make informed design decisions before writing a single line of code
  • Spot common architectural mistakes before they cost you in production

This is the post you'll come back to as a reference throughout the series.


🗺️ The 10,000-Foot View

Before going deep, let's establish the complete picture. A production cloud call centre built on Vonage has six architectural zones:

┌─────────────────────────────────────────────────────────────────────┐
│                  VONAGE CLOUD CALL CENTRE — ZONES                   │
│                                                                     │
│  ┌───────────────────────────────────────────────────────────────┐  │
│  │  ZONE 1: CUSTOMER-FACING CHANNELS                            │  │
│  │  Voice · SMS · WhatsApp · Web Chat · Email · Video           │  │
│  └───────────────────────────┬───────────────────────────────────┘  │
│                              │                                      │
│  ┌───────────────────────────▼───────────────────────────────────┐  │
│  │  ZONE 2: VONAGE NETWORK LAYER                                │  │
│  │  PSTN Gateway · SIP Trunking · WebRTC Bridge · Number Pool   │  │
│  └───────────────────────────┬───────────────────────────────────┘  │
│                              │                                      │
│  ┌───────────────────────────▼───────────────────────────────────┐  │
│  │  ZONE 3: YOUR APPLICATION LAYER  ◄── YOU BUILD THIS          │  │
│  │  Webhook Handlers · NCCO Logic · Business Rules · State       │  │
│  └────────┬──────────────────┬──────────────────┬────────────────┘  │
│           │                  │                  │                   │
│  ┌────────▼────────┐ ┌───────▼───────┐ ┌───────▼────────┐         │
│  │  ZONE 4: AI &   │ │  ZONE 5:      │ │  ZONE 6:       │         │
│  │  AUTOMATION     │ │  AGENT LAYER  │ │  DATA LAYER    │         │
│  │                 │ │               │ │                │         │
│  │ IVR · Bots      │ │ Desktop · CRM │ │ CDRs · Reports │         │
│  │ Transcription   │ │ Skills Routing│ │ Analytics · DB │         │
│  │ Sentiment       │ │ Queuing       │ │ Compliance     │         │
│  └─────────────────┘ └───────────────┘ └────────────────┘         │
└─────────────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Now let's go deep into each zone.


🔵 Zone 1: Customer-Facing Channels

This is where your customers actually reach you. Vonage supports seven distinct channel types, all of which can be unified into a single routing and conversation system.

                    CUSTOMER ENTRY POINTS
                           │
        ┌──────────────────┼──────────────────┐
        │                  │                  │
        ▼                  ▼                  ▼
  ┌──────────┐      ┌──────────┐      ┌──────────┐
  │  📞 PSTN │      │ 💬 Chat  │      │ 📱 Mobile│
  │  Voice   │      │  Web     │      │  Apps    │
  │          │      │  Widget  │      │          │
  │ Dial a   │      │ Embedded │      │ In-app   │
  │ number   │      │ on your  │      │ support  │
  │          │      │ website  │      │          │
  └────┬─────┘      └────┬─────┘      └────┬─────┘
       │                 │                  │
        ─────────────────┼──────────────────
                         │
        ┌────────────────┼────────────────┐
        │                │                │
        ▼                ▼                ▼
  ┌──────────┐    ┌──────────┐    ┌──────────┐
  │ 📲 SMS   │    │📱WhatsApp│    │ 📧 Email │
  │          │    │          │    │          │
  │ Two-way  │    │ Business │    │ Inbound  │
  │ text     │    │ messaging│    │ support  │
  │          │    │          │    │ tickets  │
  └────┬─────┘    └────┬─────┘    └────┬─────┘
       │               │               │
        ───────────────┼───────────────
                       │
                       ▼
              VONAGE API LAYER
              (Zone 2 receives
               all of the above)
Enter fullscreen mode Exit fullscreen mode

Key insight: Channels are just data

From an architectural perspective, every channel above eventually becomes a Vonage Conversation — a unified data structure that holds messages, participants, and events regardless of which channel they came from. This is the superpower of Vonage's design. More on this in Zone 3.


🔵 Zone 2: The Vonage Network Layer

This is Vonage's infrastructure — the part you don't manage directly but absolutely need to understand.

┌──────────────────────────────────────────────────────────────┐
│                  VONAGE NETWORK LAYER                        │
│                                                              │
│  ┌─────────────────┐     ┌─────────────────────────────┐   │
│  │  PSTN GATEWAY   │     │     WEBRTC BRIDGE           │   │
│  │                 │     │                             │   │
│  │  Converts       │     │  Converts browser/app       │   │
│  │  traditional    │     │  audio (WebRTC) ◄──►        │   │
│  │  phone calls    │     │  traditional telephony      │   │
│  │  to IP packets  │     │  (RTP/SIP)                  │   │
│  └────────┬────────┘     └──────────────┬──────────────┘   │
│           │                             │                    │
│           └──────────────┬──────────────┘                    │
│                          │                                   │
│                 ┌────────▼────────┐                          │
│                 │  MEDIA SERVER   │                          │
│                 │                 │                          │
│                 │ • Audio mixing  │                          │
│                 │ • Recording     │                          │
│                 │ • DTMF detect   │                          │
│                 │ • Transcoding   │                          │
│                 └────────┬────────┘                          │
│                          │                                   │
│  ┌─────────────────┐     │     ┌─────────────────────────┐  │
│  │  NUMBER POOL    │     │     │   SIP TRUNKING          │  │
│  │                 │     │     │                         │  │
│  │  Virtual phone  │     │     │  Connect your existing  │  │
│  │  numbers in     │─────┘     │  PBX or SIP provider    │  │
│  │  160+ countries │           │  to Vonage              │  │
│  └─────────────────┘           └─────────────────────────┘  │
└──────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

What is a Vonage Virtual Number?

When a customer dials your call centre number, they dial a Vonage virtual number — a phone number hosted in Vonage's platform. This number is linked to your application via a webhook URL. The moment a call arrives, Vonage pings your server.

Customer dials: +44 20 7946 0123
                        │
                        ▼
         Is this a Vonage virtual number?
                        │
                       YES
                        │
                        ▼
         Look up the Answer Webhook URL
         for this number → https://yourapp.com/webhook/answer
                        │
                        ▼
         HTTP GET to your server
                        │
                        ▼
         Your server returns NCCO JSON
                        │
                        ▼
         Vonage executes the NCCO actions
Enter fullscreen mode Exit fullscreen mode

This webhook-driven model is fundamental. Vonage never makes decisions — your code does.


🔵 Zone 3: Your Application Layer

This is where you live as a developer. Your application is the brain of the call centre.

┌───────────────────────────────────────────────────────────────────┐
│                   YOUR APPLICATION LAYER                          │
│                                                                   │
│                                                                   │
│  ┌─────────────────────────────────────────────────────────────┐  │
│  │                  WEBHOOK HANDLERS                           │  │
│  │                                                             │  │
│  │  POST /webhooks/answer    ← Vonage fires when a call arrives│  │
│  │  POST /webhooks/event     ← Vonage fires on call events     │  │
│  │  POST /webhooks/fallback  ← Vonage fires on errors          │  │
│  │  POST /webhooks/recording ← Vonage fires when rec. ready    │  │
│  │  POST /webhooks/message   ← Vonage fires on SMS/chat msg    │  │
│  └───────────────────┬─────────────────────────────────────────┘  │
│                      │                                            │
│                      ▼                                            │
│  ┌─────────────────────────────────────────────────────────┐      │
│  │                  BUSINESS LOGIC                         │  │
│  │                                                         │  │
│  │  • Which IVR menu to present?                           │  │
│  │  • Which agent skill is needed?                         │  │
│  │  • Is it within business hours?                         │  │
│  │  • Is this customer a VIP? (CRM lookup)                 │  │
│  │  • Should this call be recorded?                        │  │
│  │  • What language should the IVR speak?                  │  │
│  └───────────────────┬─────────────────────────────────────┘  │
│                      │                                         │
│                      ▼                                         │
│  ┌─────────────────────────────────────────────────────────┐  │
│  │                  NCCO BUILDER                           │  │
│  │                                                         │  │
│  │  Dynamically construct NCCO JSON based on logic above   │  │
│  │  Return it to Vonage as the HTTP response               │  │
│  └───────────────────┬─────────────────────────────────────┘  │
│                      │                                         │
│          ┌───────────┼───────────┐                            │
│          ▼           ▼           ▼                            │
│    Vonage API    Database     External                        │
│    Calls         (State)      Services                        │
│    (outbound)                 (CRM, AI)                       │
└───────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

The Webhook Request/Response Cycle

Every interaction in a Vonage call centre follows this pattern:

  YOUR SERVER                          VONAGE
       │                                  │
       │                                  │  Customer dials in
       │                                  │◄─────────────────
       │                                  │
       │  POST /webhooks/answer           │
       │◄─────────────────────────────────│
       │  Body: { from, to, uuid, ... }   │
       │                                  │
       │  200 OK + NCCO JSON              │
       │─────────────────────────────────►│
       │                                  │
       │                                  │  Executes NCCO
       │                                  │  (plays greeting,
       │                                  │   records, connects)
       │                                  │
       │  POST /webhooks/event            │
       │◄─────────────────────────────────│
       │  Body: { status: "answered" }    │
       │                                  │
       │  200 OK                          │
       │─────────────────────────────────►│
       │                                  │
       │  POST /webhooks/event            │
       │◄─────────────────────────────────│
       │  Body: { status: "completed",    │
       │          duration: 245 }         │
       │                                  │
       │  200 OK                          │
       │─────────────────────────────────►│
Enter fullscreen mode Exit fullscreen mode

⚠️ Critical rule: Your webhook endpoints must respond within 3 seconds. Vonage will time out and try the fallback URL if you take too long. Keep business logic fast — offload heavy work asynchronously.


🔵 The Conversation API: Vonage's Unified Data Model

The Conversation API is the single most important API to understand in this stack. It is the data layer that unifies every channel.

┌──────────────────────────────────────────────────────────────┐
│              CONVERSATION API DATA MODEL                     │
│                                                              │
│                    CONVERSATION                              │
│              ┌─────────────────────┐                        │
│              │  id: CON-abc123      │                        │
│              │  display_name:       │                        │
│              │   "Support - Alice"  │                        │
│              │  state: ACTIVE       │                        │
│              └──────────┬──────────┘                        │
│                         │                                    │
│         ┌───────────────┼────────────────┐                  │
│         ▼               ▼                ▼                  │
│    ┌─────────┐     ┌─────────┐     ┌─────────┐             │
│    │ MEMBER  │     │ MEMBER  │     │ MEMBER  │             │
│    │         │     │         │     │         │             │
│    │Customer │     │  Agent  │     │  Bot    │             │
│    │ Alice   │     │  Bob    │     │ (IVR)   │             │
│    │         │     │         │     │         │             │
│    │ channel:│     │ channel:│     │ channel:│             │
│    │  voice  │     │   app   │     │   app   │             │
│    └────┬────┘     └────┬────┘     └────┬────┘             │
│         │               │               │                   │
│         └───────────────┼───────────────┘                   │
│                         │                                    │
│                         ▼                                    │
│              ┌──────────────────┐                           │
│              │     EVENTS       │                           │
│              │                  │                           │
│              │ • member:joined  │                           │
│              │ • member:left    │                           │
│              │ • audio:say:done │                           │
│              │ • audio:record   │                           │
│              │ • message:submit │                           │
│              └──────────────────┘                           │
└──────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Why This Matters

The magic of the Conversation API is that a customer can start on web chat, escalate to voice, and both interactions live in the same Conversation object. The agent sees the full history.

SAME CUSTOMER — DIFFERENT CHANNELS — ONE CONVERSATION

  10:00 AM  Customer opens web chat
            "Hi, I have a billing question."
                    │
                    ▼ Conversation CON-abc123 created

  10:04 AM  Customer frustrated, clicks "Call Me."
            Voice call initiated
                    │
                    ▼ Voice MEMBER added to CON-abc123

  10:05 AM  Agent answers
            Sees: "This customer was chatting
                   about billing since 10:00."
                    │
                    ▼ Full context preserved, no repeat needed

  10:12 AM  Issue resolved
            Both chat messages and call recordings
            stored in CON-abc123
Enter fullscreen mode Exit fullscreen mode

🔵 Zone 4: The AI & Automation Layer

This zone sits between your application logic and the customer. It handles contacts that don't immediately need a human agent.

┌────────────────────────────────────────────────────────────────┐
│                    AI & AUTOMATION LAYER                       │
│                                                                │
│                         INBOUND CONTACT                        │
│                               │                                │
│                               ▼                                │
│             ┌─────────────────────────────────┐               │
│             │         IVR / BOT LAYER         │               │
│             │                                 │               │
│             │  Can this be resolved without   │               │
│             │  a human agent?                 │               │
│             └────────────────┬────────────────┘               │
│                              │                                 │
│               ┌──────────────┼──────────────┐                 │
│               │              │              │                 │
│               ▼              ▼              ▼                 │
│         ┌──────────┐  ┌──────────┐  ┌──────────┐            │
│         │  DTMF    │  │   NLU    │  │ SELF     │            │
│         │  IVR     │  │   Bot    │  │ SERVICE  │            │
│         │          │  │          │  │          │            │
│         │ Press 1  │  │ "I want  │  │ Balance  │            │
│         │ for Sales│  │  to check│  │ check,   │            │
│         │ Press 2  │  │  my bill"│  │ order    │            │
│         │ for Supp.│  │          │  │ status,  │            │
│         └────┬─────┘  └────┬─────┘  │ FAQs     │            │
│              │             │        └────┬─────┘            │
│              │             │             │                   │
│              └─────────────┼─────────────┘                   │
│                            │                                  │
│                     Resolved?                                 │
│                     │       │                                 │
│                    YES      NO                                │
│                     │       │                                 │
│                     ▼       ▼                                 │
│               End call  Escalate to                           │
│                         human agent                           │
│                         (with context)                        │
└────────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Real-Time AI Assistants

Beyond IVR, Vonage's audio streaming capability allows you to run AI models during a live call:

                    LIVE CALL AI PIPELINE

  Customer speaking ──► Vonage Media Server
                              │
                              │ Audio stream (WebSocket)
                              ▼
                      Your App / ASR Service
                       (e.g., Google STT,
                        AWS Transcribe)
                              │
                              │ Text transcript
                              ▼
                        NLU / LLM Layer
                     (Intent detection,
                      sentiment scoring,
                      suggested responses)
                              │
                              │ Insights via WebSocket
                              ▼
                       Agent Desktop UI
                     (Shows agent: "Customer
                      seems frustrated,
                      suggest discount offer")
Enter fullscreen mode Exit fullscreen mode

🔵 Zone 5: The Agent Layer

The agent layer is what your agents interact with every day. Understanding its components is critical to designing a good agent experience.

┌──────────────────────────────────────────────────────────────┐
│                     AGENT LAYER                              │
│                                                              │
│  ┌─────────────────────────────────────────────────────┐    │
│  │                 AGENT DESKTOP                       │    │
│  │                                                     │    │
│  │  ┌──────────────┐  ┌──────────────┐  ┌──────────┐  │    │
│  │  │  SOFTPHONE   │  │  CUSTOMER    │  │  QUEUE   │  │    │
│  │  │              │  │  CONTEXT     │  │  STATUS  │  │    │
│  │  │ Answer/Hold  │  │              │  │          │  │    │
│  │  │ Transfer     │  │ Name, account│  │ Waiting: │  │    │
│  │  │ Conference   │  │ history,     │  │    4     │  │    │
│  │  │ Mute/unmute  │  │ previous     │  │ Avg wait:│  │    │
│  │  │              │  │ contacts,    │  │  1m 20s  │  │    │
│  │  │ [WebRTC]     │  │ open tickets │  │          │  │    │
│  │  └──────────────┘  └──────────────┘  └──────────┘  │    │
│  └──────────────────────────┬────────────────────────────┘    │
│                             │                                │
│  ┌──────────────────────────▼────────────────────────────┐   │
│  │              VONAGE CLIENT SDK                        │   │
│  │                                                       │   │
│  │  Handles all WebRTC audio in the browser              │   │
│  │  Listens for incoming calls                           │   │
│  │  Emits events: call:received, call:answered, etc.     │   │
│  └──────────────────────────┬────────────────────────────┘   │
│                             │                                 │
│           ┌─────────────────┼─────────────────┐              │
│           ▼                 ▼                 ▼              │
│    ┌─────────────┐  ┌─────────────┐  ┌─────────────┐        │
│    │  ROUTING    │  │    SKILLS   │  │   AGENT     │        │
│    │  ENGINE     │  │  MANAGEMENT │  │   STATUS    │        │
│    │             │  │             │  │             │        │
│    │ Which agent │  │ Define skill│  │ Available   │        │
│    │ gets this   │  │ groups:     │  │ Busy        │        │
│    │ contact?    │  │ • Billing   │  │ Wrap-up     │        │
│    │             │  │ • Technical │  │ Offline     │        │
│    │ Priority,   │  │ • Spanish   │  │             │        │
│    │ skills,     │  │ • VIP       │  │             │        │
│    │ availability│  │             │  │             │        │
│    └─────────────┘  └─────────────┘  └─────────────┘        │
└──────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Skills-Based Routing Flow

  INBOUND CONTACT ARRIVES
           │
           ▼
  What skill does this contact need?
  (determined by IVR input or channel)
           │
    ┌──────┴──────┐
    │             │
    ▼             ▼
  Billing      Technical
  Support      Support
    │             │
    ▼             ▼
  Find available agent    Find available agent
  with BILLING skill      with TECHNICAL skill
    │                         │
    ▼                         ▼
  Agent found?           Agent found?
  │         │            │         │
 YES        NO          YES        NO
  │         │            │         │
  ▼         ▼            ▼         ▼
Connect   Queue      Connect    Queue
 now     customer    now      customer
          │                      │
          ▼                      ▼
     Wait for the next      Offer callback
     available agent        or voicemail
Enter fullscreen mode Exit fullscreen mode

🔵 Zone 6: The Data Layer

Every interaction generates data. This zone is responsible for storing, processing, and surfacing that data.

┌──────────────────────────────────────────────────────────────┐
│                      DATA LAYER                              │
│                                                              │
│  DATA SOURCES                                                │
│  ┌───────────┐  ┌───────────┐  ┌───────────┐  ┌─────────┐  │
│  │ Call      │  │ Vonage    │  │ Agent     │  │ CSAT    │  │
│  │ Events    │  │ Reports   │  │ Activity  │  │ Surveys │  │
│  │ (webhooks)│  │ API (CDRs)│  │ (status)  │  │         │  │
│  └─────┬─────┘  └─────┬─────┘  └─────┬─────┘  └────┬────┘  │
│        │              │              │              │        │
│        └──────────────┼──────────────┘──────────────┘        │
│                       │                                      │
│                       ▼                                      │
│  ┌────────────────────────────────────────────────────────┐  │
│  │                 DATA PIPELINE                          │  │
│  │                                                        │  │
│  │  Ingest ──► Normalise ──► Store ──► Query ──► Visualise│  │
│  └────────────────────────────────────────────────────────┘  │
│                       │                                      │
│         ┌─────────────┼─────────────┐                       │
│         ▼             ▼             ▼                       │
│  ┌─────────────┐ ┌─────────┐ ┌──────────────┐             │
│  │  REAL-TIME  │ │HISTORICAL│ │  COMPLIANCE  │             │
│  │  DASHBOARD  │ │REPORTING │ │  & AUDIT     │             │
│  │             │ │          │ │              │             │
│  │ Live queue  │ │ Daily CDR│ │ Call records │             │
│  │ Agent status│ │ Trends   │ │ Retention    │             │
│  │ SLA alerts  │ │ CSAT avg │ │ GDPR logs    │             │
│  │ Active calls│ │ Handle   │ │ PCI masking  │             │
│  │             │ │   time   │ │              │             │
│  └─────────────┘ └─────────┘ └──────────────┘             │
└──────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

What's in a CDR?

A Call Detail Record (CDR) from Vonage's Reports API contains:

{
  "record_index": 0,
  "call_uuid": "63f61863-4a51-4f6b-86e1-46edebio0391",
  "conversation_uuid": "CON-abc123",
  "direction": "inbound",
  "from": "+441234567890",
  "to": "+442079460123",
  "date_start": "2024-01-15T09:00:00Z",
  "date_end": "2024-01-15T09:04:05Z",
  "duration": "245",
  "rate": "0.00450",
  "price": "0.01838",
  "network": "GB-FIXED",
  "status": "completed",
  "recording_url": "https://api.nexmo.com/v1/files/...",
  "agent_id": "agent-bob",
  "queue_time": "32",
  "handle_time": "213"
}
Enter fullscreen mode Exit fullscreen mode

We'll use this data in Week 5 to build real-time dashboards and analytics.


🔄 End-to-End Architecture: Putting It All Together

Now let's trace a complete, realistic scenario from start to finish. A customer calls with a billing dispute:

STEP 1: CALL ARRIVES
━━━━━━━━━━━━━━━━━━━━
Customer dials +44 20 7946 0123
        │
        ▼
Vonage receives call on virtual number
        │
        ▼
POST /webhooks/answer fired to your app
        │
        ▼
Your app queries CRM: "Is this a known customer?"
        │
       YES — VIP customer identified
        │
        ▼
NCCO returned: Skip standard IVR, play VIP greeting


STEP 2: IVR / INTENT DETECTION
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
NCCO: "Welcome back, Sarah. How can we help you today?"
        │
        ▼
Customer speaks: "I have a problem with my bill."
        │
        ▼
Vonage ASR detects speech → sends to NLU
        │
        ▼
Intent: BILLING_DISPUTE, Sentiment: NEGATIVE
        │
        ▼
Routing decision: needs BILLING_SENIOR skill


STEP 3: ROUTING
━━━━━━━━━━━━━━━
Query agent pool: who has BILLING_SENIOR skill?
        │
        ▼
Agent Bob — Available ✅
Agent Carol — Busy ❌
Agent Dave — Wrap-up ❌
        │
        ▼
Assign to Agent Bob


STEP 4: AGENT NOTIFICATION
━━━━━━━━━━━━━━━━━━━━━━━━━━
Vonage Client SDK fires call:received on Bob's browser
        │
        ▼
Agent desktop shows:
  • Customer: Sarah Johnson
  • Account value: £4,200/yr (VIP)
  • Intent: Billing dispute
  • Sentiment: Negative ⚠️
  • Previous contacts: 2 (last: billing, 3 months ago)
  • Suggested action: "Offer goodwill credit"
        │
        ▼
Bob clicks "Answer"


STEP 5: CONNECTED
━━━━━━━━━━━━━━━━━
WebRTC audio channel established
Both parties connected
Recording starts (with consent tone)
Real-time transcription begins → feeds AI assistant


STEP 6: RESOLUTION & POST-CALL
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Call ends (Bob clicks "End")
        │
        ▼
Vonage fires POST /webhooks/event {status: "completed"}
        │
        ├──► CDR written to database
        ├──► Recording stored in object storage
        ├──► CRM updated with call outcome
        ├──► CSAT survey SMS sent to Sarah
        └──► Analytics dashboard updated in real time
Enter fullscreen mode Exit fullscreen mode

⚡ Architecture Decision: Stateless vs Stateful

One of the most important architectural decisions you'll make is how your application manages state during a call.

Option A: Stateless (Recommended for simplicity)

┌──────────────────────────────────────────────────────────────┐
│  STATELESS ARCHITECTURE                                      │
│                                                              │
│  Each webhook request contains everything you need           │
│  No in-memory state — all state from Vonage or DB            │
│                                                              │
│  Webhook arrives                                             │
│      │                                                       │
│      ▼                                                       │
│  Extract call UUID from request                              │
│      │                                                       │
│      ▼                                                       │
│  Fetch state from Redis/DB using UUID                        │
│      │                                                       │
│      ▼                                                       │
│  Apply business logic                                        │
│      │                                                       │
│      ▼                                                       │
│  Return NCCO / 200 OK                                        │
│                                                              │
│  ✅ Scales horizontally — run 10 instances, no problem       │
│  ✅ Survives server restarts                                  │
│  ✅ Easy to debug — each request is self-contained           │
└──────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Option B: Stateful (Simpler but fragile)

┌──────────────────────────────────────────────────────────────┐
  STATEFUL ARCHITECTURE                                       
                                                              
  Call state stored in application memory (Map, object)       
                                                              
  // In-memory call store                                     │
  const calls = new Map();                                    
  calls.set(callUUID, { step: 'ivr', customer: 'sarah' });   
                                                              
   Fails if server restarts mid-call                        
   Doesn't scale — webhook must hit same instance           │
│  ❌ Memory leak risk on high volume                           │
│  ⚠️  OK for local dev / prototypes only                      │
└──────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

💡 Recommendation: Use Redis as your state store from Day 1. It's fast enough for real-time call state and survives restarts.


🌐 Deployment Architecture

Where do you run your application backend? Here are the three common patterns:

PATTERN 1: SINGLE SERVER (Dev/Small scale)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  Vonage ──► Your VPS/EC2 ──► App + Redis
  Simple but single point of failure


PATTERN 2: CONTAINERISED (Recommended)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  Vonage ──► Load Balancer
                 │
          ┌──────┼──────┐
          ▼      ▼      ▼
       App 1  App 2  App 3  (stateless)
          │      │      │
          └──────┼──────┘
                 ▼
              Redis
              (shared state)
  Scales horizontally, survives instance failure


PATTERN 3: SERVERLESS (Event-driven)
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
  Vonage ──► API Gateway ──► Lambda/Cloud Functions
                                      │
                                   DynamoDB / Redis
  Pay-per-call, infinite scale
  Watch for cold start latency (must respond in 3s!)
Enter fullscreen mode Exit fullscreen mode

🔒 Security Architecture Overview

Security sits across every zone. Here's where each control lives:

┌──────────────────────────────────────────────────────────────┐
│               SECURITY CONTROLS BY ZONE                     │
│                                                              │
│  Zone 1 (Channels)                                           │
│    • Number masking for privacy                              │
│    • TLS 1.2+ for all channel data in transit                │
│                                                              │
│  Zone 2 (Vonage Network)                                     │
│    • SRTP for encrypted audio                                │
│    • Vonage signs all webhooks (JWT / signature header)      │
│                                                              │
│  Zone 3 (Your App)                                           │
│    • Validate Vonage webhook signatures on every request     │
│    • Store API keys in environment variables / secrets mgr   │
│    • HTTPS-only endpoints                                    │
│                                                              │
│  Zone 4 (AI Layer)                                           │
│    • PCI: Pause recording during card number input           │
│    • GDPR: Consent before recording                          │
│                                                              │
│  Zone 5 (Agent Layer)                                        │
│    • JWT-based agent authentication                          │
│    • Role-based access control (RBAC)                        │
│    • Screen recording audit logs                             │
│                                                              │
│  Zone 6 (Data Layer)                                         │
│    • Encryption at rest for recordings                       │
│    • Retention policies (auto-delete after N days)           │
│    • Audit trail for all data access                         │
└──────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

We'll implement all of these in Week 6. For now, just know where they belong.


📐 Designing for Scale: Numbers to Know

Before you design, know your scale targets. Here are reference figures for Vonage architecture:

Metric Small (Startup) Medium (SME) Large (Enterprise)
Concurrent calls < 50 50–500 500–5,000+
Agents 5–20 20–200 200–2,000+
Webhook response budget 3 seconds 3 seconds 3 seconds
Recommended state store Redis single Redis cluster Redis cluster + read replicas
Recommended deploy Single container K8s / ECS K8s multi-region
CDR storage PostgreSQL PostgreSQL + S3 Data warehouse (Snowflake / BigQuery)

🧪 Architecture Anti-Patterns to Avoid

Learn from common mistakes before you make them:

❌ ANTI-PATTERN 1: Synchronous CRM lookup in webhook handler
   Webhook fires → you query Salesforce → Salesforce takes 4s
   Result: Vonage times out, call fails

   ✅ FIX: Cache CRM data in Redis (TTL 5 min)
           or query async + use default NCCO if not ready

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

❌ ANTI-PATTERN 2: One webhook URL for everything
   POST /webhook handles ALL events
   Complex if/else logic, hard to debug

   ✅ FIX: Separate routes per event type
           /webhooks/answer
           /webhooks/event
           /webhooks/recording

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

❌ ANTI-PATTERN 3: Storing recordings on your app server
   Recordings fill disk, no redundancy, not CDN-cached

   ✅ FIX: Stream recordings directly to S3/GCS on receipt
           Use presigned URLs for agent access

━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

❌ ANTI-PATTERN 4: No fallback webhook URL
   Your server goes down mid-call
   Vonage has nowhere to send events → call hangs

   ✅ FIX: Always configure a fallback URL
           Use a secondary server or a simple Lambda
           that queues events for retry
Enter fullscreen mode Exit fullscreen mode

🗂️ Technology Stack Reference

Here's a recommended stack for building on Vonage. We'll use this throughout the series:

┌──────────────────────────────────────────────────────────────┐
│           RECOMMENDED TECHNOLOGY STACK                       │
│                                                              │
│  Backend Language   │  Node.js 20 (Express or Fastify)       │
│                     │  Python 3.12 (FastAPI)                 │
│                     │  (examples in both throughout series)  │
│                     │                                        │
│  State Store        │  Redis 7                               │
│                     │                                        │
│  Database           │  PostgreSQL 16 (for CDRs, records)     │
│                     │                                        │
│  Message Queue      │  Redis Pub/Sub or BullMQ               │
│                     │  (async post-call processing)          │
│                     │                                        │
│  Object Storage     │  AWS S3 or GCS                         │
│                     │  (recordings, exports)                 │
│                     │                                        │
│  Agent Desktop      │  React 18 + Vonage Client SDK          │
│                     │                                        │
│  Infrastructure     │  Docker + docker-compose (local dev)   │
│                     │  Kubernetes or ECS (production)        │
│                     │                                        │
│  CI/CD              │  GitHub Actions                        │
│                     │                                        │
│  Local Tunnelling   │  ngrok (Vonage webhooks in local dev)  │
└──────────────────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

✅ Architecture Checklist

Before starting to build, run through this checklist:

ARCHITECTURE PLANNING CHECKLIST

  Channels
  □ Which channels do you need? (voice, SMS, chat, WhatsApp)
  □ Do you need omnichannel (unified history across channels)?

  Routing
  □ What skills/queues do you need?
  □ Do you need time-of-day routing?
  □ What's the overflow strategy (queue, voicemail, callback)?

  IVR / Automation
  □ DTMF menus or natural language?
  □ What can be self-served without an agent?
  □ What's the escalation path?

  Application
  □ What language/framework for the backend?
  □ Where are webhook handlers deployed?
  □ What's your state store? (Redis recommended)
  □ Fallback webhook URL configured?

  Agent Experience
  □ Custom agent desktop or existing CRM embed?
  □ What context does the agent need at call answer?
  □ What post-call wrap-up workflow is needed?

  Data & Analytics
  □ Where are CDRs stored?
  □ What real-time metrics do supervisors need?
  □ What's the recording retention policy?

  Security
  □ Webhook signature validation in place?
  □ API keys in secrets manager (not code)?
  □ Recording encryption and access control?
  □ GDPR consent flow for recordings?
Enter fullscreen mode Exit fullscreen mode

🚀 What's Next

In Day 3, we get hands-on. We'll:

  • Create your Vonage API account
  • Install the Vonage CLI
  • Buy a virtual phone number
  • Set up ngrok for local webhook development
  • Deploy a working "Hello World" call handler that answers a real phone call

By the end of Day 3 you'll be able to call a real phone number and have your code answer it.


💬 Discussion

Now that you've seen the full architecture:

  • Which zone are you most interested to build first?
  • Are you building for a startup, migrating from an on-premise PBX, or adding call centre capabilities to an existing product?
  • Any architectural patterns from your own experience you'd add?

Drop a comment below — your questions shape the depth of future posts.


📌 Bookmark this post — it's your architecture reference for the entire series.

🔔 Follow for Day 3: Setup from Scratch — tomorrow.


Series: Building Cloud Call Centres with Vonage APIs
Day 1: What is a Cloud Call Centre? | Day 2 of 30 | Day 3: Setup from Scratch

Top comments (0)