Iftekhairul Alam

Posted on Oct 29

NextGenSwitch: Build AI-Powered Telephony (Features, Examples, and a Developer Quickstart)

#ai #tutorial #callcenter #pbx

If you’ve ever hacked together SIP trunks, WebRTC agents, IVRs, recordings—and then tried to bolt on an AI receptionist—you know the pain. NextGenSwitch (NGS) is a modern softswitch with programmable voice and first-class AI hooks that makes this stack delightfully straightforward.
Repo: github.com/nextgenswitch/nextgenswitch

This post gives you:

A whirlwind tour of features and architecture
Exactly how AI integrations plug in (LLMs, realtime voice, function-calling)
A copy-paste Quickstart (outbound calls, XML verbs, status callbacks, live call modify)
Dev procedures and a production checklist

TL;DR: NGS offers Twilio-style call control with softswitch power under the hood—plus clean integration points for LLMs and your backend tools.

What is NextGenSwitch?

NextGenSwitch (NGS) is a cloud softswitch + programmable voice layer:

SIP + WebRTC: bridge PSTN/SIP customers to browser-based agents.
Programmable Calls with XML/JSON-style responses from your webhooks.
IVR, Gather, Queueing, Recording built-in.
Events & CDRs for observability.
AI-native: stream audio to a realtime LLM, or do turn-by-turn ASR ? LLM ? TTS; call your own tools (helpdesk, CRM, calendar, etc.).

Feature Highlights

Outbound API: originate calls and drive the flow with hosted XML or inline responseXml.
Inbound Webhooks: answer URL returns NGS verbs like <Say>, <Play>, <Gather>, <Dial>, <Record>.
Status Callbacks: get DIALING, RINGING, ESTABLISHED, DISCONNECTED, and more.
Modify-Live: push new instructions mid-call (e.g., speak a message, transfer, or bridge).
Recording + Transcription: trigger, store, and process (e.g., summarize with LLMs).
Queues & Skills: route by priority, business hours, or sticky agent.

Architecture at a Glance

Caller (PSTN/SIP)
        ¦
        ?
   NGS Ingress --? Your Webhook
        ¦              ¦
        ¦              +-? LLM (realtime or text)
        ¦              +-? Tools (tickets, CRM, calendar)
        ¦              +-? Business logic (IVR/queues/policies)
        ?
   Media Engine (SIP/WebRTC, TTS/ASR)
        ¦
        +-? Agent Browser (WebRTC)
        +-? Upstream SIP / PSTN

You keep your business logic server-side; NGS handles signaling, media, and scale.

AI Integrations (How it actually fits)

You’ve got two common patterns:

Turn-based
NGS records/streams ? your server gets transcript ? you ask an LLM what to do ? return NGS actions (<Say>, <Gather>, enqueue, etc.).
Realtime voice
Full-duplex audio between caller and LLM (e.g., OpenAI Realtime, Gemini Live). The LLM can produce audio or directives; your webhook can still inject actions (transfer, record, ticketing).

Function-calling / Tools: define JSON schemas like create_support_ticket, schedule_appointment, lookup_customer. When the LLM calls a tool, you do the API call, then continue the conversation with the caller.

Developer Quickstart (Copy-Paste)

Replace YOUR_NGS_BASE_URL, YOUR_AUTH_ID, YOUR_AUTH_SECRET and public URLs where applicable.

1) Outbound Call — Hosted XML (via `response`) or Inline (via `responseXml`)

Option A: hosted XML

curl -X POST \
  -H "X-Authorization: YOUR_AUTH_ID" \
  -H "X-Authorization-Secre: YOUR_AUTH_SECRET" \
  -d "to=1002" \
  -d "from=1001" \
  -d "statusCallback=https://example.com/voice/status" \
  -d "response=https://example.com/voice/hello.xml" \
  https://YOUR_NGS_BASE_URL/api/v1/call

Option B: inline XML

curl -X POST \
  -H "X-Authorization: YOUR_AUTH_ID" \
  -H "X-Authorization-Secre: YOUR_AUTH_SECRET" \
  --data-urlencode 'to=1002' \
  --data-urlencode 'from=1001' \
  --data-urlencode 'statusCallback=https://example.com/voice/status' \
  --data-urlencode 'responseXml=<?xml version="1.0" encoding="UTF-8"?><Response><Say>Hello from inline XML</Say></Response>' \
  https://YOUR_NGS_BASE_URL/api/v1/call

Minimal XML you can host at https://example.com/voice/hello.xml:

<?xml version="1.0" encoding="UTF-8"?>
<Response>
  <Say>Hello, world! This call is powered by NextGenSwitch.</Say>
</Response>

2) Status Callbacks (Receive live events)

Node/Express

import express from "express";
const app = express();
app.use(express.urlencoded({ extended: false })); // NGS posts form-encoded

app.post("/voice/status", (req, res) => {
  console.log("NGS status:", req.body); // e.g., { call_id, status, timestamp, ... }
  res.sendStatus(200);                   // respond fast; do heavy work async
});

app.listen(3000, () => console.log("Status callback on :3000"));

Sample payload (form-encoded):

call_id=abc123
status=RINGING
timestamp=2025-10-29T08:00:00Z

3) Modify a Live Call (push new instructions)

curl -X PUT \
  -H "X-Authorization: YOUR_AUTH_ID" \
  -H "X-Authorization-Secre: YOUR_AUTH_SECRET" \
  --data-urlencode 'responseXml=<Response><Pause length="2"/><Say>Call has been modified</Say><Dial>1000</Dial></Response>' \
  https://YOUR_NGS_BASE_URL/api/v1/call/REPLACE_WITH_CALL_ID

4) Multi-language Examples (Node, Python, PHP)

Node (fetch)

import fetch from "node-fetch"; // Node <18; otherwise use global fetch
const base = "https://YOUR_NGS_BASE_URL";

const res = await fetch(`${base}/api/v1/call`, {
  method: "POST",
  headers: {
    "Content-Type": "application/x-www-form-urlencoded",
    "X-Authorization": "YOUR_AUTH_ID",
    "X-Authorization-Secre": "YOUR_AUTH_SECRET"
  },
  body: new URLSearchParams({
    to: "1002",
    from: "1001",
    statusCallback: "https://example.com/voice/status",
    responseXml: `<?xml version="1.0"?><Response><Say>Hello from Node</Say></Response>`
  })
});

console.log(await res.text());

Python (requests)

import requests

base = "https://YOUR_NGS_BASE_URL"
headers = {
    "X-Authorization": "YOUR_AUTH_ID",
    "X-Authorization-Secre": "YOUR_AUTH_SECRET",
}
data = {
    "to": "1002",
    "from": "1001",
    "statusCallback": "https://example.com/voice/status",
    "responseXml": '<?xml version="1.0"?><Response><Say>Hello from Python</Say></Response>'
}

r = requests.post(f"{base}/api/v1/call", headers=headers, data=data)
print(r.text)

PHP (Guzzle)

<?php
require 'vendor/autoload.php';
use GuzzleHttp\Client;

$client = new Client(['base_uri' => 'https://YOUR_NGS_BASE_URL']);
$response = $client->post('/api/v1/call', [
  'headers' => [
    'X-Authorization' => 'YOUR_AUTH_ID',
    'X-Authorization-Secre' => 'YOUR_AUTH_SECRET'
  ],
  'form_params' => [
    'to' => '1002',
    'from' => '1001',
    'statusCallback' => 'https://example.com/voice/status',
    'responseXml' => '<?xml version="1.0"?><Response><Say>Hello from PHP</Say></Response>'
  ]
]);

echo $response->getBody();

XML Verb Cheatsheet (NGS Actions)

<Say>: TTS

  <Response><Say loop="2">This message repeats twice.</Say></Response>

<Play>: play a media file

  <Response><Play loop="3">https://example.com/audio/connecting.mp3</Play></Response>

<Gather>: collect DTMF and POST to your action URL

  <Response>
    <Gather action="https://example.com/process_input" method="POST" maxDigits="4" timeout="10">
      <Say>Please enter your 4-digit PIN.</Say>
    </Gather>
  </Response>

<Dial>: bridge to a number/endpoint

  <Response>
    <Dial to="+1234567890" answerOnBridge="true" record="record-from-answer">
      <Play>https://example.com/audio/connecting.mp3</Play>
    </Dial>
  </Response>

<Record>: start a recording

  <Response>
    <Record action="https://example.com/handle_recording" method="POST" timeout="5" finishOnKey="#" beep="true"/>
  </Response>

Other useful controls: <Hangup>, <Pause>, <Redirect>, <Bridge>, <Leave>.

Development Procedures (What to actually do)

Create envs & keys

NGS_AUTH_ID, NGS_AUTH_SECRET, WEBHOOK_SECRET (if verifying signatures)
LLM keys, Helpdesk/CRM tokens

Expose your server

ngrok http 3000 (or Cloudflare Tunnel) for /voice/answer, /voice/status, /voice/ai

Provision routing

Buy/assign a number ? set Answer URL (returns XML) and Status Callback URL
Add queues (sales/support) with skills/hours if needed

Logging & Idempotency

Treat status webhooks as retryable; de-dupe by event_id if provided
Correlate call_id across logs, DB, and tickets

Security

Verify webhook signatures/timestamps
Encrypt recordings; redact sensitive entities from transcripts
Rotate secrets, enforce MFA

Testing

Unit test your XML builders
Simulate DTMF and edge flows (timeouts, invalid digits)
Load test queue wait, ASR ? LLM latency

Troubleshooting (Quick)

401/403 ? Check X-Authorization / secret headers (your deployment may use X-Authorization-Secre).
400 ? XML missing/invalid; start with <Response><Say>…</Say></Response>.
“Rings but no audio” ? If using hosted response, ensure your XML URL is public & returns valid XML with correct Content-Type.
No status events ? Verify public statusCallback URL; log incoming requests; keep handler fast.
Live modify does nothing ? Confirm call_id and that your PUT payload contains a valid responseXml.

Links & Credits

Website: nextgenswitch.com
GitHub: github.com/nextgenswitch/nextgenswitch

Wrap-Up

With NextGenSwitch, you write simple webhooks and XML while the platform handles media and scale. Add AI for triage, summarization, or full conversational agents, and route to humans when it matters. The examples above are ready to paste—use them as a base, then layer in your product logic.

If you ship something with this, share a link—I’d love to see your call flows, response times, and UI patterns!

DEV Community

NextGenSwitch: Build AI-Powered Telephony (Features, Examples, and a Developer Quickstart)

What is NextGenSwitch?

Feature Highlights

Architecture at a Glance

AI Integrations (How it actually fits)

Developer Quickstart (Copy-Paste)

1) Outbound Call — Hosted XML (via `response`) or Inline (via `responseXml`)

2) Status Callbacks (Receive live events)

3) Modify a Live Call (push new instructions)

4) Multi-language Examples (Node, Python, PHP)

XML Verb Cheatsheet (NGS Actions)

Development Procedures (What to actually do)

Troubleshooting (Quick)

Links & Credits

Wrap-Up

Top comments (0)

What is NextGenSwitch?

Feature Highlights

Architecture at a Glance

AI Integrations (How it actually fits)

Developer Quickstart (Copy-Paste)

1) Outbound Call — Hosted XML (via response) or Inline (via responseXml)

2) Status Callbacks (Receive live events)

3) Modify a Live Call (push new instructions)

4) Multi-language Examples (Node, Python, PHP)

XML Verb Cheatsheet (NGS Actions)

Development Procedures (What to actually do)

Troubleshooting (Quick)

Links & Credits

Wrap-Up

1) Outbound Call — Hosted XML (via `response`) or Inline (via `responseXml`)