DEV Community

Iftekhairul Alam
Iftekhairul Alam

Posted on

NextGenSwitch: Build AI-Powered Telephony (Features, Examples, and a Developer Quickstart)

If you’ve ever hacked together SIP trunks, WebRTC agents, IVRs, recordings—and then tried to bolt on an AI receptionist—you know the pain. NextGenSwitch (NGS) is a modern softswitch with programmable voice and first-class AI hooks that makes this stack delightfully straightforward.
Repo: github.com/nextgenswitch/nextgenswitch

This post gives you:

  • A whirlwind tour of features and architecture
  • Exactly how AI integrations plug in (LLMs, realtime voice, function-calling)
  • A copy-paste Quickstart (outbound calls, XML verbs, status callbacks, live call modify)
  • Dev procedures and a production checklist

TL;DR: NGS offers Twilio-style call control with softswitch power under the hood—plus clean integration points for LLMs and your backend tools.


What is NextGenSwitch?

NextGenSwitch (NGS) is a cloud softswitch + programmable voice layer:

  • SIP + WebRTC: bridge PSTN/SIP customers to browser-based agents.
  • Programmable Calls with XML/JSON-style responses from your webhooks.
  • IVR, Gather, Queueing, Recording built-in.
  • Events & CDRs for observability.
  • AI-native: stream audio to a realtime LLM, or do turn-by-turn ASR ? LLM ? TTS; call your own tools (helpdesk, CRM, calendar, etc.).

Feature Highlights

  • Outbound API: originate calls and drive the flow with hosted XML or inline responseXml.
  • Inbound Webhooks: answer URL returns NGS verbs like <Say>, <Play>, <Gather>, <Dial>, <Record>.
  • Status Callbacks: get DIALING, RINGING, ESTABLISHED, DISCONNECTED, and more.
  • Modify-Live: push new instructions mid-call (e.g., speak a message, transfer, or bridge).
  • Recording + Transcription: trigger, store, and process (e.g., summarize with LLMs).
  • Queues & Skills: route by priority, business hours, or sticky agent.

Architecture at a Glance

Caller (PSTN/SIP)
        ¦
        ?
   NGS Ingress --? Your Webhook
        ¦              ¦
        ¦              +-? LLM (realtime or text)
        ¦              +-? Tools (tickets, CRM, calendar)
        ¦              +-? Business logic (IVR/queues/policies)
        ?
   Media Engine (SIP/WebRTC, TTS/ASR)
        ¦
        +-? Agent Browser (WebRTC)
        +-? Upstream SIP / PSTN
Enter fullscreen mode Exit fullscreen mode

You keep your business logic server-side; NGS handles signaling, media, and scale.


AI Integrations (How it actually fits)

You’ve got two common patterns:

  1. Turn-based
    NGS records/streams ? your server gets transcript ? you ask an LLM what to do ? return NGS actions (<Say>, <Gather>, enqueue, etc.).

  2. Realtime voice
    Full-duplex audio between caller and LLM (e.g., OpenAI Realtime, Gemini Live). The LLM can produce audio or directives; your webhook can still inject actions (transfer, record, ticketing).

Function-calling / Tools: define JSON schemas like create_support_ticket, schedule_appointment, lookup_customer. When the LLM calls a tool, you do the API call, then continue the conversation with the caller.


Developer Quickstart (Copy-Paste)

Replace YOUR_NGS_BASE_URL, YOUR_AUTH_ID, YOUR_AUTH_SECRET and public URLs where applicable.

1) Outbound Call — Hosted XML (via response) or Inline (via responseXml)

Option A: hosted XML

curl -X POST \
  -H "X-Authorization: YOUR_AUTH_ID" \
  -H "X-Authorization-Secre: YOUR_AUTH_SECRET" \
  -d "to=1002" \
  -d "from=1001" \
  -d "statusCallback=https://example.com/voice/status" \
  -d "response=https://example.com/voice/hello.xml" \
  https://YOUR_NGS_BASE_URL/api/v1/call
Enter fullscreen mode Exit fullscreen mode

Option B: inline XML

curl -X POST \
  -H "X-Authorization: YOUR_AUTH_ID" \
  -H "X-Authorization-Secre: YOUR_AUTH_SECRET" \
  --data-urlencode 'to=1002' \
  --data-urlencode 'from=1001' \
  --data-urlencode 'statusCallback=https://example.com/voice/status' \
  --data-urlencode 'responseXml=<?xml version="1.0" encoding="UTF-8"?><Response><Say>Hello from inline XML</Say></Response>' \
  https://YOUR_NGS_BASE_URL/api/v1/call
Enter fullscreen mode Exit fullscreen mode

Minimal XML you can host at https://example.com/voice/hello.xml:

<?xml version="1.0" encoding="UTF-8"?>
<Response>
  <Say>Hello, world! This call is powered by NextGenSwitch.</Say>
</Response>
Enter fullscreen mode Exit fullscreen mode

2) Status Callbacks (Receive live events)

Node/Express

import express from "express";
const app = express();
app.use(express.urlencoded({ extended: false })); // NGS posts form-encoded

app.post("/voice/status", (req, res) => {
  console.log("NGS status:", req.body); // e.g., { call_id, status, timestamp, ... }
  res.sendStatus(200);                   // respond fast; do heavy work async
});

app.listen(3000, () => console.log("Status callback on :3000"));
Enter fullscreen mode Exit fullscreen mode

Sample payload (form-encoded):

call_id=abc123
status=RINGING
timestamp=2025-10-29T08:00:00Z
Enter fullscreen mode Exit fullscreen mode

3) Modify a Live Call (push new instructions)

curl -X PUT \
  -H "X-Authorization: YOUR_AUTH_ID" \
  -H "X-Authorization-Secre: YOUR_AUTH_SECRET" \
  --data-urlencode 'responseXml=<Response><Pause length="2"/><Say>Call has been modified</Say><Dial>1000</Dial></Response>' \
  https://YOUR_NGS_BASE_URL/api/v1/call/REPLACE_WITH_CALL_ID
Enter fullscreen mode Exit fullscreen mode

4) Multi-language Examples (Node, Python, PHP)

Node (fetch)

import fetch from "node-fetch"; // Node <18; otherwise use global fetch
const base = "https://YOUR_NGS_BASE_URL";

const res = await fetch(`${base}/api/v1/call`, {
  method: "POST",
  headers: {
    "Content-Type": "application/x-www-form-urlencoded",
    "X-Authorization": "YOUR_AUTH_ID",
    "X-Authorization-Secre": "YOUR_AUTH_SECRET"
  },
  body: new URLSearchParams({
    to: "1002",
    from: "1001",
    statusCallback: "https://example.com/voice/status",
    responseXml: `<?xml version="1.0"?><Response><Say>Hello from Node</Say></Response>`
  })
});

console.log(await res.text());
Enter fullscreen mode Exit fullscreen mode

Python (requests)

import requests

base = "https://YOUR_NGS_BASE_URL"
headers = {
    "X-Authorization": "YOUR_AUTH_ID",
    "X-Authorization-Secre": "YOUR_AUTH_SECRET",
}
data = {
    "to": "1002",
    "from": "1001",
    "statusCallback": "https://example.com/voice/status",
    "responseXml": '<?xml version="1.0"?><Response><Say>Hello from Python</Say></Response>'
}

r = requests.post(f"{base}/api/v1/call", headers=headers, data=data)
print(r.text)
Enter fullscreen mode Exit fullscreen mode

PHP (Guzzle)

<?php
require 'vendor/autoload.php';
use GuzzleHttp\Client;

$client = new Client(['base_uri' => 'https://YOUR_NGS_BASE_URL']);
$response = $client->post('/api/v1/call', [
  'headers' => [
    'X-Authorization' => 'YOUR_AUTH_ID',
    'X-Authorization-Secre' => 'YOUR_AUTH_SECRET'
  ],
  'form_params' => [
    'to' => '1002',
    'from' => '1001',
    'statusCallback' => 'https://example.com/voice/status',
    'responseXml' => '<?xml version="1.0"?><Response><Say>Hello from PHP</Say></Response>'
  ]
]);

echo $response->getBody();
Enter fullscreen mode Exit fullscreen mode

XML Verb Cheatsheet (NGS Actions)

  • <Say>: TTS
  <Response><Say loop="2">This message repeats twice.</Say></Response>
Enter fullscreen mode Exit fullscreen mode
  • <Play>: play a media file
  <Response><Play loop="3">https://example.com/audio/connecting.mp3</Play></Response>
Enter fullscreen mode Exit fullscreen mode
  • <Gather>: collect DTMF and POST to your action URL
  <Response>
    <Gather action="https://example.com/process_input" method="POST" maxDigits="4" timeout="10">
      <Say>Please enter your 4-digit PIN.</Say>
    </Gather>
  </Response>
Enter fullscreen mode Exit fullscreen mode
  • <Dial>: bridge to a number/endpoint
  <Response>
    <Dial to="+1234567890" answerOnBridge="true" record="record-from-answer">
      <Play>https://example.com/audio/connecting.mp3</Play>
    </Dial>
  </Response>
Enter fullscreen mode Exit fullscreen mode
  • <Record>: start a recording
  <Response>
    <Record action="https://example.com/handle_recording" method="POST" timeout="5" finishOnKey="#" beep="true"/>
  </Response>
Enter fullscreen mode Exit fullscreen mode

Other useful controls: <Hangup>, <Pause>, <Redirect>, <Bridge>, <Leave>.


Development Procedures (What to actually do)

  1. Create envs & keys
  • NGS_AUTH_ID, NGS_AUTH_SECRET, WEBHOOK_SECRET (if verifying signatures)
  • LLM keys, Helpdesk/CRM tokens
  1. Expose your server
  • ngrok http 3000 (or Cloudflare Tunnel) for /voice/answer, /voice/status, /voice/ai
  1. Provision routing
  • Buy/assign a number ? set Answer URL (returns XML) and Status Callback URL
  • Add queues (sales/support) with skills/hours if needed
  1. Logging & Idempotency
  • Treat status webhooks as retryable; de-dupe by event_id if provided
  • Correlate call_id across logs, DB, and tickets
  1. Security
  • Verify webhook signatures/timestamps
  • Encrypt recordings; redact sensitive entities from transcripts
  • Rotate secrets, enforce MFA
  1. Testing
  • Unit test your XML builders
  • Simulate DTMF and edge flows (timeouts, invalid digits)
  • Load test queue wait, ASR ? LLM latency

Troubleshooting (Quick)

  • 401/403 ? Check X-Authorization / secret headers (your deployment may use X-Authorization-Secre).
  • 400 ? XML missing/invalid; start with <Response><Say>…</Say></Response>.
  • “Rings but no audio” ? If using hosted response, ensure your XML URL is public & returns valid XML with correct Content-Type.
  • No status events ? Verify public statusCallback URL; log incoming requests; keep handler fast.
  • Live modify does nothing ? Confirm call_id and that your PUT payload contains a valid responseXml.

Links & Credits


Wrap-Up

With NextGenSwitch, you write simple webhooks and XML while the platform handles media and scale. Add AI for triage, summarization, or full conversational agents, and route to humans when it matters. The examples above are ready to paste—use them as a base, then layer in your product logic.

If you ship something with this, share a link—I’d love to see your call flows, response times, and UI patterns!

Top comments (0)