DEV Community

Code and Trust
Code and Trust

Posted on • Originally published at codeandtrust.com

How to Give Your Self-Hosted AI Agent Inbound Phone Calls (OpenClaw + Twilio)

Cross-posted from the Code and Trust blog. Canonical: codeandtrust.com/blog/openclaw-phone-calls

Your self-hosted OpenClaw agent can answer emails, send Slack messages, and query your calendar — but if someone calls your business number, the agent is nowhere to be found. This guide shows how to fix that with a 50-line Twilio webhook and an open-source bridge called clawcall (MIT).

The problem: realtime voice mode can't use gateway tools

OpenClaw's voice-call plugin has a realtime mode that provides excellent conversational feel — sub-second latency, barge-in support. But until recently (#71272), realtime mode ran as an isolated audio session that bypassed the gateway's tool registry entirely. Ask it to check your calendar or send a message, and it would politely decline.

clawcall takes a different approach: route the audio through the gateway's normal agent turn pipeline.

inbound call → Twilio → STT → chat.send() → agent (full tool access) → TTS → Twilio → caller
Enter fullscreen mode Exit fullscreen mode

The agent turn is a standard chat.send message to your gateway, so every skill, tool, and memory lookup works exactly as it does in your Telegram or Discord channel. The only difference is that input arrives as transcribed audio and output leaves as synthesized speech.

What you need

  • A running OpenClaw gateway (any version with the chat API)
  • A Twilio account with a phone number
  • Node.js 18+ (or Bun)
  • ngrok or a public URL for local dev

Setup in 5 minutes

1. Clone clawcall

git clone https://github.com/CODEANDTRUST/clawcall
cd clawcall
npm install
Enter fullscreen mode Exit fullscreen mode

2. Configure environment

OPENCLAW_GATEWAY_URL=http://localhost:4000
OPENCLAW_API_KEY=your-key
TWILIO_ACCOUNT_SID=ACxxxxxxx
TWILIO_AUTH_TOKEN=your-auth-token
STT_PROVIDER=deepgram        # or whisper
TTS_PROVIDER=elevenlabs      # or openai-tts
PORT=3000
Enter fullscreen mode Exit fullscreen mode

3. Expose with ngrok

ngrok http 3000
Enter fullscreen mode Exit fullscreen mode

Copy the https://xxxx.ngrok.io URL.

4. Wire Twilio

In your Twilio console → Phone Numbers → your number → Voice Configuration:

  • Incoming call webhook: https://xxxx.ngrok.io/call/incoming (HTTP POST)

5. Start the bridge

npm start
Enter fullscreen mode Exit fullscreen mode

Call your Twilio number. The bridge answers, transcribes your speech with Deepgram (or Whisper), sends it to your OpenClaw gateway as a chat message, streams the text response through your TTS provider, and plays the audio back to you. If you ask the agent to check your calendar or search memory, it does — because it's a real gateway turn.

How the tool-access problem is solved

The key is that chat.send goes through the full agent runtime, not a sidecar realtime session. The gateway schedules a turn, runs tool calls, awaits results, and returns a response — exactly as it would for a text message. clawcall just wraps this in audio I/O.

// core of the bridge (simplified)
app.post('/call/incoming', async (req, res) => {
  const transcript = await stt(req.body.audioUrl);          // STT
  const response   = await gateway.chat({ message: transcript }); // full agent turn
  const audio      = await tts(response.text);              // TTS
  res.twiml(audio);                                         // back to Twilio
});
Enter fullscreen mode Exit fullscreen mode

Production checklist

  • [ ] Move ngrok to a real public URL (Railway, Render, Fly.io — all work)
  • [ ] Add X-Twilio-Signature validation (twilio.validateRequest)
  • [ ] Set OPENCLAW_AGENT_ID to route calls to a specific agent persona
  • [ ] Add a per-call session key if you want conversation memory scoped to the call

Where to go from here


Built by Code and Trust — AI agent infrastructure for businesses.

Top comments (0)