Cross-posted from the Code and Trust blog. Canonical: codeandtrust.com/blog/openclaw-phone-calls
Your self-hosted OpenClaw agent can answer emails, send Slack messages, and query your calendar — but if someone calls your business number, the agent is nowhere to be found. This guide shows how to fix that with a 50-line Twilio webhook and an open-source bridge called clawcall (MIT).
The problem: realtime voice mode can't use gateway tools
OpenClaw's voice-call plugin has a realtime mode that provides excellent conversational feel — sub-second latency, barge-in support. But until recently (#71272), realtime mode ran as an isolated audio session that bypassed the gateway's tool registry entirely. Ask it to check your calendar or send a message, and it would politely decline.
clawcall takes a different approach: route the audio through the gateway's normal agent turn pipeline.
inbound call → Twilio → STT → chat.send() → agent (full tool access) → TTS → Twilio → caller
The agent turn is a standard chat.send message to your gateway, so every skill, tool, and memory lookup works exactly as it does in your Telegram or Discord channel. The only difference is that input arrives as transcribed audio and output leaves as synthesized speech.
What you need
- A running OpenClaw gateway (any version with the chat API)
- A Twilio account with a phone number
- Node.js 18+ (or Bun)
-
ngrokor a public URL for local dev
Setup in 5 minutes
1. Clone clawcall
git clone https://github.com/CODEANDTRUST/clawcall
cd clawcall
npm install
2. Configure environment
OPENCLAW_GATEWAY_URL=http://localhost:4000
OPENCLAW_API_KEY=your-key
TWILIO_ACCOUNT_SID=ACxxxxxxx
TWILIO_AUTH_TOKEN=your-auth-token
STT_PROVIDER=deepgram # or whisper
TTS_PROVIDER=elevenlabs # or openai-tts
PORT=3000
3. Expose with ngrok
ngrok http 3000
Copy the https://xxxx.ngrok.io URL.
4. Wire Twilio
In your Twilio console → Phone Numbers → your number → Voice Configuration:
-
Incoming call webhook:
https://xxxx.ngrok.io/call/incoming(HTTP POST)
5. Start the bridge
npm start
Call your Twilio number. The bridge answers, transcribes your speech with Deepgram (or Whisper), sends it to your OpenClaw gateway as a chat message, streams the text response through your TTS provider, and plays the audio back to you. If you ask the agent to check your calendar or search memory, it does — because it's a real gateway turn.
How the tool-access problem is solved
The key is that chat.send goes through the full agent runtime, not a sidecar realtime session. The gateway schedules a turn, runs tool calls, awaits results, and returns a response — exactly as it would for a text message. clawcall just wraps this in audio I/O.
// core of the bridge (simplified)
app.post('/call/incoming', async (req, res) => {
const transcript = await stt(req.body.audioUrl); // STT
const response = await gateway.chat({ message: transcript }); // full agent turn
const audio = await tts(response.text); // TTS
res.twiml(audio); // back to Twilio
});
Production checklist
- [ ] Move ngrok to a real public URL (Railway, Render, Fly.io — all work)
- [ ] Add
X-Twilio-Signaturevalidation (twilio.validateRequest) - [ ] Set
OPENCLAW_AGENT_IDto route calls to a specific agent persona - [ ] Add a per-call session key if you want conversation memory scoped to the call
Where to go from here
- clawcall on GitHub — MIT, PRs welcome
- Full guide with architecture diagram
- Related: openclaw/openclaw#71262 — the upstream issue this approach addresses
Built by Code and Trust — AI agent infrastructure for businesses.
Top comments (0)