OpenClaw launched with great fanfare, and I was curious whether you could truly "vibe code" the entire project on your own, especially since the original creator built it with Codex. We're in the era of "build it yourself instead of setting it up" and I wanted to take that philosophy a step further by recreating it from scratch.
This is the story of how I rebuilt OpenClaw using modern coding agent SDKs, tackled integration challenges across multiple messaging platforms, and deployed it securely in production,all while avoiding the security pitfalls of the original.
Checkout the repository here: Secure OpenClaw
Research & Planning
The first thing I did was use GPT Pro mode to research the entire codebase and explain all the features and tools used. The Pro model excels at these broad tasks that require processing large amounts of information in a single shot. It gave me a detailed product spec on how OpenClaw works and what it uses for each functionality.
I decided to use coding agent SDKs because they represent the first real use cases people have had with LLMs beyond writing. Claude provides the Claude Agent SDK, and OpenCode provides a similar SDK. These SDKs natively provide access to tools like read, write, bash, edit, and support for skills and MCP (Model Context Protocol).
Architecture Overview
I wanted to set up two modes:
- Terminal mode: For direct interaction and development
- Gateway mode: For 24/7 operation, listening to WhatsApp, Telegram, Signal, iMessage, and other messaging apps
The gateway architecture is what makes OpenClaw powerful,it runs continuously in the background, monitoring multiple communication channels and responding autonomously.
Messaging Platform Integrations
WhatsApp integration uses a library called Baileys to establish a WhatsApp Web connection. Here's how it works:
- Baileys connects to WhatsApp Web's WebSocket
- When a message arrives, WhatsApp's server pushes it via WebSocket
- Baileys emits a
messages.upsertevent with type'notify' - The agent can then process and respond to the message
One challenge I encountered was creating the allowlist for WhatsApp numbers. WhatsApp doesn't use phone numbers directly in the WebSocket connection,it uses link IDs. Messages arrive with these IDs, and I needed bidirectional conversion between phone numbers and link IDs. Claude Code initially struggled with building the right mapping, but after some iteration, we got it working correctly.
Telegram
Telegram was much more straightforward thanks to its Bot API. The implementation uses long polling:
- Periodically calls Telegram's
getUpdatesAPI - Waits up to 30 seconds for new messages
- When a message arrives, it immediately returns and calls
getUpdatesagain - Emits a
messageevent for each new message
The Bot API is well-documented and significantly easier to set up than WhatsApp.
iMessage
iMessage integration was a fascinating unlock. It uses a library called imsg, built by Peter Steinberger himself. The approach:
- Reads the SQLite database where all iMessages are stored
- Monitors the database using FSEvents, a kernel-level file system monitoring API on macOS
- Detects new messages in real-time as they're written to the database
This gives the agent access to iMessage without requiring any official API.
Tools & Integrations
As they say, an agent is nothing without the tools it uses. I equipped the agent with:
Core Tools:
- Read, Write, Edit (file operations)
- Bash (command execution)
- Glob, Grep (file searching)
- TodoWrite (task management)
- Skill (access to predefined workflows)
- AskUserQuestion (user interaction)
Custom Tools:
- Cron tools for scheduled tasks
- Gateway tools for WhatsApp and Telegram communication
Third-Party Integrations: For secure integration with services like Slack, GitHub, Teams, and more, I used Composio. Composio lets you securely connect and use these tools in a sandbox environment while handling all the credentials and authentication.
Deployment Challenges
The Docker Setup
I created a Docker setup designed to run in the background on a DigitalOcean droplet. The goal was to make it quickly deployable without too many setup hassles. However, I ran into several issues:
Problem 1: OOM (Out of Memory) Errors
Running on a $6/month instance with 2GB RAM, the container kept crashing. The issue? It tried installing Claude Code and OpenAI's SDK together simultaneously, exhausting available memory. Once I identified this, I staggered the installations and the problem was resolved.
Problem 2: Permission Mode Conflicts
The gateway uses permissionMode: 'bypassPermissions' so the agent can run autonomously without human approval for each tool call. However, Claude Code refuses to enable this when running as root,a built-in security feature.
The Solution:
I had to restructure the entire Dockerfile to use a non-root user:
# Create non-root user (Claude Code refuses bypassPermissions as root)
RUN useradd -m -s /bin/bash claw && chown -R claw:claw /app
USER claw
This cascaded into fixing:
- All file paths (
/root/→/home/claw/) - Docker Compose volume mounts
- CLI installation directories
- Workspace permissions
The refactoring took several hours but resulted in a much more secure deployment that adheres to best practices.
Key Takeaways
- Modern coding agents are incredibly capable - With proper tooling and context, they can rebuild complex systems from scratch
- Security by design matters - The forced non-root user setup, while initially frustrating, led to a more secure architecture
- Integration complexity varies wildly - Telegram took 30 minutes, WhatsApp took hours, iMessage required creative solutions
- Resource constraints force better architecture - The 2GB RAM limitation pushed me to optimize installation and runtime behavior
- Documentation is everything - Services with good APIs (like Telegram) are significantly easier to integrate than those requiring reverse engineering
What's Next
The rebuilt OpenClaw is now running in production, handling messages across multiple platforms without the security issues that plagued the original. Future improvements include:
- Adding more messaging platforms (Discord, Slack DMs)
- Implementing better error handling and retry logic
- Creating a web dashboard for monitoring and configuration
- Optimizing memory usage to run on even smaller instances
Building this from scratch was an excellent exercise in understanding how modern AI agents work in production. The combination of LLM capabilities, proper tooling, and careful architecture makes it possible to create powerful autonomous systems that were previously extremely difficult to build.
Top comments (5)
awesome work by the team! 🔥
when are you sharing about open-claude-cowork? I found that repo through google and found it really interesting.
Insane!!
Interesting approach to rebuilding OpenClaw! Your integration of various messaging platforms highlights the practicality of leveraging agent SDKs. Integrating real-time database monitoring for iMessage is particularly clever—avoiding the need for official APIs is a significant advantage. I'm curious, did you encounter any unexpected challenges while handling the security aspects, especially concerning the different architectures of these messaging systems? 🔍
Crazy!
I double checked the article, but I don't see what security flaws you've fixed in your release. Also, how do you handle securing against prompt injection?