Sunil Kumar Dash

for Composio

Posted on Feb 16 • Originally published at composio.dev

I rebuilt OpenClaw from scratch without the security flaws

#programming #javascript #ai #productivity

OpenClaw launched with great fanfare, and I was curious whether you could truly "vibe code" the entire project on your own, especially since the original creator built it with Codex. We're in the era of "build it yourself instead of setting it up" and I wanted to take that philosophy a step further by recreating it from scratch.

This is the story of how I rebuilt OpenClaw using modern coding agent SDKs, tackled integration challenges across multiple messaging platforms, and deployed it securely in production,all while avoiding the security pitfalls of the original.

Checkout the repository here: Secure OpenClaw

Research & Planning

The first thing I did was use GPT Pro mode to research the entire codebase and explain all the features and tools used. The Pro model excels at these broad tasks that require processing large amounts of information in a single shot. It gave me a detailed product spec on how OpenClaw works and what it uses for each functionality.

I decided to use coding agent SDKs because they represent the first real use cases people have had with LLMs beyond writing. Claude provides the Claude Agent SDK, and OpenCode provides a similar SDK. These SDKs natively provide access to tools like read, write, bash, edit, and support for skills and MCP (Model Context Protocol).

Architecture Overview

I wanted to set up two modes:

Terminal mode: For direct interaction and development
Gateway mode: For 24/7 operation, listening to WhatsApp, Telegram, Signal, iMessage, and other messaging apps

The gateway architecture is what makes OpenClaw powerful,it runs continuously in the background, monitoring multiple communication channels and responding autonomously.

Messaging Platform Integrations

WhatsApp integration uses a library called Baileys to establish a WhatsApp Web connection. Here's how it works:

Baileys connects to WhatsApp Web's WebSocket
When a message arrives, WhatsApp's server pushes it via WebSocket
Baileys emits a messages.upsert event with type 'notify'
The agent can then process and respond to the message

One challenge I encountered was creating the allowlist for WhatsApp numbers. WhatsApp doesn't use phone numbers directly in the WebSocket connection,it uses link IDs. Messages arrive with these IDs, and I needed bidirectional conversion between phone numbers and link IDs. Claude Code initially struggled with building the right mapping, but after some iteration, we got it working correctly.

Telegram was much more straightforward thanks to its Bot API. The implementation uses long polling:

Periodically calls Telegram's getUpdates API
Waits up to 30 seconds for new messages
When a message arrives, it immediately returns and calls getUpdates again
Emits a message event for each new message

The Bot API is well-documented and significantly easier to set up than WhatsApp.

iMessage

iMessage integration was a fascinating unlock. It uses a library called imsg, built by Peter Steinberger himself. The approach:

Reads the SQLite database where all iMessages are stored
Monitors the database using FSEvents, a kernel-level file system monitoring API on macOS
Detects new messages in real-time as they're written to the database

This gives the agent access to iMessage without requiring any official API.

Tools & Integrations

As they say, an agent is nothing without the tools it uses. I equipped the agent with:

Core Tools:

Read, Write, Edit (file operations)
Bash (command execution)
Glob, Grep (file searching)
TodoWrite (task management)
Skill (access to predefined workflows)
AskUserQuestion (user interaction)

Custom Tools:

Cron tools for scheduled tasks
Gateway tools for WhatsApp and Telegram communication

Third-Party Integrations: For secure integration with services like Slack, GitHub, Teams, and more, I used Composio. Composio lets you securely connect and use these tools in a sandbox environment while handling all the credentials and authentication.

Deployment Challenges

The Docker Setup

I created a Docker setup designed to run in the background on a DigitalOcean droplet. The goal was to make it quickly deployable without too many setup hassles. However, I ran into several issues:

Problem 1: OOM (Out of Memory) Errors

Running on a $6/month instance with 2GB RAM, the container kept crashing. The issue? It tried installing Claude Code and OpenAI's SDK together simultaneously, exhausting available memory. Once I identified this, I staggered the installations and the problem was resolved.

Problem 2: Permission Mode Conflicts

The gateway uses permissionMode: 'bypassPermissions' so the agent can run autonomously without human approval for each tool call. However, Claude Code refuses to enable this when running as root,a built-in security feature.

The Solution:

I had to restructure the entire Dockerfile to use a non-root user:

# Create non-root user (Claude Code refuses bypassPermissions as root)
RUN useradd -m -s /bin/bash claw && chown -R claw:claw /app
USER claw

This cascaded into fixing:

All file paths (/root/ → /home/claw/)
Docker Compose volume mounts
CLI installation directories
Workspace permissions

The refactoring took several hours but resulted in a much more secure deployment that adheres to best practices.

Key Takeaways

Modern coding agents are incredibly capable - With proper tooling and context, they can rebuild complex systems from scratch
Security by design matters - The forced non-root user setup, while initially frustrating, led to a more secure architecture
Integration complexity varies wildly - Telegram took 30 minutes, WhatsApp took hours, iMessage required creative solutions
Resource constraints force better architecture - The 2GB RAM limitation pushed me to optimize installation and runtime behavior
Documentation is everything - Services with good APIs (like Telegram) are significantly easier to integrate than those requiring reverse engineering

What's Next

The rebuilt OpenClaw is now running in production, handling messages across multiple platforms without the security issues that plagued the original. Future improvements include:

Adding more messaging platforms (Discord, Slack DMs)
Implementing better error handling and retry logic
Creating a web dashboard for monitoring and configuration
Optimizing memory usage to run on even smaller instances

Building this from scratch was an excellent exercise in understanding how modern AI agents work in production. The combination of LLM capabilities, proper tooling, and careful architecture makes it possible to create powerful autonomous systems that were previously extremely difficult to build.

Top comments (5)

Anmol Baranwal Composio • Feb 17

awesome work by the team! 🔥

when are you sharing about open-claude-cowork? I found that repo through google and found it really interesting.

Shrijal Acharya Composio • Feb 16

Insane!!

Guilherme Zaia • Feb 17

Interesting approach to rebuilding OpenClaw! Your integration of various messaging platforms highlights the practicality of leveraging agent SDKs. Integrating real-time database monitoring for iMessage is particularly clever—avoiding the need for official APIs is a significant advantage. I'm curious, did you encounter any unexpected challenges while handling the security aspects, especially concerning the different architectures of these messaging systems? 🔍

Debajyati Dey • Feb 16

Crazy!

Allen Lyons • Feb 16

I double checked the article, but I don't see what security flaws you've fixed in your release. Also, how do you handle securing against prompt injection?