DEV Community

Cover image for I rebuilt OpenClaw from scratch without the security flaws
Sunil Kumar Dash Subscriber for Composio

Posted on • Originally published at composio.dev

I rebuilt OpenClaw from scratch without the security flaws

OpenClaw launched with great fanfare, and I was curious whether you could truly "vibe code" the entire project on your own, especially since the original creator built it with Codex. We're in the era of "build it yourself instead of setting it up" and I wanted to take that philosophy a step further by recreating it from scratch.

This is the story of how I rebuilt OpenClaw using modern coding agent SDKs, tackled integration challenges across multiple messaging platforms, and deployed it securely in production,all while avoiding the security pitfalls of the original.

Checkout the repository here: Secure OpenClaw


Research & Planning

The first thing I did was use GPT Pro mode to research the entire codebase and explain all the features and tools used. The Pro model excels at these broad tasks that require processing large amounts of information in a single shot. It gave me a detailed product spec on how OpenClaw works and what it uses for each functionality.

I decided to use coding agent SDKs because they represent the first real use cases people have had with LLMs beyond writing. Claude provides the Claude Agent SDK, and OpenCode provides a similar SDK. These SDKs natively provide access to tools like read, write, bash, edit, and support for skills and MCP (Model Context Protocol).


Architecture Overview

I wanted to set up two modes:

  • Terminal mode: For direct interaction and development
  • Gateway mode: For 24/7 operation, listening to WhatsApp, Telegram, Signal, iMessage, and other messaging apps

The gateway architecture is what makes OpenClaw powerful,it runs continuously in the background, monitoring multiple communication channels and responding autonomously.

Messaging Platform Integrations

WhatsApp

WhatsApp integration uses a library called Baileys to establish a WhatsApp Web connection. Here's how it works:

  • Baileys connects to WhatsApp Web's WebSocket
  • When a message arrives, WhatsApp's server pushes it via WebSocket
  • Baileys emits a messages.upsert event with type 'notify'
  • The agent can then process and respond to the message

One challenge I encountered was creating the allowlist for WhatsApp numbers. WhatsApp doesn't use phone numbers directly in the WebSocket connection,it uses link IDs. Messages arrive with these IDs, and I needed bidirectional conversion between phone numbers and link IDs. Claude Code initially struggled with building the right mapping, but after some iteration, we got it working correctly.

Telegram

Telegram was much more straightforward thanks to its Bot API. The implementation uses long polling:

  • Periodically calls Telegram's getUpdates API
  • Waits up to 30 seconds for new messages
  • When a message arrives, it immediately returns and calls getUpdates again
  • Emits a message event for each new message

The Bot API is well-documented and significantly easier to set up than WhatsApp.

iMessage

iMessage integration was a fascinating unlock. It uses a library called imsg, built by Peter Steinberger himself. The approach:

  • Reads the SQLite database where all iMessages are stored
  • Monitors the database using FSEvents, a kernel-level file system monitoring API on macOS
  • Detects new messages in real-time as they're written to the database

This gives the agent access to iMessage without requiring any official API.


Tools & Integrations

As they say, an agent is nothing without the tools it uses. I equipped the agent with:

Core Tools:

  • Read, Write, Edit (file operations)
  • Bash (command execution)
  • Glob, Grep (file searching)
  • TodoWrite (task management)
  • Skill (access to predefined workflows)
  • AskUserQuestion (user interaction)

Custom Tools:

  • Cron tools for scheduled tasks
  • Gateway tools for WhatsApp and Telegram communication

Third-Party Integrations: For secure integration with services like Slack, GitHub, Teams, and more, I used Composio. Composio lets you securely connect and use these tools in a sandbox environment while handling all the credentials and authentication.


Deployment Challenges

The Docker Setup

I created a Docker setup designed to run in the background on a DigitalOcean droplet. The goal was to make it quickly deployable without too many setup hassles. However, I ran into several issues:

Problem 1: OOM (Out of Memory) Errors

Running on a $6/month instance with 2GB RAM, the container kept crashing. The issue? It tried installing Claude Code and OpenAI's SDK together simultaneously, exhausting available memory. Once I identified this, I staggered the installations and the problem was resolved.

Problem 2: Permission Mode Conflicts

The gateway uses permissionMode: 'bypassPermissions' so the agent can run autonomously without human approval for each tool call. However, Claude Code refuses to enable this when running as root,a built-in security feature.

The Solution:

I had to restructure the entire Dockerfile to use a non-root user:

# Create non-root user (Claude Code refuses bypassPermissions as root)
RUN useradd -m -s /bin/bash claw && chown -R claw:claw /app
USER claw
Enter fullscreen mode Exit fullscreen mode

This cascaded into fixing:

  • All file paths (/root//home/claw/)
  • Docker Compose volume mounts
  • CLI installation directories
  • Workspace permissions

The refactoring took several hours but resulted in a much more secure deployment that adheres to best practices.


Key Takeaways

  1. Modern coding agents are incredibly capable - With proper tooling and context, they can rebuild complex systems from scratch
  2. Security by design matters - The forced non-root user setup, while initially frustrating, led to a more secure architecture
  3. Integration complexity varies wildly - Telegram took 30 minutes, WhatsApp took hours, iMessage required creative solutions
  4. Resource constraints force better architecture - The 2GB RAM limitation pushed me to optimize installation and runtime behavior
  5. Documentation is everything - Services with good APIs (like Telegram) are significantly easier to integrate than those requiring reverse engineering

What's Next

The rebuilt OpenClaw is now running in production, handling messages across multiple platforms without the security issues that plagued the original. Future improvements include:

  • Adding more messaging platforms (Discord, Slack DMs)
  • Implementing better error handling and retry logic
  • Creating a web dashboard for monitoring and configuration
  • Optimizing memory usage to run on even smaller instances

Building this from scratch was an excellent exercise in understanding how modern AI agents work in production. The combination of LLM capabilities, proper tooling, and careful architecture makes it possible to create powerful autonomous systems that were previously extremely difficult to build.

Top comments (5)

Collapse
 
anmolbaranwal profile image
Anmol Baranwal Composio

awesome work by the team! 🔥

when are you sharing about open-claude-cowork? I found that repo through google and found it really interesting.

Collapse
 
shricodev profile image
Shrijal Acharya Composio

Insane!!

Collapse
 
theminimalcreator profile image
Guilherme Zaia

Interesting approach to rebuilding OpenClaw! Your integration of various messaging platforms highlights the practicality of leveraging agent SDKs. Integrating real-time database monitoring for iMessage is particularly clever—avoiding the need for official APIs is a significant advantage. I'm curious, did you encounter any unexpected challenges while handling the security aspects, especially concerning the different architectures of these messaging systems? 🔍

Collapse
 
ddebajyati profile image
Debajyati Dey

Crazy!

Collapse
 
awlawlawl profile image
Allen Lyons

I double checked the article, but I don't see what security flaws you've fixed in your release. Also, how do you handle securing against prompt injection?