DEV Community

Siraj Raval
Siraj Raval

Posted on

Building a 24/7 AI Agent with OpenClaw - Architecture Deep Dive

Building a 24/7 AI Agent with OpenClaw - Architecture Deep Dive

By Siraj Raval

Most AI agent tutorials show you toy examples: chatbots that answer questions, scripts that summarize text, demos that work perfectly on cherry-picked data.

This isn't one of those.

I'm going to show you the architecture behind OpenClaw - a production AI agent that's been running 24/7 for 6 months, handling my email, calendar, finances, and daily operations with minimal downtime.

This is what actually shipping AI agents looks like.

System Overview

Core Requirements

Before jumping into code, let's define what a production AI agent actually needs:

  1. Persistent state - Memory that survives crashes and restarts
  2. Multi-modal tools - Ability to interact with APIs, databases, browsers, and file systems
  3. Proactive execution - Run tasks without waiting for user commands
  4. Safe boundaries - Guardrails to prevent catastrophic mistakes
  5. Observable behavior - Comprehensive logging for debugging and trust

Most AI agent frameworks focus on #2 (tools) while ignoring the others. That's why they fail in production.

High-Level Architecture

┌─────────────────────────────────────────────────┐
│                   User Inputs                    │
│  (WhatsApp, Terminal, Heartbeat Scheduler)       │
└────────────────┬────────────────────────────────┘
                 │
                 ▼
┌─────────────────────────────────────────────────┐
│              OpenClaw Core Engine                │
│  ┌──────────────────────────────────────────┐   │
│  │  LLM Reasoning Layer (Claude Opus 4)     │   │
│  └──────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────┐   │
│  │  Memory System                           │   │
│  │  - Daily logs (memory/YYYY-MM-DD.md)     │   │
│  │  - Long-term memory (MEMORY.md)          │   │
│  │  - Semantic search (ChromaDB)            │   │
│  └──────────────────────────────────────────┘   │
│  ┌──────────────────────────────────────────┐   │
│  │  Tool Execution Layer                    │   │
│  │  - Gmail/Calendar API                    │   │
│  │  - Browser automation                    │   │
│  │  - Voice calls (VAPI)                    │   │
│  │  - File system operations                │   │
│  │  - Shell commands                        │   │
│  └──────────────────────────────────────────┘   │
└─────────────────┬───────────────────────────────┘
                  │
                  ▼
┌─────────────────────────────────────────────────┐
│             External Integrations                │
│  Gmail • Calendar • WhatsApp • Airbnb • Amazon   │
│  Bank APIs • Travel APIs • Voice Services        │
└─────────────────────────────────────────────────┘
Enter fullscreen mode Exit fullscreen mode

Tech Stack

Runtime:

  • Node.js 22 (persistent process)
  • TypeScript for type safety (optional but recommended)
  • PM2 for process management and auto-restart

LLM:

  • Primary: Anthropic Claude Opus 4 (200K context, strong reasoning)
  • Fallback: OpenAI GPT-4o (faster, cheaper for simple tasks)

Storage:

  • Memory: Markdown files + ChromaDB (vector embeddings)
  • Structured data: PostgreSQL (optional, for analytics)
  • Credentials: Encrypted vault (JSON with restricted permissions)

Infrastructure:

  • AWS EC2 t3.medium (4GB RAM, 2 vCPU)
  • Ubuntu 22.04 LTS
  • Nginx reverse proxy (for web interfaces)

Cost: ~$150/month (compute + API calls)

Core Components

1. Memory System

This is the most critical piece. Without reliable memory, your agent is just an expensive chatbot.

Daily Logs

Every session writes to memory/YYYY-MM-DD.md. Format:

# Memory Log - 2026-03-12

## 08:30 - Morning Brief
- Checked email: 3 sponsor inquiries, 1 urgent payment reminder
- Calendar: Meeting with team at 14:00 CET
- Action: Drafted responses to sponsors

## 10:15 - Sponsor Follow-up
- John from TechCorp responded, accepting $3.5K deal
- Generated invoice, sent via email
- Added to sponsor pipeline spreadsheet
- Set reminder for payment (NET30, due April 11)

## 15:45 - Travel Booking
- Siraj requested flights to Lisbon May 15-22
- Searched 3 options: KLM €245, TAP €198, Ryanair €89
- Recommended TAP (balance of cost + convenience)
- Awaiting confirmation to book
Enter fullscreen mode Exit fullscreen mode

Why plain markdown?

  • Human-readable (can grep/edit manually)
  • Version-controllable (git tracks changes)
  • LLM-friendly (easy to inject into context)

Semantic Search

Daily logs grow large. Searching linearly is slow. ChromaDB provides fast semantic lookup.

Setup:

# memory_store.py - embedding + retrieval
import chromadb
from chromadb.utils import embedding_functions

client = chromadb.PersistentClient(path="./chroma_db")
ef = embedding_functions.OpenAIEmbeddingFunction(
    api_key=os.environ["OPENAI_API_KEY"],
    model_name="text-embedding-3-small"
)

collection = client.get_or_create_collection(
    name="memory",
    embedding_function=ef
)

# Index memory files
def index_memories():
    for file in glob("memory/*.md"):
        with open(file) as f:
            content = f.read()
        chunks = split_into_chunks(content, max_chars=1000)
        collection.add(
            documents=chunks,
            ids=[f"{file}_{i}" for i in range(len(chunks))],
            metadatas=[{"source": file, "date": extract_date(file)}] * len(chunks)
        )

# Search
def search_memory(query, n=5):
    results = collection.query(query_texts=[query], n_results=n)
    return results['documents'][0]
Enter fullscreen mode Exit fullscreen mode

Usage:

$ python3 memory_store.py search "sponsor deal TechCorp"
# Returns relevant chunks about TechCorp negotiations
Enter fullscreen mode Exit fullscreen mode

This runs in <200ms and scales to thousands of files.

2. Tool System

AI agents are only as useful as the tools they can use. OpenClaw has 20+ integrated tools.

Tool Definition Pattern

Each tool follows a standard interface:

// tools/gmail.js
export const gmailTool = {
  name: "gmail_search",
  description: "Search Gmail messages with optional filters",
  parameters: {
    query: { type: "string", required: true },
    maxResults: { type: "number", default: 10 },
    includeBody: { type: "boolean", default: false }
  },

  async execute({ query, maxResults, includeBody }) {
    const auth = await getGoogleAuth();
    const gmail = google.gmail({ version: 'v1', auth });

    const response = await gmail.users.messages.list({
      userId: 'me',
      q: query,
      maxResults
    });

    const messages = await Promise.all(
      response.data.messages.map(msg => 
        gmail.users.messages.get({ userId: 'me', id: msg.id })
      )
    );

    return messages.map(formatMessage);
  }
};
Enter fullscreen mode Exit fullscreen mode

Why this pattern?

  • LLM can understand tool from description alone
  • Type-safe execution (prevents malformed calls)
  • Easy to add new tools without touching core logic

Browser Automation

For sites without APIs (Airbnb, Amazon, Marktplaats), browser automation is essential.

Local Playwright (last resort):

// tools/browser.js
import { chromium } from 'playwright';

export async function browserTask(taskDescription) {
  const browser = await chromium.launch({ headless: true });
  const context = await browser.newContext({
    userAgent: 'Mozilla/5.0...',
    viewport: { width: 1920, height: 1080 }
  });

  const page = await context.newPage();

  // Task-specific logic here

  await browser.close();
}
Enter fullscreen mode Exit fullscreen mode

Browser-Use Cloud (preferred for hostile sites):

// tools/browser-use.js
import axios from 'axios';

export async function browserUseTask({ url, instructions }) {
  const session = await axios.post('https://api.browser-use.com/api/v2/sessions', {
    proxyCountryCode: 'nl',
    keepAlive: true
  }, {
    headers: { 'X-Browser-Use-API-Key': process.env.BROWSER_USE_KEY }
  });

  const result = await axios.post(`https://api.browser-use.com/api/v2/sessions/${session.data.id}/tasks`, {
    type: 'navigate_and_extract',
    url,
    instructions
  });

  return result.data;
}
Enter fullscreen mode Exit fullscreen mode

When to use which:

  • API exists? → Use API (fastest, cheapest)
  • Simple scraping? → Playwright (no external cost)
  • Login/CAPTCHA/bot detection? → Browser-Use Cloud

3. Proactive Heartbeat System

Most AI agents are reactive. OpenClaw is proactive.

Implementation:

// heartbeat.js
import cron from 'node-cron';

// Run every 30 minutes
cron.schedule('*/30 * * * *', async () => {
  const context = await loadContext();

  // Check if this is quiet hours (11 PM - 8 AM Amsterdam)
  if (isQuietHours(context.timezone)) {
    return; // Skip heartbeat
  }

  const prompt = `
You are running a proactive heartbeat check.

Tasks to consider:
1. Check for urgent emails (unread from last 30 min)
2. Review calendar for upcoming events (<2h away)
3. Monitor sponsor deal pipeline for overdue payments
4. Check for new opportunities (Twitter mentions, etc.)

Current time: ${new Date().toISOString()}
Last heartbeat: ${context.lastHeartbeat}

If something needs attention, notify Siraj via WhatsApp.
Otherwise, reply with HEARTBEAT_OK and continue background work.
`;

  const response = await callLLM(prompt, context);

  if (response !== 'HEARTBEAT_OK') {
    await sendWhatsApp(response);
  }

  await updateLastHeartbeat();
});
Enter fullscreen mode Exit fullscreen mode

Key features:

  • Respects quiet hours
  • Batches multiple checks (efficiency)
  • Only notifies when necessary
  • Runs indefinitely without supervision

4. Safety & Boundaries

An agent with access to email, calendar, and finances can do a lot of damage. Safety isn't optional.

Permission Levels

const PERMISSION_LEVELS = {
  READ: 'read',         // Safe, always allowed
  DRAFT: 'draft',       // Generate content, don't send
  EXECUTE: 'execute',   // Take action automatically
  ASK: 'ask'           // Require explicit approval
};

const toolPermissions = {
  'gmail_search': PERMISSION_LEVELS.READ,
  'gmail_send': PERMISSION_LEVELS.DRAFT,  // Drafts only
  'calendar_add': PERMISSION_LEVELS.EXECUTE,
  'bank_transfer': PERMISSION_LEVELS.ASK,  // Always ask
};
Enter fullscreen mode Exit fullscreen mode

Validation Layer

async function executeTool(toolName, params) {
  const permission = toolPermissions[toolName];

  if (permission === PERMISSION_LEVELS.ASK) {
    await requestApproval(toolName, params);
  }

  if (permission === PERMISSION_LEVELS.READ || permission === PERMISSION_LEVELS.EXECUTE) {
    return await tools[toolName].execute(params);
  }

  // DRAFT mode: return without executing
  return { status: 'draft', content: await tools[toolName].preview(params) };
}
Enter fullscreen mode Exit fullscreen mode

This prevents catastrophic mistakes like:

  • Sending unreviewed emails to sponsors
  • Transferring money without approval
  • Deleting important files

5. Error Handling & Resilience

Production systems fail. Plan for it.

Retry Logic

async function robustAPICall(fn, maxRetries = 3) {
  for (let i = 0; i < maxRetries; i++) {
    try {
      return await fn();
    } catch (error) {
      if (error.status === 429) {
        // Rate limited, exponential backoff
        await sleep(Math.pow(2, i) * 1000);
      } else if (error.status >= 500) {
        // Server error, retry
        await sleep(1000);
      } else {
        // Client error, don't retry
        throw error;
      }
    }
  }
  throw new Error(`Failed after ${maxRetries} retries`);
}
Enter fullscreen mode Exit fullscreen mode

Graceful Degradation

async function sendEmail(to, subject, body) {
  try {
    await gmail.send({ to, subject, body });
  } catch (error) {
    // Gmail API failed, try SMTP fallback
    await smtpSend({ to, subject, body });
  } catch (fallbackError) {
    // Both failed, save to drafts
    await saveDraft({ to, subject, body });
    await notify("Email send failed, saved to drafts");
  }
}
Enter fullscreen mode Exit fullscreen mode

Comprehensive Logging

// logger.js
import winston from 'winston';

const logger = winston.createLogger({
  level: 'info',
  format: winston.format.json(),
  transports: [
    new winston.transports.File({ filename: 'logs/error.log', level: 'error' }),
    new winston.transports.File({ filename: 'logs/combined.log' }),
    new winston.transports.Console({ format: winston.format.simple() })
  ]
});

// Log every tool execution
function logToolExecution(toolName, params, result, duration) {
  logger.info({
    event: 'tool_execution',
    tool: toolName,
    params,
    result: result.status,
    durationMs: duration,
    timestamp: new Date().toISOString()
  });
}
Enter fullscreen mode Exit fullscreen mode

Logs saved me countless times when debugging "why did it do that?"

Deployment & Operations

Infrastructure Setup

# EC2 instance (t3.medium, Ubuntu 22.04)
$ ssh user@your-ec2-ip

# Install dependencies
$ curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash -
$ sudo apt-get install -y nodejs git
$ npm install -g pm2

# Clone OpenClaw
$ git clone https://github.com/yourusername/openclaw.git
$ cd openclaw
$ npm install

# Set up credentials
$ cp .env.example .env
$ nano .env  # Add API keys

# Start with PM2
$ pm2 start openclaw.js --name "openclaw-agent"
$ pm2 save
$ pm2 startup  # Auto-restart on reboot
Enter fullscreen mode Exit fullscreen mode

Monitoring

# View logs
$ pm2 logs openclaw-agent

# Monitor resource usage
$ pm2 monit

# Check status
$ pm2 status
Enter fullscreen mode Exit fullscreen mode

Backup Strategy

# Daily backup script (cron job)
#!/bin/bash
tar -czf backup-$(date +%Y%m%d).tar.gz memory/ MEMORY.md config/
aws s3 cp backup-$(date +%Y%m%d).tar.gz s3://openclaw-backups/
Enter fullscreen mode Exit fullscreen mode

Performance Optimization

LLM Call Reduction

The biggest cost is LLM API calls. Optimize aggressively.

Before optimization:

  • Every heartbeat: 1 LLM call
  • Every email check: 1 call per email
  • Cost: ~$300/month

After optimization:

  • Batch email processing (1 call for 10 emails)
  • Cache routine decisions (no LLM for spam detection)
  • Use fast models for simple tasks
  • Cost: ~$80/month

Code example:

// Batch processing
async function processEmails(emails) {
  // Instead of N LLM calls, make 1 batched call
  const prompt = `
Review these ${emails.length} emails and categorize:
${emails.map((e, i) => `${i+1}. From: ${e.from}, Subject: ${e.subject}`).join('\n')}

For each, output: {"id": N, "category": "urgent|normal|spam", "action": "reply|archive|flag"}
`;

  const result = await callLLM(prompt);
  return JSON.parse(result);
}
Enter fullscreen mode Exit fullscreen mode

Caching Strategies

import NodeCache from 'node-cache';
const cache = new NodeCache({ stdTTL: 3600 }); // 1 hour TTL

async function getCachedGoogleAuth() {
  const cached = cache.get('google_auth');
  if (cached) return cached;

  const auth = await freshGoogleAuth();
  cache.set('google_auth', auth);
  return auth;
}
Enter fullscreen mode Exit fullscreen mode

Lessons Learned

After 6 months running OpenClaw in production, here's what Siraj Raval learned:

1. Start with Read-Only

Don't give your agent write permissions on day 1. Start with:

  • Read emails
  • Summarize calendar
  • Generate drafts (don't send)

Build trust over weeks before enabling autonomous actions.

2. Logging Saves Everything

When something goes wrong (and it will), comprehensive logs are the only way to debug. Log:

  • Every tool call
  • Every LLM prompt + response
  • Every decision point
  • Execution time for performance analysis

3. Memory Architecture Matters More Than Model Choice

I've swapped between GPT-4, Claude, and others. The model matters less than:

  • Quality of memory retrieval
  • Context provided in prompts
  • Clear tool descriptions

A mediocre model with great memory > amazing model with no memory.

4. Edge Cases Are Infinite

You will never anticipate all failure modes. Plan for:

  • Malformed API responses
  • Unexpected user inputs
  • Network failures mid-operation
  • Rate limits at 3 AM

Defensive coding isn't optional.

What's Next

OpenClaw is evolving rapidly. Upcoming features:

  1. Multi-agent orchestration - Spawn specialized sub-agents for complex tasks
  2. Voice interface - Talk to OpenClaw naturally via phone
  3. Mobile app - Native iOS/Android apps
  4. Community marketplace - Share and install tool plugins
  5. Self-improvement - Agent analyzes its own failures and updates code

The vision: every developer should be able to deploy their own AI agent in <1 hour.

Get Involved

OpenClaw is open-source. Want to contribute?

  • GitHub: github.com/llsourcell (search for OpenClaw repo)
  • Issues: Bug reports, feature requests
  • PRs: New tools, optimizations, docs

Follow Siraj Raval for updates:

  • YouTube: youtube.com/@SirajRaval (775K+ subscribers)
  • GitHub: github.com/llsourcell
  • LinkedIn: linkedin.com/in/sirajraval

This is what production AI looks like. Not perfect, but real. Ship it.


Tags: #AI #MachineLearning #OpenClaw #AIAgents #ProductionAI #NodeJS #Claude #Automation #DevOps #SirajRaval

Meta Description: Siraj Raval breaks down the architecture of OpenClaw, a production AI agent running 24/7. Code examples, deployment guide, and lessons learned from 6 months in production.

Top comments (0)