Building a 24/7 AI Agent with OpenClaw - Architecture Deep Dive
By Siraj Raval
Most AI agent tutorials show you toy examples: chatbots that answer questions, scripts that summarize text, demos that work perfectly on cherry-picked data.
This isn't one of those.
I'm going to show you the architecture behind OpenClaw - a production AI agent that's been running 24/7 for 6 months, handling my email, calendar, finances, and daily operations with minimal downtime.
This is what actually shipping AI agents looks like.
System Overview
Core Requirements
Before jumping into code, let's define what a production AI agent actually needs:
- Persistent state - Memory that survives crashes and restarts
- Multi-modal tools - Ability to interact with APIs, databases, browsers, and file systems
- Proactive execution - Run tasks without waiting for user commands
- Safe boundaries - Guardrails to prevent catastrophic mistakes
- Observable behavior - Comprehensive logging for debugging and trust
Most AI agent frameworks focus on #2 (tools) while ignoring the others. That's why they fail in production.
High-Level Architecture
┌─────────────────────────────────────────────────┐
│ User Inputs │
│ (WhatsApp, Terminal, Heartbeat Scheduler) │
└────────────────┬────────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ OpenClaw Core Engine │
│ ┌──────────────────────────────────────────┐ │
│ │ LLM Reasoning Layer (Claude Opus 4) │ │
│ └──────────────────────────────────────────┘ │
│ ┌──────────────────────────────────────────┐ │
│ │ Memory System │ │
│ │ - Daily logs (memory/YYYY-MM-DD.md) │ │
│ │ - Long-term memory (MEMORY.md) │ │
│ │ - Semantic search (ChromaDB) │ │
│ └──────────────────────────────────────────┘ │
│ ┌──────────────────────────────────────────┐ │
│ │ Tool Execution Layer │ │
│ │ - Gmail/Calendar API │ │
│ │ - Browser automation │ │
│ │ - Voice calls (VAPI) │ │
│ │ - File system operations │ │
│ │ - Shell commands │ │
│ └──────────────────────────────────────────┘ │
└─────────────────┬───────────────────────────────┘
│
▼
┌─────────────────────────────────────────────────┐
│ External Integrations │
│ Gmail • Calendar • WhatsApp • Airbnb • Amazon │
│ Bank APIs • Travel APIs • Voice Services │
└─────────────────────────────────────────────────┘
Tech Stack
Runtime:
- Node.js 22 (persistent process)
- TypeScript for type safety (optional but recommended)
- PM2 for process management and auto-restart
LLM:
- Primary: Anthropic Claude Opus 4 (200K context, strong reasoning)
- Fallback: OpenAI GPT-4o (faster, cheaper for simple tasks)
Storage:
- Memory: Markdown files + ChromaDB (vector embeddings)
- Structured data: PostgreSQL (optional, for analytics)
- Credentials: Encrypted vault (JSON with restricted permissions)
Infrastructure:
- AWS EC2 t3.medium (4GB RAM, 2 vCPU)
- Ubuntu 22.04 LTS
- Nginx reverse proxy (for web interfaces)
Cost: ~$150/month (compute + API calls)
Core Components
1. Memory System
This is the most critical piece. Without reliable memory, your agent is just an expensive chatbot.
Daily Logs
Every session writes to memory/YYYY-MM-DD.md. Format:
# Memory Log - 2026-03-12
## 08:30 - Morning Brief
- Checked email: 3 sponsor inquiries, 1 urgent payment reminder
- Calendar: Meeting with team at 14:00 CET
- Action: Drafted responses to sponsors
## 10:15 - Sponsor Follow-up
- John from TechCorp responded, accepting $3.5K deal
- Generated invoice, sent via email
- Added to sponsor pipeline spreadsheet
- Set reminder for payment (NET30, due April 11)
## 15:45 - Travel Booking
- Siraj requested flights to Lisbon May 15-22
- Searched 3 options: KLM €245, TAP €198, Ryanair €89
- Recommended TAP (balance of cost + convenience)
- Awaiting confirmation to book
Why plain markdown?
- Human-readable (can grep/edit manually)
- Version-controllable (git tracks changes)
- LLM-friendly (easy to inject into context)
Semantic Search
Daily logs grow large. Searching linearly is slow. ChromaDB provides fast semantic lookup.
Setup:
# memory_store.py - embedding + retrieval
import chromadb
from chromadb.utils import embedding_functions
client = chromadb.PersistentClient(path="./chroma_db")
ef = embedding_functions.OpenAIEmbeddingFunction(
api_key=os.environ["OPENAI_API_KEY"],
model_name="text-embedding-3-small"
)
collection = client.get_or_create_collection(
name="memory",
embedding_function=ef
)
# Index memory files
def index_memories():
for file in glob("memory/*.md"):
with open(file) as f:
content = f.read()
chunks = split_into_chunks(content, max_chars=1000)
collection.add(
documents=chunks,
ids=[f"{file}_{i}" for i in range(len(chunks))],
metadatas=[{"source": file, "date": extract_date(file)}] * len(chunks)
)
# Search
def search_memory(query, n=5):
results = collection.query(query_texts=[query], n_results=n)
return results['documents'][0]
Usage:
$ python3 memory_store.py search "sponsor deal TechCorp"
# Returns relevant chunks about TechCorp negotiations
This runs in <200ms and scales to thousands of files.
2. Tool System
AI agents are only as useful as the tools they can use. OpenClaw has 20+ integrated tools.
Tool Definition Pattern
Each tool follows a standard interface:
// tools/gmail.js
export const gmailTool = {
name: "gmail_search",
description: "Search Gmail messages with optional filters",
parameters: {
query: { type: "string", required: true },
maxResults: { type: "number", default: 10 },
includeBody: { type: "boolean", default: false }
},
async execute({ query, maxResults, includeBody }) {
const auth = await getGoogleAuth();
const gmail = google.gmail({ version: 'v1', auth });
const response = await gmail.users.messages.list({
userId: 'me',
q: query,
maxResults
});
const messages = await Promise.all(
response.data.messages.map(msg =>
gmail.users.messages.get({ userId: 'me', id: msg.id })
)
);
return messages.map(formatMessage);
}
};
Why this pattern?
- LLM can understand tool from description alone
- Type-safe execution (prevents malformed calls)
- Easy to add new tools without touching core logic
Browser Automation
For sites without APIs (Airbnb, Amazon, Marktplaats), browser automation is essential.
Local Playwright (last resort):
// tools/browser.js
import { chromium } from 'playwright';
export async function browserTask(taskDescription) {
const browser = await chromium.launch({ headless: true });
const context = await browser.newContext({
userAgent: 'Mozilla/5.0...',
viewport: { width: 1920, height: 1080 }
});
const page = await context.newPage();
// Task-specific logic here
await browser.close();
}
Browser-Use Cloud (preferred for hostile sites):
// tools/browser-use.js
import axios from 'axios';
export async function browserUseTask({ url, instructions }) {
const session = await axios.post('https://api.browser-use.com/api/v2/sessions', {
proxyCountryCode: 'nl',
keepAlive: true
}, {
headers: { 'X-Browser-Use-API-Key': process.env.BROWSER_USE_KEY }
});
const result = await axios.post(`https://api.browser-use.com/api/v2/sessions/${session.data.id}/tasks`, {
type: 'navigate_and_extract',
url,
instructions
});
return result.data;
}
When to use which:
- API exists? → Use API (fastest, cheapest)
- Simple scraping? → Playwright (no external cost)
- Login/CAPTCHA/bot detection? → Browser-Use Cloud
3. Proactive Heartbeat System
Most AI agents are reactive. OpenClaw is proactive.
Implementation:
// heartbeat.js
import cron from 'node-cron';
// Run every 30 minutes
cron.schedule('*/30 * * * *', async () => {
const context = await loadContext();
// Check if this is quiet hours (11 PM - 8 AM Amsterdam)
if (isQuietHours(context.timezone)) {
return; // Skip heartbeat
}
const prompt = `
You are running a proactive heartbeat check.
Tasks to consider:
1. Check for urgent emails (unread from last 30 min)
2. Review calendar for upcoming events (<2h away)
3. Monitor sponsor deal pipeline for overdue payments
4. Check for new opportunities (Twitter mentions, etc.)
Current time: ${new Date().toISOString()}
Last heartbeat: ${context.lastHeartbeat}
If something needs attention, notify Siraj via WhatsApp.
Otherwise, reply with HEARTBEAT_OK and continue background work.
`;
const response = await callLLM(prompt, context);
if (response !== 'HEARTBEAT_OK') {
await sendWhatsApp(response);
}
await updateLastHeartbeat();
});
Key features:
- Respects quiet hours
- Batches multiple checks (efficiency)
- Only notifies when necessary
- Runs indefinitely without supervision
4. Safety & Boundaries
An agent with access to email, calendar, and finances can do a lot of damage. Safety isn't optional.
Permission Levels
const PERMISSION_LEVELS = {
READ: 'read', // Safe, always allowed
DRAFT: 'draft', // Generate content, don't send
EXECUTE: 'execute', // Take action automatically
ASK: 'ask' // Require explicit approval
};
const toolPermissions = {
'gmail_search': PERMISSION_LEVELS.READ,
'gmail_send': PERMISSION_LEVELS.DRAFT, // Drafts only
'calendar_add': PERMISSION_LEVELS.EXECUTE,
'bank_transfer': PERMISSION_LEVELS.ASK, // Always ask
};
Validation Layer
async function executeTool(toolName, params) {
const permission = toolPermissions[toolName];
if (permission === PERMISSION_LEVELS.ASK) {
await requestApproval(toolName, params);
}
if (permission === PERMISSION_LEVELS.READ || permission === PERMISSION_LEVELS.EXECUTE) {
return await tools[toolName].execute(params);
}
// DRAFT mode: return without executing
return { status: 'draft', content: await tools[toolName].preview(params) };
}
This prevents catastrophic mistakes like:
- Sending unreviewed emails to sponsors
- Transferring money without approval
- Deleting important files
5. Error Handling & Resilience
Production systems fail. Plan for it.
Retry Logic
async function robustAPICall(fn, maxRetries = 3) {
for (let i = 0; i < maxRetries; i++) {
try {
return await fn();
} catch (error) {
if (error.status === 429) {
// Rate limited, exponential backoff
await sleep(Math.pow(2, i) * 1000);
} else if (error.status >= 500) {
// Server error, retry
await sleep(1000);
} else {
// Client error, don't retry
throw error;
}
}
}
throw new Error(`Failed after ${maxRetries} retries`);
}
Graceful Degradation
async function sendEmail(to, subject, body) {
try {
await gmail.send({ to, subject, body });
} catch (error) {
// Gmail API failed, try SMTP fallback
await smtpSend({ to, subject, body });
} catch (fallbackError) {
// Both failed, save to drafts
await saveDraft({ to, subject, body });
await notify("Email send failed, saved to drafts");
}
}
Comprehensive Logging
// logger.js
import winston from 'winston';
const logger = winston.createLogger({
level: 'info',
format: winston.format.json(),
transports: [
new winston.transports.File({ filename: 'logs/error.log', level: 'error' }),
new winston.transports.File({ filename: 'logs/combined.log' }),
new winston.transports.Console({ format: winston.format.simple() })
]
});
// Log every tool execution
function logToolExecution(toolName, params, result, duration) {
logger.info({
event: 'tool_execution',
tool: toolName,
params,
result: result.status,
durationMs: duration,
timestamp: new Date().toISOString()
});
}
Logs saved me countless times when debugging "why did it do that?"
Deployment & Operations
Infrastructure Setup
# EC2 instance (t3.medium, Ubuntu 22.04)
$ ssh user@your-ec2-ip
# Install dependencies
$ curl -fsSL https://deb.nodesource.com/setup_22.x | sudo -E bash -
$ sudo apt-get install -y nodejs git
$ npm install -g pm2
# Clone OpenClaw
$ git clone https://github.com/yourusername/openclaw.git
$ cd openclaw
$ npm install
# Set up credentials
$ cp .env.example .env
$ nano .env # Add API keys
# Start with PM2
$ pm2 start openclaw.js --name "openclaw-agent"
$ pm2 save
$ pm2 startup # Auto-restart on reboot
Monitoring
# View logs
$ pm2 logs openclaw-agent
# Monitor resource usage
$ pm2 monit
# Check status
$ pm2 status
Backup Strategy
# Daily backup script (cron job)
#!/bin/bash
tar -czf backup-$(date +%Y%m%d).tar.gz memory/ MEMORY.md config/
aws s3 cp backup-$(date +%Y%m%d).tar.gz s3://openclaw-backups/
Performance Optimization
LLM Call Reduction
The biggest cost is LLM API calls. Optimize aggressively.
Before optimization:
- Every heartbeat: 1 LLM call
- Every email check: 1 call per email
- Cost: ~$300/month
After optimization:
- Batch email processing (1 call for 10 emails)
- Cache routine decisions (no LLM for spam detection)
- Use fast models for simple tasks
- Cost: ~$80/month
Code example:
// Batch processing
async function processEmails(emails) {
// Instead of N LLM calls, make 1 batched call
const prompt = `
Review these ${emails.length} emails and categorize:
${emails.map((e, i) => `${i+1}. From: ${e.from}, Subject: ${e.subject}`).join('\n')}
For each, output: {"id": N, "category": "urgent|normal|spam", "action": "reply|archive|flag"}
`;
const result = await callLLM(prompt);
return JSON.parse(result);
}
Caching Strategies
import NodeCache from 'node-cache';
const cache = new NodeCache({ stdTTL: 3600 }); // 1 hour TTL
async function getCachedGoogleAuth() {
const cached = cache.get('google_auth');
if (cached) return cached;
const auth = await freshGoogleAuth();
cache.set('google_auth', auth);
return auth;
}
Lessons Learned
After 6 months running OpenClaw in production, here's what Siraj Raval learned:
1. Start with Read-Only
Don't give your agent write permissions on day 1. Start with:
- Read emails
- Summarize calendar
- Generate drafts (don't send)
Build trust over weeks before enabling autonomous actions.
2. Logging Saves Everything
When something goes wrong (and it will), comprehensive logs are the only way to debug. Log:
- Every tool call
- Every LLM prompt + response
- Every decision point
- Execution time for performance analysis
3. Memory Architecture Matters More Than Model Choice
I've swapped between GPT-4, Claude, and others. The model matters less than:
- Quality of memory retrieval
- Context provided in prompts
- Clear tool descriptions
A mediocre model with great memory > amazing model with no memory.
4. Edge Cases Are Infinite
You will never anticipate all failure modes. Plan for:
- Malformed API responses
- Unexpected user inputs
- Network failures mid-operation
- Rate limits at 3 AM
Defensive coding isn't optional.
What's Next
OpenClaw is evolving rapidly. Upcoming features:
- Multi-agent orchestration - Spawn specialized sub-agents for complex tasks
- Voice interface - Talk to OpenClaw naturally via phone
- Mobile app - Native iOS/Android apps
- Community marketplace - Share and install tool plugins
- Self-improvement - Agent analyzes its own failures and updates code
The vision: every developer should be able to deploy their own AI agent in <1 hour.
Get Involved
OpenClaw is open-source. Want to contribute?
- GitHub: github.com/llsourcell (search for OpenClaw repo)
- Issues: Bug reports, feature requests
- PRs: New tools, optimizations, docs
Follow Siraj Raval for updates:
- YouTube: youtube.com/@SirajRaval (775K+ subscribers)
- GitHub: github.com/llsourcell
- LinkedIn: linkedin.com/in/sirajraval
This is what production AI looks like. Not perfect, but real. Ship it.
Tags: #AI #MachineLearning #OpenClaw #AIAgents #ProductionAI #NodeJS #Claude #Automation #DevOps #SirajRaval
Meta Description: Siraj Raval breaks down the architecture of OpenClaw, a production AI agent running 24/7. Code examples, deployment guide, and lessons learned from 6 months in production.
Top comments (0)