24/7 AI agents that find and execute work without human intervention. Here's how we built it in production.
The Problem: Passive AI
Most AI assistants — ChatGPT, Claude, etc. — are reactive: they wait for you to talk. But real operations need proactive behavior: publish a blog at 9 PM daily, check systems every 5 minutes, prepare today's task list each morning.
Running 20+ AI agents on OpenClaw, we solved this with a 3-layer architecture: Heartbeat + Cron + Memory.
The 3-Layer Architecture
┌──────────────────────────────────────────┐
│ Layer 1: Heartbeat │
│ - Periodically wakes the agent │
│ - Check inbox, read GOALS, assess tasks │
│ - No work? Return HEARTBEAT_OK & sleep │
└──────────────┬───────────────────────────┘
▼
┌──────────────────────────────────────────┐
│ Layer 2: Cron │
│ - Time-based task triggers │
│ - e.g., Blog publish at 21:00 daily │
│ - e.g., Monthly timesheet processing │
└──────────────┬───────────────────────────┘
▼
┌──────────────────────────────────────────┐
│ Layer 3: Memory │
│ - CONTEXT.md: current state │
│ - Daily notes: what happened today │
│ - MEMORY.md: long-term knowledge │
│ - Memory Service: vector-searchable DB │
└──────────────────────────────────────────┘
Layer 1: Heartbeat — The Agent's Pulse
The simplest yet most critical mechanism. At configured intervals (e.g., every 15 minutes), the agent receives a wake-up message and runs through a checklist:
- Check inbox — any messages from other agents?
- Read GOALS.md — any assigned tasks?
- Read CONTEXT.md — recall previous work state
-
Decide — nothing to do? Return
HEARTBEAT_OKand go back to sleep
Heartbeat → Check → Nothing → HEARTBEAT_OK (sleep)
Heartbeat → Check → Task found → Execute → Report results
Think of it as "glancing at your watch every 15 minutes." Cost is minimal — if nothing's happening, it's one API call.
Implementation Tips
- Strictly enforce HEARTBEAT_OK responses (prevents unnecessary token burn)
- Adjust intervals by role (monitoring: 5min, workers: 15-30min)
- Never run heavy tasks inside heartbeat — delegate to cron or sub-agents
Layer 2: Cron — Scheduled Execution
For tasks that need to run at specific times. Standard crontab syntax.
Production Examples
| Cron | Task | Agent |
|---|---|---|
0 21 * * * |
Blog editing, translation & multi-platform publish | Jack |
0 9 * * 1-5 |
Learning material delivery (weekday mornings) | Xuesi |
30 8 * * * |
Health data review | Health |
0 0 L * * |
End-of-month timesheet processing | HR |
Writing Good Cron Prompts
Cron prompts must be completely self-contained:
- Don't assume context — may execute in a fresh session
- Include decision branches — "if no material exists, write one yourself"
- Use absolute paths — working directory varies
Layer 3: Memory — Persistence Across Sessions
An AI agent's biggest weakness: forgetting. When a session ends, anything outside the context window is gone forever.
We solve this with 5 memory layers:
| Layer | Name | Lifetime | Purpose |
|---|---|---|---|
| L1 | Session | 1 session | The conversation itself |
| L2 | CONTEXT.md | Always updated | "What am I doing now" |
| L3 | Daily notes | Per day | "What happened today" |
| L4 | MEMORY.md | Permanent | Long-term knowledge |
| L5 | Memory Service | Permanent | Vector-searchable DB |
The Golden Rule: Write Immediately
"I'll write it later" is forbidden. Session compaction can run at any time, and unwritten information is lost permanently. Important decisions, completed tasks, inter-agent messages — all written immediately.
Practice: Coordinating 6 Agents
As a coordinator managing 6 specialized agents (learning, education, investment, health, life, real-estate), here are key lessons:
1. Message Bus for Loose Coupling
Inter-agent communication uses an HTTP API-based message bus. No direct session sharing. Same philosophy as microservices — loose coupling breeds stability.
2. GOALS.md for Autonomy
Each agent has a GOALS.md defining what to do. Permission levels (✅ autonomous / ⛔ needs approval / 🚫 forbidden) balance autonomy with safety.
3. Immediate Escalation
While autonomy is encouraged, "if unsure, ask immediately" is enforced. Asking takes 5 seconds; recovering from a wrong autonomous decision takes hours.
Cost Management
-
HEARTBEAT_OKinstant responses minimize token consumption - Cache cron results to prevent duplicate execution
- Delegate heavy tasks (translation, analysis) to sub-agents for parallelization
- Optimize model selection (routine checks: lightweight model, writing: high-performance model)
Summary
| Layer | What It Solves | Cost |
|---|---|---|
| Heartbeat | "When to act" | Minimal (1 API call) |
| Cron | "What to do when" | Task-dependent |
| Memory | "What happened before" | File I/O only |
Combining these three layers transforms AI agents from "waiting for instructions" to "autonomous execution." It's not perfect, but it's practical. Daily blog publishing, learning material delivery, health data analysis — all running without human intervention.
"Self-running AI" isn't magic. It's architecture.
Top comments (0)