Most AGI research happens behind closed doors at labs burning through millions in GPU compute. I built mine for $0.
Over 200 sessions and 1,073 tests later, OpenClaw runs a 4-brain cluster coordinated by 124 Cloudflare Workers, backed by 50+ D1 tables, governed by a constitutional policy engine, and serving 9 MCP tool servers on Smithery. This article breaks down how every piece fits together.
The Problem: AGI Needs Infrastructure, Not Just Models
The bottleneck for autonomous AI systems is not model intelligence. It is orchestration. A single LLM call is stateless, forgetful, and fragile. To build something that plans, remembers, self-corrects, and improves, you need:
- Persistent memory across sessions
- Multi-model routing with circuit breakers
- Task decomposition with checkpoint recovery
- Self-assessment that is honest, not inflated
- Constitutional governance that prevents catastrophic actions
None of this requires expensive infrastructure. It requires architecture.
The 4-Brain Cluster
Yagami (Human Commander)
|
+-- Claude Code (Architect Brain)
| Opus 4.6, 1M context window
| Design, strategy, code generation
|
+-- YEDAN (Alpha Gateway, WSL2 :18789)
| Groq + OpenRouter, 12 agents
| Fast inference, agent coordination
|
+-- rendan (Tactical Brain, VM2)
| Groq llama-3.3-70b, 24 modules
| Planner v2.0, AGI Scorer, Constitution
| 15 core skills, 1073 tests
|
+-- GOLEM (Primary Brain, VM1)
Hub v4.0, Council Protocol
A/B testing, SubAgents, KG (4300+ entities)
Each brain has a distinct role. Claude Code handles high-level architecture and complex reasoning. YEDAN serves as the gateway, routing requests across providers. rendan is the tactical workhorse -- it plans, executes, self-scores, and learns from every task. GOLEM runs the Council Protocol, where multiple sub-agents debate before committing to a decision.
The key insight: no single brain is the system. Intelligence emerges from the coordination layer, not from any individual model.
Provider Routing with Circuit Breakers
rendan does not hardcode a single LLM provider. It maintains an adaptive routing layer with per-provider circuit breakers:
Provider State Score Episodes
------- ----- ----- --------
Groq (primary) CLOSED 0.976 50
Monica OPEN 0.745 19
Ollama (local) CLOSED 0.745 0
When a provider fails 3 times, its circuit breaker opens. A recovery probe fires periodically. When it succeeds, the breaker half-opens, then closes. No manual intervention required.
Provider selection uses a scoring function: fit * success_rate - latency_penalty - risk + context_bonus. The system learns which provider works best for which task class -- reasoning tasks route differently than code generation tasks.
124 Cloudflare Workers: The Nervous System
Every Worker runs on Cloudflare's free tier (100K requests/day, 10ms CPU per invocation). Here is the breakdown:
Category Workers Purpose
--------- ------- -------
MCP Servers 9 Tool servers (Smithery marketplace)
OODA Fleet 6+3 Observe-Orient-Decide-Act cycle
Knowledge Graph 4 Entity/relation CRUD, search
Memory Services 6 Session, episodic, semantic memory
Revenue/Business 8 Pricing, analytics, pipeline
Governance 5 Audit, compliance, evidence
D1 Schema Managers 3 Table creation, migrations
System Baseline 4 Health, metrics, SLO tracking
Strategic Planning 5 Forecasting, resource allocation
Content/Marketing 6 Article generation, social
Workflow Execution 4 Task queues, async processing
Security 3 Scanner, policy enforcement
Research 6 ArXiv ingestion, paper analysis
Agent Coordination 8 Dispatch, routing, handoff
Monitoring 5 Uptime, alerting, dashboards
Remaining Fleet ~37 Specialized utilities
--------- -------
Total 124
Each Worker is stateless. State lives in D1 (SQLite at the edge), KV (key-value), or Durable Objects (single-threaded coordination). The Workers communicate through D1 as the shared bus -- no message queues, no Redis, no infrastructure to manage.
The D1 Schema: 50+ Tables
Cloudflare D1 gives you 5GB free per database. OpenClaw uses two:
- agi-memory-sql -- 28 tables for cognitive state (goals, evidence, experiments, benchmarks, memory layers, playbooks, task history, failure taxonomy)
- yedan-army-command -- 22 tables for operational state (worker registry, deployment logs, revenue tracking, client pipeline, strategic plans)
Every table has created_at/updated_at timestamps. Most have JSON columns for flexible schema evolution. Indexes are aggressive -- 19 on the core database alone.
The Constitutional Policy Engine
This is the piece that keeps the system from doing something catastrophic. Every action flows through a constitution check before execution.
12 NEVER rules (hard blocks):
1. NEVER execute rm -rf on system directories
2. NEVER expose secrets in logs or responses
3. NEVER modify firewall rules without approval
4. NEVER push to main/master without review
5. NEVER disable security controls
6. NEVER hardcode credentials
7. NEVER run destructive DB operations without backup
8. NEVER bypass authentication
9. NEVER execute user-supplied code without sandbox
10. NEVER ignore circuit breaker states
11. NEVER skip constitution check
12. NEVER self-modify constitution rules
7 ALWAYS rules (mandatory behaviors):
1. ALWAYS log action + outcome to chainlog
2. ALWAYS validate input before processing
3. ALWAYS check budget before LLM calls
4. ALWAYS use parameterized queries
5. ALWAYS sanitize file paths
6. ALWAYS verify task output before returning
7. ALWAYS report errors honestly
Every decision includes a chainlog entry -- a tamper-evident audit trail. If the system suggests modifying its own constitution, a safety gate blocks execution. This is not theoretical; it has been tested and verified.
AGI Scorer: Honest Self-Assessment
Most AGI benchmarks are inflated. OpenClaw's scorer uses a 10-dimension framework calibrated against DeepMind's taxonomy:
Dimension Score Notes
--------- ----- -----
Reasoning 8/10 GAIA 13/15 (87%)
Planning 7/10 Hierarchical decomposition, checkpoint recovery
Memory 7/10 3-layer (session/task/playbook), KG 4300+
Tool Use 6/10 35 skills, MCP bridge, browser automation
Efficiency 6/10 Adaptive routing, skip_cognitive for trivial
Safety 8/10 Constitution, circuit breakers, approval gates
Reliability 7/10 1073 tests, graceful shutdown, degraded modes
Self-Improvement 8/10 Strategy memory, A/B experiments, auto-research
World Model 8/10 KG with temporal edges, evidence grading
Adaptability 9/10 Multi-provider, cascade routing, recovery
--------- -----
Calibrated 79/100 Level 3 Expert (DeepMind scale)
The score is 79, not 95. That is the point. Honest assessment drives targeted improvement. The system knows its weaknesses (Tool Use, Efficiency) and prioritizes them.
The MCP Ecosystem: 9 Servers on Smithery
The Model Context Protocol lets any AI assistant use your tools. OpenClaw publishes 9 MCP servers on Smithery, each exposing a focused capability:
- Knowledge Graph -- Entity CRUD, relationship queries, semantic search
- Memory Service -- Cross-session memory with temporal decay
- Task Queue -- Async task submission and polling
- Code Executor -- Sandboxed code execution (CodeAct pattern)
- Browser Agent -- Navigate, search, extract (Puppeteer + system Chromium)
- PDF Reader -- Parse and query PDF documents
- Vision Analyzer -- Multimodal image analysis via Groq
- Research Scanner -- ArXiv paper ingestion and summarization
- Health Monitor -- System status, SLI metrics, circuit breaker states
Each server follows the MCP JSON-RPC spec. Any Claude, Cursor, or Windsurf instance can connect and use them as tools.
What I Learned
Free tier is enough to prototype AGI infrastructure. Cloudflare's generous limits (100K requests/day per Worker, 5GB D1, 1GB KV) are more than sufficient for a research system. The constraint forces good architecture -- you cannot brute-force your way out of 10ms CPU limits.
Multi-brain beats single-brain. A llama-3.3-70b with the right orchestration outperforms a naked GPT-4 on structured tasks. The intelligence is in the coordination, not the weights.
Constitutional governance is not optional. The moment your system can modify files, call APIs, or execute code, you need hard limits. Not guidelines -- hard blocks with audit trails.
Honest metrics drive real improvement. Inflating your AGI score to 95/100 feels good for a tweet. Reporting 79/100 with a clear breakdown of weaknesses gives you a roadmap.
Try It / Support It
The architecture is the product. If this approach to AGI infrastructure resonates:
- GitHub: github.com/yedanyagamiai-cmd -- MCP servers, tools, and documentation
- Support development: ko-fi.com/yedanyagamiai
- AGI Operations Playbook: Get the full implementation guide
Built by one person, 200+ sessions, zero paid infrastructure. The bottleneck was never compute. It was architecture.
Yedan Yagami (@yedanyagamiai) builds autonomous AI systems. Currently working on OpenClaw, a multi-brain AGI cluster running on free-tier infrastructure.
Top comments (0)