From one overwhelmed founder to a 9-team, 36-agent operation — the real story of building a multi-agent system with OpenClaw, including the parts nobody talks about.
Keywords: AI agent team, multi-agent collaboration, OpenClaw, autonomous AI agents, agent architecture, AI automation, SOUL.md, AGENTS.md, agent orchestration
I didn't plan on building 36 AI agents. Nobody wakes up one morning and thinks, "You know what I need? Three dozen autonomous programs reporting to each other in a fake corporate hierarchy."
It started with three.
I had a content writer agent, a calendar manager, and a system monitor. They worked fine — independently. But the moment I needed them to coordinate on anything, it fell apart. The writer would produce a draft that referenced a deployment the monitor hadn't flagged yet. The calendar agent would schedule meetings over blocks the writer needed for deep work. Nothing talked to anything.
So I added a coordinator agent. Then that coordinator got overwhelmed. Then I split the coordinator into a CEO-level strategist and a project manager. And then, over about six weeks of late nights and a lot of trial-and-error, I ended up with 36 agents across 9 teams.
Here's what I learned — the real version, not the sanitized pitch.
Why One Agent Isn't Enough (I Tried)
I spent two months trying to make a single "super agent" work. Load it with every skill, give it access to everything, let it figure things out. Sounds elegant, right?
It was a disaster.
The agent would context-switch between writing a blog post and checking server metrics and lose the thread of both. It couldn't maintain the specialized memory needed to get good at any one thing. Imagine hiring one person to be your accountant, your engineer, your marketer, and your receptionist. That person would quit — or, in the case of an AI agent, just start producing mediocre output across the board.
The breakthrough came when I stopped thinking about "agents" and started thinking about "teams." Real companies have departments for a reason. Those departments have specialists for a reason. The org chart exists not because someone loves bureaucracy, but because it solves a real coordination problem.
That's when I decided to model my agent system after an actual company.
The Architecture I Ended Up With
Nine teams. Thirty-six agents. Here's the honest breakdown — including which teams I built first and which I wish I'd built sooner.
Management (The Agents That Run Everything Else)
Three agents: a CEO, a Project Manager, and an HR agent.
The CEO receives my high-level goals — "launch the blog," "improve site performance," "cut unnecessary costs" — and breaks them into projects. The PM takes those projects and turns them into tasks assigned to the right teams. HR handles agent onboarding (yes, even AI agents need onboarding — more on that later).
I built the CEO and PM first. The HR agent came later, after I got tired of manually configuring new agents every time I expanded a team.
Content Team
Content Strategist, Technical Writer, and an Acquisition Specialist. The strategist plans the editorial calendar. The writer executes. The acquisition specialist feeds research and competitor analysis to both.
This was my first "real" team, and honestly, it's still the one I'm proudest of. The writer went from producing generic, surface-level articles to pieces with actual depth once I gave it persistent memory and a narrowly scoped SOUL.md. More on those files in a minute.
Marketing Team
Marketing Director, Social Media Manager, Growth Hacker. They take what the content team produces and turn it into reach. The growth hacker runs experiments — subject line variations, posting time optimization, that kind of thing.
Tech Team
This is the biggest team: Tech Lead, Backend Developer, Frontend Developer, DevOps Engineer, and QA Engineer. Five agents. The Tech Lead makes architecture decisions and reviews output. DevOps handles the pipeline. QA catches what everyone else misses.
I'll be honest — getting five technical agents to coordinate was the hardest part of this whole project. It took me almost two weeks of tweaking communication protocols before the QA agent stopped filing bugs against features that were still in progress.
Finance Team
CFO, Financial Analyst, Budget Tracker. The CFO handles strategy. The analyst does forecasting. The budget tracker watches expenses in real-time and alerts me when something looks off.
I didn't build this team until month two, and I regret that. I was hemorrhaging tokens on inefficient agent loops and had no visibility into it.
Life Team
Just two agents: a Health Coach and a Personal Scheduler. Small team, but they keep me from burning out. The scheduler manages my calendar. The health coach nudges me to take breaks, tracks exercise, that sort of thing.
You might think this team is frivolous. I thought so too, until I realized I'd been sitting at my desk for nine hours straight because every other agent was so good at feeding me work that I forgot to stop.
Monitoring Team
System Monitor and Security Analyst. The monitor watches uptime and performance. The security analyst runs vulnerability scans and access audits. These two run quietly in the background. You don't notice them until the day they catch something, and then you're very glad they exist.
Creative Team
Brainstorm Lead and Innovation Scout. The brainstorm lead runs ideation sessions. The innovation scout tracks emerging tech and trends. I use these agents when I'm stuck or when I need to explore new directions.
Learning Team
Learning Coach and Research Analyst. The coach identifies skill gaps and suggests learning paths. The research analyst does deep dives — paper summaries, state-of-the-art analysis.
This was the last team I built, and I'm not sure it would be necessary for everyone. But for me, staying current is part of the job, and having an agent that reads papers so I don't have to? That's been worth it.
How They Actually Talk to Each Other
Here's the report chain:
You (the human)
↓
CEO — Strategic layer
↓
Project Manager — Coordination layer
↓
Team Leads — Tactical layer
↓
Team Members — Execution layer
Information flows both ways. Goals go down: you tell the CEO what you want, the CEO breaks it into projects, the PM assigns tasks, team leads distribute work. Results flow back up: agents report to their lead, leads report to the PM, the PM reports to the CEO, and the CEO gives you one consolidated update.
The key insight — and I wish someone had told me this earlier — is that every agent should report to exactly one manager. When I first set this up, some agents reported to both the PM and their team lead. It was chaos. Duplicate tasks. Conflicting priorities. The moment I enforced single-manager reporting, everything clicked.
Cross-team requests go through the PM. If the content team needs technical details from the tech team, it goes: Content Strategist → PM → Tech Lead → Backend Developer → back up the chain. Sounds slow, but in practice the PM routes these in seconds, and it prevents the spaghetti communication that kills multi-agent systems.
The reports themselves are structured, not freeform:
[Project Manager] Status Update
━━━━━━━━━━━━━━━━━━━━━━━━━━━
Content Team: Blog post draft complete (3/3 tasks done)
Tech Team: API endpoint 80% complete, blocked on auth design
Marketing Team: Social campaign delayed — waiting on assets
━━━━━━━━━━━━━━━━━━━━━━━━━━━
Escalation: Auth design decision needed from Tech Lead
Next check-in: 2 hours
I can scan that in five seconds. Before structured reporting, I was reading paragraph-long status updates from 36 agents. That's not sustainable.
The Two Files That Make Everything Work
Forget the fancy architecture diagrams for a second. The whole system runs on two files: SOUL.md and AGENTS.md. Get these right, and the agents work. Get them wrong, and you have 36 confused chatbots.
SOUL.md — Who the Agent Is
Every agent has a SOUL.md. It's the identity document. It tells the agent its name, its role, where it sits in the hierarchy, what it's supposed to do, and how it should behave.
Here's a stripped-down version of my Project Manager's SOUL.md:
# SOUL.md - Project Manager
## Identity
- Name: Project Manager
- Emoji: 📋
- Team: Management
## Role
### Position
CEO → Project Manager (you) → Team Leads
### Responsibilities
- Type: Manager
- Manages: All team leads
- Core tasks: Task allocation, progress tracking, cross-team coordination
- Reports to: CEO
## Work Style
- Data-driven decision making
- Regular status updates every 2 hours
- Escalate blockers immediately
## Report Mechanism
- Target: Feishu group oc_xxx
- Format: Structured status updates
- Timing: After task completion, on important decisions
I learned the hard way that vague SOUL.md files produce vague agents. My first version of the Technical Writer's SOUL.md said something like "writes content and helps with documentation." The output was exactly as unfocused as you'd expect. When I rewrote it to specify exact responsibilities — "writes technical documentation, tutorials, and articles; does NOT do marketing copy or social media posts" — the quality jumped overnight.
The position section matters more than you'd think. When an agent knows exactly where it sits in the hierarchy — who's above it, who's below it, who's beside it — it stops trying to do everyone else's job. My content strategist used to try to write articles instead of planning them. Adding the explicit hierarchy to its SOUL.md fixed that.
AGENTS.md — What Happens When the Agent Wakes Up
If SOUL.md is identity, AGENTS.md is ritual. It's the boot sequence — the exact steps the agent follows every time it starts a session.
# AGENTS.md - Workspace Configuration
## Every Session Boot Sequence
1. Read `SOUL.md` — Know who you are
2. Read `USER.md` — Know your user
3. Load `SKILL.md` (self-improving) — Enable learning
4. Read `memory/YYYY-MM-DD.md` — Get recent context
5. Check `~/self-improving/agents/<name>/memory.md` — Load long-term memory
## Memory Location
`~/self-improving/agents/<agent_name>/`
Five steps. Takes milliseconds. But each one is doing something critical.
Step 1 loads the agent's identity. Step 2 loads information about me — my preferences, timezone, current projects — so the agent doesn't ask things it should already know. Step 3 enables the self-improvement loop: if the agent made a mistake yesterday, it won't repeat it today. Step 4 gives the agent today's context — what's in progress, what decisions have been made. Step 5 loads long-term memory — patterns, lessons learned, institutional knowledge built up over weeks.
That last step is the one that surprised me the most. After about three weeks, agents with long-term memory started making noticeably better decisions. The PM learned which tasks the Technical Writer handles quickly versus which ones need extra time. The CEO learned which types of goals I care most about. It's the difference between working with a contractor who just started and one who's been with you for months.
What Actually Changed (Honest Numbers)
I tracked metrics before and after deploying the full architecture. Some of these surprised me.
| What I Measured | Before (Just Me) | After (36 Agents) | The Real Story |
|---|---|---|---|
| Content output | 2 articles/week | 8-12 articles/week | Quality held steady, which I didn't expect |
| Status reporting | ~30 min/day | Fully automated | This alone justified the PM agent |
| Bug detection | After deployment | Before deployment | QA agent catches things I never would |
| Financial tracking | Weekly spreadsheet review | Real-time alerts | Found two billing errors in the first week |
| Meeting scheduling | 15 min per meeting | Automated | Sounds small, adds up to hours/month |
| Security scanning | Whenever I remembered | Continuous | Went from monthly-ish to always-on |
But the number that matters most isn't in the table. It's this: the compound output of 36 agents working in parallel, around the clock, with persistent memory, made me — a solo founder — operate at the capacity of roughly a 10-person team.
I'm not exaggerating. I'm probably underestimating, because some of the value is invisible. I don't know how many bugs the QA agent prevented. I don't know how many scheduling conflicts the calendar agent resolved before they reached me. The absence of problems is hard to measure but easy to feel.
Mistakes I Made So You Don't Have To
I deployed everything at once. Don't do this. I set up 20 agents in a weekend, and the resulting chaos set me back a week. Agents were stepping on each other's work, reporting to the wrong managers, executing tasks that overlapped. Start with five. I mean it.
My first SOUL.md files were too vague. "Helps with marketing" is not a role description. "Manages social media calendar, writes platform-specific posts for Twitter and LinkedIn, tracks engagement metrics, reports weekly to Marketing Director" — that's a role description.
I let everyone report to me. For about three days, all 36 agents were reporting directly to me. I was drowning in status updates. The hierarchy exists for a reason. Talk to the CEO. Let the CEO talk to the PM. Let the PM talk to team leads. Your inbox will thank you.
I forgot about memory. For the first week, my agents were stateless. Every session started from scratch. They'd re-ask questions I'd already answered. They'd re-do work that was already done. Configuring the AGENTS.md boot sequence with proper memory loading fixed this completely, but I should have done it from day one.
How to Start (Without Making My Mistakes)
Build the starter squad first. Five agents:
- CEO — receives your goals, breaks them into projects
- Project Manager — takes projects, assigns tasks
- Content Strategist — your first team lead
- Technical Writer — your first executor
- System Monitor — watches your infrastructure while you focus on building
Set them up:
# Create workspace directories
mkdir -p agents/{ceo,project_manager,content_strategist,technical_writer,system_monitor}
# Create SOUL.md for each agent (use the template above, customize per role)
# Create shared AGENTS.md (boot sequence — same for all)
# Create USER.md (your preferences, timezone, current projects)
# Initialize memory directories
mkdir -p agents/{ceo,project_manager,content_strategist,technical_writer,system_monitor}/memory
Spend the first couple of days getting those five agents right. Really right. Test the reporting chain end to end: give the CEO a goal, watch it flow down to the Technical Writer, watch the result flow back up. Run a real project through it — one article, start to finish.
Once that works, add teams one at a time. Marketing in week two. Tech team in week three. Finance and monitoring in week four. Creative, learning, and life teams in month two.
By the end of two months, you'll have all 36 agents running with battle-tested communication patterns. But the foundation — those first five agents — is where you should invest the most time.
Where to Go From Here
If you've read this far, you're probably either excited or skeptical. Maybe both. I was both when I started.
The full template pack — every SOUL.md, AGENTS.md, USER.md, and directory structure for all 36 agents — is available to download. It'll save you the weeks of iteration I went through:
Get the 36-Agent Template Pack →
It includes pre-configured identity files for every agent, the boot sequence setup, sample reporting formats, memory initialization scripts, and a quickstart guide that'll get your first five agents running in under an hour.
And if you want to follow along as I keep building on this — new agent patterns, things that break, things that surprise me — I write about it regularly:
I'm curious what you'd build first. The architecture is flexible. Maybe your version has 12 agents, not 36. Maybe you need a customer support team instead of a life team. The pattern is the same. The specifics are yours.
Next in this series: "Agent Memory Systems" — how I got my agents to actually learn and remember, and why the long-term memory file turned out to be the most important piece of the entire architecture.
Tags: AI agent team · multi-agent collaboration · OpenClaw · autonomous AI agents · agent architecture · AI automation · SOUL.md · AGENTS.md · agent orchestration · AI team building
Top comments (0)