UNTAKA corp

Posted on May 24

How I structured Claude Code to run 6 autonomous agents without losing control

#claudecode #ai #productivity #tools

This is Part 2 of Building with Claude Code. Part 1 covers the basic .claude/ folder setup for freelance web dev.

I've been using Claude Code for several months. Like most developers, I started by using it as a fast autocomplete — type a question, get code, repeat.

The problem: every session started from scratch. No memory of the project state, no way to pick up where I left off, no structure that would hold across sessions.

So I built a structured system. Here's the architecture and the key insight that made it work.

The Core Problem With "Chatbot Mode"

When you use Claude Code as a chatbot, you're implicitly rebuilding context every single session. You re-explain the project, re-explain the constraints, re-explain where you are. It's fast — but you're paying the setup cost every time.

The other problem: a chatbot doesn't have a decision framework. It improvises. Sometimes that's great. For long-running autonomous work, improvisation is the failure mode.

The Architecture: 4 Components

1. CLAUDE.md as Project DNA

This is the mandatory startup file. Claude Code reads it first, before any action.

A good CLAUDE.md has five sections:

Identity — what the project is, what stack, who operates it
Startup sequence — exact steps to execute at session start (read this file, read RUNBOOK.md, run diagnostic)
Autonomous permissions — what Claude can do without asking, what requires human approval
Current state — 3 lines: current phase, last action, next action
Rules — 5-10 non-negotiable constraints

The autonomous permissions section is the one most people skip. It's the most important. Without it, the system either asks permission for everything (annoying) or assumes permission for everything (dangerous). With it, you define the boundary precisely.

The current state section updated at every session end means any new session orients in 30 seconds.

2. Specialized Agents with YAML Frontmatter

Agents live in .claude/agents/. Each is a markdown file with YAML frontmatter:

---
name: judge
description: "Scores opportunities 0-100, selects Top5/Top3. Read-only access."
tools: Read, Glob, Grep
model: sonnet
permissionMode: default
---

The key insight: give each agent exactly the tools it needs for its role, nothing more.

Scout: WebSearch, WebFetch, Write (to data/pipeline/ only)
Judge: Read, Glob, Grep (scoring only, no writes)
Builder: Read, Write, Edit — but only inside experiments/<id>/
Compliance: Read everything, write only to docs/swarm/COMPLIANCE_REVIEW_*.md
Treasury: Read data/portfolio/, Write data/portfolio/ only

Constraints are features. They make the system predictable. A Builder that can't touch governance files is a Builder you can trust to run autonomously.

Model layering matters for cost:

Scout runs on Haiku: cheap, fast, good enough for web search and extraction
Judge and Builder run on Sonnet: better reasoning for decision-critical steps
Not everything needs the most expensive model

3. settings.json as the Safety Layer

This is where governance becomes real — not just documented, but enforced at runtime:

{
  "permissions": {
    "deny": [
      "Read(./.env)",
      "Read(./.env.*)",
      "Bash(rm -rf *)",
      "Bash(curl *)"
    ],
    "ask": ["Bash(*)"],
    "allow": ["Read", "Write", "Edit", "Glob", "Grep"]
  }
}

The .env files are unreadable at the runtime level — not just documented as off-limits. Any Bash command requires human approval before execution. The deny list is the actual safety boundary.

4. RUNBOOK.md as the Heartbeat

One file, always current:

# RUNBOOK
Last updated: 2026-03-31 14:32

## Current Phase: Build — exp_002
## Last Action: Builder completed guide at 14:32
## Next Action: Human to create ZIP and upload to Gumroad
## Scheduled: /portfolio-review at D+1, /kill-or-scale at D+14

Any agent reading this knows exactly what's happening. No re-explaining. 30-second context restore.

The Decision Pipeline

The system runs every idea through a pipeline before touching code:

DISCOVERY → SCORING → COMPLIANCE → DECISION → BUILD → LAUNCH

Each stage has explicit rejection criteria. The pipeline has killed more ideas than it's built — that's the point.

Scoring model: 10 dimensions

Buyer clarity (does the buyer know they have this problem?)
Urgency (does it hurt now, or is it a nice-to-have?)
Build speed (can it ship in under 8 hours?)
Support burden (will this generate support tickets?)
Stack fit (can we build it with what we have?)
+ 5 more

Auto-reject rules: score < 50, build time > 8 hours, support burden > 2 hours/month.

The result: you only build things that have a real shot at working.

What I Learned Running This

The governance layer is the most valuable part, not the least. Writing hard prohibitions forces clarity about what the system is for. "No paid advertising spend" and "no daily manual intervention required" aren't constraints — they're design decisions made in advance, when you're thinking clearly, before you're in the middle of a build and tempted to cut corners.

RUNBOOK.md matters more than I expected. Every time I skipped updating it, the next session was painful. Every time I kept it current, the next session started in 30 seconds.

Model layering saves real money. Running discovery on Haiku and only escalating to Sonnet for actual decisions made the whole system sustainable at scale.

The Quick Start (3 Files, 10 Minutes)

You don't need the full 6-agent system. Start with this:

CLAUDE.md — identity, startup sequence, autonomous permissions, current state, 5 rules
RUNBOOK.md — current phase, last action, next action
.claude/settings.json — deny .env, deny rm -rf, ask on all Bash

10 minutes of setup. Your next Claude Code session will feel fundamentally different.

DEV Community