DEV Community

Cover image for I gave Claude Code 16 specialist roles. Now I ship full-stack features before lunch.
Creative Brain Inc
Creative Brain Inc

Posted on

I gave Claude Code 16 specialist roles. Now I ship full-stack features before lunch.

I'm a solo founder. I built a Claude Code plugin where one orchestrator session dispatches 16 narrow AI specialists (planner, api-builder, ui-builder, bug-hunter, security-scanner, accessibility-tester, supabase-auditor, deployer, and more). They coordinate through plain markdown files in memory/. Hooks block force pushes and ungated production deploys. Six slash commands take a feature from idea to staging in under an hour. Everything lives in the repo. No new SaaS subscription. No data leaving the machine.

The problem nobody admits out loud

Solo founders aren't slow because they can't code. They're slow because they wear eight hats at once.

A 30-minute task becomes 3 hours when you also have to think about: "is this auth flow safe?", "did I add the migration?", "is the README still accurate?", "do the keyboard shortcuts work?", "did I just push a service_role key to the client bundle?"

The context-switching tax is the real cost of shipping alone. So I stopped trying to be eight people, and built an orchestrated AI engineering team that I direct instead.

Solo founder operating system, the three stats: 16 specialists / 6 commands / file-based memory


The pattern: orchestrator does context, specialists do the work

The single most important design decision: the orchestrator does not write code.

The main Claude Code session has one job — gather context, delegate to a specialist, verify the output. That's it. Every actual write operation (code, migration, doc, deploy) runs inside a narrow subagent with a tight system prompt and a single concern.

Why this matters: a generalist AI burns context on every task. By the time it's writing your migration, it's also remembering your CSS opinions, your API conventions, the bug it fixed yesterday, and the README you asked it to update. The output gets fuzzier as the context fills up.

A roster of specialists with clean context windows beats one big context every time. The orchestrator is the only one with persistent memory of the whole project — and it leans on the file system (memory/) for that, not the chat buffer.

┌──────────────────────────────────────────────────────────────┐
│  Orchestrator (main Claude Code session)                     │
│  → reads memory/ → delegates to specialist → verifies output │
└────────────┬─────────────────────────────────────────────────┘
             │
   ┌─────────┼─────────┬──────────┬───────────┐
   │         │         │          │           │
   ▼         ▼         ▼          ▼           ▼
 planner  api-bldr  ui-bldr   bug-huntr  security-scnr   ...
Enter fullscreen mode Exit fullscreen mode

The Orchestrator does context. Specialists do the work.


The 16 specialists

Each specialist is a Claude Code subagent — its own markdown file in agents/, its own system prompt, its own tool allowlist. They group into four bands:

Planning band

  • planner — turns a one-line ask into a precise spec with acceptance criteria
  • architect — picks patterns, writes ADRs, evaluates trade-offs
  • brd-decomposer — turns a Business Requirements Doc into a phased roadmap

Build band

  • api-builder — Next.js route handlers, validation, error contracts
  • db-architect — schema, RLS policies, indexes
  • ui-builder — React components, design-token-aligned, a11y-aware
  • test-author — unit + integration coverage for what just shipped
  • migration-author — Supabase migration files with up/down logic

Audit band (run in parallel)

  • bug-hunter — runs against the diff, looks for runtime bugs
  • security-scanner — looks for auth holes, leaked secrets, injection
  • accessibility-tester — keyboard, contrast, ARIA, focus traps
  • supabase-auditor — RLS coverage, policy gaps, exposed service keys

Operations band

  • supabase-deployer — pushes migrations, gated
  • bug-fixer — closes a single bug, tightly scoped
  • docs-writer — keeps the README and CHANGELOG honest

Sixteen narrow files beats one giant prompt. Each specialist's prompt is ~80 lines max, focused entirely on its concern. Swapping one out is a one-file change.

16 specialists. Each owns one concern.

Planning band

Build band

Audit band

Operations band


File-based memory: the part nobody else does

The orchestrator's memory does not live in chat. It lives in your repo, in markdown:

memory/
├── activity-log.md       # append-only — every Write/Edit auto-logged
├── current-sprint.md     # what we're working on RIGHT NOW
├── roadmap.md            # phased plan from the BRD
├── known-issues.md       # bugs we know about, won't ship today
├── ADR-template.md       # architecture decisions
└── brds/                 # source business requirements
    └── _BRD-TEMPLATE.md
Enter fullscreen mode Exit fullscreen mode

Why files, not a vector DB:

  1. It's diffable. Every memory change shows up in git log. You can see, line by line, what the AI thought yesterday vs. today.
  2. It's portable. Switch to a different AI tool tomorrow? Your memory is still there. Plain markdown.
  3. It's reviewable. I read current-sprint.md myself before standup. The AI and I share the same source of truth.
  4. It survives sessions. Close Claude Code, reboot, come back next week — memory/ is still there. The orchestrator picks up exactly where it left off.

A vector DB would give you "similar past work" lookups. A file system gives you precise, auditable, version-controlled state. For a solo team, files win.

The directory tree + the


Six commands. Idea to production.

The whole workflow surfaces as six slash commands. Type one, the orchestrator dispatches the right specialists.

/cbinc-init-from-brd    # Read a BRD, spit out a phased roadmap
/cbinc-plan             # Draft a spec for a single feature
/cbinc-implement        # Build it (api + ui + db + tests in parallel)
/cbinc-audit            # Run all four auditors in parallel against the diff
/cbinc-ship             # Staging deploy, halt for prod approval
/cbinc-document         # Update README, CHANGELOG, ADRs
/cbinc-standup          # Generate yesterday/today/blockers from activity-log
Enter fullscreen mode Exit fullscreen mode

A real Tuesday morning:

08:30  /cbinc-plan add user profile editing
       → planner drafts spec → I approve in 2 min

09:00  /cbinc-implement profile-editing
       → api-builder, ui-builder, db-architect, test-author run in parallel
       → I review the diff → request 2 changes

10:15  /cbinc-audit profile-editing
       → bug-hunter, security-scanner, a11y-tester, supabase-auditor run in parallel
       → security-scanner finds an RLS gap → patch applied
       → audit re-runs clean

11:00  /cbinc-ship profile-editing
       → staging deploy → smoke checks pass
       → orchestrator halts, asks me to approve production
       → I type 'y' → live by 11:15
Enter fullscreen mode Exit fullscreen mode

A feature shipped before lunch, solo, with every safety gate intact.

The six commands as a pipeline


Safety hooks: policy as code, not vibes

This is where the system earns its keep. Claude Code's hook system runs scripts at lifecycle events. I use them to enforce hard limits:

// hooks/settings.json (excerpt)
{
  "hooks": {
    "PreToolUse": [
      {
        "matcher": "Bash",
        "hooks": [{
          "type": "command",
          "command": "bash -c 'if echo \"$TOOL_INPUT\" | grep -qE \"git push.*--force|rm -rf /|DROP DATABASE\"; then echo BLOCKED; exit 2; fi'"
        }]
      }
    ],
    "PostToolUse": [
      {
        "matcher": "Write|Edit",
        "hooks": [{
          "type": "command",
          "command": "bash scripts/append-to-activity-log.sh"
        }]
      }
    ],
    "SessionEnd": [
      {
        "type": "command",
        "command": "bash scripts/write-session-summary.sh"
      }
    ]
  }
}
Enter fullscreen mode Exit fullscreen mode

What this gives me:

  • No force pushes. Ever. Even if the AI thinks it's a good idea.
  • No rm -rf near root. The pattern matcher catches it before the tool runs.
  • No production deploy without my explicit approval. The deployer halts at staging — there's no flag the AI can pass to skip the gate.
  • Every write logged. activity-log.md is append-only. I can rewind to any point in the project.
  • Session summaries. When I close Claude Code, it writes a one-page summary so the next session picks up clean.

These aren't suggestions to the AI. They're hooks. They run in the runtime layer, not the prompt layer. The AI cannot override them.


Observability: watch every agent live

When four auditors run in parallel, you want to see them. I added a small live dashboard that tails the activity log + the per-agent logs and renders them in a single browser pane:

The agent monitor / observability dashboard with multiple specialists running

You can swap this for whatever you like — tail -f in a tmux pane works fine. The point is: when the AI is doing something on your behalf, you should be able to see exactly what.


The honest tradeoffs

This isn't a silver bullet. Things that don't work well:

It's slow on the first feature of a new repo. The specialists need a memory/ to read against. The first day you set this up, you're hand-holding the planner through context it doesn't have yet. Day three onwards, it flies.

Migrations are still scary. I review every Supabase migration line by line. The AI is good at writing them but I trust the diff, not the AI's confidence. The supabase-auditor catches most issues but I treat it as a second opinion, not the final word.

You need a clear stack. This is tuned for Next.js + Supabase + Netlify. If your stack is exotic — Phoenix + Postgres + Fly.io — you'll need to retune the api-builder and deployer prompts. Doable, but it's real work, not a one-line config change.

Cost. Sixteen specialists, run on every audit, against a non-trivial diff, eats tokens. Budget for it. I run a Claude Pro subscription plus API credit for the heavy days. The trade vs. hiring an engineer is still wildly favorable, but it's not free.

It can't do design. None of this replaces taste. The ui-builder produces correct, accessible components. It does not produce beautiful products. That's still on me, working with a designer.


What I'd build next if I were starting over

If you're going to copy this pattern, two things I'd do differently:

  1. Start with the BRD decomposer, not the planner. When I started, I went /plan first. Now I always start with a BRD doc, run /cbinc-init-from-brd, and let the system generate the phased roadmap. The roadmap becomes the source of truth, and every subsequent /plan references it. Way less drift.

Have a Business Requirements Document? Decompose it.

  1. Write the activity-log hook on day one. I added it in week two. The two weeks of un-logged history are a real gap when I'm trying to understand "wait, why did we structure it this way?" Don't skip it.

And if you want to see this running live on your stack — Next.js + Supabase or otherwise — I do free 15-minute walkthroughs. I'll show you the system, look at your repo, and tell you honestly if it fits or not:

https://calendly.com/creativebrain-ca/free-mvp-strategy-call

The whole orchestrator pattern is documented in the repo (commands, agents, hooks, memory model). Happy to share specifics in the comments — drop a question and I'll dig into your stack.


If this resonated, a ❤️ or a follow tells the dev.to algorithm to show it to other founders. Comments with your own orchestrator patterns get bumped to the top — I want to hear what you're building.

Top comments (0)