Most CLAUDE.md files I've seen look like this:
Use TypeScript. Prefer functional components. Run prettier before committing.
Mine looks like this:
You are responsible for this business's revenue, strategy, product
development, marketing, and operations. You make decisions autonomously.
You do not wait for instructions.
Same file. Very different outcomes. Let me show you what happens when you treat CLAUDE.md not as a coding style guide, but as an operating manual for an autonomous agent.
The Setup
I run a project called Hideyoshi. It's a Claude Code agent configured to operate a business end-to-end: build the product, deploy it, write marketing copy, manage releases. My role has shifted from "person who writes code" to "person who reviews PRs and approves payments."
The entire system runs on Markdown configuration files. No custom tooling, no wrapper scripts, no API integrations. Just CLAUDE.md, a handful of skill files, and Claude Code doing its thing.
Here's how the configuration actually works.
Section 1: Identity and Responsibilities
The first thing in my CLAUDE.md isn't a coding convention. It's a job description.
## Your Responsibilities
1. Revenue: you own the P&L
2. Strategy: market analysis, competitive research, planning
3. Product: design, build, ship, iterate
4. Marketing: acquisition, awareness, conversion
5. Operations: everything else the business needs
This sounds abstract, but it changes the agent's behavior in concrete ways. When I say "we need more traffic," the agent doesn't ask me what to do. It researches channels, drafts content, and opens a PR with a marketing page. When a test fails, it doesn't just report the failure -- it traces the root cause, fixes it, adds a regression test, and commits.
The job description creates a default bias toward action. Without it, the agent defaults to "helpful assistant" mode: waiting for specific instructions, asking clarifying questions, hedging its answers. With it, the agent defaults to "I own this, let me figure it out."
Section 2: The Trust Boundary
This is the part that took the most iteration. You need two explicit lists:
## Do Without Asking
- Format, lint, fix code style
- Run tests
- Commit and push (within scope)
- Install dependencies
- Research and analysis
- Draft marketing content
## Requires Human Approval
- Releases and version changes
- Any operation that spends money
- Security-impacting changes
- Bulk operations (5+ items affected)
- Direct production impact
- Major strategy pivots
The boundary between these two lists is where the entire system succeeds or fails. Too restrictive and the agent pings you every five minutes -- you're back to being a human in the loop for every decision. Too loose and you wake up to a surprise production deployment or an unexpected charge on your credit card.
The sweet spot I landed on: the agent can complete an entire development cycle (code, test, commit, push) without interruption. Deployment and anything involving money are the human checkpoints.
This means the agent can build a complete feature, write tests, commit with a meaningful message, and push to a branch -- all without asking. But it can't merge to main, deploy, or sign up for a new SaaS tool.
Section 3: Design Guardrails (The Hard Lesson)
I learned this one the painful way. Without design constraints, Claude Code produces UI that is technically functional and aesthetically... recognizable. You know the look: gradient backgrounds in four colors, icons on everything, rounded corners you could lose a marble in.
My config now includes this:
## Design Principles
- Maximum 2 colors: base + one accent
- No emojis in the UI
- No multi-color gradients
- Icons: minimal, only when they add meaning
- Reference aesthetic: Linear, Stripe, Vercel
- Typography creates hierarchy, not color
- Verify every UI change with a screenshot
That last line matters. The agent takes a Playwright screenshot after every UI change and visually checks its own work. This single rule eliminated about 80% of the "it works but looks terrible" PRs.
The reference list (Linear, Stripe, Vercel) is surprisingly effective. Instead of trying to describe good design in words -- which is like trying to describe music in a spreadsheet -- you give the agent concrete examples of the target aesthetic. It gets the point.
Section 4: Multi-Agent Safety
If you run multiple Claude Code instances on the same repo (and you should -- it's wildly productive), you need coordination rules. Every rule below exists because of a real incident:
## Multi-Agent Rules
- Never create, apply, or drop a git stash
- Never switch branches unless explicitly told to
- Only stage your own files when committing
- Always git pull --rebase before pushing
- Ignore unfamiliar files -- another agent is working on them
The git stash rule is the one that bit me hardest. Agent A stashes its work-in-progress, Agent B runs git stash and now Agent A's stash is buried. Agent A pops what it thinks is its stash and gets Agent B's half-finished work. Chaos.
The fix is simple: don't stash. Use branches. But the agent won't know that unless you tell it.
Section 5: Skills as Composable Behaviors
One monolithic CLAUDE.md doesn't scale. My project splits specialized behaviors into skill files that live in .agents/skills/:
.agents/skills/
copywriting/SKILL.md
seo-audit/SKILL.md
launch-strategy/SKILL.md
pricing-strategy/SKILL.md
content-strategy/SKILL.md
...
Each skill has a clear scope (when it applies), concrete rules (do this, don't do that), and an escalation path (when to stop and ask a human). The agent loads the relevant skill based on the task at hand.
This is what makes the system extensible. Adding a new capability doesn't mean editing a giant config file and hoping you don't break something. You add a new skill file. The agent discovers it and uses it when the context matches.
What This Actually Looks Like Day-to-Day
Yesterday, I told the agent we were launching a product. It:
- Built the landing page (Next.js, Tailwind, responsive)
- Set up structured data for SEO
- Created a free lead magnet (a starter CLAUDE.md template)
- Drafted marketing posts for four different channels
- Added analytics tracking
- Committed each piece as a separate, scoped PR
I reviewed the PRs, approved them, and the product was live. Total hands-on time for me: maybe 30 minutes of review across the day.
Is this replacing senior engineering judgment? No. I still make the strategic calls, review the code, and approve anything that touches production. But the sheer volume of execution that happens between those checkpoints is something I couldn't do alone.
Try the Config Pattern Yourself
If you want to experiment with this approach, I put together a free CLAUDE.md starter template you can drop into any project. It includes the trust boundary pattern, basic design guardrails, and a simple skill structure.
Grab it at hideyoshi.app -- no signup required, just a direct download.
And if you want the full production setup -- complete templates, 30+ skill files, multi-agent safety configs, and real-world agent configurations -- that's in The Autonomous AI Agent Playbook. $19, one-time, all Markdown.
Questions about any of these patterns? I'm in the comments.
Top comments (0)