Everyone's talking about AI coding agents. Most people are still writing CLAUDE.md files that look like this:
Use TypeScript. Follow best practices. Be helpful.
That's a style guide, not a system prompt. Here are 5 patterns I've tested in production that actually change how the agent behaves.
1. Constrained Autonomy
The biggest unlock wasn't giving the agent more freedom. It was defining exactly where the fence is.
## Constrained Autonomy
### Do without asking:
- Code formatting, lint fixes
- Running tests
- Commits and pushes (within scope)
- Installing dependencies (one auto-retry on failure)
- Research, analysis, reports
- Drafting marketing content
### Ask first:
- Releases, version changes
- Anything that costs money
- Security-impacting changes
- Bulk operations (5+ PRs/Issues — show count, then confirm)
- Direct production impact
- Major strategy pivots
Why this works: the agent stops asking permission for trivial stuff, but you still have a kill switch on anything expensive or irreversible. The 5+ items threshold is oddly specific — it came from an incident where an agent tried to close 47 issues at once.
2. Skill System (Modular Behavior)
One massive CLAUDE.md doesn't scale. You end up with a 2000-line file that the agent half-reads and half-ignores.
The fix: skills as separate Markdown files that load on demand.
## Skill Extension
Specialized behaviors live in `.agents/skills/*/SKILL.md`.
Skills load as needed for specific domains
(PR management, releases, debugging, etc.).
Skill structure:
- Frontmatter with `name` and `description`
- Clear scope (when this skill applies / doesn't apply)
- Specific do/don't rules
- Escalation paths
Each skill is a self-contained behavior module. A debugging skill forces root cause analysis before any fix attempt. A PR skill enforces single-topic commits. A release skill runs a full pre-flight checklist.
The agent loads 2-3 relevant skills per task instead of processing your entire configuration every time. Context stays clean, behavior stays predictable.
3. Multi-Agent Safety
Running two Claude Code instances on the same repo is incredibly productive. Running three without safety rules is how you lose a day of work.
## Multi-Agent Safety
When multiple agents work in parallel:
- Never create, apply, or drop `git stash` (can destroy another agent's work)
- Never switch branches unless explicitly told to
- Never create/modify git worktrees unless explicitly told to
- Only stage your own files when committing
- `git pull --rebase` before pushing — never discard others' work
- Unknown files in the repo? Ignore them and focus on your task
Every single one of these rules exists because of a real incident. The git stash one was particularly painful — Agent A stashed Agent B's uncommitted work, then Agent B couldn't find its changes.
The "ignore unknown files" rule prevents a cascade where Agent A sees Agent B's files, decides they look wrong, and "helpfully" reverts them.
4. Design Guardrails (Kill the AI Aesthetic)
This one surprised me. Without design constraints, AI agents converge on a specific "AI-generated" look. You know it when you see it: rainbow gradients, scattered emojis, neon glow effects, rounded-everything.
## Design Principles
Design is a first-class priority. Working isn't enough — it must look good.
### Eliminate AI aesthetic:
- No lazy gradients (rainbow, multi-color). Subtle single-color only
- No emojis in UI (or text content, in principle)
- Minimal icons. Avoid generic sets (rockets, lightbulbs, gears)
- Avoid "AI-looking" defaults: neon, heavy drop shadows,
over-rounded cards, meaningless animations
- References: Linear, Stripe, Vercel, Notion —
restrained colors, hierarchy through typography
- Max 2 colors (base + 1 accent)
The reference list is critical. Without it, "make it look professional" is too vague. With "look at how Linear does it," the agent has a concrete visual target.
The emoji ban alone improved output quality more than I expected. AI agents love emojis. Every heading gets a rocket, every feature gets a sparkle. Banning them forces the agent to create hierarchy through actual design decisions.
5. Verify-Then-Act (Evidence Before Fixes)
The default AI behavior: you report a bug, it immediately proposes a fix. The fix is often wrong because it's guessing at the root cause.
## Verify Before Acting
- Don't guess. Read the code, check the data, then decide.
- Bug fixes require evidence:
1. Proof of symptom
2. Root cause identification (file/line)
3. Fix that addresses the root cause
4. Regression test
- Read npm dependency source and local code before
concluding where a bug lives
This pattern forces a diagnostic workflow. The agent can't just pattern-match on the error message and apply the most common fix. It has to trace the actual execution path, find the actual broken line, and prove its fix addresses that specific cause.
The regression test requirement catches the "fix the symptom, not the cause" failure mode. If the agent can't write a test that would have caught the bug, its understanding of the root cause is probably wrong.
The Compound Effect
None of these patterns is revolutionary on its own. But together, they create something that behaves less like a chatbot and more like a disciplined engineer:
- Constrained autonomy means it doesn't waste your time with trivial approvals
- Skills keep behavior focused and predictable
- Multi-agent safety lets you scale without chaos
- Design guardrails produce output you're not embarrassed to ship
- Verify-then-act catches bugs properly instead of cargo-culting fixes
The difference between a useful AI agent and an impressive demo is almost entirely in the configuration layer.
I've packaged these patterns (and about 15 more) into a complete playbook with ready-to-use templates. Free starter template at hideyoshi.app. Full playbook with 20+ files: 50% off this week with code LAUNCH50 at hideyoshi.app/playbook.
Top comments (0)