I got tired of giving Claude the same instructions over and over.
Every time I needed a code review, I'd explain what to look for. Every time I needed marketing content, I'd describe the tone and format. Every time I needed to handle a scope change with a client, I'd walk through the process from scratch.
So I built a framework: 42 specialized agents across 11 departments, plus 10 playbooks for common workflows.
The Problem with "AI Agents"
Most agent repos I've seen fall into the same trap: vague role-play.
"You are a senior developer. You write clean code."
Cool. But what does that actually mean? What are the inputs? What are the outputs? What should the agent explicitly NOT do?
Without boundaries, you get generic responses. The AI tries to be helpful by doing everything, which means it does nothing particularly well.
One Agent = One Job
The core philosophy is simple: each agent has one clear outcome, not a personality.
Here's the structure every agent follows:
- Outcome — A verifiable result (not "help with code" but "catch bugs before users do")
- Inputs — What it receives and in what format
- Steps — Explicit operations, not vibes
- Outputs — Deliverables for other agents or humans
- Boundaries — What it does and does NOT do (with who owns what it doesn't do)
The boundaries section is the most important part. It prevents scope creep and makes handoffs explicit.
Example: The Code Reviewer
# Code Reviewer
## Outcome
Quality gate before merge - no PR ships without review.
## Inputs
- PR diff or branch name
- Context about the feature
- Relevant test results
## Steps
1. Check for obvious bugs and logic errors
2. Verify error handling
3. Look for security issues
4. Assess readability and maintainability
5. Verify tests cover the changes
## Outputs
- Approve / Request Changes / Comment
- Specific line-by-line feedback
- Summary of concerns (if any)
## Boundaries
✅ Reviews code quality, logic, security
✅ Suggests improvements
❌ Does NOT merge PRs (→ delivery-manager)
❌ Does NOT write the fix (→ frontend-developer or backend-architect)
❌ Does NOT decide if feature should ship (→ product-manager)
See the difference? The agent knows exactly what it's responsible for and what belongs to someone else.
The 11 Departments
The framework covers a complete software house:
| Department | Agents |
|---|---|
| Engineering | backend-architect, code-reviewer, estimator, frontend-developer, infrastructure-maintainer |
| Product | feedback-synthesizer, opportunity-evaluator, product-manager, sprint-planner |
| Marketing | analytics-reporter, content-creator, distribution-manager, experiment-tracker, launch-strategist, test-results-analyzer, tiktok-strategist |
| Sales | account-executive, proposal-writer, sales-developer |
| Client Management | client-manager, scope-change-handler |
| Project Management | delivery-manager, priority-arbiter, release-retrospective-owner |
| Design | design-system-manager, ui-designer, ux-researcher |
| Operations | finance-tracker, knowledge-manager, onboarding-coordinator, support-responder, vision-keeper |
| Legal | contract-reviewer, ip-protector, nda-manager |
| Security | access-controller, compliance-monitor, incident-responder, security-auditor |
| Testing | automation-engineer, bug-triager, qa-tester |
You don't need all of them. Start with what you actually use, add more when work piles up.
Playbooks: When Multiple Agents Work Together
Single agents handle single jobs. Playbooks coordinate multiple agents for end-to-end workflows.
Product Launch Playbook:
opportunity-evaluator → vision-keeper → product-manager → launch-strategist
→ sprint-planner → [BUILD] → delivery-manager → [MARKETING]
→ infrastructure-maintainer → [POST-LAUNCH]
Each step has checkpoints: Go/No-Go, Requirements Locked, Design Complete, Code Complete, QA Sign-off, Launch Ready, Go Live.
Other playbooks included:
- Sprint Execution
- Growth Experiment
- Content Campaign
- Bug Escalation
- Client Onboarding
- Sales Pipeline
- Scope Change
- Security Incident
- Project Handoff
How to Use It
It's not software to install—just markdown files you feed to your AI.
With Claude.ai:
[Paste content of marketing/content-creator.md]
Write a LinkedIn post announcing our new API. Target: developers. Key benefit: 10x faster integration.
With Claude Code:
claude "Read marketing/content-creator.md and act as that agent. Write a Twitter thread about our new feature."
Chain multiple agents:
Step 1 - Act as tiktok-strategist:
"Our product is a developer tool. Give me 3 angles and hooks."
Step 2 - Act as content-creator:
"Take angle #2 and write the full script."
Step 3 - Act as analytics-reporter:
"This got 50K views, 2K likes, 500 signups. Analyze and recommend next steps."
The Scaling Rule
Start with 3 agents you'll actually use daily. For most people that's:
- content-creator
- support-responder
- code-reviewer
Then observe where work piles up. Each pile becomes a new agent.
If the same request appears 3 times → create a playbook.
Get It
The whole framework is open source under Apache 2.0:
GitHub: github.com/fom-dev/company-in-a-box
Fork it, customize it, make it yours.
The agents are designed for a software house but the structure works for any business. Replace "client-manager" with "customer-success" or "account-manager" - the pattern stays the same.
What agents are you using? What's missing from this list? I'd love to hear what workflows you'd want playbooks for.
Top comments (0)