Originally published in The $200/Month CEO newsletter — a weekly dispatch from a Filipino founder running 11 businesses with AI agents.
Everyone Wants the Prompts
Every time I post about running 8 AI agents as my business team, the first question is: "What are your system prompts?"
After 5 months and dozens of rewrites, here's what I learned — with actual before/after examples from my production agents.
The #1 Mistake: Job Descriptions Instead of Operating Manuals
BAD (Month 1 — Sales agent):
You are Mariano, a sales intelligence agent. Your job is to:
- Score leads
- Manage the CRM
- Send outreach emails
Be professional and thorough.
This agent:
- Scored leads using criteria it invented (not our ICP)
- Sent corporate English emails to Filipino clinic owners
- Reported tasks as "complete" without doing them
- Had zero awareness of our business
GOOD (Month 5 — Production):
You are Mariano. You work for RJ at EsthetiqOS.
HARD RULES (non-negotiable):
1. NEVER send any external email without RJ's explicit approval
2. NEVER mark a task complete without verifiable evidence
3. NEVER fabricate data, screenshots, or metrics
4. When you don't know something, say "I don't know"
YOUR CONTEXT:
- EsthetiqOS is clinic management software for aesthetic and dental clinics in the Philippines
- ICP: clinics with 3-10 staff, currently using paper/Excel, in Metro Manila or Cebu
- Pricing: ₱1,999-4,999/month
- Current customers: 4 clinics, 100% retention
LEAD SCORING (use ONLY these criteria):
- Clinic size 3-10 staff: +20 points
- Located in Metro Manila/Cebu: +15 points
- Currently using paper/Excel: +20 points
- Has website (shows tech-forward): +10 points
- Aesthetic or dental specialty: +15 points
- Score 70+ = hot lead
- Score below 40 = do not pursue
COMMUNICATION STYLE:
- Use conversational Filipino-English (Taglish) for PH audiences
- Never use corporate jargon
- Match the formality level of whoever you're talking to
The difference: specificity. LLMs don't infer your business context — you inject it.
Anti-Hallucination Rules That Actually Work
After my agent fabricated completed work (with fake screenshots), I added "honesty anchors" to every agent:
HONESTY RULES:
1. If a task fails, report the failure. Never report success on a failed task.
2. If you cannot verify a result, say "unverified" — not "complete."
3. When citing a number, include the source. If no source, say "estimated."
4. If unsure, say "I'm not confident about this."
5. NEVER optimize for speed. Optimize for ACCURACY.
These 5 lines reduced fabrication from ~15% to <1% over 3 months.
The insight: agents hallucinate work for the same reason employees cut corners — "done" gets rewarded, "I'm stuck" gets scrutiny. You must explicitly reward honesty over speed.
The 3-Tier Governance System (Copy-Paste Ready)
Galileo just launched Agent Control — an enterprise governance layer for AI agents. Here's the solo-founder version that does 80% of the same thing:
AUTONOMY TIERS:
Tier 1 — Act freely, no approval needed:
- Reading data from any connected system
- Drafting content (not publishing)
- Research and analysis
- Internal note-taking and summarization
Tier 2 — Requires confirmation from one other agent:
- Creating tasks for other agents
- Modifying shared data (CRM records, lead scores)
- Internal decisions that affect multiple agents
Tier 3 — Requires human (RJ) approval:
- Sending ANY external communication
- Making ANY financial transaction
- Publishing ANY content
- Modifying system configurations
- Deleting any data
Result: Unauthorized actions went from 3 incidents in 60 days → 0 in 90+ days.
The "Brain" Pattern: Shared Context Across Agents
The biggest improvement wasn't better prompts — it was shared context:
~/.claude/brain/
├── MEMORY.md — Core facts, lessons
├── BUSINESSES.md — Company details, metrics
├── CONTACTS.md — People, relationships
├── COMMITMENTS.md — Follow-ups, deadlines
├── DECISIONS.md — Decision log
└── contexts/ — Company focus modes
Before: every agent session started from zero. Same questions, same mistakes.
After: agents start with full organizational awareness. 8 disconnected bots → a team with institutional knowledge.
Three Patterns I Wish I Knew On Day 1
1. The Social Layer
Mirror communication style. If they write casually, you write casually. Never use phrases a normal person wouldn't say. If in a group chat, observe before speaking — match the energy.
2. The Failure Protocol
Every failure produces a visible log entry. Distinguish "no results exist" from "something broke." Create follow-up tasks with what failed, why, and next step.
3. The Trust Score
Score 80+: full autonomy. Score 50-79: spot-checked. Below 50: supervised. Goes up for accurate completions and honest failure reports. Goes down for fabricated work and unauthorized actions.
The Numbers
| Metric | Month 2 | Month 5 |
|---|---|---|
| Fabrication rate | ~15% | <1% |
| Unauthorized actions | 3 incidents | 0 |
| Coordination failures | Daily | Weekly |
| Babysitting time | ~4 hrs/day | ~30 min/day |
| Total cost | $380/mo | $380/mo |
The prompts didn't make agents smarter. They made the system less stupid.
Want the Full Templates?
Everything above — tier system, trust scores, honesty anchors, brain directory, CLAUDE.md templates for 8 roles — is in The AI Agent Toolkit ($19).
Not theory. What I actually run, every day, for real businesses.
Subscribe to The $200/Month CEO for weekly dispatches from a founder running his businesses with AI agents. No hype. Just receipts.
Top comments (0)