I've been using Claude Code daily for months. And I kept hitting the same wall:
The agent would just start doing things. No plan. No approval. Just... acting.
It deleted files I didn't want deleted. It refactored things I didn't ask it to refactor. It made "helpful" assumptions that broke my architecture.
So I built Full Stack HQ — a configuration kit that enforces a permission-first workflow. Here's what I learned.
The core problem with AI coding agents
Most people configure their AI agent once (or never) and just... let it go. The result is an agent that:
- Makes assumptions about what you want
- Takes irreversible actions without asking
- Mixes planning and execution in the same step
- Has no consistent code style or architectural awareness
The agent is powerful but unpredictable. That's the worst combination in software development.
The solution: permission-first workflow
Nothing happens without your explicit approval. The agent plans, shows you what it intends to do, and waits.
You: "Add user authentication with JWT"
Agent: Here's my plan:
Phase 1: Create auth module + JWT strategy
Phase 2: Add guards to protected routes
Phase 3: Implement refresh token rotation
[APPROVAL NEEDED] Should I proceed with Phase 1?
You: PLAN APPROVED
Agent: [implements Phase 1 only, then stops and reports]
The only valid approval keywords:
PLAN APPROVED
IMPLEMENTATION APPROVED
PROCEED
DO IT
Anything else — the agent waits. No exceptions.
What's inside Full Stack HQ
| Component | Count | Description |
|---|---|---|
CLAUDE.md |
1 | Global rules for Claude Code |
GEMINI.md |
1 | Global rules for Google Antigravity IDE |
| Agents | 10 | Specialist AI personas |
| Skills | 28 | Domain-specific knowledge modules |
| Workflows | 10 | Slash command procedures |
10 Specialist Agents
Instead of one generic agent trying to do everything, you get domain experts:
| Agent | What it handles |
|---|---|
frontend-specialist |
React, Next.js, Tailwind |
backend-specialist |
NestJS, Node.js, APIs |
database-specialist |
Prisma, PostgreSQL, migrations |
architect |
System design, trade-offs, ADRs |
code-reviewer |
Quality, patterns, best practices |
test-engineer |
Vitest, Jest, Playwright |
security-auditor |
Auth, OWASP, input validation |
performance-optimizer |
Bundle, queries, rendering |
devops-engineer |
Docker, CI/CD |
documentation-writer |
READMEs, technical writing |
Calling them is simple:
Use the database-specialist to design a user schema with soft deletes.
28 Skills
Deep knowledge modules for the tools you actually use:
-
Frontend:
nextjs-app-router,react-best-practices,ui-ux-pro-max,frontend-design -
Backend:
nestjs-patterns,prisma-workflow,software-architecture -
Testing:
test-driven-development,systematic-debugging,webapp-testing -
Meta:
brainstorming,prompt-engineering,skill-creator
10 Workflows (Slash Commands)
/plan → phased breakdown with approval checkpoints
/brainstorm → explore architecture options
/debug → systematic root-cause analysis
/create → implement an approved plan
/enhance → improve existing code quality
/test → generate or fix tests
/orchestrate → coordinate multiple agents
Install in 30 seconds
Mac/Linux:
curl -fsSL https://raw.githubusercontent.com/sabahattink/antigravity-fullstack-hq/main/install.sh | bash
Windows (PowerShell):
irm https://raw.githubusercontent.com/sabahattink/antigravity-fullstack-hq/main/install.ps1 | iex
Options:
./install.sh --only-claude # Claude Code only
./install.sh --only-antigravity # Antigravity only
./install.sh --force # Overwrite existing configs
The script detects which IDEs you have installed and configures them automatically.
What gets installed where
~/.claude/
├── CLAUDE.md ← global rules (Claude Code)
├── agents/ ← 10 specialist agents
└── skills/ ← 28 skill modules
~/.gemini/
├── GEMINI.md ← global rules (Antigravity)
└── antigravity/
├── agents/
├── skills/
└── workflows/
The CLAUDE.md philosophy
The rules file enforces several things I found critical in practice:
1. Separation of planning and execution
The agent never does both in the same step. First it plans, you approve, then it executes. This alone eliminates 80% of unwanted surprises.
2. Role-based reasoning
Before acting, the agent asks: "Who is the right specialist for this?" A database schema question goes to the database specialist, not the frontend agent pretending to know Prisma.
3. Explicit code style
No semicolons. Single quotes. 2-space indentation. Arrow functions. Named exports. These aren't suggestions — they're enforced rules the agent follows on every file, every time.
4. Security checklist
Before every commit: no hardcoded secrets, all inputs validated, no unbounded queries, rate limiting on public endpoints. The agent checks these automatically.
Why it works
The mental model I was missing: AI agents should behave like senior engineers, not interns with root access.
Senior engineers don't start typing when you describe a problem. They think, propose a plan, get sign-off, then execute — one reversible step at a time.
Full Stack HQ enforces this discipline by default.
Repo
⭐ github.com/sabahattink/antigravity-fullstack-hq
MIT license. Open to PRs — especially new agents and skills.
What does your current CLAUDE.md look like? I'd love to see what rules others have found valuable.
Top comments (3)
I like the permission-first framing because agent stacks can get risky quickly when every specialist has broad access. The role separation also maps well to how real engineering teams divide responsibility across frontend, backend, testing, security, performance, and DevOps. One thing I’d be interested in is how you inspect cross-agent handoffs when a task moves between these roles. For multi-agent coding workflows, the handoff trace is often where the most useful debugging context lives.
Great point on handoff tracing — that's exactly where things
get opaque in multi-agent setups.
Right now Full Stack HQ handles this through the /orchestrate
workflow: each agent reports what it did and what it's passing
to the next specialist before handing off. It's explicit in the
conversation thread, not automatic.
But you're right that a proper trace log would be much more
valuable — especially for debugging "why did the backend-specialist
make that decision?" after the fact.
That's actually a gap I want to close. A structured handoff
manifest (agent → action → output → next agent) that gets
written to a file during orchestration would make the whole
system inspectable. Might turn this into a skill.
Thanks for the push — this is going on the roadmap.
Permission-first is absolutely the right floor. Most of the “agent did something I didn’t want” failures I’ve seen come down to the agent technically having permission while the policy only existed as prose or convention.
Enforcing policy at tool dispatch instead of describing it in markdown/docs is the real architectural shift, and I think this post frames that correctly.
One related failure surface I ran into pretty quickly though: the permission grant usually lives in the running process, not in durable workspace state.
Concretely:
At that point one of two weird things happens:
Neither outcome really matches the user’s mental model of “allow always.”
The pattern that ended up working best for me was treating permission grants as durable objects keyed by something like:
(user, workspace, tool, target_pattern)…with an explicit TTL and persistence layer.
That way successor processes can rehydrate the permission model consistently instead of rebuilding it from scratch every session.
What gets interesting is that once permission-first works at session scope, the next architectural boundary becomes workspace scope. The permission model stops being “what can this process do?” and starts becoming “what is this workspace allowed to do across agent lifecycles?”
Also: MIT licensing this is genuinely useful to the ecosystem. Appreciate you putting the work out publicly.