DEV Community

Sabahattin Kalkan
Sabahattin Kalkan

Posted on

I built a permission-first CLAUDE.md + agent stack for Claude Code (free, MIT)

I've been using Claude Code daily for months. And I kept hitting the same wall:

The agent would just start doing things. No plan. No approval. Just... acting.

It deleted files I didn't want deleted. It refactored things I didn't ask it to refactor. It made "helpful" assumptions that broke my architecture.

So I built Full Stack HQ — a configuration kit that enforces a permission-first workflow. Here's what I learned.


The core problem with AI coding agents

Most people configure their AI agent once (or never) and just... let it go. The result is an agent that:

  • Makes assumptions about what you want
  • Takes irreversible actions without asking
  • Mixes planning and execution in the same step
  • Has no consistent code style or architectural awareness

The agent is powerful but unpredictable. That's the worst combination in software development.


The solution: permission-first workflow

Nothing happens without your explicit approval. The agent plans, shows you what it intends to do, and waits.

You:    "Add user authentication with JWT"

Agent:  Here's my plan:
        Phase 1: Create auth module + JWT strategy
        Phase 2: Add guards to protected routes  
        Phase 3: Implement refresh token rotation

        [APPROVAL NEEDED] Should I proceed with Phase 1?

You:    PLAN APPROVED

Agent:  [implements Phase 1 only, then stops and reports]
Enter fullscreen mode Exit fullscreen mode

The only valid approval keywords:

PLAN APPROVED
IMPLEMENTATION APPROVED
PROCEED
DO IT
Enter fullscreen mode Exit fullscreen mode

Anything else — the agent waits. No exceptions.


What's inside Full Stack HQ

Component Count Description
CLAUDE.md 1 Global rules for Claude Code
GEMINI.md 1 Global rules for Google Antigravity IDE
Agents 10 Specialist AI personas
Skills 28 Domain-specific knowledge modules
Workflows 10 Slash command procedures

10 Specialist Agents

Instead of one generic agent trying to do everything, you get domain experts:

Agent What it handles
frontend-specialist React, Next.js, Tailwind
backend-specialist NestJS, Node.js, APIs
database-specialist Prisma, PostgreSQL, migrations
architect System design, trade-offs, ADRs
code-reviewer Quality, patterns, best practices
test-engineer Vitest, Jest, Playwright
security-auditor Auth, OWASP, input validation
performance-optimizer Bundle, queries, rendering
devops-engineer Docker, CI/CD
documentation-writer READMEs, technical writing

Calling them is simple:

Use the database-specialist to design a user schema with soft deletes.
Enter fullscreen mode Exit fullscreen mode

28 Skills

Deep knowledge modules for the tools you actually use:

  • Frontend: nextjs-app-router, react-best-practices, ui-ux-pro-max, frontend-design
  • Backend: nestjs-patterns, prisma-workflow, software-architecture
  • Testing: test-driven-development, systematic-debugging, webapp-testing
  • Meta: brainstorming, prompt-engineering, skill-creator

10 Workflows (Slash Commands)

/plan       → phased breakdown with approval checkpoints
/brainstorm → explore architecture options
/debug      → systematic root-cause analysis
/create     → implement an approved plan
/enhance    → improve existing code quality
/test       → generate or fix tests
/orchestrate → coordinate multiple agents
Enter fullscreen mode Exit fullscreen mode

Install in 30 seconds

Mac/Linux:

curl -fsSL https://raw.githubusercontent.com/sabahattink/antigravity-fullstack-hq/main/install.sh | bash
Enter fullscreen mode Exit fullscreen mode

Windows (PowerShell):

irm https://raw.githubusercontent.com/sabahattink/antigravity-fullstack-hq/main/install.ps1 | iex
Enter fullscreen mode Exit fullscreen mode

Options:

./install.sh --only-claude        # Claude Code only
./install.sh --only-antigravity   # Antigravity only
./install.sh --force              # Overwrite existing configs
Enter fullscreen mode Exit fullscreen mode

The script detects which IDEs you have installed and configures them automatically.


What gets installed where

~/.claude/
├── CLAUDE.md          ← global rules (Claude Code)
├── agents/            ← 10 specialist agents
└── skills/            ← 28 skill modules

~/.gemini/
├── GEMINI.md          ← global rules (Antigravity)
└── antigravity/
    ├── agents/
    ├── skills/
    └── workflows/
Enter fullscreen mode Exit fullscreen mode

The CLAUDE.md philosophy

The rules file enforces several things I found critical in practice:

1. Separation of planning and execution

The agent never does both in the same step. First it plans, you approve, then it executes. This alone eliminates 80% of unwanted surprises.

2. Role-based reasoning

Before acting, the agent asks: "Who is the right specialist for this?" A database schema question goes to the database specialist, not the frontend agent pretending to know Prisma.

3. Explicit code style

No semicolons. Single quotes. 2-space indentation. Arrow functions. Named exports. These aren't suggestions — they're enforced rules the agent follows on every file, every time.

4. Security checklist

Before every commit: no hardcoded secrets, all inputs validated, no unbounded queries, rate limiting on public endpoints. The agent checks these automatically.


Why it works

The mental model I was missing: AI agents should behave like senior engineers, not interns with root access.

Senior engineers don't start typing when you describe a problem. They think, propose a plan, get sign-off, then execute — one reversible step at a time.

Full Stack HQ enforces this discipline by default.


Repo

github.com/sabahattink/antigravity-fullstack-hq

MIT license. Open to PRs — especially new agents and skills.

What does your current CLAUDE.md look like? I'd love to see what rules others have found valuable.

Top comments (3)

Collapse
 
raju_dandigam profile image
Raju Dandigam

I like the permission-first framing because agent stacks can get risky quickly when every specialist has broad access. The role separation also maps well to how real engineering teams divide responsibility across frontend, backend, testing, security, performance, and DevOps. One thing I’d be interested in is how you inspect cross-agent handoffs when a task moves between these roles. For multi-agent coding workflows, the handoff trace is often where the most useful debugging context lives.

Collapse
 
sabahattink profile image
Sabahattin Kalkan

Great point on handoff tracing — that's exactly where things
get opaque in multi-agent setups.

Right now Full Stack HQ handles this through the /orchestrate
workflow: each agent reports what it did and what it's passing
to the next specialist before handing off. It's explicit in the
conversation thread, not automatic.

But you're right that a proper trace log would be much more
valuable — especially for debugging "why did the backend-specialist
make that decision?" after the fact.

That's actually a gap I want to close. A structured handoff
manifest (agent → action → output → next agent) that gets
written to a file during orchestration would make the whole
system inspectable. Might turn this into a skill.

Thanks for the push — this is going on the roadmap.

Collapse
 
kcarriedo profile image
Kyle Carriedo • Edited

Permission-first is absolutely the right floor. Most of the “agent did something I didn’t want” failures I’ve seen come down to the agent technically having permission while the policy only existed as prose or convention.

Enforcing policy at tool dispatch instead of describing it in markdown/docs is the real architectural shift, and I think this post frames that correctly.

One related failure surface I ran into pretty quickly though: the permission grant usually lives in the running process, not in durable workspace state.

Concretely:

  • user grants edit access to a directory
  • agent receives “always allow for this session”
  • session dies (sleep, restart, rate-limit, process crash, etc.)
  • user reconnects later
  • new process spins up without the original in-memory permission map

At that point one of two weird things happens:

  • the user gets re-prompted for permissions they already thought they granted, or
  • the successor process has a more permissive default and proceeds differently than the previous session

Neither outcome really matches the user’s mental model of “allow always.”

The pattern that ended up working best for me was treating permission grants as durable objects keyed by something like:

(user, workspace, tool, target_pattern)

…with an explicit TTL and persistence layer.

That way successor processes can rehydrate the permission model consistently instead of rebuilding it from scratch every session.

What gets interesting is that once permission-first works at session scope, the next architectural boundary becomes workspace scope. The permission model stops being “what can this process do?” and starts becoming “what is this workspace allowed to do across agent lifecycles?”

Also: MIT licensing this is genuinely useful to the ecosystem. Appreciate you putting the work out publicly.