DEV Community

Cover image for AI Agents Don't Follow Your Rules. Here's a Compiler That Makes Them.
Alexandru Cioc
Alexandru Cioc

Posted on

AI Agents Don't Follow Your Rules. Here's a Compiler That Makes Them.

The Problem Nobody Talks About

You've set up Claude Code with a careful CLAUDE.md. Your Cursor rules are dialed in. Your AGENTS.md covers Codex. Maybe you've got copilot-instructions.md too.

They all say roughly the same thing:

  • Run npm test before committing
  • Use TypeScript strict mode
  • Don't use any — use unknown
  • Follow conventional commits

But they're separate files. Maintained separately. And they drift.

We cloned 50 of the highest-profile open-source repos — grafana, django, vue, prisma, supabase, airflow, tokio — and ran a governance audit on each one.

46% had drift. Rules that reference commands that don't exist. Configs older than the governance they were compiled from. AI agents being told to run lint scripts that were removed months ago.

The Fix: Treat It As a Compilation Problem

crag is a CLI that takes one governance.md and compiles it to every AI tool format your team uses.

npx @whitehatd/crag
Enter fullscreen mode Exit fullscreen mode

That single command:

  1. Analyzes your repo — reads CI workflows, package.json, tsconfig, Makefiles, directory structure — and generates governance.md with your actual gates, architecture, testing profile, code style, and anti-patterns.

  2. Compiles to 13 targets — each in the tool's native format, with correct frontmatter and activation patterns.

Here's what one command generates:

Target Output Consumer
agents-md AGENTS.md Codex, Aider, Factory
cursor .cursor/rules/governance.mdc Cursor
copilot .github/copilot-instructions.md GitHub Copilot
gemini GEMINI.md Gemini, Gemini CLI
claude CLAUDE.md Claude Code
cline .clinerules Cline
continue .continuerules Continue.dev
windsurf .windsurf/rules/governance.md Windsurf
zed .rules Zed
amazonq .amazonq/rules/governance.md Amazon Q
github .github/workflows/gates.yml GitHub Actions
husky .husky/pre-commit husky
pre-commit .pre-commit-config.yaml pre-commit.com

One file in, thirteen files out. Change a rule, recompile, done.

What the Analyzer Actually Finds

Run crag analyze on a real project and it reads:

  • 25+ language detectors — Node, TypeScript, Python, Go, Rust, Java, .NET, Swift, Elixir, and more
  • 11 CI system extractors — GitHub Actions, GitLab CI, CircleCI, Jenkins, Travis, Azure Pipelines...
  • 8 framework convention engines — Next.js, Django, Spring Boot, Rails...

The output is what a senior engineer would write after spending a week with your codebase:

## Gates (run in order, stop on failure)
### Lint
- npm run lint
### Test
- npm run test
### Build
- npm run build
- npm run typecheck

## Architecture
- Type: monolith
- Entry: bin/app.js

## Testing
- Framework: vitest
- Naming: *.test.ts

## Code Style
- Indent: 2 spaces
- Formatter: prettier
- Linter: eslint

## Anti-Patterns
Do not:
- Use `any` in TypeScript — use `unknown`
- Use `getServerSideProps` with App Router — use Server Components
Enter fullscreen mode Exit fullscreen mode

Under 1 second. Zero config.

Then It Watches Your Back

$ crag audit

  Compiled configs
  ✗ .cursor/rules/governance.mdc     stale — governance.md is newer
  ✗ AGENTS.md                        stale — governance.md is newer
  ✓ .github/workflows/gates.yml      in sync
  ✓ .husky/pre-commit                in sync

  Gate reality
  ✗ npx tsc --noEmit                 tsc not in devDependencies
  ✗ npm run lint                     "lint" script not in package.json

  2 stale · 2 drift
  Fix: crag compile --target all
Enter fullscreen mode Exit fullscreen mode

Install the hook and it auto-fixes on every commit:

crag hook install              # auto-recompile when governance changes
crag hook install --drift-gate # also block commits if drift detected
Enter fullscreen mode Exit fullscreen mode

The Benchmark

50 repos. 20 languages. 7 CI systems. Monorepos to single-crate Rust libraries.

Metric Result
Repos tested 50
Crashes 0
Total gates inferred 1,809
Mean gates/repo 36.2
Repos with drift 23 (46%)
Time per repo ~1.2s

Some highlights:

Repo Stack Gates Found
grafana/grafana Go + React + Docker 67
calcom/cal.com Next.js + React + Docker 53
hashicorp/vault Go + Docker + Node 50
biomejs/biome Rust + React + TS 47
django/django Python 38

Plus a 101-repo stress test: 4,400 invocations, 0 crashes.

Full benchmark results

How It's Different

The agent governance space is heating up. Microsoft just released their Agent Governance Toolkit. Coder has an AI Governance Add-On. Kong has a gateway approach.

crag takes a fundamentally different angle:

  • Compile-time, not runtime. Your rules are compiled to static files that each tool reads natively. No sidecar, no proxy, no MCP server required.
  • Deterministic. Same input → byte-identical output, SHA-verified across Linux, macOS, and Windows.
  • No LLM. The analyzer uses pattern matching, not inference. It finds what's actually in your repo, not what an LLM thinks should be there.
  • Zero dependencies. Node built-ins only. No supply chain risk.
  • Works offline. No network, no API keys, no cloud account required (cloud sync is optional).

Get Started

# One command — analyze + compile
npx @whitehatd/crag

# Step by step
npx @whitehatd/crag analyze           # generate governance.md
npx @whitehatd/crag compile --target all   # compile to 13 targets
npx @whitehatd/crag audit             # check for drift
npx @whitehatd/crag hook install      # enforce on every commit
Enter fullscreen mode Exit fullscreen mode

Requirements: Node.js 18+ and git. That's it.


Links:

Top comments (0)