DEV Community

Alexandru Cioc
Alexandru Cioc

Posted on

I checked 13 top open-source repos. 9 have zero AI agent config.

Django. Angular. Vue. Svelte. Tokio. Remix. Cal.com. Airflow. Tauri.

None of them have a CLAUDE.md. No .cursorrules. No AGENTS.md. No copilot-instructions.md. Nothing.

These are projects with hundreds of contributors. If they don't have AI agent config, your project almost certainly doesn't either.

The problem is worse than "no config"

The 4 projects that DO have AI configs?

Grafana has a CLAUDE.md. It's literally one line: @AGENTS.md. Their hand-written AGENTS.md has 157 lines, but it misses quality gates from their CI.

Prisma has a 166-line AGENTS.md that says:

"Your training data contains a lot of outdated information that doesn't apply to Prisma 7. Always analyze this codebase like you would analyze a project you are not familiar with."

If Prisma's own maintainers don't trust AI training data, why do you?

Supabase has three separate AI configs — one for Claude, one for Copilot, one for Cursor. Three tools, three configs, zero overlap.

crag scoreboard

What this means for your code

When you use Cursor, Claude, Copilot, or any AI coding agent, it needs to know:

  • What quality gates CI enforces (lint, test, build, typecheck)
  • Your architecture and key directories
  • Anti-patterns to avoid
  • Code style conventions

Without this, your AI is working off stale training data. It writes code that breaks in CI.

One command

I built crag to solve this.

npx @whitehatd/crag
Enter fullscreen mode Exit fullscreen mode

It reads your project — CI workflows, package manifests, configs, directory structure — and generates a single governance.md that compiles to every AI tool's native format.

crag command

One file in, twelve files out:

Target File Consumer
agents-md AGENTS.md Codex, Aider, Gemini CLI
cursor .cursor/rules/ Cursor
copilot copilot-instructions.md GitHub Copilot
claude CLAUDE.md Claude Code
gemini GEMINI.md Gemini
cline .clinerules Cline
continue .continuerules Continue
windsurf .windsurf/rules/ Windsurf
zed .rules Zed
amazonq .amazonq/rules/ Amazon Q
github gates.yml GitHub Actions
husky .husky/pre-commit husky

Change a rule in governance.md, run crag compile, all 12 update.

The benchmark

We tested on 50 of the most important open-source projects:

  • 1,809 gates inferred across 50 repos
  • 96.4% accuracy — 187/194 gates verified against codebase
  • 20 languages, 7 CI systems, 0 crashes
Repo Stack Gates Finding
grafana/grafana Go + React + Docker 67 CLAUDE.md: 1 line
supabase/supabase TS + React + Docker 43 3 configs, fragmented
prisma/prisma TypeScript + Rust 40 "data outdated"
django/django Python 38 No config
angular/angular TypeScript 38 No config

How it works

  1. Analyze. Reads CI workflows (GitHub Actions, GitLab, Jenkins, etc.), package manifests, tool configs. 25+ language detectors, 11 CI extractors.

  2. Generate. Writes governance.md with quality gates, architecture, testing profile, code style, anti-patterns, framework conventions.

  3. Compile. Converts to each tool's native format — MDC frontmatter for Cursor, numbered steps for AGENTS.md, YAML triggers for Windsurf.

  4. Audit. Detects stale configs, missing tools, drift.

  5. Hook. Pre-commit auto-recompile. Optional drift gate blocks commits.

No LLM. No network. No API key. 500ms. Deterministic.

Try it

npx @whitehatd/crag
Enter fullscreen mode Exit fullscreen mode

Node.js 18+. Zero dependencies. MIT.

GitHub: WhitehatD/crag

Top comments (1)

Collapse
 
apex_stack profile image
Apex Stack

The Supabase finding is the one that jumped out to me — three separate AI configs with zero overlap. That's basically three different agents getting three different versions of the truth about the same codebase. No wonder AI-assisted PRs break CI so often.

I maintain a fairly large CLAUDE.md for a multilingual site I run (100K+ pages across 12 languages), and the single biggest lesson I've learned is that the config file needs to encode failures, not just rules. Things like "TSX stock tickers do NOT use .to suffix in URLs — this caused a false alarm that wasted a full agent cycle" or "non-English pages with English body text are EXPECTED right now, do NOT file duplicate tickets about this." Without those failure memories baked in, every new agent session repeats the same mistakes.

The "one governance file compiled to many" approach makes a lot of sense for teams using multiple tools. The drift problem is real — I've seen configs get stale within days when they're manually maintained across tools. The pre-commit hook auto-recompile is the right answer there.

Curious about one thing: does crag handle project-specific anti-patterns that aren't inferrable from CI? Like "never use trailing slashes on URLs" or "always check for duplicates before creating tickets" — those are the rules that save the most agent cycles but live entirely in human knowledge until someone writes them down.