DEV Community

Alexandru Cioc
Alexandru Cioc

Posted on

I checked 13 top open-source repos. 9 have zero AI agent config.

Fragmented configs and stale training data

Django. Angular. Vue. Svelte. Tokio. Remix. Cal.com. Airflow. Tauri.

None of them have a CLAUDE.md. No .cursorrules. No AGENTS.md. No copilot-instructions.md. Nothing.

These are projects with hundreds of contributors. If they don't have AI agent config, your project almost certainly doesn't either.

The problem is worse than "no config"

The 4 projects that DO have AI configs?

Grafana has a CLAUDE.md. It's literally one line: @AGENTS.md. Their hand-written AGENTS.md has 157 lines, but it misses quality gates from their CI.

Prisma has a 166-line AGENTS.md that says:

"Your training data contains a lot of outdated information that doesn't apply to Prisma 7. Always analyze this codebase like you would analyze a project you are not familiar with."

If Prisma's own maintainers don't trust AI training data, why do you?

Supabase has three separate AI configs — one for Claude, one for Copilot, one for Cursor. Three tools, three configs, zero overlap.

crag scoreboard

What this means for your code

When you use Cursor, Claude, Copilot, or any AI coding agent, it needs to know:

  • What quality gates CI enforces (lint, test, build, typecheck)
  • Your architecture and key directories
  • Anti-patterns to avoid
  • Code style conventions

Without this, your AI is working off stale training data. It writes code that breaks in CI.

One command

I built crag to solve this.

npx @whitehatd/crag
Enter fullscreen mode Exit fullscreen mode

It reads your project — CI workflows, package manifests, configs, directory structure — and generates a single governance.md that compiles to every AI tool's native format.

crag command

One file in, twelve files out:

Target File Consumer
agents-md AGENTS.md Codex, Aider, Gemini CLI
cursor .cursor/rules/ Cursor
copilot copilot-instructions.md GitHub Copilot
claude CLAUDE.md Claude Code
gemini GEMINI.md Gemini
cline .clinerules Cline
continue .continuerules Continue
windsurf .windsurf/rules/ Windsurf
zed .rules Zed
amazonq .amazonq/rules/ Amazon Q
github gates.yml GitHub Actions
husky .husky/pre-commit husky

Change a rule in governance.md, run crag compile, all 12 update.

The benchmark

We tested on 50 of the most important open-source projects:

  • 1,809 gates inferred across 50 repos
  • 96.4% accuracy — 187/194 gates verified against codebase
  • 20 languages, 7 CI systems, 0 crashes
Repo Stack Gates Finding
grafana/grafana Go + React + Docker 67 CLAUDE.md: 1 line
supabase/supabase TS + React + Docker 43 3 configs, fragmented
prisma/prisma TypeScript + Rust 40 "data outdated"
django/django Python 38 No config
angular/angular TypeScript 38 No config

How it works

  1. Analyze. Reads CI workflows (GitHub Actions, GitLab, Jenkins, etc.), package manifests, tool configs. 25+ language detectors, 11 CI extractors.

  2. Generate. Writes governance.md with quality gates, architecture, testing profile, code style, anti-patterns, framework conventions.

  3. Compile. Converts to each tool's native format — MDC frontmatter for Cursor, numbered steps for AGENTS.md, YAML triggers for Windsurf.

  4. Audit. Detects stale configs, missing tools, drift.

  5. Hook. Pre-commit auto-recompile. Optional drift gate blocks commits.

No LLM. No network. No API key. 500ms. Deterministic.

Try it

npx @whitehatd/crag
Enter fullscreen mode Exit fullscreen mode

Node.js 18+. Zero dependencies. MIT.

GitHub: WhitehatD/crag

Top comments (12)

Collapse
 
jon_at_backboardio profile image
Jonathan Murray

the prisma note is the tell. when the people who built a framework are saying "your training data is outdated, treat this codebase as unfamiliar" thats gonna be the default for every fast-moving project within a year. static config files are a good start but drift compounds fast. teams that figure out how to give agents live runtime context and not just a months-old snapshot are gonna have a real edge.

Collapse
 
whitehatd profile image
Alexandru Cioc

Spot on about Prisma — they're ahead of the curve but the problem is universal. Any project shipping faster than its AI configs update is feeding agents stale context.

The runtime context piece is interesting. crag sits at the static layer right now — it reads your CI, manifests, and code patterns to generate configs, then crag audit catches when they've drifted. The pre-commit hook auto-recompiles so the gap between "reality changed" and "configs reflect reality" shrinks to one commit.

But you're right that there's a layer beyond this — live runtime context that no config file can capture. That's a different problem and probably needs a different shape of solution. For now, keeping the static layer honest feels like table stakes that most teams haven't even reached yet.

Collapse
 
itskondrat profile image
Mykola Kondratiuk

not sure this is actually a gap - big OSS projects adding AI config prescribes a workflow for contributors using different tools. coordination problem, not missing infrastructure.

Collapse
 
whitehatd profile image
Alexandru Cioc

I actually agree it looks like coordination on the surface
but what I’m seeing is that coordination fails because there’s no canonical source of truth

every tool defines its own “rules” → they drift → humans become the sync layer

crag just makes the rules executable + compiled across tools
so coordination becomes a byproduct, not a requirement

Collapse
 
itskondrat profile image
Mykola Kondratiuk

the human-as-sync-layer framing landed - that's where most multi-agent setups fall apart too. curious how crag handles it when the underlying tool updates its config schema, that's usually where the drift restarts

Thread Thread
 
whitehatd profile image
Alexandru Cioc

Good question, schema changes are the exact reason crag uses a compile step instead of a shared template.

Each compile target has its own emitter that knows the tool's native format (MDC frontmatter for Cursor, YAML triggers for Windsurf, numbered steps for AGENTS.md, etc.). When a tool updates its schema, the emitter gets updated once in crag, you recompile, and all your repos get the new format.

The alternative: hand-updating 13 files across every repo when Cursor changes how .mdc frontmatter works, is exactly the kind of thing that restarts the drift cycle.

crag audit also catches this: if a compiled config no longer matches what the emitter would produce from the current governance, it flags it as stale. So even if you don't recompile immediately, you know something's out of date.

tl;dr: the schema knowledge lives in the compiler, not in your head.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

that compile step pattern makes total sense - it's basically the adapter problem. one canonical source, per-tool emitter. way cleaner than trying to maintain separate templates and hoping they stay in sync when Cursor or Windsurf changes their schema again.

Thread Thread
 
whitehatd profile image
Alexandru Cioc

Exactly, that’s the core idea behind it.
Treat everything as a canonical intermediate representation, then compile down into per-tool adapters.

Trying to maintain separate templates per tool just doesn’t scale,schemas drift, features diverge, things silently break.

With a compile step, you isolate all that volatility at the edges. New tool = new emitter, not a rewrite. Much cleaner model overall.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

yeah 'new tool = new emitter, not a rewrite' is exactly what I was getting at. that constraint makes the whole thing maintainable as the ecosystem keeps shifting.

Collapse
 
atripati profile image
Abhishek Tripathi • Edited

I built an open-source runtime that tracks cost per decision step — not per request, per step. Tool call: $0.00005. Reasoning: $0.0005. Total: $0.0008. It also routes tool calls to cheap models and reasoning to expensive ones automatically. Woule love to get feedback from developers who are handling multiple agents
github.com/atripati/ark

Collapse
 
apex_stack profile image
Apex Stack

The Supabase finding is the one that jumped out to me — three separate AI configs with zero overlap. That's basically three different agents getting three different versions of the truth about the same codebase. No wonder AI-assisted PRs break CI so often.

I maintain a fairly large CLAUDE.md for a multilingual site I run (100K+ pages across 12 languages), and the single biggest lesson I've learned is that the config file needs to encode failures, not just rules. Things like "TSX stock tickers do NOT use .to suffix in URLs — this caused a false alarm that wasted a full agent cycle" or "non-English pages with English body text are EXPECTED right now, do NOT file duplicate tickets about this." Without those failure memories baked in, every new agent session repeats the same mistakes.

The "one governance file compiled to many" approach makes a lot of sense for teams using multiple tools. The drift problem is real — I've seen configs get stale within days when they're manually maintained across tools. The pre-commit hook auto-recompile is the right answer there.

Curious about one thing: does crag handle project-specific anti-patterns that aren't inferrable from CI? Like "never use trailing slashes on URLs" or "always check for duplicates before creating tickets" — those are the rules that save the most agent cycles but live entirely in human knowledge until someone writes them down.

Collapse
 
whitehatd profile image
Alexandru Cioc

The Supabase case was wild to find — three AI configs, zero overlap, maintained by (presumably) three different people who never compared notes. That's the drift problem in miniature.

Your point about encoding failures is gold. "TSX stock tickers do NOT use .to suffix" — no static analyzer will ever infer that. That's human knowledge earned through pain.

To your question: crag analyze infers what it can from code and CI — linters, test frameworks, build steps, directory conventions. But the ## Anti-Patterns section in governance.md is exactly where those hard-won rules live. You write them once ("never use trailing slashes on URLs", "always check for duplicates before creating tickets") and crag compiles them into every AI tool's native format. The key part: custom content you add survives recompilation. So crag handles the mechanical stuff automatically, you add the stuff only a human would know, and both get distributed everywhere together.

The failure-memory pattern you described is something I want to make easier to capture. Right now it's manual — you edit governance.md.
An interactive crag add-rule that prompts for the failure context would lower the barrier.