codegraph: The Missing Knowledge Graph for 5 Coding Agents

#ai #claude #agents #coding

📖 Read the full version with embedded sources on AgentConn →

colbymchenry/codegraph added 2,434 stars in 24 hours and rocketed to #2 on GitHub Trending on May 23. It's a local-first, multi-agent code knowledge graph — built specifically for Claude Code, Codex CLI, Cursor, OpenCode, and Hermes Agent — with median benchmarks of 59% fewer tokens, 49% faster responses, and 70% fewer tool calls across seven real-world codebases.

That ranking matters because of the company it's keeping. The same trending day, multica-ai/andrej-karpathy-skills gained 3,372 stars at #1, Lum1104/Understand-Anything gained 2,331 stars at #3 with the same knowledge-graph thesis, and NousResearch/hermes-agent gained 1,334 stars. Five of the day's top ten trending repos are coding-agent infrastructure.

What codegraph Actually Builds

Most coding agents waste 60–70% of their tokens re-discovering code structure on every task. codegraph replaces that with a one-time, local indexing pass:

The pipeline: Tree-sitter parses source → language-specific queries extract nodes (functions, classes, files) and edges (calls, imports, inheritance) → SQLite (.codegraph/codegraph.db) stores it with FTS5 full-text search → post-extraction reference resolution links calls to definitions → native OS file watchers keep it current with 2-second debouncing.

The agent doesn't read files to understand structure; it queries the graph. The graph already knows parseConfig is called by 3 callers, that it imports from ./yaml-utils.ts, and that its definition lives at src/config/parser.ts:147. The agent reads only the file it's editing.

The MCP Surface — Why Multi-Agent Works

codegraph exposes itself as an MCP server with nine tools (codegraph_search, codegraph_context, codegraph_callers/callees, codegraph_impact, codegraph_explore, codegraph_node, codegraph_files, codegraph_status). Because the surface is MCP, the same .codegraph/codegraph.db works for any agent that speaks the protocol. Today that's Claude Code, Codex CLI, Cursor, OpenCode, and Hermes Agent.

None of them have to ship their own indexer. None of them have to re-invent semantic exploration. Switch agents, keep the graph.

The Swift Compiler Benchmark

The most-quoted figure: 25,874 files, 272,898 nodes, indexed in under 4 minutes. On a complex question against that index, an agent answered with 6 explore calls and zero file reads in 35 seconds.

The same question through vanilla Claude Code would routinely take 90–180 seconds, 25–40 tool calls, and 200K–400K tokens of context. codegraph compresses it to a half-minute conversation that fits in context without truncation.

The Parallel Implementations Tell the Real Story

If codegraph were a one-off, the trending page would have one knowledge-graph tool. It has at least three (codegraph, Understand-Anything, code-review-graph). Three independent implementations of the same primitive shipped within the same trending window.

That's the signal. The agent ecosystem has collectively discovered that the bottleneck on coding agents stopped being model capability and became context efficiency, and that pre-indexing is the obvious answer. It's the same lesson search engines learned in 1998.

Where codegraph Sits in the Multi-Agent Stack

For production agent setups, codegraph belongs in your stack when: codebase size is >5,000 files; you're running multi-agent workflows; and your privacy posture requires 100% local processing.

The Buy/Build/Wait Read

Should you adopt today? If your codebase is over 5,000 files and you're paying Claude Code/Codex bills above $200/month per developer, yes. Install via npx @colbymchenry/codegraph and codegraph init -i.

Should you bet on the category? Yes, but loosely. Anthropic, OpenAI, and Cursor will each ship their own native indexer in 6 months. The MCP-based ones (codegraph) have a path to surviving that squeeze.

The deeper bet — that something like a pre-indexed code knowledge graph becomes table stakes for serious coding agents in 2026 — is the safe one. Three independent implementations on GitHub Trending in the same week is the market making that call out loud.

Originally published at AgentConn.