<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: akshay sharma</title>
    <description>The latest articles on DEV Community by akshay sharma (@akshay_sharma_06637368320).</description>
    <link>https://dev.to/akshay_sharma_06637368320</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3976348%2F17bec652-bc41-4639-843f-ad5518ea9f0b.jpg</url>
      <title>DEV Community: akshay sharma</title>
      <link>https://dev.to/akshay_sharma_06637368320</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/akshay_sharma_06637368320"/>
    <language>en</language>
    <item>
      <title>I cut my coding agent's token usage 61% by giving it a code graph</title>
      <dc:creator>akshay sharma</dc:creator>
      <pubDate>Wed, 10 Jun 2026 15:27:31 +0000</pubDate>
      <link>https://dev.to/akshay_sharma_06637368320/i-cut-my-coding-agents-token-usage-61-by-giving-it-a-code-graph-n3g</link>
      <guid>https://dev.to/akshay_sharma_06637368320/i-cut-my-coding-agents-token-usage-61-by-giving-it-a-code-graph-n3g</guid>
      <description>&lt;p&gt;My coding agent has a goldfish memory. Every session runs the same way: I ask, "Who calls &lt;code&gt;parseToken&lt;/code&gt;?" and it opens seven files, reads forty kilobytes, and, half a minute later, tells me something it could have told me on day one if it remembered the shape of my code.&lt;/p&gt;

&lt;p&gt;It never remembers. Every conversation starts from zero, so it greps, reads, burns tokens, and every so often invents a function name that was never there.&lt;/p&gt;

&lt;p&gt;I got tired of paying for that, so I built GraphPilot. It's a local MCP server that indexes your TypeScript/JavaScript repo once and lets the agent query its structure instead of re-reading files. Here's what I found measuring it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The actual problem
&lt;/h2&gt;

&lt;p&gt;Coding agents reason well and remember nothing. The structural questions they ask all day ("where is this defined?", "who calls it?", "what breaks if I change it?") have exact answers sitting in the code, but the agent re-derives them from raw text every time, because nothing survives between sessions.&lt;/p&gt;

&lt;p&gt;You pay for that three ways. The obvious one is tokens: re-reading files to answer a question you already answered is pure waste. Then there's accuracy, because grep finds the string &lt;code&gt;save&lt;/code&gt; without knowing which &lt;code&gt;save&lt;/code&gt; you meant, so the agent guesses. And the one that actually worries me is refactor safety. "What does renaming this break?" is the question that matters most, and a file-by-file agent is worst at exactly that question.&lt;/p&gt;

&lt;p&gt;GraphPilot fills the gap as persistent structural memory.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works
&lt;/h2&gt;

&lt;p&gt;The CLI parses your repo with tree-sitter and builds a graph. Every function, class, method, interface, type, and enum becomes a node; every call becomes an edge. The graph gets written to &lt;code&gt;~/.graphpilot/&lt;/code&gt;, and an MCP server hands it to the agent over stdio.&lt;/p&gt;

&lt;p&gt;Four tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;gp_recall&lt;/code&gt; finds where a symbol is defined&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;gp_callers&lt;/code&gt; lists who calls a symbol (the reverse lookup)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;gp_impact&lt;/code&gt; computes the blast radius of a change, depth-bounded&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;gp_index&lt;/code&gt; re-indexes after edits&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every response carries a &lt;code&gt;file:line @ sha&lt;/code&gt; anchor, so the agent can quote its source and you can jump straight to the line. And when a name is ambiguous, say two files both export &lt;code&gt;save&lt;/code&gt;, the answer tells you instead of quietly picking one and pretending it was sure.&lt;/p&gt;

&lt;h2&gt;
  
  
  The numbers
&lt;/h2&gt;

&lt;p&gt;I ran a real coding agent (claude-sonnet-4-5) against fastify, a Node.js framework of about 300 files, on 40 structural questions. First with nothing but file reads. Then with GraphPilot's four tools. Same model, same questions, same repo.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Without&lt;/th&gt;
&lt;th&gt;With GraphPilot&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Total tokens&lt;/td&gt;
&lt;td&gt;2,796,760&lt;/td&gt;
&lt;td&gt;1,088,276&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;API cost&lt;/td&gt;
&lt;td&gt;$8.88&lt;/td&gt;
&lt;td&gt;$3.68&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Correct answers&lt;/td&gt;
&lt;td&gt;33/40&lt;/td&gt;
&lt;td&gt;37/40&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That's 61% fewer tokens, and it got four more answers right instead of trading accuracy for speed. By question type, "who calls X?" dropped 82% and impact analysis dropped 73%.&lt;/p&gt;

&lt;p&gt;Now the part most launch posts skip. It doesn't help everywhere. Flow-tracing questions like "trace a request through the middleware" come out roughly even, because the agent still has to read code to answer them. Plain dependency checks save about 7%. I publish those numbers too. A tool that claims to win at everything is hiding something.&lt;/p&gt;

&lt;h2&gt;
  
  
  Local-first, and I mean it
&lt;/h2&gt;

&lt;p&gt;GraphPilot never makes a network call. Indexing is tree-sitter running on your machine, and your source never leaves it. There's no telemetry and no update check, and that part is enforced rather than promised: an ESLint rule bans &lt;code&gt;http&lt;/code&gt;, &lt;code&gt;fetch&lt;/code&gt;, and &lt;code&gt;axios&lt;/code&gt; imports in the source, and CI fails any PR that tries to add one. The graph sits in &lt;code&gt;~/.graphpilot/&lt;/code&gt; at mode 0600.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; @graphpilot-oss/graphpilot
graphpilot index ~/code/my-app
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then point your agent's MCP config at &lt;code&gt;graphpilot mcp&lt;/code&gt;. It works with Claude Code, Cursor, Cline, Windsurf, and Continue.&lt;/p&gt;

&lt;p&gt;It's TypeScript and JavaScript only right now (tree-sitter-typescript handles TS, TSX, JSX, and JS in one grammar). Python is probably next if people ask for it.&lt;/p&gt;

&lt;p&gt;It's Apache-2.0, on GitHub at &lt;a href="https://github.com/graphpilot-oss/graphpilot" rel="noopener noreferrer"&gt;graphpilot-oss/graphpilot&lt;/a&gt;. The benchmark is reproducible, script and method in the repo, so if you think I measured it wrong you can check yourself.&lt;/p&gt;

&lt;p&gt;One thing I'd actually like to know: what structural questions do you ask your agent most, and which ones am I not measuring yet?&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>typescript</category>
      <category>devtools</category>
    </item>
  </channel>
</rss>
