<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Akash Goenka</title>
    <description>The latest articles on DEV Community by Akash Goenka (@akashgoenka).</description>
    <link>https://dev.to/akashgoenka</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3899636%2F94881d4b-e46e-4c0e-a976-9d9432a911d9.jpeg</url>
      <title>DEV Community: Akash Goenka</title>
      <link>https://dev.to/akashgoenka</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/akashgoenka"/>
    <language>en</language>
    <item>
      <title>Building coldstart: what broke, what held up</title>
      <dc:creator>Akash Goenka</dc:creator>
      <pubDate>Mon, 27 Apr 2026 04:35:43 +0000</pubDate>
      <link>https://dev.to/akashgoenka/building-coldstart-what-broke-what-held-up-447l</link>
      <guid>https://dev.to/akashgoenka/building-coldstart-what-broke-what-held-up-447l</guid>
      <description>&lt;p&gt;This is the long version. If you want the short pitch for &lt;a href="https://www.npmjs.com/package/coldstart-mcp" rel="noopener noreferrer"&gt;coldstart&lt;/a&gt; — what it is and why it exists — read &lt;a href="https://dev.to/akashgoenka/your-agent-is-spending-more-time-finding-code-than-understanding-it-38in"&gt;the main post&lt;/a&gt; first. This one is for people who want to see the iteration story: the design decisions that didn't work, the ones that did, and why.&lt;/p&gt;

&lt;p&gt;The arc is roughly: I started with a simple idea, hit real codebases, and watched it break in interesting ways. Each section below is a thing that broke and what I did about it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The starting point: one folder-path domain per file
&lt;/h2&gt;

&lt;p&gt;The first version of the index assigned each file a single "domain" based on its folder path. The idea was straightforward — files in &lt;code&gt;src/auth/&lt;/code&gt; belong to the &lt;code&gt;auth&lt;/code&gt; domain, files in &lt;code&gt;src/billing/&lt;/code&gt; belong to &lt;code&gt;billing&lt;/code&gt;, and so on.&lt;/p&gt;

&lt;p&gt;This held up for about ten minutes of real-world testing.&lt;/p&gt;

&lt;p&gt;The problem: deeply nested files lost all specificity. A file at &lt;code&gt;src/features/billing/components/invoices/list/InvoiceListRow.tsx&lt;/code&gt; would get tagged with &lt;code&gt;list&lt;/code&gt; or &lt;code&gt;invoices&lt;/code&gt; or &lt;code&gt;billing&lt;/code&gt; depending on where you cut, and none of those is uniquely identifying. Two completely unrelated &lt;code&gt;Row.tsx&lt;/code&gt; files in different feature trees would collide on the same domain. Worse, the agent had no way to query for "the invoice list row" because the domain was just one slice of the path.&lt;/p&gt;

&lt;p&gt;The fix was to stop thinking about a single domain and start thinking about &lt;em&gt;all the meaningful tokens&lt;/em&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Domains as an array of tokens
&lt;/h2&gt;

&lt;p&gt;I moved to a &lt;code&gt;domains[]&lt;/code&gt; array — every meaningful token from the path segments plus every exported symbol from the file. So &lt;code&gt;InvoiceListRow.tsx&lt;/code&gt; at &lt;code&gt;src/features/billing/components/invoices/list/InvoiceListRow.tsx&lt;/code&gt; would index as something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nl"&gt;path&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;src/features/billing/components/invoices/list/InvoiceListRow.tsx&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="nx"&gt;domains&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;features&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;billing&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;components&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoices&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;list&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;invoice&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;list&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;row&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                    &lt;span class="c1"&gt;// from filename, split on case&lt;/span&gt;
    &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;InvoiceListRow&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;InvoiceListRowProps&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;      &lt;span class="c1"&gt;// exported symbols&lt;/span&gt;
  &lt;span class="p"&gt;],&lt;/span&gt;
  &lt;span class="nx"&gt;exports&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;InvoiceListRow&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;InvoiceListRowProps&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This worked much better. A query for &lt;code&gt;"invoice list"&lt;/code&gt; would hit both the path segments and the symbol name. A query for &lt;code&gt;"InvoiceListRow"&lt;/code&gt; would hit the export directly.&lt;/p&gt;

&lt;p&gt;Then I tried adding import paths as a token source — the reasoning being "files that import from &lt;code&gt;auth/&lt;/code&gt; are probably auth-related." This was a mistake.&lt;/p&gt;

&lt;p&gt;A middleware file that imports from many feature-specific files (a common pattern — global router config, a top-level layout component, an API client setup file) would start matching every query for any of those features. Pure noise. The middleware file was structurally important but not &lt;em&gt;about&lt;/em&gt; any one feature, and indexing its imports made it look like it was about all of them. I pulled it back out.&lt;/p&gt;

&lt;p&gt;The lesson: &lt;strong&gt;what a file imports tells you about its dependencies, not its identity.&lt;/strong&gt; Identity comes from the file's own path and exports. That's the boundary I drew.&lt;/p&gt;

&lt;h2&gt;
  
  
  The substring-matching disaster
&lt;/h2&gt;

&lt;p&gt;Early on, matching was substring-based. A query token would match an index token if it appeared anywhere inside it. This seemed reasonable — &lt;code&gt;"user"&lt;/code&gt; should match &lt;code&gt;"UserProfile"&lt;/code&gt;, after all.&lt;/p&gt;

&lt;p&gt;It caused cascade failures.&lt;/p&gt;

&lt;p&gt;The token &lt;code&gt;"in"&lt;/code&gt; is a substring of &lt;code&gt;"login"&lt;/code&gt;, &lt;code&gt;"signin"&lt;/code&gt;, &lt;code&gt;"settings"&lt;/code&gt;, &lt;code&gt;"admin"&lt;/code&gt;, &lt;code&gt;"binding"&lt;/code&gt;, &lt;code&gt;"PluginConfig"&lt;/code&gt;, and roughly a thousand other tokens. A query like &lt;code&gt;"sign in form"&lt;/code&gt; would tokenize to &lt;code&gt;["sign", "in", "form"]&lt;/code&gt;, with &lt;code&gt;"in"&lt;/code&gt; matching as a substring across hundreds of unrelated files, and the result list would balloon with files that had nothing to do with sign-in flows.&lt;/p&gt;

&lt;p&gt;I tried fixes in this order:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Length-based penalties&lt;/strong&gt; — penalize matches where the query token is much shorter than the index token. Helped a little; broke for legitimate short tokens like &lt;code&gt;id&lt;/code&gt;, &lt;code&gt;db&lt;/code&gt;, &lt;code&gt;api&lt;/code&gt;.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Minimum length thresholds for substring matching&lt;/strong&gt; — only allow substring matches if the query token is at least N characters. Cut some noise; introduced new false negatives where a 3-character token was actually meaningful.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Exact-match-only with a fallback&lt;/strong&gt; — match exactly first, fall back to substring only if no exact matches exist. Better, but the fallback still triggered noise on rare-but-real exact-zero queries.
None of these felt principled. They were all heuristics layered on a fundamentally noisy signal.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  IDF-based rarity scoring
&lt;/h2&gt;

&lt;p&gt;What finally worked was scoring tokens by inverse document frequency at index-build time. Common tokens — &lt;code&gt;index&lt;/code&gt;, &lt;code&gt;utils&lt;/code&gt;, &lt;code&gt;helper&lt;/code&gt;, &lt;code&gt;component&lt;/code&gt;, &lt;code&gt;types&lt;/code&gt; — get low weight. Rare tokens — your specific feature names, your specific symbol names — get high weight.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// At index build time, compute IDF for every token&lt;/span&gt;
&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;computeIDF&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tokenCounts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;totalFiles&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;idf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nb"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;tokenCounts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nx"&gt;idf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;token&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;Math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;totalFiles&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;count&lt;/span&gt;&lt;span class="p"&gt;)));&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;idf&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The match score for a file became the sum of IDF weights of matched tokens, scaled by the fraction of query tokens matched:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;scoreFile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;queryTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[],&lt;/span&gt; &lt;span class="nx"&gt;fileTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Set&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;idf&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Map&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kr"&gt;number&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;matched&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;queryTokens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;fileTokens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;has&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;matched&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;idfSum&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;matched&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;reduce&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;s&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nx"&gt;s&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;idf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;t&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;??&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;coverage&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;matched&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nx"&gt;queryTokens&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;idfSum&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;coverage&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nx"&gt;coverage&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;  &lt;span class="c1"&gt;// squared to favor higher coverage&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This alone wasn't enough — even with IDF, common-token files were still slipping through if they happened to match one rare token incidentally. So I added a two-predicate filter on top:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;A file qualifies as a result if it matches a rare token (IDF above threshold) OR satisfies multiple distinct concept groups in the query.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The "concept groups" thing matters. A query like &lt;code&gt;"invoice list row"&lt;/code&gt; is conceptually one thing — invoices, list views, row components — and a file that hits all three is structurally relevant even if no individual token is super rare. Either rare-token-match or multi-group-match gets you in. Neither, you're out.&lt;/p&gt;

&lt;p&gt;This was the version that held up across a wide range of real queries. I stopped tweaking it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tree-sitter and the nested-function problem
&lt;/h2&gt;

&lt;p&gt;Symbol extraction is done with Tree-sitter. The first pass walked top-level declarations only — &lt;code&gt;function foo()&lt;/code&gt;, &lt;code&gt;const bar = ...&lt;/code&gt;, &lt;code&gt;class Baz&lt;/code&gt;, &lt;code&gt;export default ...&lt;/code&gt;. This works for most languages.&lt;/p&gt;

&lt;p&gt;It does not work for React components.&lt;/p&gt;

&lt;p&gt;In React, handlers are typically defined &lt;em&gt;inside&lt;/em&gt; the component body:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight tsx"&gt;&lt;code&gt;&lt;span class="k"&gt;export&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;UserProfile&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="nx"&gt;userId&lt;/span&gt; &lt;span class="p"&gt;}:&lt;/span&gt; &lt;span class="nx"&gt;Props&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;handleSubmit&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;handleDelete&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;};&lt;/span&gt;

  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nt"&gt;form&lt;/span&gt; &lt;span class="na"&gt;onSubmit&lt;/span&gt;&lt;span class="p"&gt;=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nx"&gt;handleSubmit&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt;...&lt;span class="p"&gt;&amp;lt;/&lt;/span&gt;&lt;span class="nt"&gt;form&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;handleSubmit&lt;/code&gt; and &lt;code&gt;handleDelete&lt;/code&gt; are real symbols. They're referenced in tests, they show up in stack traces, they're things an agent might reasonably search for. But a top-level walk misses them entirely — Tree-sitter sees &lt;code&gt;UserProfile&lt;/code&gt; as the only declaration in the file.&lt;/p&gt;

&lt;p&gt;The fix is to walk one level deeper into function bodies when the parent is a component-shaped function (PascalCase name, returns JSX). I don't go arbitrarily deep — that opens the door to indexing every closure and helper in every callback chain — just one level into the immediate component body.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;extractSymbols&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;Parser&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;Tree&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;source&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kr"&gt;string&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;root&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;tree&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;rootNode&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;child&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;root&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;namedChildren&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;topLevel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;extractTopLevelSymbol&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;child&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;topLevel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;topLevel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
      &lt;span class="c1"&gt;// If it's a component-shaped function, walk one level into its body&lt;/span&gt;
      &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;topLevel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;kind&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;function&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="nf"&gt;isPascalCase&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;topLevel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;name&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nx"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;push&lt;/span&gt;&lt;span class="p"&gt;(...&lt;/span&gt;&lt;span class="nf"&gt;extractNestedHandlers&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;topLevel&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;body&lt;/span&gt;&lt;span class="p"&gt;));&lt;/span&gt;
      &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nx"&gt;symbols&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This isn't perfect — it'll over-index a function that happens to be PascalCase but isn't actually a component, and under-index components defined as arrow functions assigned to lowercase variables — but it covered the cases that mattered in practice.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's still unsolved: cross-file call resolution
&lt;/h2&gt;

&lt;p&gt;I want to be honest about this one. Tracing impact (what depends on this function?) requires resolving function calls across files. Named calls work — if &lt;code&gt;foo&lt;/code&gt; is defined in &lt;code&gt;A.ts&lt;/code&gt; and &lt;code&gt;B.ts&lt;/code&gt; has &lt;code&gt;foo()&lt;/code&gt; somewhere, I can resolve that. Member expression calls do not work:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// A.ts&lt;/span&gt;
&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;service&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;./services&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="nx"&gt;service&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;handleUpdate&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;        &lt;span class="c1"&gt;// unresolved — I can't tell what handleUpdate is&lt;/span&gt;

&lt;span class="c1"&gt;// B.ts&lt;/span&gt;
&lt;span class="kd"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;Service&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nf"&gt;handleUpdate&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="cm"&gt;/* ... */&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Resolving member expressions requires either (a) full type inference, which is most of the way to a language server and dramatically more work, or (b) a heuristic match on the method name plus the import graph, which is fast but has false positives.&lt;/p&gt;

&lt;p&gt;For now I'm taking the false-positive hit on heuristic matching and exposing it honestly in the response — &lt;code&gt;trace-impact&lt;/code&gt; returns named matches separately from heuristic matches so the agent knows which to trust. That's not a fix, it's an accommodation. Real type-aware resolution is on the list but not next.&lt;/p&gt;

&lt;h2&gt;
  
  
  The live index
&lt;/h2&gt;

&lt;p&gt;Agents work in active codebases. Files change while the agent is mid-task — it edits a file, runs a test, the index is now stale. A stale index means wrong results, which means the agent goes hunting for files that no longer exist or doesn't find files it just wrote.&lt;/p&gt;

&lt;p&gt;The watcher uses &lt;code&gt;fs.watch&lt;/code&gt; with a 400ms debounce:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;watcher&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;fs&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;watch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rootDir&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;recursive&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;event&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="nx"&gt;pendingChanges&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;filename&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nf"&gt;clearTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rebuildTimer&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;rebuildTimer&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setTimeout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;rebuildIndex&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Below 30 changed files, it does an incremental patch — re-parse only the changed files, update the index in place. Above 30, it does a full rebuild. The threshold is a guess that's been fine in practice; below 30 the patch is faster, above 30 the bookkeeping cost outweighs the savings.&lt;/p&gt;

&lt;p&gt;The rebuild swaps the index atomically — agents querying during a rebuild see the old index until the new one is ready, then see the new one. They never see a half-built index.&lt;/p&gt;

&lt;h2&gt;
  
  
  The four tools
&lt;/h2&gt;

&lt;p&gt;That index supports four MCP tools:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;get-overview&lt;/code&gt; — given a query, return the most relevant files. The thing this whole post has been about.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;get-structure&lt;/code&gt; — given a file or folder, return its symbol structure. No semantic overlap with embeddings; it's just "what's in here."&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;trace-deps&lt;/code&gt; — what does this file import and depend on?&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;trace-impact&lt;/code&gt; — given a symbol, what depends on it? (with the cross-file-call caveat above)
The first one is where most of the design effort went, but the other three are where coldstart genuinely can't be replaced by semantic search — embeddings don't answer "who imports this file."&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  What I'd tell anyone building something similar
&lt;/h2&gt;

&lt;p&gt;Three things that took me longer to internalize than they should have.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Soft failures beat hard failures for agent tools.&lt;/strong&gt; Returning 50 results the agent can narrow is recoverable. Returning zero with no recovery signal is a hard stop. Design for the agent to try again, not to be right the first time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Decorative scores actively mislead.&lt;/strong&gt; I removed confidence scores because they didn't differentiate results. If a number doesn't carry information, deleting it makes the tool more honest, not less professional.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Subtraction is the main design move.&lt;/strong&gt; Almost every iteration ended with me removing a feature, not adding one. The final tool is much smaller than the first version.&lt;/p&gt;




&lt;p&gt;If you got this far and you're building agent tools, I'd genuinely like to compare notes. Issues and discussions on GitHub are the best way to reach me.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; coldstart-mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;a href="https://github.com/AkashGoenka/coldstart" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt; · &lt;a href="https://www.npmjs.com/package/coldstart-mcp" rel="noopener noreferrer"&gt;npm&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>webdev</category>
      <category>programming</category>
    </item>
    <item>
      <title>Your agent is spending more time finding code than understanding it</title>
      <dc:creator>Akash Goenka</dc:creator>
      <pubDate>Mon, 27 Apr 2026 04:33:40 +0000</pubDate>
      <link>https://dev.to/akashgoenka/your-agent-is-spending-more-time-finding-code-than-understanding-it-38in</link>
      <guid>https://dev.to/akashgoenka/your-agent-is-spending-more-time-finding-code-than-understanding-it-38in</guid>
      <description>&lt;p&gt;I kept watching agents do the same thing. Given a real task in a real codebase, they'd spend the first half of the session navigating — grepping, opening files, reading imports, going back, grepping again — and only after that get to the part they're actually good at: reasoning, planning, writing code.&lt;/p&gt;

&lt;p&gt;The context window isn't infinite. Every token spent locating a file is a token not spent thinking about it.&lt;/p&gt;

&lt;p&gt;So I built &lt;a href="https://www.npmjs.com/package/coldstart-mcp" rel="noopener noreferrer"&gt;coldstart&lt;/a&gt;. It's an MCP server that gives agents a static index over a codebase — file paths, exported symbol names, path segments — built once with Tree-sitter, queried instantly. Four tools, no embeddings, no vector database, no model. The whole point is to find the right file fast and get out of the agent's way.&lt;/p&gt;

&lt;p&gt;This is a post about why it ended up that small, what I deliberately didn't build, and what I learned from agents using it on real codebases.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem isn't intelligence, it's navigation
&lt;/h2&gt;

&lt;p&gt;Agents are most valuable when they're reasoning about code, not searching for it. But navigation is where they quietly burn context. A typical session in a large codebase looks something like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Grep for a likely keyword. Get 200 lines of matches.&lt;/li&gt;
&lt;li&gt;Open three or four files to figure out which one is actually relevant.&lt;/li&gt;
&lt;li&gt;Realize the right file is named something different and grep again.&lt;/li&gt;
&lt;li&gt;Trace imports manually.&lt;/li&gt;
&lt;li&gt;Finally start the actual task — with significantly less context window left.
Larger context windows don't fix this; they just delay it. An agent that wastes tokens navigating has less left for the work that matters. The cost isn't theoretical — it shows up as worse answers near the end of long sessions, more re-reading of files the agent already saw, and tasks that get abandoned mid-flight because there's no room left to finish.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The framing I landed on: &lt;strong&gt;context window is finite regardless of price&lt;/strong&gt;. An agent that navigates efficiently stays in its best-value zone longer.&lt;/p&gt;

&lt;h2&gt;
  
  
  What coldstart actually is
&lt;/h2&gt;

&lt;p&gt;Straight from the README:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;coldstart is a lightweight navigation layer for AI agents. It answers one question: which files are relevant to this task? No embeddings, no graph, no model to run or maintain. Just a fast, static index over your codebase — file paths, symbol names, exports — built once, queried instantly. Agents are already good at reading code, tracing logic, and reasoning about structure. What they don't need is another system trying to do that for them. coldstart stays out of the way: find the file, hand it off, done. 4 tools. Minimal context overhead. No infrastructure to babysit.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;The mental model is "smarter grep." Grep matches strings; coldstart matches strings &lt;em&gt;and&lt;/em&gt; knows which ones are exported symbols, which are path segments, and which are rare enough to be meaningful signals. Once it points the agent at the right file, the agent does the rest. That's the whole tool.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I deliberately didn't build
&lt;/h2&gt;

&lt;p&gt;Most of the design effort went into things I chose &lt;em&gt;not&lt;/em&gt; to add. Each one was tempting and each one would have made the tool worse for what agents actually need.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No recommendations or next-step suggestions.&lt;/strong&gt; That's the agent's job. A navigation tool that tries to also be a reasoning tool ends up doing both poorly, and worse, anchors the agent on whatever heuristic the tool used. I'd rather hand back a clean list of files and let the agent decide.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No semantic search or embeddings.&lt;/strong&gt; Embeddings add an entire failure mode — model versioning, index rebuilds when the embedding model changes, cost, latency, dependency on a hosted service or a shipped model file — without proportional gain for &lt;em&gt;navigation&lt;/em&gt;. For finding files by symbol name, lexical retrieval is faster, more predictable, and easier to debug. (For conceptual queries — "where's the retry logic" — embeddings genuinely help. coldstart isn't trying to be that tool.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;No path prefix filters or file-type filters.&lt;/strong&gt; That's grep's job. coldstart's value is what grep can't do: finding files by exported symbol names and structural signals. If you need a glob, you already have one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Ranking is fixed.&lt;/strong&gt; I went through a lot of iterations on ranking — penalties, exact-match filters, length thresholds, IDF weighting — and what shipped is what held up across real queries on real codebases. I'm done tweaking it. If I keep tuning, I'm just overfitting to whatever I tested last week.&lt;/p&gt;

&lt;p&gt;The pattern across all four: every feature I didn't add was one that would have made coldstart bigger without making it more useful for the specific job of "find the file fast and disappear."&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned from real agent runs
&lt;/h2&gt;

&lt;p&gt;Two things stood out once I watched agents actually use it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Soft failures are recoverable; hard failures aren't.&lt;/strong&gt; When coldstart returns too many results, the agent narrows the query and tries again — that's a soft failure, and agents handle it fine. What kills an agent is a tool that returns zero results with no signal, or a confident wrong answer. coldstart is designed so its failure mode is "too much" rather than "nothing" — agents can always work with too much.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Confidence scores are decoration unless they mean something.&lt;/strong&gt; I tried adding confidence scores early on. They were meaningless — every result came back at roughly the same number — and the agent would over-anchor on them. I removed them. If a score doesn't differentiate results, it's just noise the agent has to interpret.&lt;/p&gt;

&lt;p&gt;There's a more detailed benchmark coming, comparing coldstart to a graph-based codebase analysis approach on real queries. I want to do that properly with numbers rather than vibes, so it's getting its own post.&lt;/p&gt;

&lt;h2&gt;
  
  
  The honest limitations
&lt;/h2&gt;

&lt;p&gt;A few things coldstart doesn't do well, in case you're evaluating it:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Cross-file call resolution is partial.&lt;/strong&gt; Named function calls across files are resolved. Member-expression calls (&lt;code&gt;this.foo()&lt;/code&gt;, &lt;code&gt;obj.foo()&lt;/code&gt;) are not. This is a Tree-sitter limitation I haven't fully solved.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;It's lexical, not semantic.&lt;/strong&gt; If you ask "where do we handle authentication" and no file has the word "auth" in its path or exports, coldstart won't find it. Use it for symbol-shaped queries, not concept-shaped ones.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ten languages parsed for symbols.&lt;/strong&gt; TypeScript, JavaScript, Java, Ruby, Python, Go, Rust, C#, PHP, Kotlin. Swift, Dart, and C++ files are walked and indexed by path — so agents can still find them — but symbols, imports, and call edges aren't extracted yet, so &lt;code&gt;trace-deps&lt;/code&gt; and &lt;code&gt;trace-impact&lt;/code&gt; stop at those boundaries. Adding full parsing is a Tree-sitter grammar wire-up per language.
## Try it
&lt;/li&gt;
&lt;/ul&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; coldstart-mcp
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;GitHub: &lt;a href="https://github.com/AkashGoenka/coldstart" rel="noopener noreferrer"&gt;github.com/AkashGoenka/coldstart&lt;/a&gt;&lt;br&gt;
npm: &lt;a href="https://www.npmjs.com/package/coldstart-mcp" rel="noopener noreferrer"&gt;npmjs.com/package/coldstart-mcp&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you want the long version of how the design got here — the failed ranking iterations, the AST decisions, what broke and what held up — there's a &lt;a href="https://dev.to/akashgoenka/building-coldstart-what-broke-what-held-up-447l"&gt;deep-dive post&lt;/a&gt; covering that.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>mcp</category>
      <category>opensource</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
