<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Shreyash</title>
    <description>The latest articles on DEV Community by Shreyash (@corpsekiller).</description>
    <link>https://dev.to/corpsekiller</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F394175%2F4424b914-d218-46ef-841a-61388734bde0.png</url>
      <title>DEV Community: Shreyash</title>
      <link>https://dev.to/corpsekiller</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/corpsekiller"/>
    <language>en</language>
    <item>
      <title>AI Coding Agents Search Like It's 2009. Provenant Cuts Tokens by 65x.</title>
      <dc:creator>Shreyash</dc:creator>
      <pubDate>Thu, 28 May 2026 14:40:02 +0000</pubDate>
      <link>https://dev.to/corpsekiller/ai-coding-agents-search-like-its-2009-provenant-cuts-tokens-by-65x-3jg9</link>
      <guid>https://dev.to/corpsekiller/ai-coding-agents-search-like-its-2009-provenant-cuts-tokens-by-65x-3jg9</guid>
      <description>&lt;p&gt;Here's what happens every time you ask an AI coding agent a question:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;It greps your codebase&lt;/li&gt;
&lt;li&gt;It returns 15 files&lt;/li&gt;
&lt;li&gt;It stuffs ~69,000 tokens of raw source code into your context window&lt;/li&gt;
&lt;li&gt;It answers your question using maybe 3 of those files&lt;/li&gt;
&lt;li&gt;You pay for all 69,000 tokens anyway&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is BM25 keyword search on raw source code. It's the same algorithm that powered web search in 2009. And it's still the shape of most coding-agent retrieval systems: keyword search, grep, file search, context stuffing.&lt;/p&gt;

&lt;p&gt;I spent the last few months building something better. Here's what I found.&lt;/p&gt;




&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3pc1ivoiiu81vco84c3c.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F3pc1ivoiiu81vco84c3c.png" alt="provenant ad" width="800" height="533"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The Vocabulary Gap Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;When you ask &lt;em&gt;"how does Flask handle URL routing?"&lt;/em&gt;, you're writing in English. The answer lives in &lt;code&gt;scaffold.py&lt;/code&gt;, &lt;code&gt;app.py&lt;/code&gt;, and &lt;code&gt;wrappers.py&lt;/code&gt; — files full of Python syntax, decorator patterns, and Werkzeug internals.&lt;/p&gt;

&lt;p&gt;BM25 tries to match your words against those files. It mostly fails.&lt;/p&gt;

&lt;p&gt;The word "routing" appears 4 times in Flask's source. "URL" appears 31 times — mostly in docstrings and variable names scattered across 70+ files. BM25 retrieves 15 of them and hopes for the best.&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;The agent doesn't just have a retrieval problem. It has a vocabulary problem.&lt;/strong&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Natural language queries describe &lt;em&gt;behavior&lt;/em&gt;. Source code implements &lt;em&gt;syntax&lt;/em&gt;. These are different vocabularies, and no amount of BM25 tuning bridges that gap.&lt;/p&gt;




&lt;h2&gt;
  
  
  What If You Searched a Wiki Instead?
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Generate a human-readable wiki page for every file and module, then search the wiki.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A wiki page for &lt;code&gt;flask/sansio/scaffold.py&lt;/code&gt; reads like this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Scaffold is the shared base class for Flask and Blueprint. &lt;code&gt;@route()&lt;/code&gt; calls &lt;code&gt;add_url_rule()&lt;/code&gt;, which creates a Werkzeug Rule and inserts it into &lt;code&gt;url_map&lt;/code&gt;. View callables are stored in &lt;code&gt;view_functions&lt;/code&gt; keyed by endpoint name.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Search that for &lt;em&gt;"how does Flask handle URL routing?"&lt;/em&gt; — the query and the document speak the same language. No vocabulary gap.&lt;/p&gt;

&lt;p&gt;That's &lt;strong&gt;Provenant&lt;/strong&gt;. Index once, search a wiki forever.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Numbers (Benchmarked on SWE-bench Verified)
&lt;/h2&gt;

&lt;p&gt;I ran this against &lt;strong&gt;SWE-bench Verified&lt;/strong&gt; — 500 real GitHub issues across 12 major Python repos. The metric is &lt;strong&gt;Coverage@5&lt;/strong&gt;: does the correct file appear in the top 5 retrieved results?&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Method&lt;/th&gt;
&lt;th&gt;Coverage@5&lt;/th&gt;
&lt;th&gt;Tokens/query&lt;/th&gt;
&lt;th&gt;Delta&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Raw BM25 (baseline)&lt;/td&gt;
&lt;td&gt;~40%&lt;/td&gt;
&lt;td&gt;~65,000&lt;/td&gt;
&lt;td&gt;—&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Provenant (wiki + BM25)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;63.8%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;~1,030&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;+24pp&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Provenant + HyDE&lt;/td&gt;
&lt;td&gt;66.2%&lt;/td&gt;
&lt;td&gt;~1,030&lt;/td&gt;
&lt;td&gt;+26pp&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;+24 percentage points.&lt;/strong&gt; From 40% to 63.8%. On 500 tasks. Across 12 repos.&lt;/p&gt;

&lt;p&gt;And the token numbers aren't rounding errors:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Repo&lt;/th&gt;
&lt;th&gt;Naive tokens&lt;/th&gt;
&lt;th&gt;Provenant tokens&lt;/th&gt;
&lt;th&gt;Reduction&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Flask (30 queries)&lt;/td&gt;
&lt;td&gt;69,044&lt;/td&gt;
&lt;td&gt;1,070&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;64.5×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Django (20 queries)&lt;/td&gt;
&lt;td&gt;59,634&lt;/td&gt;
&lt;td&gt;994&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;60.0×&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Answer quality delta: &lt;strong&gt;−0.15 on a 5-point blind-judge scale.&lt;/strong&gt; In this sample, that was not a meaningful drop. The model answers just as well with 1k tokens as it does with 69k — it just wasn't using the other 68k anyway.&lt;/p&gt;




&lt;h2&gt;
  
  
  How It Works (The 60-Second Version)
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbj1ohca2qz91rf2m9g1s.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fbj1ohca2qz91rf2m9g1s.gif" alt="provenant ad" width="600" height="338"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: Index your repo once.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;provenant init /path/to/your/repo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Provenant parses every file with tree-sitter, generates a wiki page per module via LLM, and stores everything in SQLite/FTS5 + LanceDB. 6,122 pages across 12 repos. Done in minutes.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2: Start the MCP server.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;provenant serve &lt;span class="nt"&gt;--repo&lt;/span&gt; /path/to/your/repo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;That's it. Provenant is now a local MCP server exposing tools your agent can call natively.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Just use Claude. No special commands.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Add it to your &lt;code&gt;claude_desktop_config.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"provenant"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"provenant"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"serve"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"--repo"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/path/to/your/repo"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;


&lt;p&gt;Now when you ask Claude &lt;em&gt;"how does authentication work?"&lt;/em&gt; — it doesn't grep your codebase. It calls &lt;code&gt;provenant_ask&lt;/code&gt;, gets 3 wiki pages (~1k tokens), and answers. You never change how you work. The retrieval layer is just better.&lt;br&gt;
&lt;/p&gt;
&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You ask Claude a question
         ↓
Claude calls provenant_ask (MCP tool)
         ↓
Provenant: BM25 over wiki pages → top-k results
         ↓
Claude synthesizes answer from ~1,030 tokens
         ↓
Attribution confidence logged → weak pages auto-repaired
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What Claude Actually Said
&lt;/h2&gt;

&lt;p&gt;I asked a fresh repo — a Java Android music player it had never seen — &lt;em&gt;"How does this app play music?"&lt;/em&gt; Here's the actual response after calling &lt;code&gt;provenant_ask&lt;/code&gt;:&lt;/p&gt;


&lt;div class="crayons-card c-embed text-styles text-styles--secondary"&gt;
    &lt;div class="c-embed__content"&gt;
        &lt;div class="c-embed__cover"&gt;
          &lt;a href="https://imgur.com/" class="c-link align-middle" rel="noopener noreferrer"&gt;
            &lt;img alt="" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fs.imgur.com%2Fimages%2Flogo-1200-630.png" height="420" class="m-0" width="800"&gt;
          &lt;/a&gt;
        &lt;/div&gt;
      &lt;div class="c-embed__body"&gt;
        &lt;h2 class="fs-xl lh-tight"&gt;
          &lt;a href="https://imgur.com/" rel="noopener noreferrer" class="c-link"&gt;
            Imgur: The magic of the Internet
          &lt;/a&gt;
        &lt;/h2&gt;
          &lt;p class="truncate-at-3"&gt;
            Imgur: The magic of the Internet
          &lt;/p&gt;
        &lt;div class="color-secondary fs-s flex items-center"&gt;
            &lt;img alt="favicon" class="c-embed__favicon m-0 mr-2 radius-0" src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fs.imgur.com%2Fimages%2Ffavicon-32x32.png" width="32" height="32"&gt;
          imgur.com
        &lt;/div&gt;
      &lt;/div&gt;
    &lt;/div&gt;
&lt;/div&gt;



&lt;p&gt;&lt;em&gt;Screenshot: Claude's unedited response after Provenant retrieved 3 wiki pages (~1k tokens). Discovery phase: ~30 seconds.&lt;/em&gt;&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;"Provenant compressed the discovery phase from ~5–10 minutes of grepping/reading to ~30 seconds. It's like having an experienced teammate say 'here's the 3 files you need and what they do' before you dive in."&lt;/strong&gt;&lt;br&gt;
— Claude, unprompted&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's on a Java codebase. Provenant indexes Python — but the wiki pages are plain English, and Claude reads English just fine.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Part That Surprised Me: Attribution Confidence
&lt;/h2&gt;

&lt;p&gt;Nobody measures when a retrieval index is wrong. BM25 returns 5 results and acts confident. The model uses 2. The other 3 were noise. The index degrades silently as your codebase changes.&lt;/p&gt;

&lt;p&gt;I built a metric for this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;attribution confidence = pages actually cited / pages retrieved
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Zero extra LLM calls. Derived from the citation structure already in the answer. It correlates with answer quality (r = 0.415 against a blind LLM judge) — high-confidence retrievals score 5.0/5 on average; low-confidence score 4.5.&lt;/p&gt;

&lt;p&gt;When a page's confidence drops below 0.35, Provenant queues a background repair:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Fires silently after low-confidence answers
&lt;/span&gt;&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create_task&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;_background_repair&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uncited_pages&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;75% of low-confidence queries improved after one repair cycle.&lt;/strong&gt; Cost: ~$0.02. Touches only 0.7% of pages.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The index improves the more you use it.&lt;/strong&gt; Without you doing anything.&lt;/p&gt;




&lt;h2&gt;
  
  
  Per-Repo Breakdown
&lt;/h2&gt;

&lt;p&gt;Some repos benefit more than others. The pattern: &lt;strong&gt;small, well-documented repos see the biggest gains.&lt;/strong&gt; Large monoliths still improve, just from a harder baseline.&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Repo&lt;/th&gt;
&lt;th&gt;Coverage@5&lt;/th&gt;
&lt;th&gt;Improvement&lt;/th&gt;
&lt;th&gt;Wiki pages&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;requests&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;78%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;+38pp&lt;/td&gt;
&lt;td&gt;58&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;pytest&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;72%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;+32pp&lt;/td&gt;
&lt;td&gt;186&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;seaborn&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;71%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;+31pp&lt;/td&gt;
&lt;td&gt;94&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;flask&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;69%&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;+29pp&lt;/td&gt;
&lt;td&gt;74&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;xarray&lt;/td&gt;
&lt;td&gt;66%&lt;/td&gt;
&lt;td&gt;+26pp&lt;/td&gt;
&lt;td&gt;218&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;sphinx&lt;/td&gt;
&lt;td&gt;63%&lt;/td&gt;
&lt;td&gt;+23pp&lt;/td&gt;
&lt;td&gt;412&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;django&lt;/td&gt;
&lt;td&gt;61%&lt;/td&gt;
&lt;td&gt;+21pp&lt;/td&gt;
&lt;td&gt;1,393&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;scikit-learn&lt;/td&gt;
&lt;td&gt;57%&lt;/td&gt;
&lt;td&gt;+17pp&lt;/td&gt;
&lt;td&gt;1,124&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;matplotlib&lt;/td&gt;
&lt;td&gt;55%&lt;/td&gt;
&lt;td&gt;+15pp&lt;/td&gt;
&lt;td&gt;634&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;requests at 78% makes sense — it's a small, well-structured library with clean module boundaries. Each file does one thing. The wiki pages are precise. The retrieval is nearly perfect.&lt;/p&gt;

&lt;p&gt;Django at 61% is still a +21pp improvement on a 1,393-page codebase. That's not nothing.&lt;/p&gt;




&lt;h2&gt;
  
  
  One More Thing: HyDE
&lt;/h2&gt;

&lt;p&gt;For the ~3% of queries where even wiki vocabulary doesn't match, Provenant generates a hypothetical wiki snippet that &lt;em&gt;would&lt;/em&gt; answer the question, then searches against that. Merged with BM25 via Reciprocal Rank Fusion.&lt;/p&gt;

&lt;p&gt;+2.4pp &lt;a href="mailto:Coverage@5"&gt;Coverage@5&lt;/a&gt;. One extra LLM call. Not the headline — but it's there when it helps. The fact that it only fires 3% of the time is the point: the wiki handles the rest.&lt;/p&gt;




&lt;h2&gt;
  
  
  Honest About What Didn't Work
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Speculative prefetching&lt;/strong&gt; — I built a hook that pre-fetches wiki context whenever your agent greps a file, warming the cache. Median speedup: 1.0×. The DB reads were already fast enough. Keeping the code, not claiming a win.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Compression/pruning&lt;/strong&gt; — removing low-attribution pages before synthesis. Firing rate on test set: 0%. The threshold was too conservative. Needs tuning before it's useful.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Self-healing at scale&lt;/strong&gt; — the repair loop is only evaluated on Django (20 questions). I can't claim it generalises yet. It's early evidence, not a proven result.&lt;/p&gt;




&lt;h2&gt;
  
  
  Try It in 2 Minutes
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;pip &lt;span class="nb"&gt;install &lt;/span&gt;provenant

&lt;span class="c"&gt;# Index&lt;/span&gt;
provenant init /path/to/your/repo

&lt;span class="c"&gt;# Serve (MCP)&lt;/span&gt;
provenant serve &lt;span class="nt"&gt;--repo&lt;/span&gt; /path/to/your/repo
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Works with Claude Code, Cursor, or anything MCP-compatible. Your agent gets &lt;code&gt;provenant_ask&lt;/code&gt;, &lt;code&gt;provenant_search&lt;/code&gt;, &lt;code&gt;provenant_context&lt;/code&gt;, and &lt;code&gt;provenant_risk&lt;/code&gt; as native tools. It stops grepping. It starts reading the wiki.&lt;/p&gt;

&lt;p&gt;⭐ &lt;strong&gt;GitHub: &lt;a href="https://github.com/shreyashsharma/provenant" rel="noopener noreferrer"&gt;github.com/shreyashsharma/provenant&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Scale self-healing across all 12 repos (not just Django)&lt;/li&gt;
&lt;li&gt;SWE-bench end-to-end: patch generation, not just retrieval&lt;/li&gt;
&lt;li&gt;Figure out when HyDE helps vs hurts on different repo types&lt;/li&gt;
&lt;li&gt;Paper: &lt;em&gt;"Provenant: Attribution-Guided Wiki Indexing for Repository-Level AI Coding Agents"&lt;/em&gt; — submitted to IEEE ICAITPR 2026, under review&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;The retrieval problem in AI coding tools is real and under-measured. BM25 on raw source code is the floor, not the ceiling.&lt;/p&gt;

&lt;p&gt;If you try Provenant on your repo, I'm especially interested in two numbers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;How many tokens your agent was reading before&lt;/strong&gt; — run with a token counter on your current setup, then compare&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Whether the retrieved wiki pages match the files you would have opened manually&lt;/strong&gt; — that's the real test, independent of benchmarks&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Those two data points are more honest than any eval I can run on my own repos. Happy to compare notes.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Benchmarked with DeepSeek-V3.2 · nomic-embed-text-v1.5 · SWE-bench Verified (500 tasks) · 12 Python OSS repos&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>rag</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
