<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Amithu Uysen</title>
    <description>The latest articles on DEV Community by Amithu Uysen (@amithuuysen).</description>
    <link>https://dev.to/amithuuysen</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F2371046%2F1ed01f35-1944-40db-adc7-e64c99656ec6.jpg</url>
      <title>DEV Community: Amithu Uysen</title>
      <link>https://dev.to/amithuuysen</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/amithuuysen"/>
    <language>en</language>
    <item>
      <title>Cursor charges $20/mo. Copilot uploads your code. I built a free, local alternative.</title>
      <dc:creator>Amithu Uysen</dc:creator>
      <pubDate>Tue, 07 Apr 2026 16:02:28 +0000</pubDate>
      <link>https://dev.to/amithuuysen/cursor-charges-20mo-copilot-uploads-your-code-i-built-a-free-local-alternative-ol8</link>
      <guid>https://dev.to/amithuuysen/cursor-charges-20mo-copilot-uploads-your-code-i-built-a-free-local-alternative-ol8</guid>
      <description>&lt;h1&gt;
  
  
  I Reverse-Engineered Cursor's Codebase Search and Built an Open-Source Alternative
&lt;/h1&gt;

&lt;p&gt;&lt;strong&gt;TL;DR:&lt;/strong&gt; I built &lt;a href="https://github.com/amithuuysen/codebase-context" rel="noopener noreferrer"&gt;CodeContext&lt;/a&gt;, an open-source Python MCP server that replicates the hybrid search architecture Cursor IDE uses internally — combining FAISS vector search, BM25 keyword search, Merkle tree sync, and AST-aware chunking. It works with VS Code Copilot, Claude Desktop, or any MCP client. No paid subscription required.&lt;/p&gt;




&lt;h2&gt;
  
  
  Let's Be Honest About What You're Paying For
&lt;/h2&gt;

&lt;p&gt;First, let me be clear: &lt;strong&gt;Cursor's $20/month isn't just for codebase search.&lt;/strong&gt; You're paying for an entire IDE experience — AI chat, tab completion, agent mode, cloud agents, frontier model access, and the search indexing that powers all of it. The search is one piece of a much larger product.&lt;/p&gt;

&lt;p&gt;Similarly, &lt;strong&gt;GitHub Copilot's value isn't just indexing&lt;/strong&gt; — it's code completion, agent mode, PR reviews, and deep GitHub integration.&lt;/p&gt;

&lt;p&gt;So why did I build just the search piece? Because &lt;strong&gt;search quality is the foundation that determines how good everything else works.&lt;/strong&gt; When the AI can't find the right code, every other feature suffers — completions are wrong, chat hallucinates, agent mode edits the wrong file.&lt;/p&gt;

&lt;p&gt;And for many developers, the search is the missing piece. You might already have a Copilot Free or Pro plan that works great — except when you ask about code in a 20K-file codebase and it falls back to grep because the local index caps at ~2,500 files.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Real Cost Picture
&lt;/h2&gt;

&lt;p&gt;Here's what the AI coding landscape actually costs in 2026:&lt;/p&gt;

&lt;h3&gt;
  
  
  Individual Plans
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Plan&lt;/th&gt;
&lt;th&gt;Price&lt;/th&gt;
&lt;th&gt;What You Get&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GitHub Copilot&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Free&lt;/td&gt;
&lt;td&gt;$0/mo&lt;/td&gt;
&lt;td&gt;50 agent/chat requests, 2K completions, GPT-5 mini&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GitHub Copilot&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pro&lt;/td&gt;
&lt;td&gt;$10/mo&lt;/td&gt;
&lt;td&gt;Unlimited completions, 300 premium requests, Claude/Codex&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;GitHub Copilot&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pro+&lt;/td&gt;
&lt;td&gt;$39/mo&lt;/td&gt;
&lt;td&gt;5× premium requests, all models including Opus 4.6&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cursor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Hobby&lt;/td&gt;
&lt;td&gt;$0/mo&lt;/td&gt;
&lt;td&gt;Limited agent requests, limited tab completions&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cursor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pro&lt;/td&gt;
&lt;td&gt;$20/mo&lt;/td&gt;
&lt;td&gt;Extended agent limits, frontier models, MCPs, cloud agents&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cursor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Pro+&lt;/td&gt;
&lt;td&gt;$60/mo&lt;/td&gt;
&lt;td&gt;3× usage on all models&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cursor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Ultra&lt;/td&gt;
&lt;td&gt;$200/mo&lt;/td&gt;
&lt;td&gt;20× usage, priority features&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  Team/Enterprise Plans
&lt;/h3&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Tool&lt;/th&gt;
&lt;th&gt;Plan&lt;/th&gt;
&lt;th&gt;Price&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cursor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Teams&lt;/td&gt;
&lt;td&gt;$40/user/mo (+ SSO, analytics, privacy controls)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cursor&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Enterprise&lt;/td&gt;
&lt;td&gt;Custom pricing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Copilot&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Business&lt;/td&gt;
&lt;td&gt;Contact sales&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h3&gt;
  
  
  The Cost Gap
&lt;/h3&gt;

&lt;p&gt;For a team of 10 developers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Copilot Pro&lt;/strong&gt;: $100/month ($10 × 10)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cursor Pro&lt;/strong&gt;: $200/month ($20 × 10)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cursor Teams&lt;/strong&gt;: $400/month ($40 × 10)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Annual difference: $1,200–$3,600&lt;/strong&gt; — just between Copilot Pro and Cursor Pro/Teams. And neither price includes the LLM API costs you might incur on top.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where CodeContext Fits
&lt;/h2&gt;

&lt;p&gt;CodeContext doesn't replace Cursor or Copilot. It &lt;strong&gt;fills a specific gap&lt;/strong&gt;: bringing Cursor-quality codebase search to tools that don't have it.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario 1: You use Copilot Free/Pro but have a large codebase
&lt;/h3&gt;

&lt;p&gt;You love Copilot's VS Code integration, but your 20K-file enterprise repo exceeds the ~2,500-file local index limit. Remote indexing only works with GitHub.com repos — and yours is on GitLab/Bitbucket/self-hosted.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before CodeContext:&lt;/strong&gt; Copilot falls back to grep and file search. Agent mode can't find relevant code across the project.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After CodeContext:&lt;/strong&gt; Add one MCP server config. Copilot now has hybrid semantic + keyword search across all 20K files via &lt;code&gt;@workspace&lt;/code&gt;. &lt;strong&gt;No change to your Copilot plan needed.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario 2: You're evaluating Cursor vs Copilot
&lt;/h3&gt;

&lt;p&gt;Cursor's search is genuinely better (12.5% accuracy improvement from hybrid retrieval). But you'd need to switch IDEs, retrain muscle memory, and pay $20/mo minimum.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;With CodeContext:&lt;/strong&gt; Stay in VS Code, keep your Copilot plan, and get the same hybrid search architecture. The $10–$30/month you save per developer can go toward more premium model requests.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario 3: Enterprise team on a budget
&lt;/h3&gt;

&lt;p&gt;10 devs × $40/user/month = $4,800/year just for Cursor Teams. CodeContext is free, runs locally, and gives you the search piece.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cost reduction:&lt;/strong&gt; Keep everyone on Copilot Pro ($1,200/year for 10 devs) + CodeContext (free) instead of Cursor Teams ($4,800/year). &lt;strong&gt;Save $3,600/year&lt;/strong&gt; while getting comparable search quality.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scenario 4: Privacy-sensitive projects
&lt;/h3&gt;

&lt;p&gt;Both Cursor and Copilot send code to external servers for indexing and inference. Copilot's remote index &lt;strong&gt;uploads your entire codebase to GitHub's cloud&lt;/strong&gt; (&lt;code&gt;api.github.com&lt;/code&gt;) for processing — community reports indicate ~500MB uploads during indexing, even for private repos. Cursor similarly processes code on their proprietary servers.&lt;/p&gt;

&lt;p&gt;For teams working on proprietary code, regulated industries (healthcare, finance, defense), or codebases under NDA, this is a non-starter.&lt;/p&gt;

&lt;p&gt;CodeContext runs 100% locally with Ollama — your code never leaves your machine. The embedding model runs on your GPU/CPU, the FAISS index is stored on local disk (&lt;code&gt;~/.context/&lt;/code&gt;), and zero bytes are transmitted over the network. The HMAC path obfuscation adds an extra privacy layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  What CodeContext Does NOT Replace
&lt;/h2&gt;

&lt;p&gt;Let me be honest about the limitations:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Feature&lt;/th&gt;
&lt;th&gt;Cursor/Copilot&lt;/th&gt;
&lt;th&gt;CodeContext&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Code completion (tab)&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Real-time, context-aware&lt;/td&gt;
&lt;td&gt;❌ Not a completion tool&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;AI chat&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Multi-model, streaming&lt;/td&gt;
&lt;td&gt;❌ Not a chat interface&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Agent mode&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Multi-file editing&lt;/td&gt;
&lt;td&gt;❌ Search only — feeds agents via MCP&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Cloud agents&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ (Cursor Pro, Copilot Pro)&lt;/td&gt;
&lt;td&gt;❌ Local only&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;PR code review&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ (Cursor Bugbot, Copilot)&lt;/td&gt;
&lt;td&gt;❌ Not in scope&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;IDE integration&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Deep, native&lt;/td&gt;
&lt;td&gt;⚠️ Via MCP protocol (works but less seamless)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Codebase search&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ Proprietary, optimized&lt;/td&gt;
&lt;td&gt;✅ &lt;strong&gt;Open-source, comparable architecture&lt;/strong&gt;
&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Custom embedding model&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;✅ (Cursor trains their own)&lt;/td&gt;
&lt;td&gt;⚠️ Uses open models (nomic-embed-text)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;CodeContext is a search engine, not an IDE.&lt;/strong&gt; It makes your existing IDE + AI assistant better at finding code. That's it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problems (Being Honest)
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;First-index time on huge codebases&lt;/strong&gt; — Indexing 20K files with Ollama takes time (minutes, not seconds). Cursor has optimized proprietary infrastructure. We use pipelining and caching to mitigate this, but first run is still slow.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Embedding quality&lt;/strong&gt; — Cursor trains a custom embedding model on real coding sessions. We use &lt;code&gt;nomic-embed-text&lt;/code&gt; (137M params, open-source). It's good, but not fine-tuned for code search specifically. Voyage AI's &lt;code&gt;voyage-code-3&lt;/code&gt; would be better but costs money.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;No tab completion or chat&lt;/strong&gt; — If search quality is your only pain point, CodeContext solves it. If you want the full AI IDE experience, you need Cursor or Copilot.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;MCP overhead&lt;/strong&gt; — Communication via MCP protocol adds latency compared to Cursor's native in-process search. Typically ~50-100ms per query.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;No cloud infrastructure&lt;/strong&gt; — Cursor can share indexes between team members (SimHash, 92% overlap). CodeContext is local-only right now.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Mac-first optimization&lt;/strong&gt; — The pipelined engine is optimized for Apple Silicon (M4 Pro). It works on Linux/Windows but hasn't been tuned for those platforms.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Bottom Line
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;If you...&lt;/th&gt;
&lt;th&gt;Recommendation&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Have budget + want the best experience&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Cursor Pro&lt;/strong&gt; ($20/mo) — best-in-class AI IDE&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Want great completions + affordable&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Copilot Pro&lt;/strong&gt; ($10/mo) — best value&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Need search on large codebases with Copilot&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Copilot + CodeContext&lt;/strong&gt; — fills the gap for free&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Have privacy requirements (no cloud)&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;CodeContext + Ollama&lt;/strong&gt; — 100% local&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Team on budget, need enterprise search&lt;/td&gt;
&lt;td&gt;
&lt;strong&gt;Copilot Pro + CodeContext&lt;/strong&gt; — save $3,600+/year vs Cursor Teams&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;




&lt;h2&gt;
  
  
  The Problem Nobody Talks About
&lt;/h2&gt;

&lt;p&gt;Every AI coding assistant has the same bottleneck: &lt;strong&gt;retrieval&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;When you ask Copilot &lt;em&gt;"where do we handle authentication?"&lt;/em&gt; in a 20,000-file codebase, it needs to find the right files in milliseconds. Get this wrong and the AI hallucinates, suggests fixes in the wrong file, or just says "I don't have enough context."&lt;/p&gt;

&lt;p&gt;Cursor's engineering blog revealed their approach: &lt;strong&gt;hybrid retrieval with Reciprocal Rank Fusion produces 12.5% better results&lt;/strong&gt; than either semantic or keyword search alone, and up to 23.5% on large codebases.&lt;/p&gt;

&lt;p&gt;So I built it.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture: How Cursor Does It (And How I Replicated It)
&lt;/h2&gt;

&lt;p&gt;Cursor's codebase search isn't magic — it's a well-engineered pipeline. Here's what I reverse-engineered and implemented:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Codebase (on disk)
    │
    ▼
Merkle Tree Sync ─── O(changes) not O(files)
    │
    ▼
Tree-sitter AST Splitter ─── functions, classes, methods
    │
    ▼
┌──────────┐    ┌──────────┐
│  FAISS   │    │  BM25    │
│ (dense)  │    │ (sparse) │
│  cosine  │    │ inverted │
└────┬─────┘    └────┬─────┘
     │               │
     ▼               ▼
  RRF Fusion (k=60) ─── merges both rankings
     │
     ▼
  Cross-Encoder Reranker (optional)
     │
     ▼
  MCP Server (stdio / HTTP)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Why Hybrid &amp;gt; Pure Semantic
&lt;/h3&gt;

&lt;p&gt;Consider three different queries against the same codebase:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Query&lt;/th&gt;
&lt;th&gt;FAISS (semantic)&lt;/th&gt;
&lt;th&gt;BM25 (keyword)&lt;/th&gt;
&lt;th&gt;Hybrid (RRF)&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;em&gt;"where do we handle authentication?"&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;✅ Finds &lt;code&gt;session.ts&lt;/code&gt; by meaning&lt;/td&gt;
&lt;td&gt;❌ Word "authentication" absent&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;em&gt;"find all imports of PaymentService"&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;⚠️ Returns similar but wrong&lt;/td&gt;
&lt;td&gt;✅ Exact keyword match&lt;/td&gt;
&lt;td&gt;✅&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;em&gt;"how does the tax calculation work?"&lt;/em&gt;&lt;/td&gt;
&lt;td&gt;✅ Good conceptual match&lt;/td&gt;
&lt;td&gt;✅ Matches "tax" + "calculation"&lt;/td&gt;
&lt;td&gt;✅ Best&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Neither approach alone covers all query types. RRF fusion combines them without needing score normalization — FAISS cosine scores and BM25 IDF scores are on completely different scales, but RRF only uses rank positions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Performance Problem (And How I Solved It)
&lt;/h2&gt;

&lt;p&gt;The naive pipeline — scan files → split → embed → insert — is painfully slow on large codebases. On a 20,000-file enterprise codebase (Zoho CRM), the first version took forever.&lt;/p&gt;

&lt;h3&gt;
  
  
  The bottleneck analysis:
&lt;/h3&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;File I/O + AST splitting&lt;/strong&gt; is CPU-bound (Tree-sitter parsing)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedding&lt;/strong&gt; is GPU/API-bound (waiting for Ollama or OpenAI)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;FAISS persistence&lt;/strong&gt; is disk I/O-bound (writing after every batch)&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;These three are mostly independent — the classic producer/consumer problem.&lt;/p&gt;

&lt;h3&gt;
  
  
  Solution: Pipelined Indexing Engine
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Producer (thread pool, 14 workers)     Consumer (async)
┌──────────────────────────┐          ┌──────────────────────┐
│ Read files in parallel   │          │ Check embedding cache │
│ AST split via Tree-sitter│  Queue   │ Embed ~4 sub-batches  │
│ Push chunk batches       │────────▶ │   concurrently        │
│                          │ maxsize=4│ Insert into FAISS     │
└──────────────────────────┘          │ Add to BM25 index     │
                                      └──────────────────────┘
                                               │
                                      FAISS persist (once at end)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Key optimizations:&lt;/strong&gt;&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;asyncio.Queue pipeline&lt;/strong&gt; — While batch N is embedding, batch N+1 is being split. CPU and GPU work overlap.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Concurrent embedding sub-batches&lt;/strong&gt; — Each flush splits into ~4 sub-batches, sent to Ollama in parallel threads. Set &lt;code&gt;OLLAMA_NUM_PARALLEL=4&lt;/code&gt; to saturate your GPU.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Deferred FAISS persistence&lt;/strong&gt; — One disk write at the end instead of hundreds during indexing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Embedding cache&lt;/strong&gt; — SHA-256 content hash → embedding vector. Re-indexing unchanged code costs zero API calls.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Adaptive thread pool&lt;/strong&gt; — Scales to your CPU cores (14 on M4 Pro, 8 on older machines).&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The &lt;code&gt;overlap&lt;/code&gt; value directly measures pipeline efficiency — it's the time saved by running splitting and embedding concurrently.&lt;/p&gt;

&lt;h2&gt;
  
  
  The AST Chunking Approach
&lt;/h2&gt;

&lt;p&gt;Most embedding-based search tools split code at arbitrary character boundaries. This produces chunks that start mid-function and end mid-class — meaningless to both humans and embedding models.&lt;/p&gt;

&lt;p&gt;CodeContext uses &lt;strong&gt;Tree-sitter&lt;/strong&gt; to parse code into an AST, then splits at logical boundaries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Python: splits at function_definition, class_definition, decorated_definition
# JavaScript: function_declaration, class_declaration, arrow functions
# Go: function_declaration, method_declaration, type_declaration
# ... 9 languages supported
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Control flow (&lt;code&gt;if&lt;/code&gt;, &lt;code&gt;for&lt;/code&gt;, &lt;code&gt;while&lt;/code&gt;, &lt;code&gt;try&lt;/code&gt;) stays inside its parent function — it's never split into a separate chunk. Gap text (imports, comments between functions) is handled separately. This matches Cursor's documented approach.&lt;/p&gt;

&lt;h2&gt;
  
  
  Merkle Tree Sync: O(changes) Not O(files)
&lt;/h2&gt;

&lt;p&gt;For a 50,000-file repo where 3 files changed:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Flat scan&lt;/strong&gt;: Hash all 50K files → compare → O(50K)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Merkle tree&lt;/strong&gt;: Compare root hash → walk only divergent branches → O(log N + changes)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The Merkle tree is SHA-256 based and directory-aware. Unchanged subtrees are skipped entirely. On a 20K-file codebase, re-indexing after a few file changes takes seconds instead of minutes.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Use It
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Install
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/amithuuysen/codebase-context.git
&lt;span class="nb"&gt;cd &lt;/span&gt;codebase-context
uv &lt;span class="nb"&gt;sync&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Run (with local Ollama — no API key needed)
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Start Ollama with parallel embedding&lt;/span&gt;
&lt;span class="nv"&gt;OLLAMA_NUM_PARALLEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;4 ollama serve

&lt;span class="c"&gt;# Pull the embedding model (137M params, fast)&lt;/span&gt;
ollama pull nomic-embed-text

&lt;span class="c"&gt;# Start CodeContext&lt;/span&gt;
&lt;span class="nv"&gt;EMBEDDING_PROVIDER&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;ollama &lt;span class="nv"&gt;OLLAMA_MODEL&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;nomic-embed-text uv run codecontext
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Connect to VS Code Copilot
&lt;/h3&gt;

&lt;p&gt;Add to &lt;code&gt;.vscode/mcp.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"servers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"codecontext"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"uv"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"run"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"codecontext"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"cwd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/path/to/codebase-context"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"MCP_TRANSPORT"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stdio"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"EMBEDDING_PROVIDER"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ollama"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now Copilot's &lt;code&gt;@workspace&lt;/code&gt; agent uses CodeContext for semantic search across your entire codebase — no file limit.&lt;/p&gt;

&lt;h3&gt;
  
  
  Connect to Claude Desktop
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"codecontext"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"uv"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"run"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"codecontext"&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"cwd"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"/path/to/codebase-context"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"env"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"MCP_TRANSPORT"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"stdio"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"EMBEDDING_PROVIDER"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"ollama"&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Embedding Provider Options
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Provider&lt;/th&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Speed&lt;/th&gt;
&lt;th&gt;Quality&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;
&lt;strong&gt;Ollama&lt;/strong&gt; (recommended)&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;nomic-embed-text&lt;/code&gt; (137M)&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Free, local&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;OpenAI&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;text-embedding-3-small&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Fast&lt;/td&gt;
&lt;td&gt;Best&lt;/td&gt;
&lt;td&gt;~$0.02/1M tokens&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;HuggingFace&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;code&gt;all-MiniLM-L6-v2&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Moderate&lt;/td&gt;
&lt;td&gt;Good&lt;/td&gt;
&lt;td&gt;Free, local&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;For enterprise codebases (10K+ files), Ollama with &lt;code&gt;nomic-embed-text&lt;/code&gt; hits the sweet spot — fast enough for batch indexing, good enough for accurate retrieval, and completely local (no data leaves your machine).&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/amithuuysen/codebase-context" rel="noopener noreferrer"&gt;github.com/amithuuysen/codebase-context&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you're hitting Copilot's 2,500-file limit or don't want to pay for Cursor, give it a try. It's open source, runs locally, and works with any MCP-compatible client.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built with Python, FAISS, Tree-sitter, LlamaIndex, and the MCP protocol. Inspired by Cursor IDE's engineering blog on hybrid search architecture.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>mcp</category>
      <category>opensource</category>
      <category>python</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
