<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Dhrupo Nil</title>
    <description>The latest articles on DEV Community by Dhrupo Nil (@dhrupo).</description>
    <link>https://dev.to/dhrupo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1151170%2Ff3f2b324-73c2-4df0-9f8a-ffc109194198.jpeg</url>
      <title>DEV Community: Dhrupo Nil</title>
      <link>https://dev.to/dhrupo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/dhrupo"/>
    <language>en</language>
    <item>
      <title>I gave 8 AI agents an island and watched a society emerge — wars, gossip, grudges, and peace</title>
      <dc:creator>Dhrupo Nil</dc:creator>
      <pubDate>Sun, 14 Jun 2026 07:30:50 +0000</pubDate>
      <link>https://dev.to/dhrupo/i-gave-8-ai-agents-an-island-and-watched-a-society-emerge-wars-gossip-grudges-and-peace-2edj</link>
      <guid>https://dev.to/dhrupo/i-gave-8-ai-agents-an-island-and-watched-a-society-emerge-wars-gossip-grudges-and-peace-2edj</guid>
      <description>&lt;h3&gt;
  
  
  Tiny Civilization: what happens when AI agents have to &lt;em&gt;live together&lt;/em&gt;
&lt;/h3&gt;

&lt;p&gt;I grew up on &lt;strong&gt;Age of Empires&lt;/strong&gt;, &lt;strong&gt;Sid Meier's Civilization&lt;/strong&gt;, and &lt;strong&gt;Rise of Nations&lt;/strong&gt;. The thing that hooked me was never the graphics — it was the &lt;em&gt;systems&lt;/em&gt;. You set a few rules in motion and a whole world spills out of them: economies, rivalries, alliances, betrayals.&lt;/p&gt;

&lt;p&gt;Years later I watched OpenAI's &lt;a href="https://www.youtube.com/watch?v=kopoLzvh5jY" rel="noopener noreferrer"&gt;hide-and-seek multi-agent video&lt;/a&gt; (&lt;a href="https://openai.com/index/emergent-tool-use/" rel="noopener noreferrer"&gt;writeup&lt;/a&gt;), where agents that were only rewarded for hiding and seeking &lt;em&gt;invented tools and counter-strategies nobody coded&lt;/em&gt; — ramps, box-surfing, fort-building. Emergent behavior from simple pressure. That broke something open for me.&lt;/p&gt;

&lt;p&gt;So I asked a smaller question: &lt;strong&gt;forget winning a game — what if AI agents just had to live in a society together?&lt;/strong&gt; Would they behave like us? Hold grudges? Gossip? Make peace because they're tired of fighting?&lt;/p&gt;

&lt;p&gt;That became &lt;strong&gt;Tiny Civilization&lt;/strong&gt; — a browser sim where 2–8 agents with distinct personalities live on a small island, gathering, building, trading, stealing, gossiping, holding grudges, making peace, and &lt;em&gt;remembering it all across lives&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;&lt;a href="https://multiagentciv.netlify.app/" rel="noopener noreferrer"&gt;Live demo&lt;/a&gt;&lt;/strong&gt; — runs keyless in "instinct mode," or plug in a key for LLM minds.&lt;/p&gt;

&lt;p&gt;The whole thing — every line — was built with &lt;strong&gt;Claude Code, using the Fable model&lt;/strong&gt;, right before Fable retired. It felt fitting to send a storytelling model off by having it build a world full of little stories.&lt;/p&gt;




&lt;h3&gt;
  
  
  The problem: pure-LLM agents are bankrupting and pure-utility agents are boring
&lt;/h3&gt;

&lt;p&gt;The first design decision was the hardest. Two obvious options, both bad:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Call the LLM every tick.&lt;/strong&gt; Every agent, every day, makes an API call. Beautiful, expressive — and it costs a fortune and crawls.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Pure utility AI (the classic RTS approach).&lt;/strong&gt; Fast and free, but agents can't &lt;em&gt;scheme&lt;/em&gt;, can't talk, can't surprise you. It's just min-maxing.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;So I split the brain in two:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Layer&lt;/th&gt;
&lt;th&gt;Decides&lt;/th&gt;
&lt;th&gt;Cadence&lt;/th&gt;
&lt;th&gt;Cost&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;LLM mind&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Strategy (&lt;code&gt;gather&lt;/code&gt;/&lt;code&gt;build&lt;/code&gt;/&lt;code&gt;trade&lt;/code&gt;/&lt;code&gt;befriend&lt;/code&gt;/&lt;code&gt;aggress&lt;/code&gt;/&lt;code&gt;reconcile&lt;/code&gt;/&lt;code&gt;defend&lt;/code&gt;), per-neighbor stances, an inner thought, and all dialogue&lt;/td&gt;
&lt;td&gt;~every 15 sim-days&lt;/td&gt;
&lt;td&gt;~150 calls / 1,000 days&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;Utility engine&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;Each day's concrete action — eat, sleep, gather, steal, attack, gift, trade, make peace&lt;/td&gt;
&lt;td&gt;every tick&lt;/td&gt;
&lt;td&gt;free, local&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The LLM declares intent — &lt;em&gt;"aggress against Kai, he raided my base"&lt;/em&gt; — and that biases the utility scores for the next two weeks. The &lt;em&gt;body&lt;/em&gt; runs on instinct (hunger, energy, storms); the &lt;em&gt;mind&lt;/em&gt; sets direction. This is the trick that makes it both affordable and alive.&lt;/p&gt;




&lt;h3&gt;
  
  
  Memory across lives — where it got strange
&lt;/h3&gt;

&lt;p&gt;When a run ends, each agent's life is distilled into memory lines:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;"you won with score 200"&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"Maya destroyed your home"&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"you and Kai made peace after a feud"&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;"this life hardened you — you trust less now"&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Stored in &lt;code&gt;localStorage&lt;/code&gt;, keyed by agent &lt;strong&gt;name&lt;/strong&gt;, and injected into next run's prompts. Agents start referencing past lives in dialogue, pre-emptively paying reparations to remembered enemies, trusting remembered allies — &lt;em&gt;sometimes to their own ruin.&lt;/em&gt;&lt;/p&gt;




&lt;h3&gt;
  
  
  How I actually built and balanced it
&lt;/h3&gt;

&lt;p&gt;This is the part I'm proudest of, and it's pure childhood-strategy-game energy: &lt;strong&gt;you can't balance a society by vibes.&lt;/strong&gt; So the workflow was:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;A pure, deterministic simulation core&lt;/strong&gt; — zero DOM, zero AI. The same &lt;code&gt;runTick&lt;/code&gt; powers the browser, the tests, and a batch runner.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A seeded experiment runner.&lt;/strong&gt; &lt;code&gt;npm run experiment -- --runs 30 --days 1000 --seed 1&lt;/code&gt; runs 30 reproducible lifetimes and spits out a win-rate/score table. Every balance change landed with a before/after table. (Example: a Hermit rebalance moved one agent from 0/30 wins to 9–11/30 &lt;em&gt;without&lt;/em&gt; breaking the other archetypes.)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A 16-gate regression suite.&lt;/strong&gt; The justification gate (no grievance → no violence), war burnout, reconciliation pricing, positive-sum trade, granary protection, homelessness-death, trait drift — each one locked behind a headless test so balance changes can't silently regress behavior.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Change a dial in &lt;code&gt;constants.ts&lt;/code&gt; → run the experiment → read the table. That was the entire loop.&lt;/p&gt;




&lt;h3&gt;
  
  
  What emerged (none of this is scripted)
&lt;/h3&gt;

&lt;p&gt;Running the same island over and over, with memory on, produced a coherent arc:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Massacres.&lt;/strong&gt; Early on, the warrior just killed everyone. No deterrence existed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Forever wars.&lt;/strong&gt; I added a justification gate (violence needs a real grievance — theft, attack, trespass). That fixed unprovoked killing… but now wars never &lt;em&gt;ended&lt;/em&gt;: 495 fruitless attacks across 1,500 days.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Diplomacy.&lt;/strong&gt; Reconciliation + escalating reparations + war-weariness made endings &lt;em&gt;inevitable&lt;/em&gt;. Attacks per 2,000-day run collapsed: 594 → 14 → 0.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The kleptocracy.&lt;/strong&gt; With war capped, theft became the unpunished crime — 340 thefts/run. I fixed it the &lt;em&gt;human&lt;/em&gt; way: granaries. Fortification, not punishment.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The golden age.&lt;/strong&gt; A clean-slate run, no memories: zero attacks in 1,000 days, and the Warrior won &lt;em&gt;by out-trading everyone&lt;/em&gt; (118 trades, 1 attack).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The fall.&lt;/strong&gt; The very next run — now &lt;em&gt;remembering&lt;/em&gt; that golden age — collapsed. Remembered trust lowered everyone's guard, which raised the payoff of betrayal. Scores dropped ~15%; every relationship ended negative. &lt;strong&gt;Peace between strangers turned out to be easier than peace between old friends with open tabs.&lt;/strong&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The recurring lesson: every time I patched one form of conflict, the agents found the next-cheapest one. Massacres → wars → theft → litigation. Exactly like us.&lt;/p&gt;




&lt;h3&gt;
  
  
  Stack
&lt;/h3&gt;

&lt;p&gt;TypeScript, React, Zustand, Vite, Recharts. Default mind is z.ai GLM, but any OpenAI-compatible provider works per-agent — so you can literally pit Claude vs GLM vs Gemini in the same village and watch model-vs-model diplomacy. Keys never touch the browser (server-side proxy), and an adaptive-pacing controller learns each key's real rate ceiling.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Try it:&lt;/strong&gt; &lt;a href="https://multiagentciv.netlify.app/" rel="noopener noreferrer"&gt;https://multiagentciv.netlify.app/&lt;/a&gt;&lt;br&gt;
&lt;strong&gt;Code:&lt;/strong&gt; &lt;a href="https://github.com/dhrupo/multi-agent-civilization" rel="noopener noreferrer"&gt;https://github.com/dhrupo/multi-agent-civilization&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you played the same strategy games I did, I think you'll feel right at home watching this thing run.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>gamedev</category>
      <category>typescript</category>
      <category>showdev</category>
    </item>
    <item>
      <title>I translated my 7,000-line AI-coding handbook with 12 parallel AI agents</title>
      <dc:creator>Dhrupo Nil</dc:creator>
      <pubDate>Sun, 07 Jun 2026 09:32:01 +0000</pubDate>
      <link>https://dev.to/dhrupo/i-translated-my-7000-line-ai-coding-handbook-with-12-parallel-ai-agents-mco</link>
      <guid>https://dev.to/dhrupo/i-translated-my-7000-line-ai-coding-handbook-with-12-parallel-ai-agents-mco</guid>
      <description>&lt;p&gt;Last week I shipped a &lt;a href="https://dev.to/dhrupo/i-built-a-complete-ai-coding-handbook-in-bangla-words-tools-a-story-and-a-30-day-path-879"&gt;complete AI-coding handbook in Bangla&lt;/a&gt; — 79 terms decoded with everyday analogies, hands-on CLI guides, a 6-chapter story, and a 30-day practice path. The premise was the &lt;em&gt;vocabulary wall&lt;/em&gt;: words like "token", "context window", and "harness" sound like spells until someone explains them plainly.&lt;/p&gt;

&lt;p&gt;Turns out the vocabulary wall isn't a Bangla problem. It's a beginner problem.&lt;/p&gt;

&lt;p&gt;So today the whole book exists in English:&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://github.com/dhrupo/dictionary-of-ai-coding-english" rel="noopener noreferrer"&gt;https://github.com/dhrupo/dictionary-of-ai-coding-english&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  What's inside (the 60-second version)
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;📖 Part 1 — The words.&lt;/strong&gt; 62 terms across 7 sections, each with a daily-life analogy instead of a textbook definition. A model is a calculator that never presses its own buttons. And Goldi 🐠 — a goldfish with a 3-second memory — keeps reminding you that models are stateless too.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🛠️ Part 2 — The tools.&lt;/strong&gt; 7 hands-on guides: real CLI commands (Claude Code + Codex), AI-friendly folder structure (&lt;code&gt;AGENTS.md&lt;/code&gt;, &lt;code&gt;.claude/&lt;/code&gt;), bad-prompt → good-prompt rewrites, popular community skills (brainstorming, grill-me, TDD, handoff…), 17 extra terms (RAG, embeddings, temperature, hooks…), token economics, and safety — prompt injection explained with a postman analogy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📜 Part 3 — The story.&lt;/strong&gt; Nothing to install: Rafi, a 9th grader, builds his first portfolio site with an AI agent, makes every classic mistake (vague prompts, a hallucinated library, dragging a session deep into the dumb zone), and recovers using the tools from Part 2. You read over his shoulder.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🗓️ The 30-day path.&lt;/strong&gt; One 5–15 minute mission a day. The first two weeks need nothing but a free AI chat. Plus a myth-busting FAQ, a 79-term index, and copy-paste &lt;code&gt;templates/&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fun part: the book translated itself the way it teaches you to work
&lt;/h2&gt;

&lt;p&gt;Here's the meta-story. The handbook spends three chapters teaching subagents, handoff artifacts, and verification-before-completion. So when it came time to translate ~7,000 lines of Bangla markdown across 36 files… I used exactly that.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;12 parallel AI agents, one session.&lt;/strong&gt; Each agent got the same conventions block — character names (Goldi, Rafi), recurring section labels, file renames — plus its own slice of the book: three dictionary chapters here, the whole 6-chapter story to a single agent there (one translator = one consistent voice).&lt;/p&gt;

&lt;p&gt;Three things made it work:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. Deterministic rules beat agent judgment.&lt;/strong&gt; The Bangla book's term headings were &lt;code&gt;### বাংলা-শব্দ (English Term)&lt;/code&gt;. The convention: the English heading becomes &lt;em&gt;exactly&lt;/em&gt; the parenthetical term. That one rule made every cross-file anchor predictable — agents working on different files in parallel could link into each other's not-yet-written chapters and have the slugs match.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. A machine-verifiable gate, not vibes.&lt;/strong&gt; The repo ships a tiny CI link checker (&lt;code&gt;scripts/check-links.py&lt;/code&gt;) that validates every markdown link and anchor, GitHub slug quirks included. First run after the agents finished: &lt;strong&gt;44 broken links.&lt;/strong&gt; Leftover Bangla anchors, one agent guessing &lt;code&gt;#esc-stop&lt;/code&gt; where another wrote &lt;code&gt;### ESC (stopping)&lt;/code&gt;, a third doubling slugs like &lt;code&gt;#-grill-me-grill-me&lt;/code&gt;. Twenty minutes of scripted fixes later: &lt;strong&gt;36 files, every link alive.&lt;/strong&gt; That checker now runs on every PR.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. Verify, don't assume.&lt;/strong&gt; The index agent didn't &lt;em&gt;compute&lt;/em&gt; its 79 anchors — it read the actual translated files and extracted the real headings. Zero misses. Meanwhile an asset agent scanned every terminal-GIF source script for Bangla characters and found… none. The GIFs had been authored in English all along. The "re-render 26 GIFs" task evaporated because an agent checked instead of assuming.&lt;/p&gt;

&lt;p&gt;If that workflow sounds useful, it's literally what Part 2 and Part 3 of the book teach — subagents, handoffs, verification — just pointed at itself.&lt;/p&gt;

&lt;h2&gt;
  
  
  It's a community thing
&lt;/h2&gt;

&lt;p&gt;Everything is CC BY 4.0, in both languages, and the two editions cross-link. Issue forms for "I found a mistake" and "suggest a new word" are ready in both repos.&lt;/p&gt;

&lt;p&gt;💙 Built on the shoulders of Matt Pocock's &lt;a href="https://github.com/mattpocock/dictionary-of-ai-coding" rel="noopener noreferrer"&gt;Dictionary of AI Coding&lt;/a&gt; — Parts 2 and 3 grew far beyond it, but the soul is his.&lt;/p&gt;

&lt;p&gt;And the offer from the Bangla launch stands: if you speak another language, &lt;strong&gt;steal this structure.&lt;/strong&gt; The deterministic-heading trick + a link checker means the next translation is mostly an afternoon of agent-wrangling. I'd genuinely love to see a Spanish, Hindi, or Vietnamese edition.&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://github.com/dhrupo/dictionary-of-ai-coding-english" rel="noopener noreferrer"&gt;https://github.com/dhrupo/dictionary-of-ai-coding-english&lt;/a&gt;&lt;/strong&gt;&lt;br&gt;
🇧🇩 &lt;strong&gt;Prefer Bangla?&lt;/strong&gt; &lt;a href="https://github.com/dhrupo/dictionary-of-ai-coding-bangla" rel="noopener noreferrer"&gt;https://github.com/dhrupo/dictionary-of-ai-coding-bangla&lt;/a&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>beginners</category>
      <category>learning</category>
    </item>
    <item>
      <title>I built a complete AI-coding handbook in Bangla - words, tools, a story, and a 30-day path</title>
      <dc:creator>Dhrupo Nil</dc:creator>
      <pubDate>Fri, 05 Jun 2026 17:14:15 +0000</pubDate>
      <link>https://dev.to/dhrupo/i-built-a-complete-ai-coding-handbook-in-bangla-words-tools-a-story-and-a-30-day-path-879</link>
      <guid>https://dev.to/dhrupo/i-built-a-complete-ai-coding-handbook-in-bangla-words-tools-a-story-and-a-30-day-path-879</guid>
      <description>&lt;p&gt;A year of AI-coding discourse taught me one thing: the hardest part isn't the tools — it's the &lt;em&gt;vocabulary wall&lt;/em&gt;. If English isn't your first language, "token", "context window", "hallucination", "harness" don't sound like concepts. They sound like spells. 🪄&lt;/p&gt;

&lt;p&gt;Matt Pocock's brilliant &lt;a href="https://github.com/mattpocock/dictionary-of-ai-coding" rel="noopener noreferrer"&gt;Dictionary of AI Coding&lt;/a&gt; showed how teachable these words really are. So I adapted it into Bangla — and then kept going, way past where the original stops. Today it's a complete, free, open-source learning path:&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://github.com/dhrupo/dictionary-of-ai-coding-bangla" rel="noopener noreferrer"&gt;https://github.com/dhrupo/dictionary-of-ai-coding-bangla&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The four-stage arc: শব্দ → হাতিয়ার → গল্প → অভ্যাস
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;(words → tools → story → habit)&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📖 Part 1 — The words.&lt;/strong&gt; 62 terms across 7 sections, each with a daily-life analogy instead of a textbook definition. A model is a calculator that never presses its own buttons. Parameters are mixing-board knobs. And there's গোল্ডি (Goldie) — a goldfish with a 3-second memory who keeps reminding you that models are stateless too. 🐠&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🛠️ Part 2 — The tools.&lt;/strong&gt; This is where it goes beyond the original: 7 hands-on guides covering real CLI commands (Claude Code + Codex, mapped back to the concepts — &lt;code&gt;/compact&lt;/code&gt; is just the Compaction button), AI-friendly folder structure (&lt;code&gt;AGENTS.md&lt;/code&gt;, &lt;code&gt;.claude/&lt;/code&gt;, who reads what and when), bad-prompt → good-prompt rewrites, the popular community skills (brainstorming, grill-me, TDD, handoff…), 17 extra terms (RAG, embeddings, temperature, hooks…), token economics, and safety (including prompt injection, explained with a postman analogy).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;📜 Part 3 — The story.&lt;/strong&gt; Instead of a tutorial you have to install things for, it's a 6-chapter read-along: রাফি, a 9th grader, builds his first portfolio site with an AI agent. He makes every classic mistake — vague prompts, trusting a hallucinated library, dragging a session deep into the dumb zone — and recovers using the tools from Part 2. Margin notes call back every concept at the exact moment it bites. The reader installs &lt;em&gt;nothing&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;🗓️ The 30-day path.&lt;/strong&gt; Reading isn't owning. So the book ends with 30 daily missions (5–15 min each). The first two weeks need zero setup — just any free AI chat and open eyes ("ask the same question twice — did the answers match?"). Weeks 3–4 go hands-on, but every mission has a read-only alternative.&lt;/p&gt;

&lt;p&gt;Plus: a myth-busting FAQ ("AI will eat my job", "you need a CS degree"), a 79-term alphabetical index, copy-paste &lt;code&gt;templates/&lt;/code&gt; (a starter &lt;code&gt;AGENTS.md&lt;/code&gt;, a working &lt;code&gt;/handoff&lt;/code&gt; command, a &lt;code&gt;SKILL.md&lt;/code&gt; example), and 26 terminal GIFs — all reproducible from committed VHS scripts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Things I learned building it
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Analogies are load-bearing, not decoration.&lt;/strong&gt; "Prefix cache" never landed until it became "the waiter remembers your usual order."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A story teaches sequencing in a way lists can't.&lt;/strong&gt; You can document &lt;code&gt;/compact&lt;/code&gt; vs &lt;code&gt;/clear&lt;/code&gt; vs handoff perfectly and people still won't know &lt;em&gt;when&lt;/em&gt;. Watching রাফি lose his color-scheme decision to a long session — that sticks.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Bengali text breaks your tooling in fun ways.&lt;/strong&gt; &lt;code&gt;grep&lt;/code&gt; couldn't match half my content (decomposed Unicode), GitHub slugs for emoji-prefixed headings start with a stray hyphen, and nested code fences need 4-backtick outers. The repo's link checker is a tiny Python script for a reason.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Beginners deserve the real thing.&lt;/strong&gt; Simplified ≠ dumbed down. The dictionary keeps the exact English terms visible everywhere, so readers can walk into any English doc afterward and feel at home.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  It's a community thing
&lt;/h2&gt;

&lt;p&gt;Everything is CC BY 4.0. There are Bangla issue forms for "I want a new word" and "I found a mistake" — contributions of any size welcome, from a typo to a better analogy to a whole new section.&lt;/p&gt;

&lt;p&gt;And if you speak another language: steal this structure. The vocabulary wall exists in every non-English-speaking dev community, and the fix is surprisingly fun to build. 💙&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://github.com/dhrupo/dictionary-of-ai-coding-bangla" rel="noopener noreferrer"&gt;https://github.com/dhrupo/dictionary-of-ai-coding-bangla&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>beginners</category>
      <category>learning</category>
    </item>
    <item>
      <title>Your AI Coding Agent Wastes 80% of Its Context. Fixed That with Graph Theory.</title>
      <dc:creator>Dhrupo Nil</dc:creator>
      <pubDate>Mon, 25 May 2026 13:45:00 +0000</pubDate>
      <link>https://dev.to/dhrupo/your-ai-coding-agent-wastes-80-of-its-context-fixed-that-with-graph-theory-1mfa</link>
      <guid>https://dev.to/dhrupo/your-ai-coding-agent-wastes-80-of-its-context-fixed-that-with-graph-theory-1mfa</guid>
      <description>&lt;h2&gt;
  
  
  The problem nobody admits
&lt;/h2&gt;

&lt;p&gt;When you give Claude Code, Cursor, or Codex a task like &lt;em&gt;"fix the login validation bug"&lt;/em&gt;, here's what they usually do:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Run &lt;code&gt;grep -l login src/&lt;/code&gt; → 17 files&lt;/li&gt;
&lt;li&gt;Read all 17 files top-to-bottom (because context is "free")&lt;/li&gt;
&lt;li&gt;Spend 80% of the model's context window on irrelevant imports, type aliases, and helper functions the bug doesn't touch&lt;/li&gt;
&lt;li&gt;Generate a fix using whatever 20% of attention is left&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This works. Sort of. But it's wasteful — and on big codebases, it's wrong: the agent runs out of context before it sees the actual buggy function.&lt;/p&gt;

&lt;p&gt;The instinct is to throw a bigger model at it. Bigger context window, fancier RAG, vector embeddings. All of which trade real cost for diminishing returns.&lt;/p&gt;

&lt;p&gt;There's a better answer that's been sitting in classical CS the whole time: &lt;strong&gt;treat the repo as a graph&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2Fdhrupo%2Fmincut-context%2Fmain%2Fdocs%2Fdemo.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fraw.githubusercontent.com%2Fdhrupo%2Fmincut-context%2Fmain%2Fdocs%2Fdemo.gif" alt="demo" width="760" height="415"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The idea, in one paragraph
&lt;/h2&gt;

&lt;p&gt;Your codebase already &lt;em&gt;is&lt;/em&gt; a graph. Functions call functions. Modules import modules. Classes extend classes. Pick a node (the symbol your task is about), and the structurally-closest neighborhood is almost certainly what an agent needs to see.&lt;/p&gt;

&lt;p&gt;So I built &lt;strong&gt;&lt;code&gt;mincut-context&lt;/code&gt;&lt;/strong&gt; — an npm package that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Parses your repo into a &lt;strong&gt;symbol graph&lt;/strong&gt; (tree-sitter, supports TS/JS/Vue/Python/PHP)&lt;/li&gt;
&lt;li&gt;Derives &lt;strong&gt;seed nodes&lt;/strong&gt; from your task description (keyword IDF on symbol names + file paths)&lt;/li&gt;
&lt;li&gt;Runs &lt;strong&gt;personalized PageRank&lt;/strong&gt; with the seeds as the restart vector&lt;/li&gt;
&lt;li&gt;Picks the &lt;strong&gt;minimum-cut subgraph&lt;/strong&gt; that fits a token budget you choose&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The output: a list of files + line ranges that an agent should look at. Nothing more, nothing less.&lt;/p&gt;

&lt;h2&gt;
  
  
  Show me the numbers
&lt;/h2&gt;

&lt;p&gt;I built an evaluation suite into the repo itself. 28 hand-labeled tasks across 3 real codebases at a 4,000-token budget:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;strategy&lt;/th&gt;
&lt;th&gt;precision&lt;/th&gt;
&lt;th&gt;recall&lt;/th&gt;
&lt;th&gt;F1&lt;/th&gt;
&lt;th&gt;token-efficiency&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;strong&gt;mincut&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.27&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.83&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.39&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.270&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;mincut + &lt;code&gt;--embed&lt;/code&gt; (semantic)&lt;/td&gt;
&lt;td&gt;0.27&lt;/td&gt;
&lt;td&gt;0.83&lt;/td&gt;
&lt;td&gt;0.39&lt;/td&gt;
&lt;td&gt;0.270&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;grep keyword baseline&lt;/td&gt;
&lt;td&gt;0.11&lt;/td&gt;
&lt;td&gt;0.42&lt;/td&gt;
&lt;td&gt;0.16&lt;/td&gt;
&lt;td&gt;0.105&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;random selection (control)&lt;/td&gt;
&lt;td&gt;0.01&lt;/td&gt;
&lt;td&gt;0.04&lt;/td&gt;
&lt;td&gt;0.01&lt;/td&gt;
&lt;td&gt;0.009&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Per-repo breakdown:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;repo&lt;/th&gt;
&lt;th&gt;tasks&lt;/th&gt;
&lt;th&gt;mincut recall&lt;/th&gt;
&lt;th&gt;grep recall&lt;/th&gt;
&lt;th&gt;mincut F1&lt;/th&gt;
&lt;th&gt;grep F1&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;mincut-context (self)&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.97&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.56&lt;/td&gt;
&lt;td&gt;0.44&lt;/td&gt;
&lt;td&gt;0.30&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;FluentForm (PHP+Vue+JS)&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.88&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.13&lt;/td&gt;
&lt;td&gt;0.43&lt;/td&gt;
&lt;td&gt;0.04&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Fluent Player (TS/JSX)&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0.63&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;0.56&lt;/td&gt;
&lt;td&gt;0.31&lt;/td&gt;
&lt;td&gt;0.13&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;mincut catches ~2× more of the correct files than grep, at ~2.5× better token efficiency.&lt;/strong&gt; Reproducible with &lt;code&gt;npm run eval&lt;/code&gt;. Add your own labeled tasks under &lt;code&gt;eval/fixtures/&lt;/code&gt; to score against your own codebase.&lt;/p&gt;

&lt;h2&gt;
  
  
  The math, briefly
&lt;/h2&gt;

&lt;p&gt;Given a symbol graph $G = (V, E, w)$ where:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;$V$ are code units (functions, classes, methods)&lt;/li&gt;
&lt;li&gt;$E$ are dependency edges (imports, calls, references)&lt;/li&gt;
&lt;li&gt;$w(v)$ is the token cost of including symbol $v$&lt;/li&gt;
&lt;li&gt;$B$ is your token budget&lt;/li&gt;
&lt;li&gt;$S \subseteq V$ are seed nodes derived from the task&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Find $T \supseteq S$ with $\sum_{v \in T} w(v) \le B$ minimizing the &lt;strong&gt;boundary cut cost&lt;/strong&gt;:&lt;/p&gt;

&lt;p&gt;$$\text{cut}(T, V \setminus T) = \sum_{e \in E, \text{ crossing}} w(e)$$&lt;/p&gt;

&lt;p&gt;In plain English: pick a connected, low-token region that has few "loose ends" pointing outside it. The inside of the cut is what the agent needs; the outside is safely ignorable.&lt;/p&gt;

&lt;p&gt;The objective is submodular, so a greedy algorithm gives a $(1 - 1/e) \approx 0.63$ approximation guarantee. The full pseudocode is in the README; the implementation is ~200 lines in &lt;a href="https://github.com/dhrupo/mincut-context/blob/main/src/core/select.ts" rel="noopener noreferrer"&gt;&lt;code&gt;src/core/select.ts&lt;/code&gt;&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  Three ways to use it
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. As an MCP server — recommended for agents
&lt;/h3&gt;

&lt;p&gt;Drop this block into your Claude Code / Codex / Cursor settings:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"mcpServers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"mincut-context"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"npx"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"args"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"-y"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mincut-context"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"mcp"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your agent now has six new tools: &lt;code&gt;pack_context&lt;/code&gt;, &lt;code&gt;expand_node&lt;/code&gt;, &lt;code&gt;find_callers&lt;/code&gt;, &lt;code&gt;find_callees&lt;/code&gt;, &lt;code&gt;search_symbols&lt;/code&gt;, &lt;code&gt;explain_selection&lt;/code&gt;. They operate on the cached graph from the most recent &lt;code&gt;pack_context&lt;/code&gt; call — effectively free traversal after the first pack.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. As a CLI
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; mincut-context

mcx pack &lt;span class="s2"&gt;"fix the login validation bug"&lt;/span&gt; &lt;span class="nt"&gt;--budget&lt;/span&gt; 4000             &lt;span class="c"&gt;# plain output&lt;/span&gt;
mcx pack &lt;span class="s2"&gt;"..."&lt;/span&gt; &lt;span class="nt"&gt;--format&lt;/span&gt; tree                                       &lt;span class="c"&gt;# directory-grouped&lt;/span&gt;
mcx pack &lt;span class="s2"&gt;"..."&lt;/span&gt; &lt;span class="nt"&gt;--format&lt;/span&gt; json | jq                                  &lt;span class="c"&gt;# pipe to anything&lt;/span&gt;
mcx pack &lt;span class="s2"&gt;"..."&lt;/span&gt; &lt;span class="nt"&gt;--interactive&lt;/span&gt;                                       &lt;span class="c"&gt;# Ink TUI: vim keys + preview&lt;/span&gt;
mcx pack &lt;span class="s2"&gt;"..."&lt;/span&gt; &lt;span class="nt"&gt;--embed&lt;/span&gt;                                             &lt;span class="c"&gt;# semantic seeding&lt;/span&gt;
mcx pack &lt;span class="s2"&gt;"..."&lt;/span&gt; &lt;span class="nt"&gt;--cache&lt;/span&gt;                                             &lt;span class="c"&gt;# 5× warm-run speedup&lt;/span&gt;
mcx watch &lt;span class="s2"&gt;"..."&lt;/span&gt; &lt;span class="nt"&gt;--debounce&lt;/span&gt; 300                                     &lt;span class="c"&gt;# re-pack on file change&lt;/span&gt;
mcx doctor                                                         &lt;span class="c"&gt;# environment self-check&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;mcx doctor&lt;/code&gt; is my favorite — it tells you in 6 lines what's installed and what isn't:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzw1c7rjfwuh340i0s9dx.gif" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2Fzw1c7rjfwuh340i0s9dx.gif" alt="doctor" width="799" height="276"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3. As a library
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;pack&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;mincut-context&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;pack&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;task&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;fix the login validation bug&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;repo&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;process&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;cwd&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
  &lt;span class="na"&gt;budget&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4000&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;cache&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;parallel&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;chunk&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="na"&gt;enabled&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;maxTokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;400&lt;/span&gt; &lt;span class="p"&gt;},&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;

&lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;f&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;files&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;console&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;log&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;path&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;score&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;toFixed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;tokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;·&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;f&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;reasons&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;// → src/auth/login.ts        0.541  612 · seed — matched directly by task&lt;/span&gt;
&lt;span class="c1"&gt;// → src/auth/session.ts      0.408  483 · attached (60%)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What I learned by building this
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Embeddings are oversold for this problem
&lt;/h3&gt;

&lt;p&gt;Adding semantic embeddings (&lt;code&gt;--embed&lt;/code&gt; flag, via &lt;code&gt;@xenova/transformers&lt;/code&gt; running locally) &lt;strong&gt;did not improve recall on any of my three eval task sets.&lt;/strong&gt; Why? Because the labels were named honestly. When you label "stripe payment processor" → &lt;code&gt;StripeProcessor.php&lt;/code&gt;, the keyword match catches it without help. Embeddings only earn their keep when your task vocabulary diverges from the code's — "centrality and ranking" → &lt;code&gt;PageRank&lt;/code&gt;, that kind of gap.&lt;/p&gt;

&lt;p&gt;I left &lt;code&gt;--embed&lt;/code&gt; in because it doesn't hurt, and there are real users whose mental model doesn't match the code. But the marketing-friendly "AI-powered" framing for this stuff is mostly noise.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Greedy beats CELF for this objective
&lt;/h3&gt;

&lt;p&gt;I implemented CELF (Cost-Effective Lazy Forward, Leskovec 2007) hoping for a free speedup over the naive greedy. It diverged — not just slower (8× slower on FluentForm) but &lt;strong&gt;wrong&lt;/strong&gt;: it produced smaller, structurally weaker selections.&lt;/p&gt;

&lt;p&gt;Why: our "no isolated nodes" acceptance rule (a candidate must have at least one edge into the current selection) breaks CELF's submodular-monotone assumption. A candidate's eligibility flips discontinuously when a node with an edge to it joins T. The lazy cache becomes unreliable.&lt;/p&gt;

&lt;p&gt;I wrote the dead end up in &lt;a href="https://github.com/dhrupo/mincut-context/blob/main/eval/ALGORITHM-RESEARCH.md" rel="noopener noreferrer"&gt;&lt;code&gt;eval/ALGORITHM-RESEARCH.md&lt;/code&gt;&lt;/a&gt; so nobody re-treads it. &lt;strong&gt;Honest negative results are worth shipping.&lt;/strong&gt;&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Sub-symbol chunking matters more than I expected
&lt;/h3&gt;

&lt;p&gt;Big legacy codebases have huge functions. A 500-line function is one symbol in the graph, and if it gets selected, the whole thing eats your budget. So &lt;code&gt;--chunk&lt;/code&gt; splits big functions at statement boundaries — each chunk becomes its own sub-symbol, individually selectable.&lt;/p&gt;

&lt;p&gt;On FluentForm: indexing without chunking → 4,333 symbols. With &lt;code&gt;--chunk&lt;/code&gt; → 4,878 symbols (+545 chunks). Same budget, much finer-grained selection. The greedy can pick &lt;em&gt;just the relevant &lt;code&gt;if/for/try&lt;/code&gt; block&lt;/em&gt; instead of all-or-nothing.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. Test coverage of 88% isn't the whole story
&lt;/h3&gt;

&lt;p&gt;The CI gates on 85% statements / 80% branches / 90% functions / 85% lines. But the genuinely-untestable files — worker scripts, lazy-loaded LSP clients — are excluded from the calc. Honest reporting means saying &lt;em&gt;what&lt;/em&gt; is tested, not just the headline number.&lt;/p&gt;

&lt;h2&gt;
  
  
  The honest tradeoffs
&lt;/h2&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Honest tradeoff&lt;/th&gt;
&lt;th&gt;What we do&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;True optimal min-cut is NP-hard&lt;/td&gt;
&lt;td&gt;Greedy submodular — &lt;code&gt;(1−1/e)&lt;/code&gt; bound&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Tree-sitter symbols are syntactic, not type-aware&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;--lsp&lt;/code&gt; refines TS/JS via typescript-language-server&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Embedding model adds ~22 MB on first run&lt;/td&gt;
&lt;td&gt;Opt-in behind &lt;code&gt;--embed&lt;/code&gt; flag&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;LSP startup is slow (~1–5s)&lt;/td&gt;
&lt;td&gt;Opt-in; cached after init&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cold start parses whole repo&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;--cache&lt;/code&gt; (5× speedup) + &lt;code&gt;--parallel n&lt;/code&gt; (2.7× speedup)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;h2&gt;
  
  
  What I'd build next if you asked
&lt;/h2&gt;

&lt;p&gt;The roadmap that's &lt;em&gt;not&lt;/em&gt; checked off yet:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Pyright / Intelephense LSP adapters&lt;/strong&gt; — type-aware calls for Python and PHP (~1–2 days each on the existing LSP infrastructure)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Svelte / Rust / Go parsers&lt;/strong&gt; — one file each on the parser template&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Incremental neighborhood caching in the greedy&lt;/strong&gt; — keep &lt;code&gt;attach(v, T)&lt;/code&gt; cached and update only when a node with an edge to v is added. Expected 3–5× speedup on graphs with bounded degree.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each is bounded effort and additive. The core is done.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stop building, start using
&lt;/h2&gt;

&lt;p&gt;The hardest lesson: &lt;strong&gt;a tool's value comes from someone actually using it on real work, not from feature count.&lt;/strong&gt; mincut-context is at v1.7.0 — 261 tests, 88.6% coverage, CI green on Ubuntu + macOS × Node 18/20/22. There's no honest "but it's not ready" excuse left.&lt;/p&gt;

&lt;p&gt;If you've watched an AI agent burn 80% of its 200k-token context on imports it doesn't care about, install it now and tell me what breaks:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; mincut-context
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;🔗 &lt;strong&gt;GitHub:&lt;/strong&gt; &lt;a href="https://github.com/dhrupo/mincut-context" rel="noopener noreferrer"&gt;github.com/dhrupo/mincut-context&lt;/a&gt;&lt;br&gt;
📦 &lt;strong&gt;npm:&lt;/strong&gt; &lt;a href="https://www.npmjs.com/package/mincut-context" rel="noopener noreferrer"&gt;npmjs.com/package/mincut-context&lt;/a&gt;&lt;br&gt;
📊 &lt;strong&gt;Reproducible benchmarks:&lt;/strong&gt; &lt;a href="https://github.com/dhrupo/mincut-context/blob/main/eval/CROSS-REPO-RESULTS.md" rel="noopener noreferrer"&gt;&lt;code&gt;eval/CROSS-REPO-RESULTS.md&lt;/code&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;I'd love feedback — especially "your numbers don't replicate on my codebase" feedback. That's literally what the eval suite is for.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you got value from this, ⭐ the repo or drop a comment about a tooling problem you're solving. mincut-context is open-source MIT; the eval suite welcomes new fixtures.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>llm</category>
      <category>typescript</category>
    </item>
  </channel>
</rss>
