<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Thi An</title>
    <description>The latest articles on DEV Community by Thi An (@anlor1002alt).</description>
    <link>https://dev.to/anlor1002alt</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3978730%2F2196a27d-cf7a-46fe-908a-5528ffecdddd.png</url>
      <title>DEV Community: Thi An</title>
      <link>https://dev.to/anlor1002alt</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/anlor1002alt"/>
    <language>en</language>
    <item>
      <title>Your AI coding agent keeps re-trying the fix that already failed. I built a zero-dependency cure.</title>
      <dc:creator>Thi An</dc:creator>
      <pubDate>Thu, 11 Jun 2026 05:22:43 +0000</pubDate>
      <link>https://dev.to/anlor1002alt/your-ai-coding-agent-keeps-re-trying-the-fix-that-already-failed-i-built-a-zero-dependency-cure-4k7e</link>
      <guid>https://dev.to/anlor1002alt/your-ai-coding-agent-keeps-re-trying-the-fix-that-already-failed-i-built-a-zero-dependency-cure-4k7e</guid>
      <description>&lt;p&gt;There's a specific loop that anyone who codes with AI agents knows by heart.&lt;/p&gt;

&lt;p&gt;The agent tries a fix. The test fails. It tries something else. Fails again. Then — a few prompts later, or after its context window gets compacted — it confidently re-applies the &lt;strong&gt;exact first fix that already failed&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;The failure got erased from its memory. The confidence didn't.&lt;/p&gt;

&lt;p&gt;I watched Claude Code do this four times in one session, each time presenting the same broken patch as a fresh idea. Every loop burned tokens, time, and a little bit of my soul. So I built &lt;a href="https://github.com/anlor1002-alt/regressionledger" rel="noopener noreferrer"&gt;RegressionLedger&lt;/a&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The core idea: verdicts should outlive the context window
&lt;/h2&gt;

&lt;p&gt;An agent's context is ephemeral — compaction, new sessions, &lt;code&gt;/clear&lt;/code&gt; all wipe it. But &lt;em&gt;what was tried and how it went&lt;/em&gt; is exactly the knowledge that shouldn't die with the context.&lt;/p&gt;

&lt;p&gt;RegressionLedger is a Claude Code hook + CLI that:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Fingerprints every edit&lt;/strong&gt; the agent makes — a normalized token stream where whitespace, comments, and string/number literals are abstracted away (but &lt;code&gt;true&lt;/code&gt; ≠ &lt;code&gt;false&lt;/code&gt;; opposite behavior must never collide)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Links each edit to the next test/build outcome&lt;/strong&gt; — it recognizes 12+ toolchains (jest, vitest, pytest, go, cargo, tsc, eslint, gradle…) and parses pass/fail plus the error signature&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Persists everything to a local JSON ledger&lt;/strong&gt; that survives restarts and compaction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hard-blocks re-applying a fix that already failed&lt;/strong&gt; — a PreToolUse deny with the original error attached:
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;RegressionLedger: you already tried the same fix to src/auth.js 2 hours ago.
It failed with: AssertionError: expected 200, got 401
Re-applying it will reproduce the same failure. Change strategy instead.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The key design decision: &lt;strong&gt;enforced, not advisory&lt;/strong&gt;. Memory systems that merely &lt;em&gt;suggest&lt;/em&gt; get ignored exactly when the agent is most confident — which is exactly when it's about to repeat the mistake. The block is at the tool layer; the agent can't argue its way past it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The parts I haven't seen anywhere else
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;The agent wakes up knowing its dead ends.&lt;/strong&gt; A SessionStart hook injects a compact "what already failed here" briefing every time a session starts — &lt;em&gt;including right after compaction wipes the agent's memory&lt;/em&gt;. Failures get blocked before they're re-conceived, not just re-applied.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Renaming variables doesn't help.&lt;/strong&gt; A second, structure-only fingerprint (identifiers erased, shape kept) catches "the same fix, renamed" — it never blocks on that weaker signal, but it annotates: &lt;em&gt;this may be the same fix, paraphrased&lt;/em&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thrash detection.&lt;/strong&gt; Blocking identical fixes catches one doom loop. The sneakier one: &lt;em&gt;different&lt;/em&gt; fixes all dying on the same error. At 3 distinct failed approaches on one error signature, the hook escalates: "The patches differ; the error doesn't. The diagnosis is wrong, not the patches. State 2–3 root-cause hypotheses before editing again."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Herd immunity.&lt;/strong&gt; &lt;code&gt;rl export&lt;/code&gt; / &lt;code&gt;rl import&lt;/code&gt; — your agent inherits the dead ends a teammate's agent already paid for, attributed and auditable.&lt;/p&gt;

&lt;h2&gt;
  
  
  Trust, but verify — literally
&lt;/h2&gt;

&lt;p&gt;A guardrail that blocks wrongly gets uninstalled within the hour. So:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Reproducible benchmark&lt;/strong&gt;: &lt;code&gt;npm run bench&lt;/code&gt; runs a deterministic 310-case corpus — 120/120 cosmetically-disguised repeat fixes caught, 0/190 false blocks on genuinely different fixes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;rl doctor&lt;/code&gt;&lt;/strong&gt; proves the install works end-to-end: it fires synthetic events through the real hook process — a first-time edit must pass, a seeded repeat failure must be denied&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;warn&lt;/code&gt; mode + hit auditing&lt;/strong&gt;: run it advisory-only for a week, then check &lt;code&gt;rl stats&lt;/code&gt; to see exactly what it &lt;em&gt;would&lt;/em&gt; have blocked on your codebase before enabling hard-block&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;rl unblock &amp;lt;file&amp;gt;&lt;/code&gt;&lt;/strong&gt; for when the context genuinely changed (postgres → aurora) — deliberately a &lt;em&gt;human&lt;/em&gt; decision, because an agent claiming "it's different this time" is exactly the confidence loop this tool exists to break&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Honest limitation: the fingerprint is a lexer, not an AST. A heavily restructured-but-equivalent patch can slip under the similarity threshold (it then earns its own ledger entry when the test fails again). Opt-in tree-sitter mode is on the roadmap — but a lexer that ships beats a tree-sitter that's roadmapped.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# inside your project&lt;/span&gt;
npx regressionledger init
npx regressionledger doctor   &lt;span class="c"&gt;# proves the guardrail works end-to-end&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or as a Claude Code plugin:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight console"&gt;&lt;code&gt;&lt;span class="go"&gt;/plugin marketplace add anlor1002-alt/regressionledger
/plugin install regressionledger@anlor1002-plugins
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Zero dependencies. Fully offline — no network calls, no API keys, nothing leaves your machine. MIT.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Repo (with a 30-second demo GIF):&lt;/strong&gt; &lt;a href="https://github.com/anlor1002-alt/regressionledger" rel="noopener noreferrer"&gt;https://github.com/anlor1002-alt/regressionledger&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you live in the doom loop too, I'd love feedback — especially on the false-positive/false-negative balance. The best features so far were all requested by people who tried to poke holes in it.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>opensource</category>
      <category>node</category>
      <category>productivity</category>
    </item>
  </channel>
</rss>
