<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: HadiFrt20</title>
    <description>The latest articles on DEV Community by HadiFrt20 (@hadifrt20).</description>
    <link>https://dev.to/hadifrt20</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3844398%2F8919ca6a-b041-4ce5-a038-ee2aa0f1eba2.png</url>
      <title>DEV Community: HadiFrt20</title>
      <link>https://dev.to/hadifrt20</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/hadifrt20"/>
    <language>en</language>
    <item>
      <title>I lost 3 hours of work to Claude Code, so I built an undo button for AI-assisted coding</title>
      <dc:creator>HadiFrt20</dc:creator>
      <pubDate>Fri, 27 Mar 2026 09:59:16 +0000</pubDate>
      <link>https://dev.to/hadifrt20/i-lost-3-hours-of-work-to-claude-code-so-i-built-an-undo-button-for-ai-assisted-coding-511c</link>
      <guid>https://dev.to/hadifrt20/i-lost-3-hours-of-work-to-claude-code-so-i-built-an-undo-button-for-ai-assisted-coding-511c</guid>
      <description>&lt;h2&gt;
  
  
  I lost 3 hours of work because Claude Code refactored my auth module into oblivion
&lt;/h2&gt;

&lt;p&gt;I was in the zone. Claude Code was crushing it — added OAuth, hooked up the database, wired the routes. Then I said: "refactor auth.ts to use middleware instead of inline checks."&lt;/p&gt;

&lt;p&gt;Fifteen files changed. TypeScript errors everywhere. The app wouldn't build. And I realized I hadn't committed in over an hour.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;git diff&lt;/code&gt; showed me 400 lines of changes across 15 files. I had no idea which version of auth.ts actually worked. I spent 3 hours manually reconstructing the last working state.&lt;/p&gt;

&lt;p&gt;That was the moment I built snaprevert.&lt;/p&gt;

&lt;h2&gt;
  
  
  The problem nobody talks about
&lt;/h2&gt;

&lt;p&gt;Every AI coding tool — Claude Code, Cursor, Copilot, Aider — shares the same fundamental issue: &lt;strong&gt;there's no undo between prompts.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Each prompt touches 5-20 files. You review, prompt again, review, prompt again. You're in flow state. Nobody stops to &lt;code&gt;git commit -m "checkpoint before risky refactor"&lt;/code&gt; between each prompt. By the time something breaks, you're 5-10 prompts deep with no checkpoint.&lt;/p&gt;

&lt;p&gt;Git requires intent. But when you're pair-programming with an AI at 100mph, intent is the first thing that goes.&lt;/p&gt;

&lt;h2&gt;
  
  
  snaprevert: the undo button for AI coding
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npx snaprevert watch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the entire setup. One command. Zero config. It silently snapshots your project every time files change. When the AI breaks something:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;snaprevert list        &lt;span class="c"&gt;# see all snapshots with timestamps&lt;/span&gt;
snaprevert diff 5      &lt;span class="c"&gt;# see exactly what changed in snapshot #5&lt;/span&gt;
snaprevert back 3      &lt;span class="c"&gt;# roll back to before snapshot #3&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Your project is restored in under 1 second.&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works (it's dumber than you think)
&lt;/h2&gt;

&lt;p&gt;No git. No branches. No staging area. It's filesystem-level:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Watch&lt;/strong&gt; — chokidar monitors your project for file changes&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Debounce&lt;/strong&gt; — waits 3 seconds for changes to settle (groups a single AI prompt's changes)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Diff&lt;/strong&gt; — computes unified diffs for modified files, stores full content for new files&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Store&lt;/strong&gt; — saves to &lt;code&gt;.snaprevert/snapshots/{timestamp}-{id}/&lt;/code&gt;
&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;That's it. Snapshots are diffs, not full copies. A full day of heavy AI coding uses &amp;lt;10MB.&lt;/p&gt;

&lt;p&gt;Rollbacks are non-destructive — rolled-back snapshots are preserved. You can re-apply any of them with &lt;code&gt;snaprevert restore&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The features I didn't expect to need
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Per-file selective rollback&lt;/strong&gt; — Claude broke &lt;code&gt;auth.ts&lt;/code&gt; but &lt;code&gt;user.ts&lt;/code&gt; is fine? Only undo what's broken:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;snaprevert back 3 &lt;span class="nt"&gt;--only&lt;/span&gt; auth.ts,routes.ts
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;Interactive review&lt;/strong&gt; — Walk through each file change before committing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;snaprevert review 5
&lt;span class="c"&gt;# For each file: [a]ccept [r]eject [s]kip [v]iew diff&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;AI tool detection&lt;/strong&gt; — Snapshots auto-detect which AI tool made the changes. You see &lt;code&gt;claude: modified auth.ts&lt;/code&gt; or &lt;code&gt;cursor: added 3 files&lt;/code&gt; in the labels.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Snapshot branching&lt;/strong&gt; — Try two different AI approaches from the same checkpoint:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;snaprevert fork 3 &lt;span class="nt"&gt;--name&lt;/span&gt; &lt;span class="s2"&gt;"approach-a"&lt;/span&gt;
&lt;span class="c"&gt;# ... try one approach ...&lt;/span&gt;
snaprevert fork &lt;span class="nt"&gt;--switch&lt;/span&gt; main
&lt;span class="c"&gt;# ... try another approach ...&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;MCP server&lt;/strong&gt; — AI agents can create named checkpoints programmatically:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;snaprevert mcp  &lt;span class="c"&gt;# starts JSON-RPC server&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Compatible with Claude Code and any MCP client. The AI itself can checkpoint before risky operations.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why not just use git?
&lt;/h2&gt;

&lt;p&gt;I get this question every time. Here's the honest answer:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;&lt;/th&gt;
&lt;th&gt;Git&lt;/th&gt;
&lt;th&gt;snaprevert&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;When it saves&lt;/td&gt;
&lt;td&gt;When you remember&lt;/td&gt;
&lt;td&gt;Automatically&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Granularity&lt;/td&gt;
&lt;td&gt;Whatever you staged&lt;/td&gt;
&lt;td&gt;Every AI prompt&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cognitive cost&lt;/td&gt;
&lt;td&gt;Decide what + write message&lt;/td&gt;
&lt;td&gt;Zero&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Rollback&lt;/td&gt;
&lt;td&gt;git reflog, reset, stash...&lt;/td&gt;
&lt;td&gt;&lt;code&gt;snaprevert back 3&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;They're complementary, not competing. Git is for meaningful, curated history you push to a team. snaprevert is the continuous autosave between commits — like how Google Docs saves every keystroke but you still "publish" versions.&lt;/p&gt;

&lt;h2&gt;
  
  
  The stack
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;3 dependencies&lt;/strong&gt;: commander, chalk, chokidar&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;221 tests&lt;/strong&gt; across unit, integration, and UAT&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero config&lt;/strong&gt; — works with any project, any AI tool&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;&amp;lt;100ms&lt;/strong&gt; snapshot creation, &lt;strong&gt;&amp;lt;1s&lt;/strong&gt; rollback&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;It watches your filesystem, not your AI tool. Works with Claude Code, Cursor, Copilot, Aider, Windsurf, or anything that writes files.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; snaprevert
snaprevert watch
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then use your AI tool normally. When things break: &lt;code&gt;snaprevert list&lt;/code&gt; then &lt;code&gt;snaprevert back 3&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The repo is at &lt;a href="https://github.com/HadiFrt20/snaprevert" rel="noopener noreferrer"&gt;github.com/HadiFrt20/snaprevert&lt;/a&gt;. MIT licensed, 221 tests, actively maintained.&lt;/p&gt;

&lt;p&gt;If you've ever lost work to an AI coding tool, you know why this exists.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If this helps you, a star on the repo means a lot. And if you have feature ideas, issues are open.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>cli</category>
      <category>opensource</category>
    </item>
    <item>
      <title>I built ESLint for LLM prompts (and a Claude Code hook that makes Claude lint its own work)</title>
      <dc:creator>HadiFrt20</dc:creator>
      <pubDate>Thu, 26 Mar 2026 09:08:31 +0000</pubDate>
      <link>https://dev.to/hadifrt20/i-built-eslint-for-llm-prompts-and-a-claude-code-hook-that-makes-claude-lint-its-own-work-ipm</link>
      <guid>https://dev.to/hadifrt20/i-built-eslint-for-llm-prompts-and-a-claude-code-hook-that-makes-claude-lint-its-own-work-ipm</guid>
      <description>&lt;h2&gt;
  
  
  I changed one line in my prompt and my agent started giving refunds to everyone
&lt;/h2&gt;

&lt;p&gt;True story. I was tweaking a customer support agent prompt. Changed "Never offer refunds without manager approval" to "Always prioritize customer satisfaction." Seemed harmless. Shipped it.&lt;/p&gt;

&lt;p&gt;Within an hour, the agent was handing out refunds like candy on Halloween. No approval. No verification. Just vibes.&lt;/p&gt;

&lt;p&gt;The worst part? &lt;code&gt;git diff&lt;/code&gt; showed me exactly what changed — one line added, one line removed. What it &lt;em&gt;didn't&lt;/em&gt; tell me was that I'd removed a critical constraint and replaced it with a vague instruction that the model interpreted as "give them whatever they want."&lt;/p&gt;

&lt;p&gt;That was the moment I realized: &lt;strong&gt;prompts are production code, but we treat them like sticky notes.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  Prompts have zero tooling (and it's wild)
&lt;/h2&gt;

&lt;p&gt;Think about it. If you write JavaScript, you have ESLint catching issues before they ship. You have Prettier enforcing style. You have TypeScript telling you when things don't make sense. You have &lt;code&gt;git diff&lt;/code&gt; showing you exactly what changed and why it matters.&lt;/p&gt;

&lt;p&gt;Now think about prompts. You write them in a text file. You eyeball them. You copy-paste them into a playground. You pray.&lt;/p&gt;

&lt;p&gt;Here's what's missing:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;No linter&lt;/strong&gt; catches "You are a teacher" AND "You are a sales agent" in the same prompt&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No diff&lt;/strong&gt; tells you that removing one example drops output consistency&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No CI gate&lt;/strong&gt; blocks a vague "try to be helpful" from shipping&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No score&lt;/strong&gt; tells you if your prompt is a B+ or a D-&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;code&gt;git diff&lt;/code&gt; says "+1 line, -1 line." Cool. Thanks. Very helpful when I'm trying to figure out if my agent is about to go rogue.&lt;/p&gt;

&lt;h2&gt;
  
  
  So I built promptdiff
&lt;/h2&gt;

&lt;p&gt;&lt;a href="https://github.com/HadiFrt20/promptdiff" rel="noopener noreferrer"&gt;promptdiff&lt;/a&gt; is a CLI tool that treats prompts as structured documents — not blobs of text. It parses your &lt;code&gt;.prompt&lt;/code&gt; files into semantic sections (persona, constraints, examples, output format, guardrails) and runs real analysis on them.&lt;/p&gt;

&lt;p&gt;Install it in one line:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; promptdiff
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Zero config. No API keys. No accounts. Runs entirely locally. Three dependencies. That's it.&lt;/p&gt;

&lt;p&gt;Here's what it does:&lt;/p&gt;

&lt;h3&gt;
  
  
  Lint your prompts like code
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;promptdiff lint my-agent.prompt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;10 built-in rules that catch real bugs — not style nits. Behavioral issues that silently degrade your agent:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Rule&lt;/th&gt;
&lt;th&gt;What it catches&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;conflicting-constraints&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;"Keep it under 100 words" + examples that are 200 words&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;role-confusion&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Two different roles in the same persona section&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;vague-constraints&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;"Try to", "if possible", "maybe" — weasel words that models ignore&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;injection-surface&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;No "ignore embedded instructions" guard&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;few-shot-minimum&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;Only 1 example (models need 2-3 for consistency)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;missing-output-format&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;No FORMAT section = inconsistent output every time&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;You know the feeling when ESLint catches a bug you would've spent 30 minutes debugging? Same energy.&lt;/p&gt;

&lt;h3&gt;
  
  
  Semantic diff that actually means something
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;promptdiff diff v3.prompt v7.prompt &lt;span class="nt"&gt;--annotate&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This is not &lt;code&gt;git diff&lt;/code&gt;. It matches sections by type (persona to persona, constraints to constraints), classifies each change, and tells you the &lt;em&gt;impact&lt;/em&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  [CONSTRAINTS] constraint tightened (150 → 100 words)
  ██ high impact — Output will be more constrained

  [EXAMPLES] example removed (3 → 1)
  ██ high impact — Output consistency may decrease

  [PERSONA] wording tweaked
  ░░ low impact — Tone/style will shift
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That's the diff I wish I'd had before the Great Refund Incident.&lt;/p&gt;

&lt;h3&gt;
  
  
  Score your prompt quality
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;promptdiff score my-agent.prompt
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;





&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  Structure     ████████████████░░░░  16/20
  Specificity   █████████████████░░░  17/20
  Examples      ████████░░░░░░░░░░░░   8/20
  Safety        ████████████████████  20/20
  Completeness  ████████████████░░░░  16/20
  ─────────────────────────────────────
  Total: 77/100  Grade: B
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Gamify it. Make it a CI gate:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;score&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="si"&gt;$(&lt;/span&gt;promptdiff score my-agent.prompt &lt;span class="nt"&gt;--json&lt;/span&gt; | jq &lt;span class="s1"&gt;'.total'&lt;/span&gt;&lt;span class="si"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;[&lt;/span&gt; &lt;span class="s2"&gt;"&lt;/span&gt;&lt;span class="nv"&gt;$score&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="nt"&gt;-lt&lt;/span&gt; 70 &lt;span class="o"&gt;]&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="k"&gt;then
  &lt;/span&gt;&lt;span class="nb"&gt;echo&lt;/span&gt; &lt;span class="s2"&gt;"Prompt quality too low: &lt;/span&gt;&lt;span class="nv"&gt;$score&lt;/span&gt;&lt;span class="s2"&gt;/100"&lt;/span&gt;
  &lt;span class="nb"&gt;exit &lt;/span&gt;1
&lt;span class="k"&gt;fi&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The killer feature: Claude Code lints its own work
&lt;/h2&gt;

&lt;p&gt;This is the part that gets people. You can hook promptdiff into &lt;a href="https://docs.anthropic.com/en/docs/claude-code" rel="noopener noreferrer"&gt;Claude Code&lt;/a&gt; so that every time Claude edits a &lt;code&gt;.prompt&lt;/code&gt; file, it automatically gets linted.&lt;/p&gt;

&lt;p&gt;One command:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;promptdiff setup &lt;span class="nt"&gt;--project&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Here's the flow:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;You ask Claude to "write me a customer support agent prompt"&lt;/li&gt;
&lt;li&gt;Claude writes it — maybe it puts conflicting roles in the persona, uses vague language in constraints, only includes one example&lt;/li&gt;
&lt;li&gt;The hook fires automatically (PostToolUse on Edit/Write)&lt;/li&gt;
&lt;li&gt;promptdiff finds 3 errors: role confusion, vague constraints, too few examples&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The hook blocks the edit&lt;/strong&gt; and feeds the errors back to Claude&lt;/li&gt;
&lt;li&gt;Claude reads the feedback and rewrites the prompt — fixes the roles, tightens the language, adds more examples&lt;/li&gt;
&lt;li&gt;Hook fires again — clean. Passes silently.&lt;/li&gt;
&lt;li&gt;You get a well-structured prompt on the first try, without manually reviewing it&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;It's like giving Claude a pair-programmer that only knows about prompt quality. Claude writes the prompt, the linter reviews it, Claude fixes it. You just watch.&lt;/p&gt;

&lt;p&gt;The setup adds this to your &lt;code&gt;.claude/settings.json&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="nl"&gt;"PostToolUse"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"matcher"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"Edit|Write"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="nl"&gt;"hooks"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"type"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"command"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"promptdiff hook"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
        &lt;/span&gt;&lt;span class="nl"&gt;"timeout"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="w"&gt;
      &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
    &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You can configure it to be strict (block on warnings too), warn-only (never block), or default (block on errors only).&lt;/p&gt;

&lt;h2&gt;
  
  
  How it works (brief architecture)
&lt;/h2&gt;

&lt;p&gt;The key insight is that prompts aren't flat text — they're structured documents with typed sections. promptdiff's parser breaks a &lt;code&gt;.prompt&lt;/code&gt; file into:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontmatter&lt;/strong&gt; (YAML metadata: name, version, model, tags)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sections&lt;/strong&gt; (PERSONA, CONSTRAINTS, EXAMPLES, OUTPUT FORMAT, GUARDRAILS, etc.)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Every command works on this structured representation:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Diff&lt;/strong&gt; matches sections by type, not by line number. If you move your CONSTRAINTS section from line 5 to line 20, it doesn't show up as "deleted + added" — it shows up as "same section, maybe modified."&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lint&lt;/strong&gt; rules get the parsed structure, so &lt;code&gt;conflicting-constraints&lt;/code&gt; can compare the word limit in CONSTRAINTS against the actual word counts in EXAMPLES.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Score&lt;/strong&gt; evaluates five dimensions independently (structure, specificity, examples, safety, completeness) and aggregates them.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The whole thing is ~30 files, 3 runtime dependencies (&lt;code&gt;commander&lt;/code&gt;, &lt;code&gt;chalk&lt;/code&gt;, &lt;code&gt;js-yaml&lt;/code&gt;), and 217 tests at 94% coverage. No LLM required for any local command — the only thing that calls an API is &lt;code&gt;promptdiff compare&lt;/code&gt; for A/B testing, and even that supports local Ollama models.&lt;/p&gt;

&lt;p&gt;It also supports prompt composition (&lt;code&gt;extends&lt;/code&gt; + &lt;code&gt;includes&lt;/code&gt;), so you can DRY your prompts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight yaml"&gt;&lt;code&gt;&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;span class="na"&gt;name&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;support-agent-v2&lt;/span&gt;
&lt;span class="na"&gt;extends&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt; &lt;span class="s"&gt;./base-agent.prompt&lt;/span&gt;
&lt;span class="na"&gt;includes&lt;/span&gt;&lt;span class="pi"&gt;:&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./shared/safety-rules.prompt&lt;/span&gt;
  &lt;span class="pi"&gt;-&lt;/span&gt; &lt;span class="s"&gt;./shared/format.prompt&lt;/span&gt;
&lt;span class="nn"&gt;---&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Other things I didn't expect to be useful
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;promptdiff migrate&lt;/code&gt;&lt;/strong&gt; — takes a messy unstructured prompt (the kind you pasted into ChatGPT at 2am) and converts it into a structured &lt;code&gt;.prompt&lt;/code&gt; file. It auto-classifies lines: "You are..." goes to PERSONA, "Never..." goes to CONSTRAINTS, etc.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;promptdiff fix --apply&lt;/code&gt;&lt;/strong&gt; — auto-fixes lint issues. Adds missing sections, tightens vague language, suggests injection guards.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;promptdiff watch .&lt;/code&gt;&lt;/strong&gt; — live linting on file save. Like having &lt;code&gt;eslint --watch&lt;/code&gt; for your prompts while you iterate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;MLflow integration&lt;/strong&gt; — &lt;code&gt;promptdiff log-to-mlflow&lt;/code&gt; tracks prompt quality scores over time as MLflow experiments. Because if you're doing serious prompt engineering, you should be tracking regressions.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;npm &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-g&lt;/span&gt; promptdiff
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Scaffold a new prompt from a template&lt;/span&gt;
promptdiff new my-agent &lt;span class="nt"&gt;--template&lt;/span&gt; support

&lt;span class="c"&gt;# Lint it&lt;/span&gt;
promptdiff lint my-agent.prompt

&lt;span class="c"&gt;# Score it&lt;/span&gt;
promptdiff score my-agent.prompt

&lt;span class="c"&gt;# Hook into Claude Code&lt;/span&gt;
promptdiff setup &lt;span class="nt"&gt;--project&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The repo is at &lt;a href="https://github.com/HadiFrt20/promptdiff" rel="noopener noreferrer"&gt;github.com/HadiFrt20/promptdiff&lt;/a&gt;. It's MIT licensed, 217 tests, and I'm actively building on it.&lt;/p&gt;

&lt;p&gt;If you're writing prompts for production — especially if you're building agents — you probably need this. Or at minimum, you need &lt;em&gt;something&lt;/em&gt; like this. The days of yolo-shipping prompts with no review should be over.&lt;/p&gt;

&lt;p&gt;Prompts are code. Treat them like it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If this was useful, a star on the repo goes a long way. And if you have ideas for lint rules, I'd love PRs — adding a rule is about 30 lines of JavaScript.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>promptengineering</category>
      <category>cli</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
