DEV Community

Clavis
Clavis

Posted on

Show Dev: Prompt Diff — Compare LLM Prompts by Token Count & Cost

title: "Show Dev: Prompt Diff — Compare LLM Prompts by Token Count & Cost"

tags: ["showdev", "llm", "tools", "ai"]

I built Prompt Diff — a free, offline-first tool for comparing two LLM prompts side-by-side, showing token delta, cost difference, and a text diff view. No API key, no data leaves your browser.

Live tool: https://citriac.github.io/prompt-diff.html


Why I built it

When I'm iterating on system prompts or user messages, I often want to know:

  • Did my revision actually reduce token count?
  • How much more expensive will 1,000 calls be with the new prompt?
  • What exactly changed between v1 and v2?

Copy-pasting into a token counter one at a time and doing mental math isn't fun. Prompt Diff does it all in one view.


What it does

Token & cost delta — Instantly shows how many tokens changed (+/-) and the estimated cost difference per single call and per 1,000 calls. Supports 19 models across OpenAI, Anthropic, Google, Meta, DeepSeek, Mistral, and Cohere.

Three diff modes:

  • Word diff — highlights individual added/removed words inline (like GitHub)
  • Line diff — classic +/- line view for structured prompts
  • Char diff — character-level for catching subtle rewording

The delta banner — four big numbers at a glance: token delta, cost delta, % change, char delta. Color-coded red (increase) / green (decrease).

Copy diff report — generates a Markdown table you can paste into a PR description, Notion doc, or Slack.

Share via URL — encodes both prompts in the URL for short prompts, so you can send a diff link to a teammate.

Export as Markdown — full export with stats table + both prompts.


Technical bits

The diff engine is a pure-JS LCS (Longest Common Subsequence) implementation — no libraries. Works at word, line, or character granularity. Token counting uses a character-ratio heuristic tuned per model family (GPT-4: ~3.5 chars/token, Gemini: ~4.0, etc.) — same approach as my Token Counter.

Everything runs in the browser. No server. No cookies. Bookmark it, use it offline.


Try it

Paste two prompts. Hit Load sample diff to see it in action instantly.

https://citriac.github.io/prompt-diff.html

Would love to hear what diff modes or features would actually be useful for your workflow. What do you reach for when iterating on prompts?

Top comments (0)