I Built "Git for Prompts" — Here is What 332 Tests Taught Me
I was managing 50+ LLM prompts in Google Docs.
It broke my production AI 3 times in one month.
Each time, I spent hours manually testing versions to find what changed.
Sound familiar?
The Problem
Git works great for code. But prompts are different:
- Semantic changes matter more than text diff — changing "be concise" to "be thorough" is a behavioral shift
- Version history is scattered — Google Docs, Notion, or worse, inline comments
- No way to query by performance — Which version had the best success rate?
- Sharing improvements is manual — Copy-paste and hope you do not break anything
I needed version control that understands prompts, not just tracks text.
Meet PIT (Prompt Information Tracker)
pip install prompt-pit
PIT is "Git for prompts" — semantic version control designed for LLM workflows.
1. Binary Search for Broken Versions
Your AI started giving weird answers. Which version broke it?
pit bisect start --failing-input "why is the sky blue?"
pit bisect good v1
pit bisect bad v50
Binary search finds the culprit. Minutes, not hours.
2. Time-Travel Replay
Same input. 50 versions. Instant comparison.
pit replay run my-prompt --input "Hello" --all
See exactly how behavior evolved. No more "it worked yesterday" mysteries.
3. Query by Behavior
Find versions that actually matter:
pit log --where "success_rate > 0.9"
pit log --where "content contains 'be concise' AND tags contains production"
Query by metrics, not just metadata.
4. Shareable Patches
Your teammate improved a prompt. You want that improvement.
pit patch create prompt v1 v2 --output fix.patch
pit patch apply fix.patch --to my-prompt
Like Git patches, but for prompt semantics.
5. Git-Style Hooks
Prevent bad prompts from reaching production:
pit hooks install pre-commit
# Scans for security issues before every commit
CI/CD for prompts. Finally.
6. Dependencies
Your prompts depend on other prompts. Track it:
pit deps add shared github org/repo/prompts --version v1.0
pit deps install
Like npm for prompts. Version-lock everything.
The Full Feature Set
| Feature | What It Does |
|---|---|
| Bisect | Binary search to find broken versions |
| Replay | Test same input across all versions |
| Patches | Export/import prompt changes |
| Hooks | Pre-commit, post-checkout automation |
| Bundles | Package and share prompts |
| Query Language | Search by behavior metrics |
| Dependencies | External prompt packages |
| Worktrees | Multiple contexts without switching |
| Stash | Save WIP with test context |
| Semantic Merge | Smart conflict detection |
332 tests. Production-ready. Open source.
Why This Matters
Prompts are becoming critical infrastructure.
Just like we do not deploy code without version control, we should not deploy prompts without it either.
PIT brings software engineering discipline to prompt engineering:
- Traceability (who changed what, when, why)
- Reproducibility (checkout any version instantly)
- Collaboration (patches, bundles, dependencies)
- Quality (hooks, testing, metrics)
Try It
pip install prompt-pit
pit init
pit add my-prompt.md --name "my-prompt"
pit commit my-prompt --message "Initial version"
⭐ Star it on GitHub: github.com/itisrmk/pit
What is Your Biggest Prompt Management Pain?
I built PIT to solve my own headaches.
But I am curious — what frustrates you most about managing prompts in production?
Drop a comment below 👇
PIT is free, open source (MIT), and built with Python + Rich + Typer.
Top comments (0)