DEV Community

Aman Bhandari
Aman Bhandari

Posted on

The five-gate publishing contract that keeps Claude-written READMEs defensible

When Claude writes your README, every claim in it becomes a hostage to fortune. An interviewer opens the repo. An ATS scanner parses the bullets. A recruiter AI drills into a link from your resume. If any sentence in there points at a file, a command, or a capability that does not actually behave as advertised, your credibility collapses in the 90 seconds it takes them to notice.

The fix is not to stop using Claude for public artifacts. The fix is a pre-push contract — five gates in a bash script — that takes five minutes to re-run before every push and reduces the blast radius of a bad claim from "career-ending" to "caught on disk."

I run this contract on every public repo in my framework. Reference surfaces: claude-code-agent-skills-framework and claude-code-mcp-qa-automation.

Gate 0 — Claim-evidence mapping (the hardest gate)

Before any artifact pushes, create docs/claim-evidence.md. Every README claim and every resume line pointing at the artifact maps to a specific file, command, or screenshot that evidences it. Every row is marked verified before push.

| Claim | Evidence | Verified |
|---|---|---|
| "8 Python modules implement the pipeline" | `src/pipeline/` has 8 .py files, each with tests | ✅ |
| "Byte-identical output under the same flags" | `tests/test_determinism.py::test_html_hash` | ✅ |
| "7-table SQLite trending store" | `src/trending/schema.sql` has 7 CREATE TABLE | ✅ |
| "Sub-agent fan-out coordinator" | `src/coordinator.py::run_parallel` uses ThreadPoolExecutor | ✅ |
Enter fullscreen mode Exit fullscreen mode

The discipline is not the table itself. It is that every aspirational sentence in the README forces a row. If a claim cannot produce evidence, it is either reworded (to something you can evidence) or deleted. No orphan claims survive to push.

Gate 1 — Hype-word deny list

Run the grep before every push. Every hit is operationalized or deleted.

HYPE_WORDS='production-ready|production-grade|enterprise-grade|robust|seamless|scalable|best-in-class|world-class|state-of-the-art|cutting-edge|unlock|leverage|streamline|battle-tested|proven|comprehensive|powerful'

grep -r -i -n -E "$HYPE_WORDS" README.md docs/ || echo "✓ no hype words"
Enter fullscreen mode Exit fullscreen mode

"Operationalize or delete" means: if you genuinely want to claim "robust," replace it with the measurable behavior — "retries with exponential backoff on 5xx, p99 under 200ms under 100 RPS." If you cannot name the measurable behavior, the original sentence was vibes and it goes.

This is the same pattern as Gate Zero for prompts. At the prompt, you catch the vibe before it becomes a spec. At publish, you catch the vibe before it becomes a public claim.

Gate 2 — Runnable demo from a fresh clone

Every command in the README must run from a fresh clone and produce the advertised output. Not "works on my machine." Not "runs after you install these seven unnamed dependencies." Fresh clone, one make target or one README.md command block, produces the screenshot the README promises.

Test protocol: git clone into /tmp/demo-$$, run the advertised command, diff the output against the claimed output. If sanitization (secret removal, placeholder substitution) broke the demo, you do not ship the broken demo and add a comment. You narrow the scope or rebuild against simpler fixtures.

This gate catches the failure mode where a repo looks impressive in the README and collapses when anyone actually runs it — which is exactly what a curious interviewer will do.

Gate 3 — Private-identifier grep

Session-specific. Anything from the private corpus that must never appear in a public artifact: internal project names, client names, private repo paths, personal email addresses, workspace paths, coworker names. Maintain the list explicitly — new names get added when a new private context enters.

PRIVATE='taksha|ai-engineer-lab|/home/ubuntu/workspace|bhandari\.aman0101'

grep -r -n -E "$PRIVATE" README.md docs/ src/ && {
    echo "BLOCKED — private identifier leaked"
    exit 1
}
Enter fullscreen mode Exit fullscreen mode

The grep runs on everything, not just docs. A private path checked into a config file is the same leak as one in the README. The block must be exit-code-fatal — warnings get ignored, exits do not.

Gate 4 — Secret pattern grep

Not session-specific. The same patterns run on every repo.

SECRETS='api[_-]?key|secret[_-]?key|password|BEGIN (RSA|OPENSSH) PRIVATE KEY|ghp_[A-Za-z0-9]{36}|sk-[A-Za-z0-9]{40,}'

grep -r -n -E -i "$SECRETS" . --exclude-dir=.git && {
    echo "BLOCKED — secret leaked"
    exit 1
}
Enter fullscreen mode Exit fullscreen mode

Do not roll this yourself in production — use gitleaks, trufflehog, or a pre-commit hook. The grep above is the floor, not the ceiling. It catches the obvious leaks; the real tools catch more. A floor is still better than no check at all.

Gate 5 — Artifact-specific checks

Every repo has its own failure modes the generic gates cannot see. The gate file names them explicitly and runs them before push.

Examples of what an artifact-specific check looks like:

  • A repo that ships HTML reports: grep 'src=' report.html must return zero external references (inline CSS, no JS, no CDN — ensures reports render offline without surprises).
  • A repo that claims "byte-identical output under the same flags": a determinism test that runs the pipeline twice and diffs the outputs.
  • A repo that claims "offline-capable": run the advertised command with --network=none in a container and verify it still works.

The artifact-specific gate is where the README's differentiation claim gets tested. If the repo's pitch is "deterministic," the test is for determinism. If the pitch is "offline," the test is offline.

The integrity-check.sh skeleton

#!/usr/bin/env bash
set -euo pipefail

echo "Gate 0 — claim-evidence mapping"
grep -q '| ✅ |' docs/claim-evidence.md || { echo "no verified claims"; exit 1; }
grep '| ❌ \|pending' docs/claim-evidence.md && { echo "unverified claims"; exit 1; }

echo "Gate 1 — hype-word deny"
! grep -r -i -n -E "$HYPE_WORDS" README.md docs/

echo "Gate 2 — fresh-clone demo"
./scripts/run-demo-fresh.sh

echo "Gate 3 — private-identifier grep"
! grep -r -n -E "$PRIVATE" README.md docs/ src/

echo "Gate 4 — secret grep"
! grep -r -n -E -i "$SECRETS" . --exclude-dir=.git

echo "Gate 5 — artifact-specific"
./scripts/artifact-checks.sh

echo "✓ integrity check passed"
Enter fullscreen mode Exit fullscreen mode

Wire the script into pre-push if you trust the hook, or run it manually. The discipline is "re-run before every push," not "configure a CI job once and forget." A CI-only check is a check you stop reading.

Why this earns its presence

This is not code review. It is reputation-risk review — the quality bar is higher because the audience is not a teammate, it is a stranger deciding whether to take you seriously.

The gates exist because every one of them corresponds to a failure mode I have seen (in my own drafts or in public repos I have read): a README that claimed "production-ready" for a script that did not handle retries. A demo that could not be run from a fresh clone. A commit that leaked an API key into the public history. A claim that could not be backed by any file in the repo.

The total cost of the five gates is five minutes before each push. The cost of one bad claim discovered by an interviewer is the interview. The asymmetry is what makes the contract earn its presence every time.

Pick one public repo you own. Write the first ten rows of its claim-evidence.md. You will either tighten ten README lines or delete ten README lines. Either outcome is a win.


Aman Bhandari. Operator of an AI-engineering research lab running Claude Opus as the coaching partner, plus a QA-automation surface shipping against a real sprint workload. Public artifacts: claude-code-agent-skills-framework and claude-code-mcp-qa-automation. github.com/aman-bhandari.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.