My agent started lying. It took me two days to realize the system prompt had changed.

#hermesagentchallenge #devchallenge #agents #python

Hermes Agent Challenge Submission: Write About Hermes Agent

This is a submission for the Hermes Agent Challenge.

My Hermes research agent started giving shorter, vaguer answers. I assumed it was a model issue. I tweaked temperatures, changed sampling params, added more context. Two days later I found the real cause: someone had edited the system prompt template to "clean it up," and the rendered output had changed.

I had no audit trail. No way to know when the prompt had changed or what it had looked like before.

That's prompt-version-pin.

The idea

Pin a prompt at the moment you decide it's correct. At runtime, verify the actual prompt matches the pin. If it drifted, you know immediately — before the agent produces a single output — instead of debugging backward from weird behavior.

One function to pin

from prompt_version_pin import pin_prompt

pin = pin_prompt("You are a research agent. Be concise.", label="research-v1")
pin.save("pins/research-agent.json")

The pin file:

{
  "label": "research-v1",
  "hash": "sha256:3a7f9c...",
  "pinned_at": "2026-05-24T14:22:01Z",
  "length": 38
}

One function to verify

from prompt_version_pin import PromptPin, verify_prompt

pin = PromptPin.load("pins/research-agent.json")
result = verify_prompt(actual_system_prompt, pin)

if not result.ok:
    raise RuntimeError(str(result))

[FAIL] research-v1: prompt has drifted
  expected: sha256:3a7f9c...
  actual:   sha256:b8c2d1...

Put that check at agent startup. If the prompt changed, the agent never runs. You get an error, not a mystery.

Works with message lists

Most Anthropic agents use a messages list, not a raw string. prompt-version-pin handles that too — and by default only pins the system message, not user turns:

from prompt_version_pin import pin_messages, verify_messages

messages = [
    {"role": "system", "content": "Be helpful."},
    {"role": "user", "content": "Hello"},
]
pin = pin_messages(messages, label="v1")

# Later — user message changed, system prompt is the same
modified = [
    {"role": "system", "content": "Be helpful."},
    {"role": "user", "content": "A completely different question"},
]
result = verify_messages(modified, pin)
assert result.ok  # True — only the system message is pinned

Supports Anthropic content blocks (content: [{"type": "text", "text": "..."}]) automatically.

Normalization prevents false alarms

The library strips trailing whitespace from each line and normalizes line endings to LF before hashing. Copy-pasting from a Windows machine, a trailing space added by your editor — these don't trigger false positives. Real content changes do.

pin1 = pin_prompt("line1  \nline2", label="a")   # trailing spaces
pin2 = pin_prompt("line1\r\nline2", label="b")    # CRLF
pin3 = pin_prompt("line1\nline2", label="c")      # clean

assert pin1.hash == pin2.hash == pin3.hash  # all match

CLI for shell scripts and CI

# Pin a prompt
prompt-version-pin pin "You are a helpful assistant." \
  --label my-agent \
  -o pins/agent.json

# Pin from a file
cat system_prompt.txt | prompt-version-pin pin - --label my-agent -o pins/agent.json

# Verify (exit 0 = ok, exit 1 = drifted)
prompt-version-pin verify "$(cat system_prompt.txt)" pins/agent.json

The exit code makes this composable with CI checks. Add a step that verifies your agent's production prompt against the pinned version in your repo.

The full VerifyResult

When verification fails, there's more than a boolean:

@dataclass
class VerifyResult:
    ok: bool
    label: str
    expected_hash: str   # what was pinned
    actual_hash: str     # what you gave it
    length_mismatch: bool  # quick hint: did length change too?

length_mismatch is a fast sanity signal. If the hash differs but the length is the same, that's a subtle edit. If both differ, something larger changed.