Wrap Hermes Agent in a leash: USD caps + egress allowlist + audit log in 30 lines

#ai #hermesagentchallenge #python #opensource

Hermes Agent Challenge Submission

Last week I let a fresh LLM agent loose on a sandbox Stripe key just to see what it would do. Eleven minutes later it had ranged across seven endpoints I never approved, fanned out a paid embedding loop, and posted a charge twice. It would have kept going.

That's the gap. The reasoning gets better every quarter. The brakes don't.

This post is a short build with Hermes Agent running through agentleash, a tiny Python guardrail I published on PyPI today.

If you've been kicking the tires on Hermes Agent because you want something open-source that runs on your own infra, agentleash is the layer to put around it before you hand it real money or real external APIs.

What agentleash does

One context manager. Inside the block, every tool call goes through four guards.

USD per-run cap. Eight dollars max. The agent halts the moment a tool would push you over.
Per-call dollar cap. No single tool burns more than fifty cents.
Egress allowlist. Wildcard hostnames. *.stripe.com, api.openrouter.ai, your domain. Anything else gets dropped.
Optional JSON Schema gate. Arg validation before the side-effectful function runs. A negative charge or a malformed customer ID never reaches the tool.

Everything lands in an append-only JSONL audit log. Tool args are sha256-hashed by default so the trail proves a call happened without leaking PII.

No magic. ~280 LOC. Composes with any agent that calls tools, including Hermes Agent.

Why Hermes Agent + agentleash is a good pair

Hermes Agent is open source. You run it. You can read what it's doing. That's already half of safety.

The other half is what happens at the tool boundary. Hermes is doing the planning and reasoning. agentleash is the seat belt around the action layer. Hermes decides "call the Stripe charge tool with amount=X". agentleash gets to say yes or no before that side effect actually fires.

The two are cleanly separated by design. Hermes doesn't know agentleash exists. agentleash doesn't know Hermes exists. They meet at the Python function boundary.

The 30-line wire-up

from agentleash import Leash
from your_hermes_setup import HermesClient   # whatever you're using

leash = Leash(
    usd_cap=8.00,
    call_cap=50,
    allowed_hosts={"*.stripe.com", "api.openrouter.ai"},
    audit_path="runs/hermes-2026-05-20.jsonl",
)

hermes = HermesClient()

def stripe_charge(amount_usd: float, customer_id: str) -> dict:
    # real Stripe call here
    return {"id": "ch_demo", "amount": amount_usd}

with leash.session("hermes-payments-agent") as session:
    while not done:
        plan = hermes.next_step(state)
        if plan.tool == "stripe.charge":
            # agentleash: validate args, check budget, log, then run the tool
            result = session.tool(
                "stripe.charge",
                args=plan.args,
                handler=lambda a: stripe_charge(**a),
                schema={
                    "type": "object",
                    "properties": {
                        "amount_usd": {"type": "number", "minimum": 0.5, "maximum": 100.0},
                        "customer_id": {"type": "string", "pattern": "^cus_"},
                    },
                    "required": ["amount_usd", "customer_id"],
                },
                usd=plan.estimated_cost,
            )
            state = hermes.observe(result)

That's it. The agent loop is yours. agentleash slides in at the tool call boundary.

What you get in the audit log

{"ts":1779302689.66,"session_id":"hermes-payments-agent","kind":"tool_ok","tool":"stripe.charge","args_sha256":"f4b...","usd":0.30,"usd_run_total":2.40}
{"ts":1779302711.14,"session_id":"hermes-payments-agent","kind":"egress_denied","url":"https://random-c2.example/exfil","error":"host not in allowlist"}
{"ts":1779302718.02,"session_id":"hermes-payments-agent","kind":"usd_cap_hit","tool":"openai.embedding","usd_run_total":8.00,"would_have_been":8.34}

When the cap fires, the agent does NOT lose state. It halts AT the side-effect boundary. Whatever the LLM has already produced is still in memory. You decide whether to bump the cap and retry, or stop the run.

When this matters most

Paygentic agents that touch Stripe / Plaid / payments infra
Research agents that hit paid APIs (Anthropic, OpenAI, Cohere, etc.)
Scraping agents wrapped around Bright Data
Any Hermes Agent demo you're going to leave running unattended

If your agent never touches money and never leaves your VPN, you can skip this. Everyone else, please put brakes on the thing.

Try it

pip install agentleash

GitHub: https://github.com/MukundaKatta/agentleash
PyPI: https://pypi.org/project/agentleash/

Eleven tests, MIT licensed, zero heavy dependencies in the core. Pairs cleanly with birddog (audited Bright Data egress) and mantle-agent-attest (on-chain agent attestations) if you want the full safety stack.

Built for the Hermes Agent Challenge. Feedback welcome.

Submitted to the Build With Hermes Agent prompt. Solo entry. agentleash repo first-commit was earlier today.