Thread a Run ID Through Every Agent Call So You Can Debug Anything

#hermeschallenge #ai #python #agents

Something went wrong in production. The logs say "tool call failed" at 3:14am. Which user? Which session? Which specific tool call in which specific agent run?

Without a run ID threaded through every log entry, every tool call result, and every LLM request, "something went wrong" is where your investigation starts and where it ends.

agent-run-id generates, threads, and propagates run IDs through your agent code.

The Shape of the Fix

from agent_run_id import RunId, RunContext

# Generate a run ID at the entry point
run_id = RunId.generate()

# Use it everywhere
with RunContext(run_id=run_id):
    logger.info("agent_start", run_id=run_id.value)

    result = call_tool("search_web", query=query, run_id=run_id)

    response = call_llm(messages, extra_headers={"X-Run-Id": run_id.value})

    logger.info("agent_complete", run_id=run_id.value, tokens=response.usage.input_tokens)

One ID per agent run. Threaded through logs, tool calls, LLM requests, and error reports. When something goes wrong, search your logs for the run ID and see everything that happened.

What It Does NOT Do

agent-run-id does not centralize logging. It generates and manages IDs. Your logging infrastructure decides what to do with them.

It does not propagate IDs across process boundaries automatically. If your agent spawns a subprocess or makes an HTTP request to another service, you must pass the ID explicitly via headers, query params, or message body.

It does not correlate IDs across parent/child agent runs. If your outer agent spawns inner agents, each gets its own run ID. Correlation across them is a separate concern (store the parent run ID in the child's context).

Inside the Library

RunId is a thin wrapper around a UUID:

import uuid

class RunId:
    def __init__(self, value: str):
        self.value = value

    @classmethod
    def generate(cls) -> "RunId":
        return cls(str(uuid.uuid4()))

    @classmethod
    def from_string(cls, s: str) -> "RunId":
        # Validate UUID format
        uuid.UUID(s)  # raises ValueError if invalid
        return cls(s)

    def short(self) -> str:
        """First 8 chars for display in logs."""
        return self.value[:8]

    def __str__(self) -> str:
        return self.value

    def __repr__(self) -> str:
        return f"RunId({self.value!r})"

RunContext is a context manager that stores the current run ID in a contextvars.ContextVar:

from contextvars import ContextVar

_current_run_id: ContextVar[RunId | None] = ContextVar("run_id", default=None)

class RunContext:
    def __init__(self, run_id: RunId):
        self._run_id = run_id
        self._token = None

    def __enter__(self):
        self._token = _current_run_id.set(self._run_id)
        return self

    def __exit__(self, *args):
        _current_run_id.reset(self._token)

def current_run_id() -> RunId | None:
    return _current_run_id.get()

ContextVar is thread-safe and async-safe. Each thread/coroutine has its own value. This means you can run multiple concurrent agent runs in the same process and each sees its own run ID via current_run_id().

The log integration pattern: a logging filter that adds run_id to every log record automatically:

class RunIdFilter(logging.Filter):
    def filter(self, record):
        run_id = current_run_id()
        record.run_id = run_id.short() if run_id else "no-run"
        return True

logging.getLogger().addFilter(RunIdFilter())

After this, every log line includes the current run ID without any explicit argument passing.

When to Use It

Use it from day one for any agent that handles multiple concurrent users or tasks. Adding run IDs retroactively to a production system requires touching every log call — much easier to start with them.

Use it for batch jobs where you need to correlate logs across multiple items in the same batch, or distinguish runs from Monday's batch vs Tuesday's batch.

The short form run_id.short() (first 8 chars) is good for display in logs where space is limited. The full UUID is good for trace correlation and exact lookups.

Install

pip install git+https://github.com/MukundaKatta/agent-run-id

from agent_run_id import RunId, RunContext, current_run_id
import logging

# Setup once at startup
class RunIdFilter(logging.Filter):
    def filter(self, record):
        rid = current_run_id()
        record.run_id = str(rid) if rid else "-"
        return True

root_logger = logging.getLogger()
root_logger.addFilter(RunIdFilter())

# In your request handler
async def handle_agent_request(user_id: str, task: str) -> dict:
    run_id = RunId.generate()

    with RunContext(run_id=run_id):
        logger.info("agent_started", user_id=user_id, task=task[:50])

        try:
            result = await run_agent(task)
            logger.info("agent_completed", result_length=len(str(result)))
            return {"run_id": str(run_id), "result": result}
        except Exception as e:
            logger.error("agent_failed", error=str(e), exc_info=True)
            return {"run_id": str(run_id), "error": str(e)}

Sibling Libraries

Library	What it solves
`agenttap`	Wire-level LLM call capture (use run_id as session key)
`agent-decision-log`	WHY-layer decisions (include run_id in every entry)
`agent-step-log`	Per-step JSONL logging (keyed by run_id)
`agentsnap`	Usage snapshots (tag with run_id for correlation)
`agent-event-bus`	In-process pub/sub (include run_id in event payloads)

The observability stack with run IDs: agent-run-id generates the ID, RunContext propagates it automatically, the logging filter adds it to every log line, agenttap captures wire-level calls tagged with the run ID, agentsnap records usage under the run ID.

What's Next

Parent/child run correlation: RunId.generate(parent=parent_run_id) that records the parent relationship. When an outer agent spawns an inner agent, the inner agent's run ID records the parent, enabling tree-shaped trace navigation.

OpenTelemetry integration: RunContext could set a span attribute so that the run ID is automatically propagated through OTel traces. This would correlate agent run IDs with distributed traces without manual plumbing.

Run registry: a RunRegistry that stores run metadata (start time, user ID, task summary, status) and allows lookup by run ID. This is the beginning of an agent observability dashboard — run IDs are the primary key.

Built as part of the agent-stack family: composable Python primitives for production LLM agents.