DEV Community

Mukunda Rao Katta
Mukunda Rao Katta

Posted on

I had 800 lines of Hermes agent audit log. trace-tree turned it into a tree I could read.

Hermes Agent Challenge Submission: Build With Hermes Agent

This is a submission for the Hermes Agent Challenge.

I ran a Hermes agent overnight. In the morning I opened the audit log. There were 800 lines of JSONL. I scrolled for a minute, sighed, and closed the file.

The problem was not that the agent had failed. The problem was that I could not tell what it had done.

This is what the raw log looked like. Five lines, picked at random:

{"ts":1779638601.262,"session_id":"abc12","kind":"session_open","tool":null,"args_hash":null,"url":null,"usd":0.0,"error":null,"extra":{}}
{"ts":1779638601.265,"session_id":"abc12","kind":"tool_ok","tool":"locus.payments.charge","args_hash":"aeff9a9e","url":null,"usd":4.99,"error":null,"extra":{"latency_ms":12}}
{"ts":1779638601.266,"session_id":"abc12","kind":"budget_denied","tool":"locus.payments.charge","args_hash":"5715bbc0","url":null,"usd":7.0,"error":"budget exceeded: 11.99 > 10","extra":{"latency_ms":1}}
{"ts":1779638601.266,"session_id":"abc12","kind":"tool_denied","tool":"locus.payments.charge","args_hash":"84f50ae9","url":null,"usd":0.0,"error":"args invalid: -1.0 < min 0.5","extra":{}}
{"ts":1779638601.267,"session_id":"abc12","kind":"egress_denied","tool":null,"args_hash":null,"url":"https://evil.attacker.example/exfil","usd":0.0,"error":"host not in allowlist","extra":{}}
Enter fullscreen mode Exit fullscreen mode

You can read one of those, sure. Maybe two. By line forty your brain has tuned out. There is no shape to it. Every row is the same width. Every row carries six fields you do not care about and one you do, and the one you care about is in a different column each time.

What I actually wanted was a tree. A session is a root. Tool calls are children. Denied calls are children with an error attached. Stuff the agent tried to do that got blocked should look blocked. That is how I read traces in any other system, and there was no reason this log could not be drawn the same way.

So I wrote trace-tree.

What trace-tree does

It reads a JSONL audit log and prints a tree. That is the whole library.

pip install trace-tree
trace-tree runs/audit.jsonl
Enter fullscreen mode Exit fullscreen mode

The same five lines above render as this:

session-abc12 [4.99 USD, 1 call, 850 ms]
├─ session_open [ts=1779638601.262143]
├─ tool_ok locus.payments.charge [4.99 USD, 12 ms]
│  └─ args_hash=aeff9a9ed25b8e06
├─ budget_denied locus.payments.charge [7.00 USD attempted, 1 ms]
│  ├─ args_hash=5715bbc0d738a5a0
│  └─ error="budget exceeded: 11.99 > 10"
├─ tool_denied locus.payments.charge
│  ├─ args_hash=84f50ae9b21ff1d0
│  └─ error="args invalid: -1.0 < min 0.5"
├─ egress_denied
│  ├─ url=https://evil.attacker.example/exfil
│  └─ error="host not in allowlist"
└─ session_close [4.99 USD, 837 ms]
Enter fullscreen mode Exit fullscreen mode

I can read that. I can see the agent tried to charge a customer five dollars, the charge went through, then the agent tried to charge seven dollars and the budget guard stopped it, then the agent tried to charge a negative amount and arg validation stopped it, then the agent tried to talk to an attacker URL and the egress guard stopped it. The session ended with 4.99 USD actually spent. The whole run took 850 ms.

The session root carries an aggregate. Only tool_ok rows count toward spend, so attempted-and-denied charges are visible but they do not pollute the total. That single rule made the difference between a tree I could trust and a tree that lied about my bill.

Why a tree

Most agent audit tooling I see falls into two camps. Either you ship the log to a hosted observability vendor and view it in their UI, or you grep it. Both are too much for the case where you just want to know what the agent did in the last run.

The tree fits in the terminal. No login. No upload. No vendor lock. You open the file, you read the tree, you close the file. If the run was good you move on. If the run was bad you have a clear pointer at which step went wrong.

Reading several shapes

I have a bunch of small libraries that write audit logs in slightly different shapes. agenttrace writes parent_span_id and latency_ms. agentleash writes the agentleash shape you saw above with args_hash and a top-level error field. agentsnap writes a single object per run with a steps list nested inside. agent-step-log writes per-step rows with step instead of kind.

I did not want a separate reader for each one. So the parser normalizes all of them into one Event shape. It accepts a few common aliases per field:

concept accepted keys
event kind kind, event, type, step, name
tool name tool, tool_name, function, name
parent id parent_span_id, parent_id, parent
cost usd, cost_usd, cost, price_usd
latency latency_ms, duration_ms, elapsed_ms (top level or under extra)
error error, err, message (top level or under extra)

If your shape is none of those, the tree still draws something. The row just collapses to its kind. You lose nothing for trying.

Two tree modes

By default trace-tree groups events by session_id. Every session becomes a root. Every event in that session is a direct child. This matches what agentleash and agent-step-log actually emit, since those formats do not track per-call parent ids.

If your log does carry parent ids (like agenttrace does), you pass --parent-key parent_span_id and trace-tree builds a real nested tree. A tool call that triggers a sub-call shows up as a child of the parent. Same library, two modes, picked at the CLI.

from trace_tree import render_file, Tree

# Flat session view
print(render_file("runs/audit.jsonl"))

# Real parent-child nesting
tree = Tree.from_jsonl("runs/audit.jsonl", parent_key="parent_span_id")
print(tree.render(max_depth=10, show_timing=True))
Enter fullscreen mode Exit fullscreen mode

What it is not

trace-tree does not try to be a tracing system. It does not own the write side. It does not send anything anywhere. It does not parse non-JSONL files. It does not do colors (yet).

It reads a file and prints a tree. That is the whole thing. About 150 lines of stdlib Python.

Where it fits with my other Hermes libraries

trace-tree sits at the end of a small chain. agenttrace writes the log. agentleash writes a stricter log with budget and egress guards. agentsnap snapshots tool-call traces. trace-tree reads any of those and gives you something you can actually look at.

If you run a Hermes agent for any real workload, sooner or later you will end up staring at a JSONL file and wondering what happened. This is a tiny tool for that moment.

Try it

pip install trace-tree
trace-tree your-log.jsonl
Enter fullscreen mode Exit fullscreen mode

Repo: https://github.com/MukundaKatta/trace-tree

Top comments (0)