DEV Community

Mukunda Rao Katta
Mukunda Rao Katta

Posted on

I logged 300 Hermes runs to one file. trace-session-split cut it into 300.

Hermes Agent Challenge Submission: Build With Hermes Agent

This is a submission for the Hermes Agent Challenge.

Early in my Hermes agent project I made a classic mistake: I logged everything to one file.

It made sense at the time. One file per day, all runs concatenated. After 30 days I had a file with 300 runs and maybe 15,000 events. Every analysis tool I had worked on single-run files. I needed to split the log.

trace-session-split does that in one command.

One command

python3 -m trace_session_split all_runs.jsonl ./runs/
Enter fullscreen mode Exit fullscreen mode
'run-2026-05-01-a'    -> runs/run-2026-05-01-a.jsonl
'run-2026-05-01-b'    -> runs/run-2026-05-01-b.jsonl
'run-2026-05-02-a'    -> runs/run-2026-05-02-a.jsonl
...
300 files written to runs/
Enter fullscreen mode Exit fullscreen mode

The tool auto-detects the split key by trying run_id, session_id, session, run, trace_id, and lane in that order against the first event. If your logs use a different field name, pass --key my_field.

Then the whole toolchain works again

Once I had 300 per-run files, I could use the rest of my tools normally:

# Cost distribution across all runs
for f in runs/*.jsonl; do python3 -m trace_cost "$f"; done | grep "total:"

# Find the run with anomalous latency
for f in runs/*.jsonl; do
    echo "=== $f ==="
    python3 -m trace_anomaly "$f" duration_ms
done

# Stats across all runs
cat runs/*.jsonl | python3 -m trace_stats /dev/stdin duration_ms
Enter fullscreen mode Exit fullscreen mode

Python API

from trace_session_split import split_file, split_by_key, write_splits, load_jsonl

# One-shot
written = split_file("all_runs.jsonl", "./runs/")

# Or step by step for more control
events = load_jsonl("all_runs.jsonl")
groups = split_by_key(events, key="run_id")

for key, run_events in groups.items():
    print(f"{key}: {len(run_events)} events")

written = write_splits(groups, "./runs/", prefix="run_")
Enter fullscreen mode Exit fullscreen mode

What split_by_key returns

A dict mapping field value -> list of events (in input order). You can work with these groups in memory instead of writing to disk:

groups = split_by_key(events, key="session_id")

# Compute stats per session without writing files
from trace_stats import field_stats
for session_id, sess_events in groups.items():
    s = field_stats(sess_events, "duration_ms")
    print(f"{session_id}: p95={s.p95:.0f}ms n={s.count}")
Enter fullscreen mode Exit fullscreen mode

Handles mixed-key logs

If some events are missing the split key (e.g., a header event that predates the run_id), they're grouped under the special key "" (empty string). You can drop that group or handle it separately.

Safe filenames

Special characters in key values (slashes, colons, spaces) are replaced with underscores by default. run/2026:05:01 becomes run_2026_05_01.jsonl. Pass safe_filenames=False to skip that.

Technical notes

15 tests. Zero runtime dependencies. Python 3.10+. The test suite covers explicit key splitting, auto-detection of all 6 candidate keys, missing-key grouping under "", filename safety, prefix option, the end-to-end split_file path, and the load_jsonl error case.

Repo: https://github.com/MukundaKatta/trace-session-split

pip install trace-session-split
Enter fullscreen mode Exit fullscreen mode

Top comments (0)