This is a submission for the Hermes Agent Challenge.
Early in my Hermes agent project I made a classic mistake: I logged everything to one file.
It made sense at the time. One file per day, all runs concatenated. After 30 days I had a file with 300 runs and maybe 15,000 events. Every analysis tool I had worked on single-run files. I needed to split the log.
trace-session-split does that in one command.
One command
python3 -m trace_session_split all_runs.jsonl ./runs/
'run-2026-05-01-a' -> runs/run-2026-05-01-a.jsonl
'run-2026-05-01-b' -> runs/run-2026-05-01-b.jsonl
'run-2026-05-02-a' -> runs/run-2026-05-02-a.jsonl
...
300 files written to runs/
The tool auto-detects the split key by trying run_id, session_id, session, run, trace_id, and lane in that order against the first event. If your logs use a different field name, pass --key my_field.
Then the whole toolchain works again
Once I had 300 per-run files, I could use the rest of my tools normally:
# Cost distribution across all runs
for f in runs/*.jsonl; do python3 -m trace_cost "$f"; done | grep "total:"
# Find the run with anomalous latency
for f in runs/*.jsonl; do
echo "=== $f ==="
python3 -m trace_anomaly "$f" duration_ms
done
# Stats across all runs
cat runs/*.jsonl | python3 -m trace_stats /dev/stdin duration_ms
Python API
from trace_session_split import split_file, split_by_key, write_splits, load_jsonl
# One-shot
written = split_file("all_runs.jsonl", "./runs/")
# Or step by step for more control
events = load_jsonl("all_runs.jsonl")
groups = split_by_key(events, key="run_id")
for key, run_events in groups.items():
print(f"{key}: {len(run_events)} events")
written = write_splits(groups, "./runs/", prefix="run_")
What split_by_key returns
A dict mapping field value -> list of events (in input order). You can work with these groups in memory instead of writing to disk:
groups = split_by_key(events, key="session_id")
# Compute stats per session without writing files
from trace_stats import field_stats
for session_id, sess_events in groups.items():
s = field_stats(sess_events, "duration_ms")
print(f"{session_id}: p95={s.p95:.0f}ms n={s.count}")
Handles mixed-key logs
If some events are missing the split key (e.g., a header event that predates the run_id), they're grouped under the special key "" (empty string). You can drop that group or handle it separately.
Safe filenames
Special characters in key values (slashes, colons, spaces) are replaced with underscores by default. run/2026:05:01 becomes run_2026_05_01.jsonl. Pass safe_filenames=False to skip that.
Technical notes
15 tests. Zero runtime dependencies. Python 3.10+. The test suite covers explicit key splitting, auto-detection of all 6 candidate keys, missing-key grouping under "", filename safety, prefix option, the end-to-end split_file path, and the load_jsonl error case.
Repo: https://github.com/MukundaKatta/trace-session-split
pip install trace-session-split
Top comments (0)