ithiria894

Posted on Apr 7 • Edited on Apr 15

I Dissected a 70MB Claude Code Session. 93% Was Noise.

#ai #showdev #opensource #productivity

I wrote a session distiller for Claude Code that cuts 70MB sessions down to 7MB. That post covered the what and how. This one covers the why, with data.

Before I wrote a single line of code, I needed to answer three questions:

What's actually inside a 70MB session?
What's safe to throw away?
How do I prove I'm not losing anything that matters?

Dissecting a 70MB Session

I opened a real 70MB session JSONL and categorized every byte. Here's the breakdown:

JSON envelope (sessionId, cwd, version, gitBranch):  ~54%
Tool results (Read, Bash, Edit, Write, Agent):        ~25%
Base64 images (screenshots, UI captures):             ~12%
Thinking blocks (internal reasoning):                  ~4%
Actual conversation text:                              ~3%
Progress lines, file-history-snapshots:                ~2%

That first line is the surprise. Every single JSONL line repeats the same envelope fields: sessionId, userType, cwd, version, gitBranch, entrypoint. Same values, every line. On a 70MB file with thousands of lines, that's 38MB of identical JSON keys and values repeated over and over.

The actual conversation, the words you and Claude exchanged, is 3% of the file. Everything else is either redundant metadata or tool output that served its purpose hours ago.

Why Tool Results Are Safe to Strip

Not all tool results are equal. Each tool type has different re-obtainability:

Tool	Safe to strip?	Why
Read	Yes	The file is still on disk. Claude can `Read` it again in 50ms. Keeping 28MB of file contents that haven't changed is pure waste.
Bash	Mostly	Build outputs, test runs, `git log` results. Stale the moment they were captured. Keep head 5 + tail 5 lines: the command itself and whether it succeeded or failed.
Edit	Partially	The file path and what changed matter. But we don't need the full file content. Keep `old_string` and `new_string` previews (200 chars each), enough to remember the intent.
Write	Partially	Same idea as Edit. Keep file path and a head/tail preview.
Agent	Keep more	Research reports and analysis from subagents contain synthesized knowledge. Up to 2000 chars preserved.
Screenshots	Yes	Base64 images from hours ago showing yesterday's UI state. Claude can't even display them after a session grows past certain limits.

There's a research finding that backs this up. A JetBrains NeurIPS 2025 study tested two approaches to handling tool outputs in coding agents: observation masking (replacing results with placeholders) vs LLM summarization. Task performance was identical. The model doesn't need the raw output once it's already processed it and responded. The response is the knowledge.

Or as one researcher put it: "When Claude reads 847 lines and responds 'this uses JWT with refresh tokens in httpOnly cookies,' that sentence is the knowledge. The 847 lines were consumed to produce it."

The Senior Engineer Audit

Before shipping, I had an AI agent role-play as a senior engineer and audit the entire distiller design. The audit found real bugs and real design problems, categorized by severity.

P0 — Breaks resume:

Dangling tool_results. 164 MCP tool messages (chrome-devtools, TodoWrite) were being dropped, but their corresponding tool_result blocks in user messages weren't. Resume would show orphan results with no matching tool call. Claude gets confused.
MCP tool inputs missing. Fields like selector, text, key from chrome-devtools weren't in the generic fallback handler. Click and type actions became empty hints, then got dropped entirely.

P1 — Silent data corruption:

Parallel tool call mismatch. The distiller tracked a global lastToolName variable. When Claude fires 3 parallel tool calls in one message, only the last tool name gets remembered. All 3 results get the wrong distillation rule applied. Fix: use tool_use_id matching instead of position tracking.
93% reduction too aggressive. The initial approach stripped too much content to hit a target number. The audit revealed that redundant JSON envelope fields account for 54% of file size. Strip the envelope instead of stripping content, and you hit 60-80% reduction while preserving more meaningful data.

P2 — Quality issues:

Agent result limit too low (600 chars → raised to 2000)
Bash results only kept tail (headers disappear) → changed to head 5 + tail 5
Thinking blocks all dropped → now preserve first 200 chars of decision rationale

The P1 #3 bug (parallel tool calls) was the scariest. It would have silently applied Read-stripping rules to Bash results and vice versa. No error, no crash, just wrong data in the output. The kind of bug you don't notice until someone resumes a distilled session and Claude has amnesia about half the commands that were run.

Proving Information Retention

I ran the distiller against a real session and then had two independent agents reconstruct what happened in the session from (a) the original and (b) the distilled version. Then I compared:

Content	Original	Distilled	Missed?
Project objectives and goals	Yes	Yes	—
Market research findings	Yes	Yes	—
Code architecture decisions	Yes	Yes	—
File changes (which files, what changed)	Yes	Yes	—
Feature modes and configurations	Yes	Yes (via tooltip/cinematic/plain)	—
FFmpeg multi-threading bug details	Yes (detailed)	No	Distilled missed
Zoom merge logic bug specifics	Yes (detailed)	No	Distilled missed
Event timing bug (end < start)	Yes (detailed)	No	Distilled missed
New file creation (tooltip.js, bubble.js)	No	Yes	Original missed
Git push SSH configuration	No	Yes	Original missed
Demo GIF file sizes	No	Yes (sizes noted)	Original missed
API tool switch (Tavily → Exa)	No	Yes	Original missed

Overall information retention: 85-90%. The distilled version actually caught some things the original-based reconstruction missed, because the distiller preserves conversation text verbatim while the original is so large that even an AI agent skims past details buried in thousands of lines of tool output.

The 10-15% loss is debugging process details, specific error messages buried deep in Bash outputs. These are things you'd need the backup for, which is why the distiller always creates one.

Why Not LLM Summarization?

Claude Code itself uses LLM summarization for its auto-compact feature. There's a good reason for that in their context, and a good reason I chose differently.

Why CC uses summarization:

Prompt cache economics. Summary goes at the start of the conversation as a stable prefix. Every subsequent API call gets a cache hit on that prefix. Extractive filtering changes the content each time, invalidating the cache. The token savings from distillation get eaten by cache misses.
API constraints. Claude's API requires strict alternating user/assistant turns. Extractive filtering can easily break that alternation if you're not careful about which messages you keep.

Why I chose extractive:

Zero cost. No API calls, no tokens burned. The distiller runs in under 2 seconds on a 70MB file using pure JSON parsing.
Zero hallucination risk. LLM summarization can lose details or introduce inaccuracies. Extractive filtering keeps original words or drops them. Nothing in between.
Different use case. CC auto-compact runs during live sessions where cache hits matter. Distillation runs offline, after the session, where cache economics don't apply.

The trade-off is real. Summarization gives you better compression. Extractive gives you perfect fidelity on what it keeps. For a tool whose job is "make old sessions resumable without losing context," I'll take fidelity.

Four Versions to Get It Right

The distiller went through four iterations. Here's how each one improved:

Feature	V1	V2	V3	V4
Output size	7.2M (90%)	5.3M (93%)	7.2M (90%)	7.1M (90%)
Edit handling	Path only	old/new 200ch	old/new 200ch	old/new 200ch
Thinking blocks	All dropped	All dropped	First 200ch	First 200ch
MCP tool messages	Partial regex	Partial drop	All preserved	All preserved
Bash results	Head 300ch	Tail 8 lines	Head 5 + tail 5	Head 5 + tail 5
Agent results	300 chars	600 chars	2000 chars	2000 chars
Parallel tool safety	Bug	Bug	tool_use_id match	tool_use_id match
Envelope stripping	None	None	Strip redundant	Strip redundant
Backup + index	—	—	—	Yes

V2 achieved 93% reduction but was too aggressive. V3 fixed the critical bugs (parallel tools, MCP inputs) and backed off on compression. V4 added the backup/index system so nothing is ever truly lost.

The counterintuitive lesson: more compression isn't better. V2's 93% broke things. V3/V4's 90% preserves everything that matters. The extra 3% bought correctness.

The Envelope Insight

The single biggest optimization wasn't about tool results at all. It was about the JSON envelope.

Every JSONL line in a Claude Code session looks like this:

{"sessionId":"abc-123","userType":"external","version":4,"cwd":"/home/user/project","gitBranch":"main","entrypoint":"cli","message":{...}}

That's roughly 150-200 bytes of identical metadata per line. On a session with 15,000 lines, that's 2-3MB of pure repetition. The distiller strips these after the first occurrence: if the sessionId and cwd haven't changed (they never do), subsequent lines only contain the message field.

This one change accounts for a significant chunk of the reduction without touching any content at all. No information lost. Just deduplication.

Real Numbers Across Sessions

Session	Original	Distilled	Reduction
4-hour coding session (Pagecast)	70MB	7.1MB	90%
2-hour refactor session	17MB	353KB	98%
Short debugging session	465KB	321KB	31%

The pattern: longer sessions with more tool calls benefit more. A short session that's mostly conversation barely shrinks. A marathon session with hundreds of file reads and screenshots drops 90-98%.

CC Auto-Compact Destroys the Original

One detail that forced the backup design: Claude Code's auto-compact is an in-place rewrite. When compact triggers, the original JSONL gets a compact_boundary marker inserted. Old conversation turns are replaced with a summary. Original tool results and thinking blocks disappear. Same file, overwritten.

If you distill a session and later CC auto-compacts the original, the distilled version's backup is the only complete copy that exists. That's not duplication. That's preservation.

70MB in 2026 is nothing. 50 sessions backed up is 3.5GB. One external drive holds thousands of sessions. The cost of keeping everything is negligible. The cost of losing something you needed is not.

Try It

npx @mcpware/claude-code-organizer --distill <session.jsonl>

mcpware / claude-code-organizer

Dashboard to manage Claude Code memories, configs, and MCP servers — security scanner for tool poisoning, context token budget tracker, duplicate cleanup, scope management. npx @mcpware/claude-code-organizer

Claude Code Organizer

AI agents: read AI_INDEX.md first. It is the navigation manifest for this codebase — where to find every module, how they connect, and where to look before making any claim about the code.

Claude Code Organizer (CCO) is a free, open-source dashboard that lets you manage all Claude Code configuration — memories, skills, MCP servers, settings, agents, rules, and hooks — across global and project scopes. It includes a security scanner for MCP tool poisoning and prompt injection, a per-item context token budget tracker, per-project MCP enable/disable controls, and bulk cleanup for duplicate configs. All without leaving the window.

v0.18.0 — Backup Center: one click backs up every memory, skill, MCP config, rule, plan, agent, and session to a private…

View on GitHub

The distiller ships with CCO v0.17.0. Dashboard button, CLI flag, API endpoint. Backup is automatic, index is generated, and the original is never modified until you explicitly choose to replace it.

About Me

CS dropout. Building tools for the Claude Code ecosystem. github.com/ithiria894

⭐ Star the repo if you've ever lost context in a bloated session.

DEV Community