ithiria894

Posted on Apr 6 • Originally published at dev.to

My Claude Code Sessions Hit 70MB. So I Built a Distiller.

#ai #showdev #opensource #productivity

I had a 4-hour coding session with Claude Code. Felt productive. Fixed a bunch of bugs, refactored a module, reviewed some screenshots Claude took of the UI along the way.

Then I tried to --resume it the next day.

The session file was 73MB. Claude loaded it, burned through half the context window on old tool outputs and base64-encoded screenshots from yesterday, and started forgetting things I'd said 20 minutes ago. The conversation was fine. The cargo it was dragging around was not.

I opened the JSONL. Here's what 73MB of "session" actually looks like:

Conversation text:          ~4MB  (what we actually said)
Tool results (Read):       ~28MB  (file contents Claude already read)
Tool results (Bash):        ~9MB  (build outputs, test runs, logs)
Base64 screenshots:        ~22MB  (UI screenshots, now stale)
Tool results (Edit/Write):  ~6MB  (diffs and file previews)
Everything else:            ~4MB  (metadata, tool_use blocks)

93% of the file is stuff Claude doesn't need to resume the conversation. The Read results are files that still exist on disk. The screenshots are from yesterday's UI state. The Bash outputs are build logs from 6 hours ago.

So I built a distiller.

What Session Distiller Does

It reads a session JSONL, keeps every word of the actual conversation verbatim, and applies per-tool-type rules to strip results down to what's useful for context:

Tool type	What's kept	Why
Read	Nothing (stripped entirely)	The file is still on disk. Claude can re-read it.
Bash	First 5 + last 5 lines	You need the command and whether it succeeded. Not 800 lines of webpack output.
Edit	File path + 200-char preview of old/new	Enough to remember what changed.
Write	File path + head/tail preview	Same idea.
Agent	Up to 2000 chars	Research reports are worth keeping. Build logs aren't.

The key decision was extractive filtering, not summarization. I don't pass anything through an LLM. Every word of conversation text is preserved exactly as-is. Tool results are either kept (trimmed) or dropped based on deterministic rules. No tokens spent, no hallucination risk, no "the AI summarized away the one detail I needed."

Typical result: 70MB session → 7MB distilled. 90% reduction.

The original session is backed up before anything changes. You always have the full version if you need it.

The Tool-ID Matching Problem

This sounds simple until you hit parallel tool calls.

Claude Code often fires multiple tool calls in a single assistant message. A tool_result block references its parent by tool_use_id, not by position. My first implementation tracked a global lastToolName variable: "the most recent tool_use was a Read, so the next tool_result must be a Read result." That breaks immediately when an assistant message contains three parallel tool calls.

The fix: build a toolIdMap from every tool_use block (mapping id → tool name), then look up each tool_result.tool_use_id to find the correct tool type. Now parallel calls work correctly. A Read result and a Bash result in the same message get their own distillation rules applied independently.

// Build map: tool_use_id → tool name
if (block?.type === "tool_use" && block.id && block.name) {
  toolIdMap.set(block.id, block.name);
}

// Look up correct tool for each result
if (block?.type === "tool_result") {
  const toolName = toolIdMap.get(block.tool_use_id);
  // Now we know: this result came from "Read", "Bash", etc.
  return distillByToolType(toolName, block);
}

Small detail. Would have caused silent data corruption without it.

Image Trimmer: The Targeted Fix

Sometimes you don't need full distillation. You just need to remove the screenshots.

I kept hitting Claude Code's "image exceeds dimension limit" warning after long sessions with a lot of UI review. The session file was fine except for 20-30MB of base64 image data that Claude couldn't even display anymore.

So I wrote a separate tool that does exactly one thing: find every image block in the JSONL, replace it with [image redacted], leave everything else untouched.

node src/trim-images.mjs ~/.claude/projects/.../session.jsonl
# → Redacted 47 image(s), saved 24832K

It also handles images nested inside tool_result blocks (which is where most screenshots end up, since they come back as results of Bash commands that ran adb screencap or similar).

The whole script is 35 lines. It's also available as a Claude Code skill: type /trim-images when you see the dimension warning and it runs automatically.

How to Use It

From the dashboard:

If you're using Claude Code Organizer, every session row now has a Distill button. Click it, the session gets distilled in-place, and the result shows up as an expandable bundle in the tree view with the backup and index files grouped together.

From the command line:

# Full distillation (conversation + trimmed tool results + backup)
npx @mcpware/claude-code-organizer --distill ~/.claude/projects/.../session.jsonl

# Just strip images
node src/trim-images.mjs ~/.claude/projects/.../session.jsonl

The distiller outputs stats showing before/after sizes, number of index entries, and where the backup landed.

What's Actually in the Backup

The distiller creates a folder named after the session ID:

{sessionId}/
  backup-{originalId}.jsonl    ← full original session, untouched
  index.md                     ← summary of what was kept/stripped

The distilled session gets a context message injected at the top telling Claude where the backup lives and how to retrieve specific tool results if needed (Read with offset). So if Claude needs the full output of a Bash command from 3 hours ago, it knows exactly where to look.

Performance

Distillation runs in under 2 seconds on a 70MB file. It's pure JSON parsing and string manipulation. No LLM calls, no network, no dependencies.

The backup doubles your disk usage temporarily, but if your session was 70MB and the distilled version is 7MB, you're at 77MB total instead of 70MB. Not a meaningful difference on any modern machine.

The context window savings are the real win. A 70MB session dumps roughly 15-20M tokens of tool output into Claude's context when resumed. After distillation, that drops to 1-2M tokens of actual conversation. Claude remembers what you talked about instead of drowning in stale build logs.

Try It

mcpware / claude-code-organizer

Dashboard to manage Claude Code memories, configs, and MCP servers — security scanner for tool poisoning, context token budget tracker, duplicate cleanup, scope management. npx @mcpware/claude-code-organizer

Claude Code Organizer

AI agents: read AI_INDEX.md first. It is the navigation manifest for this codebase — where to find every module, how they connect, and where to look before making any claim about the code.

Claude Code Organizer (CCO) is a free, open-source dashboard that lets you manage all Claude Code configuration — memories, skills, MCP servers, settings, agents, rules, and hooks — across global and project scopes. It includes a security scanner for MCP tool poisoning and prompt injection, a per-item context token budget tracker, per-project MCP enable/disable controls, and bulk cleanup for duplicate configs. All without leaving the window.

v0.18.0 — Backup Center: one click backs up every memory, skill, MCP config, rule, plan, agent, and session to a private…

View on GitHub

The distiller is part of CCO v0.17.0. Dashboard button, CLI flag, and API endpoint all included. Image trimmer works standalone or as a /trim-images skill.

If your sessions are small, you don't need this. If your sessions regularly push 50MB+, this is the difference between "--resume working" and "--resume followed by Claude forgetting your name."

About Me

CS dropout. Building tools for the Claude Code ecosystem. github.com/ithiria894

⭐ Star the repo if bloated sessions have ever ruined your day.

DEV Community