TL;DR
Claude Code stores conversations locally in ~/.claude/projects/ as JSONL files. Every message you send hits Anthropic's servers with the full conversation history attached. Running /compact creates a summary checkpoint and API payloads shrink by ~85%, but you lose granular context. I ran an experiment to see exactly how this works.
Prerequisites
Let’s start with installing some of the tools which we will use to deep dive in
brew install mitmproxy
brew install jq
pip install tiktoken
mitmproxy is a free, open source HTTPS proxy that lets you intercept and inspect HTTPS traffic.
I ran mitmproxy once to generate SSL certificates:
mitmproxy
# Press q to quit after it starts
Certs get created at ~/.mitmproxy/
Setting up the experiment
I created a fresh directory:
mkdir ~/compact-experiment && cd ~/compact-experiment
Started mitmproxy in one terminal:
mitmproxy
Once running, pressed f and set filter to api.anthropic.com so I only see requests to Anthropic's servers.
In another terminal, configured Claude Code to route through the proxy:
export HTTPS_PROXY=http://127.0.0.1:8080
export HTTP_PROXY=http://127.0.0.1:8080
export NODE_EXTRA_CA_CERTS=~/.mitmproxy/mitmproxy-ca-cert.pem
cd ~/compact-experiment
claude
Building conversation history
I ran 10 prompts to build up a realistic conversation:
1. Create a Python FastAPI server with a health endpoint at /health
2. Add a POST /users endpoint that accepts name and email with pydantic validation
3. Add rate limiting using slowapi with max 100 requests per 15 minutes per IP
4. Write pytest unit tests for the rate limiter
5. Add centralized exception handling
6. Add request logging middleware
7. Create a /users GET endpoint with pagination
8. Add input sanitization using bleach
9. Add a config module using pydantic settings
10. Add graceful shutdown handling
At this point, mitmproxy showed each request going to api.anthropic.com/v1/messages with increasingly large payloads.
Capturing pre compaction state
Found my JSONL file:
ls ~/.claude/projects/
Copied it:
cp ~/.claude/projects/<project-folder>/*.jsonl ~/compact-experiment/pre-compact.jsonl
Checked the size:
$ wc -l pre-compact.jsonl
24 pre-compact.jsonl
$ ls -lh pre-compact.jsonl
-rw-r--r-- 1 user staff 38K pre-compact.jsonl
Looked at the structure:
$ head -1 pre-compact.jsonl | jq .
User message entry:
{
"uuid":"msg_01XK7Qv...",
"type":"human",
"message":{
"role":"user",
"content":"Add rate limiting using slowapi with max 100 requests per 15 minutes per IP"
},
"timestamp":"2025-01-17T10:23:45.123Z",
"sessionId":"sess_abc123..."
}
Assistant message entry:
{
"uuid":"msg_02YL8Rw...",
"type":"assistant",
"message":{
"role":"assistant",
"content":"I'll add rate limiting using slowapi..."
},
"timestamp":"2025-01-17T10:23:52.456Z",
"sessionId":"sess_abc123..."
}
Capturing API request payload
In mitmproxy, found the latest POST request, pressed enter to view, then e → request body → saved to ~/compact-experiment/pre-compact-request.json
Counting tokens
tiktoken is OpenAI's tokenizer library. Anthropic uses a similar tokenization scheme, so it gives a good approximation.
count_tokens.py:
import json
import os
import sys
import tiktoken
enc = tiktoken.get_encoding("cl100k_base")
filepath = sys.argv[1]
with open(filepath) as f:
messages = json.load(f).get("messages", [])
tokens = sum(len(enc.encode(m.get("content", ""))) for m in messages)
print(f"Messages: {len(messages)}")
print(f"Tokens: {tokens}")
print(f"Bytes: {os.path.getsize(filepath)}")
$ python count_tokens.py pre-compact-request.json
Messages: 21
Tokens: 14280
Bytes: 41847
Running compaction
In Claude Code:
/compact
Waited a few seconds for confirmation.
Capturing post compaction state
Copied the updated JSONL:
cp ~/.claude/projects/<project-folder>/*.jsonl ~/compact-experiment/post-compact.jsonl
Compared:
$ wc -l pre-compact.jsonl post-compact.jsonl
24 pre-compact.jsonl
27 post-compact.jsonl
$ ls -lh pre-compact.jsonl post-compact.jsonl
-rw-r--r-- 1 user staff 38K pre-compact.jsonl
-rw-r--r-- 1 user staff 41K post-compact.jsonl
Post compact file is larger. Compaction doesn't delete history, it appends a compact boundary document.
The compact boundary
$ grep -i "compact_boundary" post-compact.jsonl | jq .
{
"parentUuid":null,
"logicalParentUuid":"24aae814-9669-4858-aefc-6c12cb1b023f",
"isSidechain":false,
"userType":"external",
"cwd":"/Users/<username>/srccode/compact-conv-experiment",
"sessionId":"596539c1-3d3d-46c4-af79-346807029a3d",
"version":"2.1.11",
"gitBranch":"",
"slug":"idempotent-herding-marshmallow",
"type":"system",
"subtype":"compact_boundary",
"content":"Conversation compacted",
"isMeta":false,
"timestamp":"2026-01-17T16:47:51.316Z",
"uuid":"d79a9fff-b6f9-4877-99d0-1c88267f26b6",
"level":"info",
"compactMetadata":{
"trigger":"manual",
"preTokens":37418
}
}
This is the compact boundary. A few things stand out:
-
typeissystemandsubtypeiscompact_boundary -
compactMetadata.triggershows it was manual (via/compactcommand) -
compactMetadata.preTokenscaptures token count before compaction (37418 in my case) -
logicalParentUuidlinks to the last message before compaction
Future API requests use the summary generated at this boundary instead of the full history.
Capturing post compaction request
Sent one more message:
Add a /metrics endpoint that returns request count and average response time
Saved the new request payload from mitmproxy to post-compact-request.json
$ python count_tokens.py post-compact-request.json
Messages: 3
Tokens: 1820
Bytes: 6234
Comparing payloads
compare.py:
import json
import os
import tiktoken
enc = tiktoken.get_encoding("cl100k_base")
def analyze(path):
with open(path) as f:
msgs = json.load(f).get("messages", [])
tokens = sum(len(enc.encode(m.get("content", ""))) for m in msgs)
return len(msgs), tokens, os.path.getsize(path)
pre = analyze("pre-compact-request.json")
post = analyze("post-compact-request.json")
print(f"{'Metric':<20} {'Pre':>10} {'Post':>10} {'Reduction':>12}")
print("=" * 55)
print(f"{'Messages':<20} {pre[0]:>10} {post[0]:>10} {pre[0]-post[0]:>12}")
print(f"{'Tokens':<20} {pre[1]:>10} {post[1]:>10} {(pre[1]-post[1])/pre[1]*100:>11.1f}%")
print(f"{'Bytes':<20} {pre[2]:>10} {post[2]:>10} {(pre[2]-post[2])/pre[2]*100:>11.1f}%")
$ python compare.py
Metric Pre Post Reduction
=======================================================
Messages 21 3 18
Tokens 14280 1820 87.3%
Bytes 41847 6234 85.1%
Results summary
JSONL file
| Metric | Pre compact | Post compact |
|---|---|---|
| Lines | 24 | 27 |
| Size | 38 KB | 41 KB |
Local history grows because compact adds a boundary document, it doesn't remove anything.
API request payload
| Metric | Pre compact | Post compact | Reduction |
|---|---|---|---|
| Messages | 21 | 3 | 86% |
| Tokens | 14,280 | 1,820 | 87% |
| Bytes | 41.8 KB | 6.2 KB | 85% |
What's in the post compact payload
$ jq -r '.messages[] | .role' post-compact-request.json
user
assistant
user
System context now contains the summary. Only messages after the compact boundary are sent as individual entries.
Files generated
~/compact-experiment/
├── pre-compact.jsonl
├── post-compact.jsonl
├── pre-compact-request.json
├── post-compact-request.json
├── count_tokens.py
└── compare.py
Takeaways
- Local JSONL keeps everything, compact doesn't delete history
- Compact boundary is a system document with
subtype: compact_boundarythat marks the checkpoint -
compactMetadatatracks trigger type and pre compaction token count - Post compact requests only send summary plus recent messages to Anthropic
- ~85% reduction in payload size
- Summary is lossy, it captures what was built but loses implementation details
Use /compact when switching tasks or hitting context limits. Avoid it mid task when you need to reference earlier decisions.
Curious about how other coding assistants manage long conversation history.
Top comments (0)