DEV Community

JessYT
JessYT

Posted on • Originally published at jessinvestment.com

5 Token-Saving Habits From 3 Months With Claude Code

5 Token-Saving Habits From 3 Months With Claude Code

EDIBLOG · Phase 2-1 · 3-Month Retrospective

Hi, I'm Eddie. After 3 months running blog automations, I settled on 5 token-saving habits. Honestly — I broke each one at least once before they stuck. This is the first deep dive I promised in the CLAUDE.md memory index post.

The 5 habits: /compact, agent split, /clear, CLAUDE.md split, 3 Skills.

01 — /compact: press it on the alert, judge by the next answer

Bottom line. I learned /compact fires when the context alert appears. The way I see it, it summarizes and compresses the conversation to free up token space. It is not lossless.

My take after 3 months: "Sometimes I lose things. When critical info disappears, the next answer can go off the rails."

My pattern — see the alert, run /compact, ask one question, and if it looks wrong I /clear and restart. No blind trust.

I think being honest about limits is the better move. I accepted that compaction is not lossless. Right after a critical code change, I now habitually restate the key facts before compacting.

# See the alert → /compact (first attempt)
[Claude] context window 80% used. /compact recommended.
$ /compact
[Claude] Compaction done — 60% token space recovered.

# Immediate one-line check after compacting
$ "Show me the contents of [key file] you just worked on"
# If clean, continue. If loss detected, /clear and restart
Enter fullscreen mode Exit fullscreen mode

02 — Important work goes to a separate agent, not main

Bottom line. The second pattern, in my view, is role separation. I run main Claude as the coordinator and throw detail work to sub-agents.

My separation rule: "Anything I consider important — code review, benchmarking, post quality control — doesn't touch main Claude. Dedicated agents only."

I see two payoffs. Main context stays protected + only the final output of the detail work surfaces back to main. The biggest win is that reading a huge document once doesn't eat main context.

I keep 4 custom agents as one-pager markdown files under .claude/agents/:

  • monetization-analyst — Monetization analysis — VSD Pro · jessinvestment stage checks
  • opportunity-ranker — Opportunity ranking — sort next-sequence candidates
  • dev-finance-explorer — Dev × finance crossover — post idea discovery
  • asset-health-checker — Asset health check — jobs · blog operations status

On top of that, there's the /review slash command and the weekly weekly-blog-review job (6-AI panel for post quality). This was the single biggest token-saving move.

# Don't ask main Claude
$ Task(opportunity-ranker, "rank the next 10 post idea candidates")
# → sub-agent works in a separate context, only the result returns to main

# Main plays coordinator only
# Heavy analysis / 6-AI panels / benchmarking all go to separate agents
Enter fullscreen mode Exit fullscreen mode

03 — /clear: not compression, full reset

Bottom line. Third, I actively use /clear.

My timing rule: "When the context completely shifts, I /clear and rebuild from scratch."

A typical example for me — finishing real-estate blog work and switching to debugging the trading system. I realized mixing the two contexts causes incidents.

I see the two tools as different roles. /compact is compress, /clear is full empty. After clear, the first message reloads only CLAUDE.md and I rebuild context from there.

This was a bit counterintuitive: clear isn't wasteful — NOT clearing is more expensive in most cases. Without clearing, every turn has to spend tokens replaying the entire prior context.

# /clear if you hit ANY of these
[ ] Complete domain switch (real-estate → trading system, etc.)
[ ] Answer feels off even after /compact (compression loss)
[ ] Conversation getting long, answers slowing down (context bloat)
[ ] Moving to another blog · project (separate CLAUDE.md folder)

# 1+ above → don't hesitate, /clear
# Clear, reload CLAUDE.md, restart clean
Enter fullscreen mode Exit fullscreen mode

04 — Slim CLAUDE.md, move the rest to side files

Bottom line. The fourth is the CLAUDE.md diet.

My split rule: "Only what's needed every task lives in CLAUDE.md. Everything else is request-specific — that's how you save tokens."

The reason is clear to me. CLAUDE.md is auto-loaded every session. Every line in there spends tokens every time. The conclusion: anything you rarely look at has to come out.

My actual split structure:

File type Examples
Main CLAUDE.md (auto-loaded every session) automation/CLAUDE.md — ops rules, schedules, incident log
Side files (loaded only on request) real-estate/_series_guide.md · ediblog/SKILL.md · docs/INFRA_2026_05.md
Archive (one-line index only) CLAUDE-archive-2026-04.md — old incident log moved out

I split old incident logs into CLAUDE-archive-2026-04.md and kept only a one-line index in main. The main file had been swelling past 1,000 lines — you can feel the relief. To me, this is the heart of the main-slim pattern.

# Main CLAUDE.md (currently ~80 lines)

## 5. Incident log (learnings)

| Date | Incident | Action |
|---|---|---|
| 2026-05-09 | 4 publish failures + telegram loop | 4 self-recovery layers added |

> 10 incidents from 2026-04-14 ~ 2026-04-18 moved to `CLAUDE-archive-2026-04.md` §A
> 8 incidents from 2026-04-21 ~ 2026-05-04 moved to `CLAUDE-archive-2026-05.md` §A
> Main keeps only post-05-05 incidents
Enter fullscreen mode Exit fullscreen mode

05 — 3 skills, loaded only on trigger

Bottom line. The last one, in my view, is skills. I keep 3 custom skills under automation/.claude/skills/.

  1. add-launch-job — Auto-triggered when adding a new LaunchAgent job. Checks plist TCC policy · FD limit 3-layer defense. Used heavily this past week alone creating 5 new jobs.
  2. blog-style-guide — Auto-triggered on post writing · review. Applies 9 base patterns + 4 evolved patterns + reviewer gate. Managed in one place across 4 daily-publishing blogs.
  3. captcha-recovery — Auto-triggered on captcha · session corruption. Telegram relay + zombie Chrome cleanup. Rare but mandatory when it hits.

I couldn't measure the split effect quantitatively. My read: inline = main CLAUDE.md swells past 1,000 lines and gets fully loaded every session. Split = main keeps a one-line index, and SKILL.md only loads when triggered. I estimate ~30% main-token savings.

  • ~30% main tokens saved (estimated, not measured)
  • 3 custom skills (add · style · captcha)
  • 1 rule-change point (one SKILL.md)
---
name: blog-style-guide
description: 9 writing patterns + reviewer gate thresholds for blog post
  writing · review · publishing in the automation project. Trigger on
  ediblog · jessinvestment · luna-pick · jesslab publishing
  tasks or post quality review requests.
---

# Automation project blog style guide
...

# When Claude matches keywords like "write this post" / "review" / "before publishing"
# Auto-trigger → load SKILL.md → apply 9 patterns
Enter fullscreen mode Exit fullscreen mode

Coda — If I had to nail it in one line

Token saving isn't about tools — it's about deciding how much to pin in main.

— Eddie · Phase 2-1 wrap

All 5 only stuck after I broke them at least once each. I'd recommend looking at what info you actually need every session before memorizing tools. That, I realized, is the essence of token saving.

Phase 2 · next deep dives

  • 2-2 Slash commands deep dive — /run-daily differences across 5 blogs (soon)
  • 2-3 3 custom skills deep dive — why I built them · structure
  • 2-4 CLAUDE.md operations — how the incident log accumulates

Every line that auto-loads is a cost. I only pin what every session actually needs.

Sources · References

Claude Code Official

Series index

  • Can't leave my desktop — Claude Code 3 months, 6 patterns (Phase 1, series index)

Disclosure: Personal 3-month retro. No ads, no affiliates.


Original with full infographics and visual structure: https://jessinvestment.com/5-token-saving-habits-from-3-months-with-claude-code/

Top comments (0)