JackWu

Posted on Feb 20

20 Days, 20,000 Conversations, 1.2 Billion Tokens: What I Learned from Heavy Claude Code Use

#claudecode #ai #programming #productivity

I started using Claude Code in November thinking it was just a smarter Cursor. After 20,000 conversations and 1.2 billion tokens — verified by Anthropic as a top 1% user — here's the full retrospective.

1. The Mental Model Shift: From AI Assistant to AI Team

When I first picked up Claude Code, I treated it like an autocomplete tool with better reasoning. A smarter linter. A more accurate Cursor.

That framing was wrong.

After enough usage and conversations with others doing the same, the better mental model clicked: Claude Code is not an AI assistant. It's a capable, low-cost, reliable team.

Not an intern. A team.

2. What 20 Days of Output Actually Looks Like

Claude Code can now carry a product from zero to shipped — independently, end to end. Here's what I built in 20 days:

Product	Stack	Time	Notes
AI ChatBot	Flutter	2 days	Voice input, resume upload, AI interviewer
iOS Launcher App	Native iOS	2 days	—
GroAsk	Native macOS	2 weeks	Claude Code session manager — the tooling behind everything in this post

All three were vibe-coded. No lines written by hand. During the Opus 4.5 era I'd still glance at the code occasionally. By Opus 4.6, I stopped reading it entirely.

GroAsk came out of my own frustration: constantly switching terminals, losing track of which session was doing what, context windows silently filling up with no visibility. I built it to scratch my own itch. More on that below.

3. Beyond Coding: What Else Claude Code Handles

Auto-posting to Twitter — via browser MCP + custom Skills
Article writing and multi-platform distribution — via custom Skills
Product analytics — running against real operational data
Market research — gathering user pain points, validating assumptions

4. My Iteration Loop

Small changes: one sentence to shipped

Describe the change → Claude Code executes → Ship → One-line fix if needed

Large features: structured pipeline

Gather requirements → PRD → Market validation → Tech spec → Execution plan → Build

In practice:

Requirements: personal pain points + community feedback
PRD: describe the raw need, let Claude Code draft the PRD, validate with research
Tech spec: Claude Code writes it, market research reviews it
Execution plan: Claude Code uses TDD + subagents to break down tasks
Build: execute the plan

A 3,000-word PRD → 10,000-word tech spec → 40,000-word execution plan. The AI handles the expansion. I handle the decisions.

The principle that matters most

Product decisions and technical decisions are the human's job. Fully delegating decisions to AI still breaks down on complex tasks — it looks like it works, but the maintenance cost is brutal. Get the design right first.

Break problems into small, concrete tasks. A focused, small prompt outperforms a large, vague one in reliability, speed, and outcome.

5. Techniques That Actually Matter

5.1 Skills: The Highest-Leverage Feature

Core idea: encoding your own workflows.

A Skill has two parts:

Prompt: natural language description of what needs model judgment
Scripts: code for the deterministic parts

Only use prompts where judgment is required. Everything else should be a script.

My regular Skills: atomic commits, auto-build-and-run, pulling operational metrics, posting drafts, writing articles from outlines.

5.2 Subagents: Context Management for Complex Tasks

The built-in Task tool is Claude Code's subagent mechanism. The main session delegates work to internal sessions and only cares about inputs and outputs. This keeps the primary context clean on long-running tasks.

5.3 MCP: Extending What Claude Code Can See

I run 8 MCPs personally, but I'd recommend fewer to start
MCPs consume a lot of raw context — use parameters to expose minimal interfaces
Prefer Rust or Go compiled MCPs over Node — significantly lower memory footprint

5.4 Hooks: The Safety Net

The highest-value hook I've set up: back up anything untracked before Claude Code deletes it.

5.5 The Most Effective Debugging Technique

Write logs to disk. Let Claude Code read the logs and diagnose from there.

This is the most reliable way to resolve bugs — more reliable than anything else I've tried. When you think Claude Code is stuck, try logs. It will surprise you.

6. Long-Term Memory: Making It Stick

Forgetting context across sessions is a real friction point. My solution: layered knowledge files.

Maintain a hierarchical knowledge/ folder. Tell Claude Code to store useful findings there.
Use CLAUDE.md files per sub-project, plus a rules/ directory for principles
The result: a local long-term memory that gets better the more you use it

7. Context Management: The Biggest Performance Variable

Strategy	Why
Load MCPs on-demand	Expose minimal interfaces via parameters
Layer CLAUDE.md by scope	Avoid information overload
Hard limit at 200k tokens	Switch sessions before you hit this
Never compress context	Open a new session instead
Monitor in real time	Know what each session is consuming

I keep context under 200k (roughly 20% of the 1M window). The difference between a fresh session and one near the limit is like the difference between a colleague at 9am versus 2am — technically still working, but the quality has dropped.

The "monitor in real time" row is the one most people skip. You keep going, context quietly fills up, and response quality drops off a cliff before you notice. That's actually the primary reason I built GroAsk — a menu bar panel that shows each Claude Code terminal's context usage, current status, and queue at a glance. No more switching terminals to check.

8. Multi-Session: Stop Watching One Window

Running a single Claude Code process at a time is leaving most of the throughput on the table.

Comfortable baseline: 2–3 parallel sessions
High-output mode: 5–10 sessions
Only actively manage sessions waiting for input; let the others run

The practical problem: once you're running 5+ terminals, just finding "which session is waiting for me" becomes its own overhead. My current setup uses GroAsk's dispatch panel — all Claude Code terminals in one view, with status (running / waiting / context usage) visible without switching windows. ⌥Space opens the panel, one click jumps to the right terminal.

If you're running multiple sessions regularly: groask.com

9. Where the Model Hits Its Limits

The one thing models can't do well: genuinely novel work.

When there's no training data to draw from — when you're building something that hasn't been built before — the model is guessing. That's where human creativity and judgment still matter. Know where that line is.

10. Three Takeaways

Claude Code is not an AI assistant. It's an AI team. Manage it like one.
Decisions are your job. Product calls, architecture calls, novel problems — those stay with you.
Engineer your AI usage. Skills encode workflows. Subagents manage complexity. Knowledge files build memory. Context discipline keeps quality high.

I use Claude Code daily to build GroAsk — a macOS menu bar launcher that opens any AI with ⌥Space and monitors all Claude Code terminal sessions in one panel. If you're running Claude Code heavily, it's worth a look.

Top comments (2)

Sitt Shein • Feb 25

Thanks for sharing

Ethan Watkins • Feb 25

nice