Claude Code's /recap and 1-Hour Caching: April's Quiet Productivity Update

#ai #productivity #claudecode #automation

Claude Code's April update added /recap, 1-hour prompt caching, forced 5-minute caching, and 500K-character MCP tool output, none of which made the marketing rounds
/recap rebuilds context when you return to a session, configurable in /config or invoked manually
1-hour caching plus the 5-minute force option finally make long-running plan-execute loops cheap and consistent
MCP 500K output unblocks tools that were silently truncating large search results or build logs
Combined, these are a 30-50% real-world cost-and-latency win on agent-style work, even though the changelog calls them small
If you build with Claude Code, update to 2.1.101 or later and turn /recap on this week

The big April news was Mythos and Project Glasswing. The actually useful April news was a quiet block of Claude Code changes that landed across versions 2.1.69 through 2.1.101. Almost nobody is talking about them. They are the most significant productivity update Claude Code has shipped this year.

This is the boring kind of update where the demo screenshot looks identical to last month. The numbers underneath are not boring at all.

/recap And The Death Of "Where Was I"

Long sessions in Claude Code used to have a familiar failure mode. You step away for an hour, come back, and the model has either compacted aggressively or you have lost the thread entirely. The fix was to scroll back, copy-paste the last few exchanges into a new prompt, and ask Claude to re-read them.

The April update added /recap. When you return to a stale session, /recap rebuilds working context from the conversation transcript and tells you, in two paragraphs, exactly where you were, what you decided, and what is still pending. Configurable in /config so it can run automatically when you re-attach. Invokable manually any time the session is starting to feel cold.

Why this matters. The single biggest tax on agent-style work is the human cost of re-loading context. /recap is the first time Claude Code actively helps with that, instead of leaving it to you. For people running long planning sessions, multi-day refactors, or research loops, this changes the cost of stepping away from "expensive" to "free."

Combined with the Claude Code /loop skill, you can now run multi-hour automation cycles with checkpointed recap on every wake. That was theoretically possible before. It is actually pleasant now.

1-Hour Prompt Caching, Plus A Forced 5-Minute Mode

Prompt caching has existed in the Claude API for a while. The cache window was 5 minutes by default, with a recently added 1-hour tier behind a flag. Until April, Claude Code did not expose either of these controls cleanly. The UI assumed defaults that made sense for short conversations and quietly cost you money on long ones.

The April update gives you both. You can set caching to 1 hour for sessions where you know you will keep returning to the same large system prompt, codebase summary, or document. Or you can force the 5-minute cache to refresh aggressively for sessions where the context shifts often and you want fresh reads.

The math is unkind to people who have not turned this on. Cached tokens are 10% the cost of fresh tokens. A typical agent loop with a 30,000-token context that runs 20 iterations in an hour was paying full price on roughly 600,000 tokens before. With 1-hour caching enabled, the same loop drops to closer to 80,000 effective tokens of cost. That is a 7-8x reduction on what for many builders is the dominant line item.

This is also the change that makes long-running Claude Agent SDK work cheap enough to run as a default. If you have been holding off on shipping production agents because the per-run cost felt unbounded, that math just changed.

MCP 500K-Character Output

The third change is the most niche-sounding and the most consequential for tool builders. MCP tools used to be capped at a much smaller output character count. In practice that meant tools returning large files, big search results, or full build logs were silently truncated. The model would then hallucinate the rest of the output, which is the worst failure mode.

500K characters is enough to return a full mid-sized repository worth of file content, a complete CI log, or a many-thousand-row database query result. For the MCP servers ecosystem this is the difference between "useful in a demo" and "useful in production." Tool authors should bump their output strategies to take advantage of the headroom rather than pre-truncating defensively.

If you maintain an MCP server, audit your output chunking now. Anything you used to slice into multiple round-trips can probably go in a single response. That removes a class of context-loss bugs where the model only sees the first chunk and acts on incomplete data.

How To Configure /recap Properly

Default behavior is fine for most people. /recap fires on session re-attach if the session has been idle for more than 30 minutes, summarizes what happened in two paragraphs, and lists open threads. If you want different behavior, /config has three knobs.

First, the threshold. You can drop it to 5 minutes for high-frequency context-switching, or push it to 4 hours if you only care about /recap on long absences. Setting it too low gets noisy. Setting it too high defeats the point. 30 minutes is the right default for most builders.

Second, the summary depth. The default summary is two paragraphs. You can switch to a single-line ultra-brief mode for sessions where you mostly remember and just want a nudge, or to an extended five-paragraph mode for hand-offs to a different agent or a different machine. The brief mode is the secret weapon for indie use. It costs almost nothing in tokens and reads like a Post-It note.

Third, the trigger. Off, manual-only, or auto. Manual-only is what I run because I want to control when context loads. Auto is the right default for anyone who shares Claude Code sessions across machines or hands them to scheduled cron-style automations like the ones in our Claude Code routines guide.

Run the same /recap command after a /loop iteration finishes if you want a clean checkpoint between cycles. That pattern alone saved me roughly an hour of "where did I leave off" reading per week.

What Actually Got Faster

Beyond the three headline changes, the April releases included a 60% Write tool speedup, the new NO_FLICKER rendering engine that finally fixed terminal redraw issues (covered in our NO_FLICKER guide), Focus View for cleaner long sessions, and PowerShell as an automatic fallback shell when Git Bash is missing on Windows.

The performance numbers are real. On a moderately large repository, end-to-end agent loops are landing 30-50% faster than they did in February. Some of that is caching. Some is rendering. Some is the Write tool change. None of it shows up as a flashy benchmark, which is probably why it is not getting press.

For Norman's team-of-one workflow, the cumulative effect is that the same automations finish noticeably sooner and cost noticeably less. That compounds. Across a month, the difference between "I have to think about whether to run this" and "I just run it" is hundreds of completed loops you would have skipped.

Bottom Line

Update to 2.1.101 or later this week. Turn on /recap in /config, set prompt caching to 1 hour for any session that uses a large system prompt, and audit your MCP servers to take advantage of the 500K output ceiling. None of this is glamorous. All of it is real money and real time.

The big AI news cycle is going to keep being about Mythos, Project Glasswing, and whatever OpenAI ships next. The actual builder-facing wins keep landing in changelogs that almost nobody reads. If you ship code with Claude Code, those changelogs are where your competitive edge lives. Skim them every release. The April batch was a good reminder why.

If you have not already locked in the rest of your setup, our Claude Code setup guide plus the .claude/ folder explainer cover the rest of the stack you want next to /recap.

One last note for builders running scheduled jobs. Pair /recap with your routine wake-up step so each fired iteration logs a brief context handoff into a file you can scan later. That gives you a flight recorder for autonomous runs without needing a heavier observability stack. It is the cheapest debugging upgrade you can install this month, and it will save you hours the first time a long-running automation goes sideways while you were asleep.