brian austin

Posted on Apr 1

Claude Code memory: how to manage context windows across long sessions

#claudecode #ai #productivity #programming

Claude Code memory: how to manage context windows across long sessions

If you've used Claude Code for more than 30 minutes on a complex project, you've hit the wall: the context window fills up, Claude starts forgetting earlier decisions, and you get contradictory behavior.

Here's exactly how I manage this — and the CLAUDE.md tricks that make long sessions actually work.

The problem: Claude Code's memory is a sliding window

Claude Code doesn't have persistent memory. It has a context window — a fixed amount of text it can "see" at once. On long sessions:

Early decisions get pushed out of context
Claude forgets constraints you established 2 hours ago
You get contradictory code that breaks earlier architecture
Worst case: Claude starts refactoring things you explicitly told it not to touch

This isn't a bug. It's how transformers work. But it's completely manageable once you understand it.

The 3-layer memory system I use

Layer 1: CLAUDE.md as permanent memory

Anything that must survive the entire session goes in CLAUDE.md:

# CLAUDE.md

## Architecture decisions (DO NOT change without asking)
- Using PostgreSQL, NOT SQLite — we have concurrent writes
- Auth is JWT, NOT sessions — we're stateless for horizontal scaling  
- All API routes under /api/v1/ — versioning is non-negotiable
- NO ORM — raw SQL only, performance is critical

## Active constraints
- Node 18 compatibility required (production is Node 18)
- No new npm dependencies without checking with me first
- TypeScript strict mode is ON

## Current sprint goal
- Building the payment webhook handler
- Do NOT refactor existing code during this sprint

CLAUDE.md is injected at the START of every context. This means architecture decisions survive session resets.

Layer 2: Session summary file

Every hour (or at major milestones), I ask Claude to write a summary:

> Write a summary of what we've built so far and what decisions we made. 
> Save it to SESSION_NOTES.md

Claude writes something like:

# Session notes — 2024-01-15

## What we built
- Stripe webhook handler at /api/v1/webhooks/stripe
- Idempotency key storage in Redis (TTL 24h)
- Retry logic with exponential backoff (3 attempts)

## Decisions made
- Using Redis for idempotency (NOT database) — faster, cheaper
- Webhook secret stored in env var STRIPE_WEBHOOK_SECRET
- Failed webhooks logged to webhook_failures table

## Next steps
- Test with Stripe CLI
- Add monitoring alerts for failure rate > 5%

Then I add to CLAUDE.md:

@SESSION_NOTES.md

The @filename syntax imports the file into context. Now the session notes are permanent memory.

Layer 3: Explicit context refresh

When Claude starts acting confused (contradicting earlier decisions), I run:

> Re-read CLAUDE.md and SESSION_NOTES.md. Summarize the current state 
> of the project before we continue.

This forces Claude to process the permanent memory files and realign.

The /compact command (built-in reset)

Claude Code has a /compact slash command that compresses the conversation history. Run it when:

The context bar in the UI shows > 80% full
Claude starts getting slow to respond (large context = more tokens to process)
You're switching to a different part of the codebase

/compact

Claude summarizes everything important and resets the window. You lose raw conversation history but keep the compressed summary.

Pro tip: Before running /compact, make sure your CLAUDE.md is up to date. The compact summary + CLAUDE.md = your full context after reset.

The checkpoint pattern

For very long projects (multiple days), I use explicit checkpoints:

# CHECKPOINT.md — Last updated 2024-01-15 14:30

## Project state
- [x] Database schema (complete, DO NOT change)
- [x] Auth system (complete, DO NOT change)  
- [x] User API (complete, DO NOT change)
- [ ] Payment system (IN PROGRESS)
- [ ] Email notifications (NOT STARTED)

## Files that are DONE (do not refactor)
- src/models/user.ts
- src/routes/auth.ts
- src/middleware/auth.ts

## Current focus
- src/routes/payments.ts
- src/services/stripe.ts

This prevents the #1 Claude Code failure mode: refactoring finished code while you're working on new features.

Context budget math

Claude's context window is roughly 200K tokens. In a typical Claude Code session:

Item	Tokens
System prompt	~2,000
CLAUDE.md	~500-2,000
Conversation so far	grows to ~50,000
Files being edited	~5,000-20,000
Budget left	~130,000

You have more room than you think. But if you're asking Claude to read large files + have a long conversation + maintain complex state, you'll feel it.

Tip: Keep CLAUDE.md under 1,000 tokens. Use @imports for longer reference docs.

The API angle: flat-rate context

One thing worth knowing: Claude Code's rate limits are based on tokens, not messages. Long context sessions hit limits faster.

If you're using Claude via API (via ANTHROPIC_BASE_URL), you pay per token — so a 200K token session costs more than ten 20K token sessions that accomplish the same thing.

This is why context management isn't just about Claude's memory — it's also about cost. Compact aggressively. Use CLAUDE.md for permanent state instead of re-explaining context every time.

(I use SimplyLouie for flat-rate Claude API access — $2/month removes the per-token anxiety entirely.)

Quick reference

Problem	Solution
Claude forgets architecture	Put it in CLAUDE.md
Claude refactors finished code	Add "DO NOT change" list to CLAUDE.md
Context getting full	Run `/compact`
Multi-day project	Use CHECKPOINT.md with `@import`
Claude acting confused	Ask it to re-read CLAUDE.md before continuing

The CLAUDE.md → SESSION_NOTES.md → CHECKPOINT.md stack has made my Claude Code sessions dramatically more reliable. No more context drift, no more contradictory refactors.

What's your memory management approach? Curious what others have found works.

Top comments (1)

Some comments may only be visible to logged-in visitors. Sign in to view all comments.