Claude Code performance: why your AI slows down after 2 hours (and how to fix it)
You've noticed it. An hour into your Claude Code session, responses start dragging. Two hours in, it feels like you're talking to a different, slower AI. By hour three, you're getting weird refusals and context errors.
This isn't your imagination. Here's exactly what's happening and how to fix it.
What's actually causing the slowdown
Claude Code maintains a context window for your session. Every file you read, every command output, every back-and-forth exchange fills that window. When it gets full:
- Response latency increases (Claude is processing more tokens per request)
- Accuracy drops (early context gets compressed or dropped)
- Rate limit errors appear more frequently
- Claude starts "forgetting" decisions made earlier in the session
The context window isn't infinite. Once it saturates, you're effectively working with a degraded version of the model.
Fix 1: Compact your context proactively
Don't wait until Claude gets slow. Every 45-60 minutes, run:
/compact
This compresses your conversation history while preserving the key decisions and file state. It's like defragging your session.
Pro tip: Compact before switching to a new feature, not after Claude starts acting weird. Reactive compacting loses more context than proactive compacting.
Fix 2: Use CLAUDE.md to front-load critical context
Every token Claude spends rediscovering your project is a token wasted. Put it in CLAUDE.md once:
# Project: MyApp
## Architecture
- Frontend: React 18, TypeScript, Tailwind
- Backend: Express + PostgreSQL
- Auth: JWT with refresh tokens
## Current sprint goals
- Fix payment webhook handler
- Add user export feature
- Write integration tests for /api/orders
## Do not touch
- Legacy billing code in /src/billing/v1/
- Auth middleware (security audit pending)
Claude reads this at session start. You don't have to re-explain your stack every time.
Fix 3: Start fresh sessions for fresh tasks
This is counterintuitive. Keeping a single long session "feels" more productive but isn't.
Start a new Claude Code session when:
- Switching from debugging to new feature work
- Moving from one service to another in a monorepo
- Starting the next day's work
Fresh sessions start with full context capacity. Your CLAUDE.md ensures continuity without the baggage.
Fix 4: Use subagents for parallel work
Instead of one bloated session, run multiple focused sessions:
# Terminal 1: Frontend work
cd /frontend && claude
# Terminal 2: Backend work
cd /backend && claude
# Terminal 3: Tests
cd /tests && claude
Each session stays lean. No session knows what the others are doing, so they can't pollute each other's context.
Fix 5: Scope your file reads
Every cat large-file.js dumps the entire file into context. Instead:
# BAD - dumps 2000 lines into context
Read src/api/handlers.js
# GOOD - targeted read
Show me lines 45-90 of src/api/handlers.js
# BETTER - semantic request
Find the authentication middleware in src/api/handlers.js
Claude can search and extract. You don't need to load entire files.
Fix 6: Remove the rate limit problem entirely
Here's something most Claude Code users don't know: a significant part of the "slowdown" you're experiencing isn't context saturation — it's rate limiting.
When your session gets heavy, you're hitting Anthropic's API rate limits more frequently. Claude Code starts queuing requests, which feels like slowness.
The fix is to route Claude Code through a proxy that removes rate limit caps:
export ANTHROPIC_BASE_URL=https://simplylouie.com
claude
With this set, Claude Code sends requests to the proxy instead of directly to Anthropic. The proxy handles rate limit management so you never hit a wall mid-session.
SimplyLouie costs $2/month and removes the rate limit interruptions entirely. For heavy Claude Code users, this is the most effective performance fix.
The compound effect
The real problem with context saturation isn't just the slowdown — it's the cascade:
- Context fills up
- Accuracy drops
- You re-explain things to compensate
- Re-explaining fills more context
- Performance gets worse faster
Break the cascade early. Compact at 45 minutes, use CLAUDE.md, scope your reads, and if rate limits are hitting you, fix that separately with ANTHROPIC_BASE_URL.
Summary
| Problem | Fix |
|---|---|
| Context filling up |
/compact every 45-60 min |
| Re-explaining project every session | CLAUDE.md with architecture + goals |
| Session bloat | Fresh session per task |
| Reading full files | Targeted line/semantic reads |
| Rate limit interruptions |
ANTHROPIC_BASE_URL proxy |
| Parallel work slowing each other | Separate terminal sessions |
Claude Code is a multiplier on your productivity — but only if the sessions stay healthy. These fixes keep you in flow.
Running Claude Code heavy? The ANTHROPIC_BASE_URL trick works with any proxy. SimplyLouie is $2/month and handles the rate limit management automatically.
Top comments (0)