DEV Community

brian austin
brian austin

Posted on

Claude Code performance: why your AI slows down after 2 hours (and how to fix it)

Claude Code performance: why your AI slows down after 2 hours (and how to fix it)

You've noticed it. An hour into your Claude Code session, responses start dragging. Two hours in, it feels like you're talking to a different, slower AI. By hour three, you're getting weird refusals and context errors.

This isn't your imagination. Here's exactly what's happening and how to fix it.

What's actually causing the slowdown

Claude Code maintains a context window for your session. Every file you read, every command output, every back-and-forth exchange fills that window. When it gets full:

  • Response latency increases (Claude is processing more tokens per request)
  • Accuracy drops (early context gets compressed or dropped)
  • Rate limit errors appear more frequently
  • Claude starts "forgetting" decisions made earlier in the session

The context window isn't infinite. Once it saturates, you're effectively working with a degraded version of the model.

Fix 1: Compact your context proactively

Don't wait until Claude gets slow. Every 45-60 minutes, run:

/compact
Enter fullscreen mode Exit fullscreen mode

This compresses your conversation history while preserving the key decisions and file state. It's like defragging your session.

Pro tip: Compact before switching to a new feature, not after Claude starts acting weird. Reactive compacting loses more context than proactive compacting.

Fix 2: Use CLAUDE.md to front-load critical context

Every token Claude spends rediscovering your project is a token wasted. Put it in CLAUDE.md once:

# Project: MyApp

## Architecture
- Frontend: React 18, TypeScript, Tailwind
- Backend: Express + PostgreSQL
- Auth: JWT with refresh tokens

## Current sprint goals
- Fix payment webhook handler
- Add user export feature
- Write integration tests for /api/orders

## Do not touch
- Legacy billing code in /src/billing/v1/
- Auth middleware (security audit pending)
Enter fullscreen mode Exit fullscreen mode

Claude reads this at session start. You don't have to re-explain your stack every time.

Fix 3: Start fresh sessions for fresh tasks

This is counterintuitive. Keeping a single long session "feels" more productive but isn't.

Start a new Claude Code session when:

  • Switching from debugging to new feature work
  • Moving from one service to another in a monorepo
  • Starting the next day's work

Fresh sessions start with full context capacity. Your CLAUDE.md ensures continuity without the baggage.

Fix 4: Use subagents for parallel work

Instead of one bloated session, run multiple focused sessions:

# Terminal 1: Frontend work
cd /frontend && claude

# Terminal 2: Backend work  
cd /backend && claude

# Terminal 3: Tests
cd /tests && claude
Enter fullscreen mode Exit fullscreen mode

Each session stays lean. No session knows what the others are doing, so they can't pollute each other's context.

Fix 5: Scope your file reads

Every cat large-file.js dumps the entire file into context. Instead:

# BAD - dumps 2000 lines into context
Read src/api/handlers.js

# GOOD - targeted read
Show me lines 45-90 of src/api/handlers.js

# BETTER - semantic request
Find the authentication middleware in src/api/handlers.js
Enter fullscreen mode Exit fullscreen mode

Claude can search and extract. You don't need to load entire files.

Fix 6: Remove the rate limit problem entirely

Here's something most Claude Code users don't know: a significant part of the "slowdown" you're experiencing isn't context saturation — it's rate limiting.

When your session gets heavy, you're hitting Anthropic's API rate limits more frequently. Claude Code starts queuing requests, which feels like slowness.

The fix is to route Claude Code through a proxy that removes rate limit caps:

export ANTHROPIC_BASE_URL=https://simplylouie.com
claude
Enter fullscreen mode Exit fullscreen mode

With this set, Claude Code sends requests to the proxy instead of directly to Anthropic. The proxy handles rate limit management so you never hit a wall mid-session.

SimplyLouie costs $2/month and removes the rate limit interruptions entirely. For heavy Claude Code users, this is the most effective performance fix.

The compound effect

The real problem with context saturation isn't just the slowdown — it's the cascade:

  1. Context fills up
  2. Accuracy drops
  3. You re-explain things to compensate
  4. Re-explaining fills more context
  5. Performance gets worse faster

Break the cascade early. Compact at 45 minutes, use CLAUDE.md, scope your reads, and if rate limits are hitting you, fix that separately with ANTHROPIC_BASE_URL.

Summary

Problem Fix
Context filling up /compact every 45-60 min
Re-explaining project every session CLAUDE.md with architecture + goals
Session bloat Fresh session per task
Reading full files Targeted line/semantic reads
Rate limit interruptions ANTHROPIC_BASE_URL proxy
Parallel work slowing each other Separate terminal sessions

Claude Code is a multiplier on your productivity — but only if the sessions stay healthy. These fixes keep you in flow.


Running Claude Code heavy? The ANTHROPIC_BASE_URL trick works with any proxy. SimplyLouie is $2/month and handles the rate limit management automatically.

Top comments (0)