DEV Community

gentic news
gentic news

Posted on • Originally published at gentic.news

How to Run Claude Code 24/7 Without Burning Your Context Window

Implement a hard 50K token session cap and a three-tier memory system (daily notes, MEMORY.md, PARA knowledge graph) to prevent context bloat and memory decay in long-running Claude Code agents.

The Technique: Session Discipline & Structured Memory

Running a Claude Code agent for a weekend project is easy. Running it for 67 days straight in production—handling emails, deployments, and business logic—requires a specific architecture to avoid collapse. The core insight from this real-world deployment is that you must manage two things aggressively: context window bloat and memory retrieval decay.

Why It Works: The Physics of Long-Running Sessions

Every tool call, file read, and API response inflates your context window. A single "heartbeat" check that reads email, calendar, and social media can consume 15K tokens. At that rate, a 200K context window is exhausted in under 7 hours if you run checks every 30 minutes. The agent becomes sluggish, starts hallucinating, and your API costs spiral.

The solution is counter-intuitive but effective: impose a hard 50K token cap per session. When hit, the agent must extract its progress to external memory files, end the session, and start fresh. This brutal discipline forces a critical behavior: the agent cannot rely on its short-term conversational memory. It must write everything important to files that persist across sessions.

How To Apply It: The Three-Tier Memory System

Externalizing memory isn't enough if it all goes into one giant, unwieldy file. The pattern that fails is a single memory.md that grows to 2,000+ lines. The agent, suffering from recency bias, reads only the last 100 lines and forgets critical decisions buried on line 847.

The fix is a structured, three-tier approach:

Tier 1: Daily Notes (memory/YYYY-MM-DD.md)

These are raw, ephemeral logs. Everything that happens today goes here. Archive them after 14 days.

Tier 2: Long-Term Memory (MEMORY.md)

This is a curated file for permanent rules, anti-patterns, and directives. The agent should periodically review daily notes and promote important learnings here. Keep this file concise and well-organized.

Tier 3: Knowledge Graph (~/life/ with PARA structure)

Use the PARA (Projects, Areas, Resources, Archives) method to structure entities: people, companies, projects, and resources. This enables semantic search and connects related information.

Try It Now: Implementing the Cap

You can implement a session bloat detector with a simple script. Here’s a conceptual outline to integrate with your Claude Code agent's heartbeat:

#!/bin/bash
# session_check.sh
TOKEN_USAGE=$(claude code status --json | jq '.session_tokens')
THRESHOLD=50000

if [ $TOKEN_USAGE -gt $((THRESHOLD * 96 / 100)) ]; then
  echo "CRITICAL: Session at 96% capacity. Forcing memory dump and restart."
  # Trigger agent to write summary to MEMORY.md
  # End current Claude Code session
  # Start a new session
elif [ $TOKEN_USAGE -gt $((THRESHOLD * 80 / 100)) ]; then
  echo "WARNING: Session at 80% capacity."
fi
Enter fullscreen mode Exit fullscreen mode

Schedule this with a cron job to run every 5-10 minutes alongside your agent's main heartbeat.

The Stack That Made It Work

The production system used:

  • Runtime: OpenClaw on an always-on Mac Mini (M-series).
  • Model: Claude on a flat-rate plan (to eliminate per-token anxiety).
  • Ops: Cron-based heartbeats every 30 minutes, session cleanup at 3 AM, and weekly memory compaction.

Nothing here is exotic. The magic is in the strict discipline of session management and memory hierarchy. This architecture transforms Claude Code from a short-burst coding assistant into a stable, long-term autonomous operator.


Originally published on gentic.news

Top comments (0)