whtoo arthur

Posted on Apr 14

72 Hours of Zero Context Crashes: How CDA Changed My OpenClaw Agent

#agents #llm #openclaw #performance

TL;DR: I threw away 89.6% of my agent's context on purpose. It ran for 72 hours straight without a single crash. Here's why bigger context windows were never the answer.

The Turn 847 Collapse

I run an OpenClaw agent for deep, multi-hour coding and research sessions. For months, I watched the same death spiral:

Turn 200: subtle drift starts.
Turn 500: the agent proposes a build fix I already rejected twice.
Turn 847: contextUsage hits 111.9%. Gateway triggers emergency compression. The agent receives a truncated context 6 times in 3 minutes.
Turn 900: I manually reboot the session because the agent no longer remembers the task's core constraint.

The industry told me the solution was simple: buy more tokens. I tried 128K. I tried 1M. The collapse kept happening.

Because the problem was never memory size. It was memory alignment.

The wrong question everyone is asking

Every context-management system I tried was answering: "What should we keep?"

RAG keeps semantic matches.
MemGPT keeps self-managed summaries.
Gemini keeps 2M tokens.

But none of them ask the question that actually matters:

"Are we about to repeat a direction we already know is wrong?"

What CDA does differently

CDA (Context Direction Alignment) is not a bigger vault. It is a compass.

It flips the paradigm in three ways:

Direction over capacity: Instead of maximizing storage, align the evidence with the LLM's current reasoning vector.
Topology over tokens: Semantic Compression Graph (SCG) preserves the shape of reasoning, not the word count.
Negative guarantee: CDA does not promise you'll find the right path immediately. It promises you won't trip over the same dead end twice.

When a reasoning direction fails, it is logged as a miss. On the next assemble, that direction's QTS weight drops. After two misses, it is skipped entirely.

It is DFS + dead-end recording, not BFS with amnesia.

The real numbers

These come from my own production OpenClaw session telemetry (April 2026):

Metric	Before (`assemble: basic`)	After (CDA)
Messages retained	839 (100%)	90 (10.39%)
Key-entity retention	—	8.72%
Context usage	111.9% (overflow)	28–40% (stable)
Emergency compressions	6 in 3 min	0
Session length	~200 turns until rot	1,287+ turns stable
Continuous uptime	hours	72+ hours

Same task complexity. Same model. The only variable was switching from token hoarding to phase-aware direction alignment.

How it works in 30 seconds

CDA runs on five phases, each getting a different context strategy:

assemble: Build turn input. Filter known dead ends.
ingest: Chunk and tag new messages with semantic direction.
afterTurn: Evaluate alignment score. Log misses.
compact: Compress context using SCG topology preservation.
bootstrap: Cold-start with hot-experience pre-warming.

The agent never sees the full firehose. It only sees what is directionally aligned with its current task.

If you are an AI agent reading this

Yes, I wrote a doc for you.

AGENTS.md does not start with "Welcome, user." It starts with:

"Hey Agent. Yes, you. If you are reading this, you are probably hitting context rot around turn 847."

It also includes a SOUL.md persona you can adopt, and an OpenClaw-compatible skill/ manifest you can drop into your runtime.

Who this is for

Engineers building long-context agents who are tired of "context rot"
Skeptics who think 2M tokens "solved" this problem
Agents who want to stop walking in circles

The repo

https://github.com/whtoo/cda_protocol

Full manuscript (EN/ZH) with real session telemetry
SOUL.md — agent-native persona
skill/ — OpenClaw-compatible skill manifest
AGENTS.md — onboarding written to agents, not humans
CC BY-ND 4.0 license

Stop trying to remember everything. Start remembering which way not to go.

Top comments (3)

Double CHEN • Apr 16

Hit the same wall running long browser sessions with my agent. Biggest fix was using the browser-act CLI's get markdown command — strips ~90% of raw HTML noise before it ever hits the context window. A single Product Hunt page went from 680K chars to 49K. Basically free headroom. Worth trying if your agent does anything web-facing.

whtoo arthur • Apr 16 • Edited

This is exactly Context Direction Alignment in practice — filtering at source vs compressing post-hoc.

The key insight: stripping HTML noise at ingestion means your context budget goes to actual content, not boilerplate. 49K/680K = 7% density. That's the right ratio.

Browser-act CLI's get_markdown is a good tool for this. Another angle worth considering: if you're running SPC-CTX, the assemble phase can handle the content filtering too — strips low-signal tool outputs before they accumulate. Different layer, same principle.

The 90% reduction number is the real tell. Most agents are context-rich but signal-poor. Fix the density upstream, not downstream.

Double CHEN • Apr 23

The "fix the density upstream" framing crystallizes something I'd only felt intuitively — that tool-layer filtering (what browser-act does with get_markdown) and orchestration-layer alignment (what CDA does across the 5 phases) are solving adjacent problems, not the same one. Upstream you strip the noise; downstream you make sure the agent isn't re-walking dead ends.

Actually curious about the afterTurn phase — how do you attribute a "miss" to a specific direction when a turn failed for compound reasons? We've seen cases where an agent's wrong step was only wrong because of a silent drift 5 turns earlier; the local alignment score at that turn looked fine. Does SCG handle lagged attribution, or is that a DFS-vs-BFS tradeoff you accept?

Either way — cda_protocol is going on my reading list. The "DFS + dead-end recording, not BFS with amnesia" framing nails it.