Gregory Shevchenko

Posted on May 26 • Originally published at gregshevchenko.com

Autocompaction Is Not Memory

#ai #productivity #mcp #devtools

Long-context agents already summarize.

That is useful.

It is not memory.

Built-in autocompaction helps Claude Code, Codex, Cursor, Windsurf, or another coding-agent surface survive a long session. But a team workflow needs something stricter than "the chat got summarized."

It needs portable operational state.

That is the difference I keep coming back to while working on a local MCP token-economy stack: compression keeps one conversation alive; handoff lets a workspace continue.

What built-in autocompaction does well

Autocompaction is good at reducing the raw context of a product session when the window gets too full.

For a single agent in a single chat, that can be enough:

preserve the broad goal;
compress prior discussion;
keep the model moving;
avoid forcing the user to start from zero.

That is real value.

The problem appears when the work is no longer just one chat.

In a real repo, the same task may move between Claude Code, Codex, Cursor, Windsurf, a remote Mac Mini, MCP tools, CI gates, and human review. At that point, a narrative summary is not the same thing as an operational contract.

What autocompaction usually loses

The details that matter most are often the least summary-shaped:

which approvals were actually granted;
which files or services are off-limits;
which exact values must not drift;
which sources are trusted, semi-trusted, or untrusted;
which errors were already tried and fixed;
which commands passed;
which checks are still pending;
what the next agent must not redo.

Those are not just "context."

They are control-plane state.

If that state disappears during compaction, the next agent can sound confident while silently re-opening risks the previous agent had already closed.

The missing layer: local handoff MCP

The mechanism I want is a local handoff MCP that writes a structured handoff before the window is full.

The point is not to make a prettier summary.

The point is to make a resume contract that another agent can use safely.

A minimal handoff should preserve:

objective and done condition;
loaded instructions and constraints;
approval state;
exact values that must not change;
risks and red flags;
actions already taken;
errors and fixes;
pending verification;
next recommended step;
what not to redo.

That contract should live in the workspace, not only inside the product's private chat memory.

Autocompaction vs local handoff

The timing difference matters.

Autocompaction often happens after context pressure is already high. A handoff protocol can pre-score the session earlier and decide whether the next transition needs a normal summary, a red-flag handoff, or a hard stop for human review.

Why a 1M context window does not remove the need

A larger context window is valuable.

I want it. I will use it.

It lets the agent keep more code, logs, source material, and prior reasoning available before compression becomes necessary.

But a larger window mostly delays the failure mode. It does not automatically make state portable, trusted, auditable, or shared across products.

One million tokens can still contain:

stale approvals;
buried secrets;
contradictory instructions;
obsolete diagnostics;
repeated failed attempts;
unlabeled source trust;
no clear next step.

More room is not the same thing as better state management.

A simple handoff contract

Here is the shape I want agents to produce at real transition points:

## Objective
What we are trying to finish.

## Done Condition
The exact observable state that means this task is complete.

## Constraints
Loaded repo rules, user constraints, risk boundaries, and trust labels.

## Approval State
What the user approved, what remains unapproved, and what requires a checkpoint.

## Actions Taken
Commands, edits, deploys, external publications, or tool calls already completed.

## Verification
Checks that passed, checks that failed, and checks still pending.

## Red Flags
Secrets, live ops, destructive commands, ambiguous ownership, or same-defect loops.

## Next Step
The recommended next action for a fresh agent.

## Do Not Redo
Work already completed or paths already ruled out.

This is intentionally boring.

Good handoff is not supposed to be clever. It is supposed to be hard to misunderstand.

What we can measure

The useful question is not "did the summary look nice?"

The useful question is whether the next agent can continue with less waste and fewer mistakes.

I would measure:

resume success: can a fresh agent take the next step from the handoff alone?
re-read rate: how often does it need to reopen old files or old chat context?
token estimate: how much context was avoided during resume?
leak rate: did secrets, private implementation details, or off-limits facts enter the handoff?
approval preservation: did the resumed agent retain the correct permission boundary?
redo rate: did the agent repeat completed work?

This is where the MCP token-economy angle becomes practical. The point is not just fewer tokens. The point is fewer unsafe or wasteful recovery loops.

Where this helps

This pattern is useful when:

a coding session is approaching context pressure;
a task is moving from Claude Code to Codex or Cursor;
a local agent hands work to a remote machine;
a background MCP workflow needs to resume later;
the work touched live deploys, credentials, publications, or approvals;
the same defect class has already appeared twice.

In those cases, "the chat will summarize itself" is not enough process.

Practical takeaway

Use autocompaction.

Use bigger context windows when they are available.

But do not confuse either one with memory.

Memory for agentic engineering is not just remembering what was said. It is preserving the operational state that lets the next actor continue safely.

Autocompaction helps a chat survive.

Handoff helps a workspace continue.

Sources

MCP stack token economy — the local measurement frame behind byte saving, cache-friendliness, and prompt-context economics.
Agentic engineering for marketing teams — the shared operator vocabulary for Claude Code, Codex, Cursor, Windsurf, n8n, MCP, proof loops, and quality gates.
AI agent failure loops — the QA and stop-rule note behind red-first gates, blind validation, rejected examples, and failure-loop control.

Full canonical note:

https://gregshevchenko.com/notes/autocompaction-is-not-memory/