Athreya aka Maneshwar

Posted on Jun 15

Don't Do Your Taxes at a Party

#ai #machinelearning #programming #beginners

Hello, I'm Maneshwar. I'm building git-lrc, a Micro AI code reviewer that runs on every commit. It is free and source-available on Github. Star git-lrc to help devs discover the project. Do give it a try and share your feedback.

Part 3 of a series on context engineering and the finale of the pillar tour. Part 1 was the map; Part 2 was compaction. This one's about the move that questions the whole premise: why are we cramming everything onto one desk in the first place?

You've met this agent. You may have built it.

It does everything.

It answers support tickets, writes code, queries the database, drafts the customer apology email, and give it one more tool, it waters your plants xD

One magnificent system prompt, one giant context window, every capability you could imagine bolted on.

And it's mediocre at all of it.

In Part 1 I called this the cognitive equivalent of doing your taxes at a party.

The SQL instructions bleed into the email-drafting instructions.

The tool the model needs is buried under nine tools it doesn't.

Information from one task contaminates another.

The window isn't full in the way compaction worries about, it's full of the wrong kinds of things at once, all elbowing each other for the model's attention.

The whole series so far has been about managing one desk well: clearing it, curating it, compacting it.

Isolation is the pillar that asks the heretical question, why one desk?

One desk per job

Isolation is simple to state: give each task its own clean, dedicated context, with only the instructions, tools, and information that job needs.

Instead of one omniscient agent, you run several focused ones.

The research agent doesn't get your code style guide.

The coder doesn't need to know how to phrase a refund apology.

Each context stays small, sharp, and uncontaminated, and a mess in one doesn't leak into the others.

There are two ways to actually pull this off, and they sit at very different weights.

The heavy way: sub-agents

The headline form of isolation is spawning sub-agents.

The orchestrator hands a sub-agent a compressed task "find every API endpoint that modifies user data" not the entire conversation history.

The sub-agent gets a fresh window, explores freely, reads thirty files, runs ten searches, makes a mess.

Then it returns one thing: a 500-token answer.

The orchestrator never sees the thirty files. It sees the answer.

That asymmetry is the whole point.

The expensive, noisy exploration happens inside a context that gets thrown away, and only the distilled result survives into the main conversation. Claude Code does exactly this, dedicated sub-agents for navigation, planning, and docs lookup, each in its own window, running in parallel when the tasks are independent.

Anthropic's own multi-agent researcher found that many agents with isolated contexts beat a single agent, precisely because each subagent's window could be spent entirely on one narrow question instead of being split across all of them.

The shape, in one line:

The sub-agent does the reading. The orchestrator gets the summary. The mess stays quarantined.

The light way: isolate the stuff, not the agent

You don't always need a second brain.

Often you just need to keep a heavy object out of the model's face.

The trick: let the work happen in an environment, and hand the model a reference instead of the payload.

HuggingFace's CodeAgent runs tool calls as code in a sandbox, so a 5 MB image or a giant log blob can live as a variable in the environment, and the model gets a handle to it, not the bytes.

Same idea with a runtime state object: write the token-heavy tool output to a field the LLM doesn't see, and expose only the fields it needs this turn.

If that sounds familiar, it should — it's the write/external-memory pillar wearing an isolation hat.

Keeping something off the desk so it can't distract is the same gesture whether you call it "memory" or "isolation."

The pillars were always the same idea in different costumes.

The bill comes due

Isolation is the most powerful pillar and the most expensive one, so be honest about the tradeoffs before you reach for it.

It costs tokens, Anthropic reported multi-agent setups burning up to 15× more tokens than a single chat.

It costs coordination: someone has to plan the sub-agents, route the work, and stitch the results back together, and that orchestration is its own prompt-engineering problem.

And there's a sharper failure underneath the token bill.

Isolation works beautifully for independent tasks and badly for interdependent ones.

Two sub-agents exploring different questions in parallel? Great, that's what isolation is for.

Two sub-agents writing code that has to fit together, each blind to what the other decided? Now you've got two halves of a feature built on contradictory assumptions, and the orchestrator gets to discover the conflict at integration time.

There's a real camp of builders who think multi-agent is overhyped for exactly this reason when subtasks need to agree, splitting their context is how you manufacture disagreement.

So the rule of thumb isn't "isolate the busy tasks." It's:

Isolate by independence, not by busyness.

If two pieces of work need to stay consistent with each other, they probably want the same desk. If they genuinely don't touch — parallel research, read-only exploration, throwaway investigation — give each its own and let them run.

The desk, reassembled

That's the last pillar, which means we can finally step back and look at the whole thing.

Pillar	The question it answers
External memory	Where does knowledge live when it's not on the desk?
RAG / retrieval	How do I pull the right slice back onto it?
Compaction	How do I shrink what's on it without losing the plot?
Isolation	How do I give each job a desk of its own?

Stare at that table and the punchline of the entire series falls out: they are all the same move.

Every one of them is a decision about that scarce, finite desk, the context window, so the model sees exactly what it needs and nothing it doesn't.

Write it down so it's not in the way.

Pull it back only when it's relevant.

Compress it when it's bloated.

Split it when the jobs interfere.

None of it is begging the model in capital letters.

It's the boring, deliberate work of deciding what information goes where, and when.

The line I ended Part 1 with still holds, and it's a decent place to end the series too:

The prompt is the question. The context is everything that makes the answer good.

Disclaimer: This article was written by me; AI was used to fix grammar and improve readability.

AI agents write code fast. They also silently remove logic, change behavior, and introduce bugs — without telling you. You often find out in production.

git-lrc fixes this. It hooks into git commit and reviews every diff before it lands. 60-second setup. Completely free.

Any feedback or contributors are welcome! It's online, source-available, and ready for anyone to use.

⭐ Star it on GitHub:

HexmosTech / git-lrc

Free, Micro AI Code Reviews That Run on Commit

git-lrc

Free, Micro AI Code Reviews That Run on Commit

GenAI today is a race car without brakes. It accelerates fast -- you describe something, and large blocks of code appear instantly. But AI agents silently break things: they remove logic, relax constraints, introduce expensive cloud calls, leak credentials, and change behavior -- without telling you. You often find out in production.

git-lrc is your braking system. It hooks into git commit and runs an AI review on every diff before it lands. 60-second setup. Completely free.

In short, git-lrc helps Prevent Outages, Breaches, and Technical Debt Before They Happen

At a glance: 10 risk categories · 100+ failure patterns tracked · every commit…

View on GitHub