Aming

Posted on May 30

Route Context: How I Built Right Context for Agents

#ai #rag #agents #development

We already knew agents needed the right context. Dogfood taught us the harder problem: making that context live, visible, auditable, and enforceable inside a multi-agent workflow.

The failure looked like success.

A worker changed one file. The focused test passed. The patch was small, readable, and locally correct. If you only looked at the worker's report, the job was done.

But the overall job was still wrong.

The change touched a permission-sensitive path. It needed evidence that the right role had acted, that the project map still matched the code, that the audit trail could explain why the edit was allowed, and that the final close check had not been skipped. Instead, the system had solved the nearest local problem. A lane that should have preserved the shape of the work had drifted into implementation. The worker had completed its tiny job, while the whole route had lost its meaning.

Nobody had to make a dramatic mistake for this to happen. That was the unsettling part. The worker was not lazy. The test was not fake. The prompt was not empty. The system had plenty of context. It just did not have route context.

Route context is the per-task, per-lane runtime packet that tells an agent what job it is in, what role it has, what it saw, what it cannot do, and what evidence must pass before the work is done.

That is the positioning of this article. I am not trying to convince technical readers that right context matters. Most people building with agents have already learned that, usually the hard way. The point is how we made right context operational for agent development: route context, a live artifact that turns the principle into prompt-visible, hashable, auditable, gate-checkable workflow state.

We started with a different hope. We wanted agent development to feel zero-orchestration-ish: the user says what they want, the system understands the work, the right agents handle the right pieces, and nobody has to manually conduct a meeting of subagents. The observer would keep things coherent. Workers would implement. Review lanes would check evidence. Validation would catch bad finishes.

Then we used it on ourselves.

Again and again, the same failure appeared in different clothes. The observer would begin correctly, then collapse into a nearby code edit. A worker would pass a test, but not satisfy the larger obligation. A reviewer would evaluate plausible reasoning, but not the same evidence the worker actually produced. A prompt would contain lots of useful system knowledge, but not the specific promises this worker had to satisfy.

That is when the thesis became operational: AI agent development needs right context, not more context, and route context is how we made that sentence executable.

The Words We Needed

We had to translate our internal language into something simple enough to survive a live task.

Topology means what kind of job this really is. Is this a tiny deterministic bug fix, a permission change, a runtime change, a graph update, a public decision, or a mixed task that needs independent lanes?

A contract is the set of promises a worker must satisfy: target files, acceptance criteria, out-of-scope areas, allowed actions, blocked actions, and evidence to return.

A gate is a check that prevents a false finish. A test can be a gate, but it is not the only one. A close gate may reject a change because the audit evidence is missing, the runtime was not redeployed, the project map is stale, or the wrong role acted.

Graph, backlog, timeline, and contract are the governance layers that preserve project state, user intent, execution evidence, and allowed action. The graph is the project map. The backlog records the work and acceptance criteria. The timeline records what happened. The contract says what this lane may do.

Once those words were clear, the bug was easier to see. Our static skill text explained the system, but the live task needed route context: the current path through the work, with the role, lane, injected context, and checks bound to the action about to happen.

Static skill text is only a bootloader. It can teach the agent that observers, workers, review lanes, graphs, and gates exist. It cannot reliably decide what matters for this task, this lane, this file, this permission boundary, and this moment. Route context is the runtime packet assembled for that decision.

The Route Context Artifact

The useful fix was not a bigger prompt. It was a route context alert: a small implementation artifact generated before lane dispatch, during topology classification and prompt-contract assembly. It describes what the lane is allowed to do, what was injected into its prompt, and what evidence will be checked later.

A simplified route context alert looks like this:

route_context_alert:
  task_intent: "fix permission handling without changing audit policy"
  role_boundary: "implementation worker; no merge, close, or graph mutation"
  topology: "permission-sensitive bug; requires independent review"
  contract: "edit accepted target file; run focused test; report evidence"
  blocked_actions: ["change route policy", "close backlog", "redeploy runtime"]
  visible_injection_manifest: ["route_doc@sha256:...", "contract@sha256:..."]
  evidence_gates: ["focused_test", "independent_validation", "close_gate"]

This is not magic model control. The model can still misunderstand things. The point is that the important context is visible in the prompt, listed in an audit manifest, and tied to workflow checks that can reject a false finish.

The visible injection manifest was especially important. If a document, decision summary, expert note, or implementation contract influences a lane, it should appear in the manifest with an id, kind, source reference, and hash. The hash proves the identity of the injected artifact. Gates and evidence decide whether the lane satisfied the contract. Without the manifest, nobody can reconstruct what the agent actually saw.

Route context gave us a way to keep orchestration minimal without making it invisible. The system still feels close to zero-orchestration from the user's side, but the route carries enough explicit structure that lanes do not silently blend together.

The Implementation Pattern In Aming Claw

This is not only language we use around the system. Aming Claw implements the pattern as code-level contracts, local gates, audited queries, and append-only evidence.

The route prompt contract is source-controlled in the mf_workflow_runtime.v1.json template. That contract makes route-owned prompt context explicit: the injected artifacts are listed in a visible manifest, observer and review lanes are blocked from drifting into implementation, and the worker has to carry matching route_context_hash, prompt_contract_id, and prompt_contract_hash values. In other words, the prompt is no longer just a bag of helpful text. It has identity.

Before a bounded worker is handed the job, Aming Claw runs a local dispatch gate in mf_subagent_contract.py. The gate checks the worker's branch, worktree, base commit, target head, merge queue, fence token, route hash, prompt hash, and owned files. Same-worktree dispatch is blocked by default because "please stay in this directory" is not a boundary. The boundary has to be represented in durable facts the system can re-check.

When the worker returns, the finish gate in the same module treats the response as a claim, not as truth. The finish validation requires the fence token to match, tests to pass, blockers to be absent, a checkpoint id to exist, and the worker identity to still match the handoff. This is the difference between "the agent says it is done" and "the route can safely advance."

The audit trail follows the same idea. The task_timeline.py module is append-only execution evidence. Its close gate expects the route to have the right event kinds: implementation, verification, and close_ready. The later close verification checks those facts before a backlog item can be honestly closed.

Even project knowledge is handled this way. Graph context is not dumped wholesale into the prompt. It is queried through an audited graph-query trace and exposed through the MCP graph_query surface, so later review can ask what the agent looked up instead of guessing. The public manual-fix SOP names the same workflow: route, contract, timeline, and close gates are required evidence, and dispatch has to prove the worker's fenced identity before handoff (timeline and contract gates, dispatch requirements).

That is the implementation pattern: right context becomes a chain of small, checkable facts. The agent can reason with them, but it does not get to be the only witness.

Before And After

Before, the agent confidently finished the wrong local job.

It saw a failing test and fixed the code. It saw a nearby file and edited it. It saw a plausible completion story and wrote one. The observer forgot what kind of job this really was. The worker optimized inside its local patch. The review or validation step, if present, evaluated the local result instead of the global obligation.

After, route context requires the observer to start by preserving global state. It names the topology: tiny fix, permission-sensitive change, runtime change, graph-impacting change, major decision, or something else. It dispatches lanes accordingly. The worker acts inside a contract. The architecture review lane checks whether the route makes sense. Validation checks evidence, not just confidence. Later validation and close gates check the route context alert, manifest, and returned evidence before accepting the result.

That rejection matters. A good system must be allowed to say, "The test passed, but the work is not done."

For example: the focused test passed, but the route context hash did not match the worker contract. Or the code changed the right file, but the audit evidence did not prove the right role acted. Or the patch was correct, but the graph, meaning the project map, was now stale. Or runtime needed redeploy before anyone could claim the fix was live. Route context makes those checks explicit instead of hoping a model remembers them.

What The Observer Is For

The observer's advantage is not that it manages subagents. That framing makes it sound like a tiny project manager.

The observer's real advantage is global-state custody. Route context gives it something concrete to preserve: route identity, dirty scope, graph/current state, runtime status, backlog state, tests, close gate requirements, and follow-up work.

Workers should be local. That is their strength. A bounded worker should know its target files, acceptance criteria, blocked actions, focused tests, and required evidence. It should not need to carry the whole project in its head. When every worker receives the whole world, prompts get heavier and guarantees get weaker.

The observer keeps the larger surfaces connected through route context. Did this lane have permission to edit? Did it stay inside its file fence? Did the test prove the actual promise or just a nearby behavior? Did an independent reviewer inspect the same packet the worker produced? Did the runtime or graph need an update? Is there follow-up work outside the worker's scope?

These checks are not glamorous. They are what stop a green test from becoming a false finish.

Efficiency Without The Theater

The biggest improvement was not raw wall-clock speed.

Parallel lanes can help. Architecture review, implementation, and validation lanes expose different gaps earlier than one linear worker. But the real win was effective efficiency and quality: fewer locally correct patches that could not be honestly closed, less rework after review, and earlier discovery of missing evidence.

The system became calmer because it stopped treating "done locally" as "done globally." Some gates remain serial on purpose. Commit, runtime redeploy, graph reconcile, and backlog close mutate shared state or claim shared state is current. Those steps should not be casually parallelized just because multiple agents are available.

This is the correction to zero-orchestration-ish design. The goal is not to hide all orchestration. Hidden orchestration is unreliable because nobody can audit what happened. Route context keeps orchestration visible and minimal at the points where role, evidence, and shared state matter.

The Rule We Use Now

A tiny deterministic edit can be one agent plus a focused test. If route context says the blast radius is clear, the file ownership is obvious, and there is no permission, audit, graph, or runtime implication, keep it simple. Give the worker a tight contract and verify the behavior.

P1 and P0 work is different. So are routing, permission, audit, graph, and runtime tasks. Those need an observer, an architecture review lane, an implementation worker, and independent validation. Major decisions need adversarial lanes: separate expert packets, an independent review lane comparing evidence, and an observer final decision. Not because important work deserves ceremony, but because important work has more ways to be locally successful and globally wrong.

The checklist is compact:

Name the topology: what kind of job this really is.
Bind the worker contract: target files, acceptance criteria, and evidence.
List blocked actions: what this lane must not do.
Expose injected context: manifest the artifacts and hashes the lane saw.
Make gates reject false finishes: tests, validation, and close checks must be able to say no.

That is the practical lesson we got from dogfood. More context made agents sound more informed. Route context made right context safer to act on.

For us, right context stopped being a prompt-writing aspiration when it became route context: a route that knows what matters now, a contract that bounds the worker, a manifest that shows what was injected, and gates that refuse to confuse a passing test with a finished job.

Top comments (2)

Harjot Singh • May 31

"Right context, not max context" is the realization that separates people who've actually run agents at scale from people who think bigger context windows solve everything. More context isn't better - it's more expensive, slower, and often worse (the model gets distracted by irrelevant material). Routing the right slice to the right agent at the right step is the actual hard problem, and it's where most of the cost AND quality lives.

This is the core of how Moonshift works under the hood - it's a multi-agent pipeline (prompt to a shipped SaaS on your own GitHub + Vercel) where each agent gets a scoped context relevant to its step rather than the whole project history, which is both why quality holds across a long build and why a full build is ~$3 flat (you're not re-paying for dragged-along context). First run's free, no card. Really aligned with how I think - how are you deciding what's "right" per agent: retrieval, static slicing by task type, or something learned? The selection policy is the part I find hardest to get right.

Aming • May 31

Thanks, and yes, “right context” is exactly the hard part.

For us, the key distinction is that “right” is not just static slicing or retrieval. We treat context selection as a routed, auditable contract.

The route is built from several system facts:

Graph: what part of the codebase/task structure is actually relevant
Backlog: what user intent and priority this work belongs to
Timeline: what already happened, failed, or was verified
Contract: what the agent is allowed to do
Gate: whether the context/prompt/action is allowed before dispatch

So the goal is not only “give each agent less context.” It is “give each agent the right context, at the right step, with evidence that the context was selected and bounded correctly.”

That is the part I think matters most: context should be scoped, but also inspectable and enforceable.