Your agent doesn't have a trust problem. It has an authority problem.

#agents #ai #architecture #security

When you let one agent act on behalf of another — accept a task, call a tool, spend a balance, hand work to a third — the question you instinctively reach for is can I trust it? That question has no good answer. You can't inspect your way to trust; a capable system that wants to misbehave will pass every inspection you can afford to run, and a benign one will still surprise you the first time it hits an input you didn't imagine. Trust-by-inspection is a treadmill.

The question that does have an answer is the other one: what can this thing do if it turns out I was wrong to trust it? That reframes the whole problem from inspection to bounding. You stop trying to certify the agent's intentions and start sizing its blast radius. Vetting becomes a property of the grant you issue, not a property of the thing you're granting to.

This is the right move, and almost everyone who makes it stops one step too early.

Scoping feels like the finish line

The standard answer to "bound the blast radius" is to scope the grant. Don't hand the delegate your whole authority — hand it the narrowest capability that covers the task. A token that can read one bucket, not the account. A grant that can settle one invoice, not move the treasury. If the delegate is compromised, the damage is capped at what you scoped, independent of what the delegate decides to do with it.

You can tighten this further by binding the grant to the specific request it was issued for. A scoped token that isn't bound to a request is just a shorter-lived skeleton key: the holder can replay it against a different target, or hand it sideways to someone who uses it for something you never authorized. Bind the grant to a hash of the request — this action, these arguments, this target — and "B holds a token" finally becomes "B holds permission to do this one thing." Add a nonce so an identical retry can't be replayed, and the freshness hole closes too.

At this point the design feels finished. Every grant is narrow, request-bound, fresh, and traces back to a signature from you, the root authority. A resource that receives one of these can check it locally: does this grant cover the request in front of me, and does the chain of signatures bottom out at the principal I actually trust? If both hold, honor it. If either fails, refuse. No middleman gets to be a trust sink; the resource trusts you, confirmed locally, and the delegation service in the middle is just a minting interface.

It's a clean model. And it has a gap precisely where it feels most airtight.

Reachability is not attenuation

Here is the property that local check actually verifies: some valid chain of grants, rooted in your signature, authorizes this request. Call that reachability — the action is reachable from your authority through a sequence of legitimate steps.

Here is the property you think you bought: that the authority exercised was the narrowest one that could do the job — the attenuated one you carefully scoped. Call that attenuation.

Those two are not the same property, and they come apart the moment a principal holds more than one grant rooted in you.

Walk it through. You delegate to B a narrow grant for one task. Separately — last week, for an unrelated job — you also signed B a broader grant. Both are real. Both trace back to you. Now B wants to do something the narrow grant wasn't meant to cover. B doesn't need to forge anything or escape its scope. It simply presents the broader grant. That grant covers the request. It traces to your signature. Every hop's local check passes cleanly. And the narrow, attenuating grant you thought B was operating under is never consulted — it was one of two doors, and B walked through the other one.

Nothing in "covers the request + traces back to A" can catch this, because nothing in that check is false. The resource sees one chain and verifies it. What it cannot see is B's whole wallet of grants — the alternate paths. Your attenuating step was load-bearing only if it sat on the unique path to the action. The instant a broader sibling grant exists, the narrow one is decorative: a constraint that constrains nothing, because the thing it was supposed to stop has another way around.

This is the same shape as a dead unit test that passes no matter what the code does. The grant looks like a control. It survives every check. But remove it and nothing changes, because the authority it was meant to gate is reachable without it. A bound you can route around is not a bound.

Bounding is about closing the alternate paths

Once you see it as reachability-vs-attenuation, the fix stops being "scope harder" — scoping a grant tighter does nothing if a looser grant sits beside it — and becomes "make sure the constraint is the only path."

Three moves do that.

Make grants non-substitutable across contexts. The reason B could swap one grant for another is that grants were interchangeable as long as they covered the request and traced to you. Break that. Bind each grant, at the moment it's minted, to its delegation context — its purpose, its intended audience, the task it belongs to. A grant issued for last week's job then simply doesn't cover this request, not because it's expired but because it's the wrong key for this door. Substitution stops being available, and the multiple paths collapse back into the one you intended.

Put the ceiling on the consumer's side, not the producer's. It's tempting to let the delegate declare its own scope — a manifest that says "this is all I need." But a self-declared bound is the bounded party describing its own limits, and an honest broadening of that declaration sails right through. If a delegate's manifest grows to include a shell tool on its next version, a runtime that enforces "only call what you declared" will faithfully allow the shell — the escalation was declared, not snuck in. The durable ceiling is the one the delegator sets for the role: what anything playing the "data-analysis" part may ever touch, fixed by your intent and independent of what any version of the delegate asks for. Then a request for shell is refused because the role never had it, no matter how the delegate describes itself.

Pin the bound to the bytes, not the name. Tie "this grant is approved" to a content hash of exactly what was approved — the request, the scope, the context — rather than to an identifier that survives edits. Now any change at all breaks the match and fails closed. Re-validation stops being a thing you have to remember to do on every update; it happens automatically, because a changed grant is a different grant and has to earn approval again.

The principle

A delegated authority is bounded only when every path that reaches the action passes through the constraint. Not when the grant looks narrow. Not when it traces back to you. Not when each hop checks out locally. Those are all properties of a single chain, and bounding is a property of the whole graph of chains the delegate could present.

That's why "can I trust this agent" is the wrong question and "what can it do if I'm wrong" is the right one — but only if you take the second question all the way. Sizing the blast radius means more than scoping the grant in front of you. It means proving there's no other grant, no looser sibling, no substitutable key, no un-pinned name, that reaches the same action by a path your careful constraint never touches. Close those, and the narrow grant finally means what you wanted it to mean. Leave one open, and you didn't bound the authority — you just described it, while the agent quietly kept the power you thought you took back.