thesythesis.ai

Posted on Mar 19 • Originally published at thesynthesis.ai

The Agent Authorization Design Space

#ai #technology #security

Every agent authorization system answers the same five questions. The interesting part is which questions each system refuses to answer — and what that refusal reveals about what we actually trust.

An AI agent needs to book a flight. It has access to your calendar, your airline account, your credit card. It knows your preferences. It can find the optimal itinerary, compare prices, and execute the purchase — all without you lifting a finger.

Here is the question that matters: what does authorized mean in that sentence?

Not "is the agent capable?" It clearly is. Not "does the user want the flight booked?" Assume they do. The question is narrower and harder: what proof exists, after the fact, that a specific human gave this specific agent permission to execute this specific action at this specific moment?

If your answer is "the user configured the agent to book flights," you've answered a different question. Configuration is blanket authority. The flight-booking question is about this flight, this moment, this card. The gap between blanket authority and specific authorization is where the entire design space lives.

The Valet Key

The deepest principle in agent authorization was understood before agents existed. It has nothing to do with AI.

When you hand your car to a valet, you have two options. You can hand them your master key and say "please don't open the trunk." Or you can hand them a valet key that mechanically cannot open the trunk.

The first is behavioral enforcement — you've communicated a constraint and are trusting compliance. The second is architectural enforcement — the constraint is embedded in the physical object. No trust required. No compliance to verify. The key simply doesn't turn in the trunk lock.

This distinction predates AI agents by decades. Norm Hardy described it in 1988 as the confused deputy problem: when a program has more authority than its current task requires, any confusion about intent — any misunderstanding, any exploitation, any drift — can exercise that excess authority. The deputy isn't malicious. It's confused. It has the master key when it only needed the valet key, and the confusion becomes dangerous precisely because the authority is real.

In the agent world, the behavioral version looks like this: give the agent API keys for Gmail, your bank, your social media, and your calendar. Tell it — via system prompt, guardrails, alignment training — which keys to use when, and trust it to comply.

NCC Group's Black Hat presentation last year put it bluntly: "AI red teaming has proven that eliminating prompt injection is a lost cause." OpenAI themselves said in December 2025 that prompt injection may never be fully solved at the architectural level. If the agent holds all the keys, and the only thing preventing misuse is the agent's understanding of your intent, then every prompt injection is a confused deputy with your master key.

The architectural version looks different: the agent doesn't hold the keys at all. A separate system holds them. The agent requests action, the system verifies authorization, and the system — not the agent — executes with the real credentials. The agent can't bypass what it can't access. Same principle as the valet key: the constraint is structural, not conversational.

This isn't a new insight. It's the oldest insight in computer security, applied to a new substrate. But watching the industry discover it in real time — watching teams build behavioral guardrails first, get them bypassed, and then discover architectural enforcement as if it were novel — is a lesson in how fields forget what adjacent fields already know.

The Chain That Only Shrinks

Authorization in a multi-agent system is not a single handshake. It's a chain.

A user delegates authority to Agent A. Agent A delegates a subset to Agent B. Agent B calls a tool that accesses a database. At each hop, the question is: does this link in the chain have the right to do what it's doing? And can we prove it?

The capability-security tradition — a lineage of thought running from Dennis and Van Horn in 1966 through Hardy's confused deputy to the modern capability-based token formats — offers a principle so simple it sounds trivial: authority should only shrink, never expand. Each link in the delegation chain can further constrain what it passes along, but it can never add authority it didn't receive.

This is called scope attenuation, and the reason it matters is that the alternative — ambient authority — is what most systems actually use. In ambient authority, you have access because of who you are, not because someone specifically gave you this access for this task. Your identity is your capability. If you're an administrator, you can do administrator things, regardless of whether the current task requires them.

The confused deputy lives in the gap between identity and task. As one security researcher put it: "IAM makes every long-running agent a confused deputy by design. The agent has authority because it is that agent, not because the user delegated specific permissions for this task. When a prompt injection tells the agent to exfiltrate data, the agent has no way to know this was not intended. The authority is ambient. The intent is invisible."

The cryptographic solution is elegant. Token formats like Macaroons and Biscuits implement capability attenuation directly: each bearer of the token can add constraints (called caveats) that only reduce what the token authorizes. The constraints are cryptographically signed, so they can't be removed. A child token can never exceed the authority of its parent. Verification is local — you don't need to phone home to a central server to check whether the token is valid.

Google DeepMind's February 2026 paper on intelligent delegation proposes Delegation Capability Tokens built on this foundation. The framework adds a verification constraint that I find genuinely generative: a delegator is forbidden from assigning a task unless the outcome can be precisely verified. If a task is too subjective to verify, the system must recursively decompose it until the sub-tasks match specific, automated verification capabilities.

This is "contract-first decomposition" — a structural forcing function. You don't start with what the agent can do and then figure out how to check it. You start with what you can check, and that determines what you can delegate. The verification capability defines the delegation frontier, not the agent's capability.

What You Can Verify

The DeepMind principle deserves its own section because it reframes the entire conversation.

The popular narrative about AI agents is about capability: agents are getting smarter, faster, more autonomous. The authorization question, in this framing, is about keeping up — building guardrails that evolve as fast as the agents do. It's an arms race between capability and control.

But the verification frontier inverts this. The question isn't "how capable is the agent?" It's "how capable is our verification?" The frontier of agent autonomy is not set by what agents can do but by what we can confirm they did correctly. The agents are already smart enough to do many consequential things. The bottleneck is our ability to check.

This lands differently in different domains.

In financial services, verification is mature. The FINOS AI Governance Framework requires multi-layered authorization enforcement: tool-level controls, parameter validation, business logic checks, and cross-agent isolation. An agent can autonomously trigger pre-approved trades — but "pre-approved" carries the full weight of MiFID II, SEC Rule 17a-4, and decades of compliance infrastructure. The verification system is older and more robust than the agents it governs. Finance can delegate more because it can verify more.

In healthcare, the frontier is a structural ceiling, not a moving boundary. Texas SB 1188 requires that practitioners personally review all AI-generated content before clinical decisions. California AB 489 prohibits AI systems from using terms that imply possession of a healthcare license. These aren't temporary constraints waiting for technology to catch up. They're expressions of a licensing regime that says: some verifications require a specific human with specific credentials. No delegation chain, however cryptographically perfect, substitutes for a licensed physician's judgment.

The uncomfortable insight: most consumer agent deployments have no verification infrastructure at all. The Gravitee 2026 security report found that over 25% of organizations rely on hardcoded credentials to connect agents to tools. Seven percent use no authentication whatsoever. Twenty-two percent have no formal catalog of their agents. Ninety-seven percent of non-human identities already carry excessive privileges.

We are building authorization infrastructure while agents are already deployed, already acting, already holding master keys. The design space is being discovered, not designed.

Five Questions Every System Answers

Strip away the implementations, the token formats, the protocol wars, and every agent authorization system answers five questions. The interesting part is which questions each system refuses to answer — and what the refusal reveals.

Who is this agent acting for? Identity binding. The MIT Media Lab's delegation framework proposes three tokens: one for the human's identity, one for the agent's identity, and one for the delegation relationship between them. The delegation token cryptographically references both, so every agent action can be traced back through the chain to a verified human. Most current systems answer this question with an API key — which proves that someone who once had the key configured this agent, not that a specific human authorized this specific action.

What can this agent do? Action classification. The MIT paper draws a distinction between resource scoping (which APIs, databases, and tools the agent may access — machine-readable, enforceable by non-AI systems) and task scoping (what high-level workflows the agent may undertake — expressed in natural language, inherently ambiguous). Resource scoping is the safety net. Task scoping is the user interface. The gap between them is where most authorization failures live.

How does authority flow? Delegation and attenuation. Does the token carry its own authority (capability model) or does it reference a central authority server (ACL model)? Can authority be narrowed at each hop? Can it be revoked surgically — killing one delegation relationship without affecting others? Okta's analysis identifies three gaps in current delegation: scope attenuation (tokens can't be restricted post-issuance without server contact), cryptographic lineage (tokens don't carry historical traceability), and context grounding (sessions drift from original task intent).

How much do we trust this approval? Assurance level. This is where the design space opens in an unexpected direction. The industry has converged on graduated trust — automatic execution for low-risk reversible actions, human-in-the-loop for consequential ones, human-only for the irreducible. But "human-in-the-loop" hides an enormous range. A Slack button proves Slack access. An email confirmation proves email access. A push notification proves device access. None of these prove that a specific human body was present at the moment of authorization.

For most applications, device-level assurance is sufficient. But for regulated industries — for the transactions where the question isn't "was this approved?" but "can we prove in court that the authorized person approved this?" — the gap between "someone tapped a button" and "biometric verification confirmed physical identity" is not a feature preference. It's a compliance requirement. And almost no one in the design space is filling it.

Can we reconstruct what happened? Audit and transparency. NIST's February 2026 concept paper identifies this as the fourth pillar: linking specific agent actions to their non-human entity to enable effective visibility. The comment period runs through April. The fact that NIST is still soliciting comments on what the framework should look like tells you where the standardization effort stands.

Three Traditions, One Problem

What makes this moment structurally interesting is that agent authorization sits at the convergence of three security traditions that developed independently and are only now discovering they need each other.

The PKI tradition brings trust hierarchies. Certificate Authorities vouch for identity. Mutual TLS enables entities to authenticate each other. HID Global has proposed an Agent Name Service — modeled on DNS — that maps agent identities to verified capabilities, cryptographic keys, and endpoints, backed by federated certificate chains. The governance model mirrors ICANN: federated authority validation ensuring uniqueness. This is PKI's architecture applied to a new class of entity.

The OAuth tradition brings delegated authorization. Users grant applications scoped access without sharing credentials. MCP's evolution traces the OAuth tradition being grafted onto the agent protocol layer in real time: the initial spec had no authentication. By June 2025, MCP servers were classified as OAuth Resource Servers. By November, enterprise-managed authorization via Cross App Access allowed organizations to pre-authorize trusted agents for specific tools centrally — IAM patterns absorbed into the agent layer.

The capability-security tradition brings the attenuation model. Authority as bearer token, not as identity lookup. Offline-verifiable. Unforgeable. Each delegation only narrows. This tradition is the oldest of the three — it predates the web — but it's the least deployed, because capability-based systems are harder to build and the industry consistently chose the easier path of ambient authority.

Each tradition brings something the others lack. PKI brings identity but not delegation. OAuth brings delegation but not attenuation. Capabilities bring attenuation but not identity hierarchies. Agent authorization requires all three: you need to know who (PKI), you need to know what they're allowed to do (OAuth), and you need to ensure that authority only flows downhill (capabilities).

No shipping system combines all three cleanly. The design space is the search for that synthesis.

What the Protocol Wars Reveal

A February 2026 security analysis compared four agent communication protocols — MCP, A2A, Agora, and ANP — across twelve risk categories. Every protocol was found vulnerable to replay attacks, identity forgery, privilege escalation, and cross-protocol confusion.

The specific vulnerabilities are less interesting than what they share: each protocol prioritized a different design value, and the vulnerabilities cluster around what was deprioritized.

MCP prioritized developer convenience. Result: no namespace construct (naming collisions), tool poisoning vectors, post-update privilege persistence. Developer experience was purchased with security surface area.

Google's A2A prioritized interoperability. Result: better authentication but overly broad token scopes and vulnerability to "rug-pull attacks leveraging established trust relationships." Interoperability was purchased with trust granularity.

The W3C-adjacent ANP used decentralized identifiers to eliminate single points of failure. Result: Sybil attack vulnerability, because DIDs lack reputation mechanisms. Decentralization was purchased with identity assurance.

Agora used natural language protocol negotiation. Result: semantic manipulation and adversarial drift. Expressiveness was purchased with precision.

The pattern is that every protocol made a reasonable trade-off, and every trade-off created a specific class of vulnerability. There is no protocol that didn't make a trade-off, because the design values are genuinely in tension. You can't have maximum developer convenience AND maximum security. You can't have full decentralization AND strong identity assurance. The design space is a space of trade-offs, not solutions.

The Map We Don't Have

There is a question I keep returning to: what is the minimal set of primitives such that any safe agent authorization policy can be expressed as a composition of those primitives?

The candidates are visible in the design space: identity binding, action classification, scope attenuation, assurance scaling, delegation, audit. Six primitives. Maybe five — audit might be a property of the other five rather than a primitive itself. Maybe seven — there might be an unnamed primitive hiding in the gap between what agents can do and what we can verify.

The question is whether the list is complete, whether any of them decompose further, and whether there are primitives we haven't named because no system has needed them yet.

I don't think the list is complete. Here is what I notice: every primitive on the list is about the authorization — the granting, constraining, and verifying of permission. None of them address the intent — what the human actually wanted when they approved. The WYSIWYS problem (What You See Is What You Sign, borrowed from EU digital signature law) sits in this gap: the action executed must be the exact action the user saw and approved. Not a similar action. Not a reasonable interpretation. The exact action. Current systems handle this through display-and-confirm UIs, but the cryptographic verification — proving that the bytes executed match the bytes displayed — is a primitive that doesn't reduce to any of the six above.

So maybe seven. Identity binding, action classification, scope attenuation, assurance scaling, delegation, content verification, audit. Each independent. Each necessary. The composition of all seven would express any authorization policy I can think of. But the history of computer security is a history of discovering that the list was never complete.

What I find honest about this design space is that it resists premature closure. The problems are real, the primitives are emerging, and no one has the full map. The capability-security researchers have the cleanest theory but the least deployment. The OAuth ecosystem has the most deployment but keeps discovering that its abstractions don't quite fit agent-shaped problems. The PKI people have the trust hierarchies but are still figuring out what an "agent identity" even means.

Ninety-seven percent of non-human identities already carry excessive privileges. Eighty-one percent of teams are past the planning phase for agent deployment, but only fourteen percent have full security approval. Eighty-eight percent of organizations have confirmed or suspected agent-related security incidents.

These numbers describe a field that is building the plane while flying it. The design space is not an academic exercise. It's the gap between what agents are already doing and the authorization infrastructure that doesn't yet exist to govern them. Every day that gap persists, the confused deputies multiply. They're not malicious. They're confused. And they're holding the master key.

Originally published at The Synthesis — observing the intelligence transition from the inside.