curi0us_dev

Posted on Jul 2

Building Runtime Authorization Boundaries for AI Agents and MCP Tools

#agents #ai

Five Key Takeaways

AI agents should not be the final authority for deciding whether their own tool calls are allowed, because authorization must be enforced outside the model's reasoning loop.
Tool discovery, tool invocation, and tool argument authorization are separate control points, and treating them as one generic guardrail creates gaps in production systems.
A useful agent authorization request must include the originating user, the acting agent, the requested tool, the action, the target resource, the tool arguments, delegated authority, and trusted runtime context; the policy decision point should return the policy identifier and version used for the decision.
Agent audit logs should explain why an action was allowed or denied, not only that a token was valid or that a tool was called.
Teams can adopt runtime authorization incrementally by starting with high-risk tools, running policy checks in audit-only mode, and then enforcing fail-closed controls on actions that can change data, move money, or expose sensitive information.

Abstract

AI agents are moving from chat interfaces into systems where they call models, inspect files, query databases, retrieve documents, create pull requests, trigger workflows, and act on behalf of users. This changes the security model. The main question is no longer only "Who is the user?" or "Is this API key valid?" The harder runtime question is: should this specific agent, acting for this specific user, be allowed to perform this specific tool call with these specific arguments right now?

This article proposes a practical runtime authorization pattern for AI agents and MCP-style tools. It separates model access, tool discovery, tool invocation, argument-level checks, delegated user context, and audit records into explicit control points. The goal is not to replace authentication, OAuth, MCP authorization, or gateway controls. It is to add an independently evaluated policy check at the moment when model output becomes a real action.

Why Agent Access Control Is Different

Traditional application authorization usually starts with a user or service identity. A human clicks a button, a backend receives a request, and the application checks whether that principal can perform an action on a resource. The decision may use roles, attributes, relationships, tenancy, ownership, time, device posture, or other context. Even when the implementation is complex, the request shape is familiar: principal, action, resource, context.

AI agent systems disturb this model in several ways.

First, the agent is not simply a user. It may act on behalf of a user, a team, a tenant, or an automated workflow. It may also have its own identity, configuration, model access, tool access, and runtime state. A support agent and a database maintenance agent should not be treated as the same actor just because the same employee started both sessions.

Second, the action may be selected dynamically. In a normal UI, a developer decides which button maps to which backend action. In an agent workflow, the model may choose a tool based on a prompt, retrieved context, previous tool output, or a plan generated several steps earlier. The tool call is still a software action, but the path to that action is probabilistic.

Third, tool arguments matter as much as the tool name. Allowing a refund_customer tool does not mean every refund amount, customer account, ticket, payment method, and region should be allowed. A tool allowlist is useful, but it is a coarse control. Production authorization usually needs to bind the invocation arguments to the caller's permissions.

Fourth, the same agent may operate across multiple systems. A coding agent may read files, run tests, query a local database, call a CI API, and open a pull request. A support agent may read CRM data, search logs, summarize a ticket, and trigger a refund workflow. A data agent may query a warehouse, retrieve embeddings, and call a reporting API. Each hop can change the risk profile.

The Model Context Protocol (MCP) is one reason this topic has become urgent. MCP provides a standard way for LLM applications to connect with external data sources and tools, and its specification describes hosts, clients, and servers as the protocol participants. In practice, this makes tool integration easier and more composable, but it also makes authorization boundaries more important because the agent can reach more capabilities through a common interface. This article references the MCP 2025-11-25 specification, the latest published version at the time of writing, and its authorization section for the protocol-level background.

OWASP describes a related risk as Excessive Agency: damaging actions can occur when an LLM-based system has too much functionality, too many permissions, or too much autonomy. The examples include agents with unnecessary tools, extensions with excessive downstream permissions, and high-impact actions executed without independent verification. That framing is useful because it shifts the conversation from "Was the prompt safe?" to "Was the system allowed to do too much?" See OWASP LLM06:2025 Excessive Agency for the broader risk category.

Where Common Controls Stop Short

Most teams do not deploy agents with no controls at all. They usually have several layers already:

authentication for the user or client;
API keys or OAuth tokens for model and tool access;
prompt instructions that tell the agent what not to do;
static lists of tools available to a given agent;
gateway routing rules for model access and cost control;
logs of model requests and tool calls;
human approval for some workflows.

These controls are valuable. The problem is that none of them, by itself, answers the runtime authorization question.

A prompt can instruct an agent not to modify production data, but a prompt is not an authorization boundary. It may be ignored, contradicted by other context, weakened by prompt injection, or bypassed by an unexpected tool path. A static tool list can reduce the attack surface, but it does not decide whether this particular invocation is allowed. A token can prove that a client authenticated successfully, but it may not encode the full policy needed for the action. A gateway can route model calls and track spend, but model routing is not the same as data authorization. Logs are useful after the fact, but they do not prevent the action.

The missing layer is an explicit policy decision between the model-generated intention and the real-world effect.

That decision should not ask, "Did the model think this was allowed?" It should ask, "According to policy, is this principal allowed to perform this action on this resource with these arguments in this context?"

Separate the Control Points

A common mistake is to talk about "agent guardrails" as if one layer can cover every access decision. In practice, agent authorization has several distinct control points.

1. Model access

The first question is which models a user, agent, or workflow may call. This may depend on cost, data sensitivity, region, model capability, or approved use case. For example, an internal documentation assistant may be allowed to call a low-cost general model, while a production incident agent may be allowed to call a stronger model only during an active incident.

This decision belongs near the model gateway or model router. It is not enough to issue one shared key for all agents and all models.

2. Tool discovery

The second question is which tools the model should be able to see. If the model never sees a high-risk tool, it is less likely to select it. For example, a ticket summarization agent does not need to discover delete_customer, issue_refund, or rotate_database_credentials.

Tool discovery is a useful risk-reduction layer. It controls what the model is likely to choose, not what the surrounding system is able to execute. It does not replace authorization at the MCP server or downstream API: gateway and server configurations can drift, a server can be reused by another host or workflow, and the underlying API may remain reachable outside the agent gateway. Every invocation still needs authorization at the enforcement point that performs the action.

3. Tool invocation

The third question is whether the tool may be invoked at all. A support agent may be allowed to call search_customer_records, while a sales assistant may not. An engineering agent may call read_logs, but not restart_service unless it is attached to an incident ticket.

This is usually where teams first introduce policy-as-code. It is also where many implementations stop too early.

4. Argument-level authorization

The fourth question is whether the specific arguments are allowed. This is often the most important check.

Consider a refund_customer tool:

{
  "tool": "refund_customer",
  "arguments": {
    "ticket_id": "T-10029",
    "customer_id": "C-8821",
    "amount_cents": 200000,
    "reason": "billing complaint"
  }
}

A role check might say that support agents can issue refunds. A runtime policy should ask more precise questions:

Is this ticket assigned to the user or their team?
Is the customer in the user's tenant or region?
Is the amount below the user's approval limit?
Has the user already confirmed the action?
Is this customer account under review or locked?
Has this workflow exceeded a daily refund threshold?

The difference between allowing the tool and allowing the arguments is the difference between a demo and a production control.

5. Downstream resource access

The fifth question is whether the downstream system should accept the action. The gateway may deny unsafe calls before they leave the agent layer, but the API behind the tool should still enforce its own authorization. Defense in depth matters because agent stacks are composable and often change faster than ordinary application stacks.

An MCP server that exposes internal APIs should not assume that every connected client has already done all required checks. A policy decision at the gateway reduces risk, but the tool server should still validate identity, permissions, and resource scope.

6. Audit and explanation

The final control point is evidence. If an agent tried to delete a file, run a SQL statement, issue a refund, or retrieve customer documents, the team needs to know more than "a tool was called." They need to know who initiated the session, which agent acted, which tool and arguments were used, which policy version made the decision, and why the decision was allow or deny.

Without this, authorization becomes difficult to debug and almost impossible to improve.

The Runtime Authorization Request

The core implementation detail is the authorization request contract. It should be explicit and stable enough that different agents, gateways, and tool servers can produce the same kind of request.

A minimal version might look like this:

{
  "principal": {
    "type": "user",
    "id": "user_123",
    "roles": ["support_agent"],
    "tenant_id": "tenant_acme",
    "region": "eu",
    "refund_limit_cents": 100000
  },
  "agent": {
    "id": "support-agent-v3",
    "type": "customer_support",
    "environment": "production"
  },
  "action": "tools.call",
  "resource": {
    "type": "mcp_tool",
    "name": "refund_customer",
    "server": "billing-mcp"
  },
  "context": {
    "session_id": "sess_789",
    "ticket_id": "T-10029",
    "purpose": "customer_support",
    "human_confirmed": false
  },
  "arguments": {
    "customer_id": "C-8821",
    "amount_cents": 200000,
    "reason": "billing complaint"
  }
}

A production version may include more data: device posture, workload identity, network zone, data classification, approval claims, incident status, resource ownership, relationship tuples, token audience, and delegation chain. The important point is that the model should not invent this context. The gateway or tool server should derive it from trusted sources: identity provider claims, application state, resource metadata, ticketing systems, policy context services, or the authenticated session.

Policy selection is different from request context. The agent and session should not submit a policy_version field that tells the evaluator which policy to use. The gateway or policy decision point selects the deployed policy according to its own trusted configuration, then returns decision metadata such as the matched policy identifier and policy version. Those values belong in the response and audit log, where they record what actually evaluated the request.

This is also where delegated authority becomes important. If an agent acts on behalf of a user, the authorization request should preserve both identities: the originating user and the acting agent. If one agent delegates to another, the chain should not collapse into a generic service account.

OAuth token exchange is one established pattern for representing delegation and impersonation between services. RFC 8693 defines a way to exchange security tokens in scenarios involving delegation or impersonation. Agent systems do not have to use that exact mechanism, but they need an equivalent concept: the ability to represent who started the action, which component is acting now, what authority was delegated, and what constraints came with that delegation. See RFC 8693: OAuth 2.0 Token Exchange for the general token exchange model.

A Gateway-Style Enforcement Pattern

A practical implementation does not require every tool to embed a full policy engine. A common pattern is to put an enforcement point at the gateway or tool-routing layer.

The flow looks like this:

The user starts an agent session.
The gateway authenticates the user and identifies the agent.
The gateway filters the models and tools visible to that session.
The model produces a structured tool call.
The gateway validates the tool call against the tool schema.
The gateway enriches the request with trusted identity, resource, and session context.
The gateway asks a policy decision point for an allow or deny decision.
If allowed, the gateway forwards the call to the MCP server or downstream API.
If denied, the gateway returns a safe error to the agent and records the decision.
The downstream API still performs its own authorization checks.

The gateway is not trusted because it is magical. It is useful because it sits at a narrow point where model output becomes tool input. That makes it a natural place to validate, authorize, and log decisions before side effects happen.

Here is simplified pseudocode for a tool call handler:

async def handle_tool_call(request, session):
    principal = authenticate_user(session.token)
    agent = resolve_agent(session.agent_id)
    tool = tool_registry.get(request.tool_name)
    if tool is None:
        raise Forbidden("Unknown tool")
    arguments = validate_json_schema(tool.schema, request.arguments)
    resource_context = await load_resource_context(
        tool_name=request.tool_name,
        arguments=arguments,
        tenant_id=principal.tenant_id,
    )
    authz_input = {
        "principal": principal.to_policy_input(),
        "agent": agent.to_policy_input(),
        "action": "tools.call",
        "resource": {
            "type": "mcp_tool",
            "name": request.tool_name,
            "server": tool.server_name,
            **resource_context,
        },
        "arguments": arguments,
        "context": {
            "session_id": session.id,
            "purpose": session.purpose,
            "human_confirmed": session.human_confirmed,
        },
    }
    decision = await policy_decision_point.check(authz_input)
    await audit_log.write({
        "decision": decision.effect,
        "reason": decision.reason,
        "matched_policy": decision.policy_id,
        "policy_version": decision.policy_version,
        "input": authz_input,
    })
    if decision.effect != "ALLOW":
        raise Forbidden("Tool call denied by policy")
    return await tool.invoke(arguments)

There are three details worth highlighting.

First, schema validation happens before authorization. Policy should not evaluate arbitrary unvalidated input if the tool schema is known. Validation should reject missing fields, unexpected types, malformed IDs, invalid enum values, and unsafe defaults.

Second, enrichment happens outside the model. The model may say that a customer belongs to a user, but the system should verify that from trusted data. The same applies to ticket assignment, tenant membership, approval status, and data classification.

Third, the audit record stores the decision, the reason, and the policy metadata returned by the decision point. This is useful for debugging denied requests and for proving that allowed requests matched the policy in force at the time.

Example Policies

The implementation choice is deliberately separate from the authorization pattern. Teams can use Open Policy Agent, Cedar-style policies, a custom authorization service, or a dedicated policy decision point. The useful test is not whether a system uses a particular product, but whether policies can be evaluated independently of the agent and enforced consistently at the gateway, MCP server, and downstream API boundaries. One concrete gateway-oriented implementation of these control points is described in this walkthrough of authorizing AI agents and MCP tools at the gateway.

Here is a simplified policy expressed as readable pseudocode:

policies:
  - name: support-agent-can-read-customer-record-for-assigned-ticket
    effect: allow
    when:
      principal.roles contains "support_agent"
      agent.type == "customer_support"
      resource.name == "get_customer_record"
      arguments.ticket_id == context.assigned_ticket_id
      principal.tenant_id == resource.tenant_id
  - name: refund-requires-limit-and-confirmation
    effect: allow
    when:
      principal.roles contains "support_agent"
      agent.type == "customer_support"
      resource.name == "refund_customer"
      arguments.ticket_id == context.assigned_ticket_id
      arguments.amount_cents <= principal.refund_limit_cents
      context.human_confirmed == true
  - name: deny-production-db-write-from-coding-agent
    effect: deny
    when:
      agent.type == "coding_assistant"
      resource.name == "sql_query"
      resource.environment == "production"
      arguments.statement_type in ["INSERT", "UPDATE", "DELETE", "DROP", "ALTER", "CREATE"]

The last rule is deliberately mundane. In practice, many agent incidents are not exotic. A coding assistant may run a command in the wrong directory, query the wrong database, or assume that a local file is safe to modify. A deny rule for production writes is not a complete security model, but it is a concrete boundary that a prompt alone cannot provide.

In a real implementation, SQL authorization would need more than string matching. The tool should parse SQL into an AST, classify the operation, prevent multiple statements if not needed, block dangerous extensions, and use a database identity with the least required permissions. The policy layer should not be the only protection. It should be one explicit decision point in a layered design.

Field Note: Coding Agents and Local Databases

This lesson came from using a coding assistant on a small local application backed by SQLite. The assistant could traverse the repository, inspect configuration, run shell commands, and issue database queries while debugging. The live prod.sqlite file was in the same working tree as development data, and a relative path in the application's configuration made it discoverable from the repository root. The agent had a write-capable SQLite connection available through its tool path. It could therefore move from a harmless request such as "inspect the schema" to opening the file that contained real application data.

That was not hypothetical. In one incident on a pet project, the assistant modified the production database while exploring the codebase and testing a change. There was no attacker and no prompt injection; the failure was a path and capability problem. The agent had enough filesystem access to find the protected database and enough SQL access to change it. A written instruction not to touch production data did not compensate for those permissions.

The controls I use for this class of workflow are deliberately concrete:

prod.sqlite is mounted or exposed to the agent as read-only, while the application process uses its own controlled access path;
each coding-agent session receives a separate dev.sqlite copy for experiments, migrations, and test data changes;
SQL issued through the agent-facing tool goes through a wrapper that rejects INSERT, UPDATE, DELETE, DROP, ALTER, CREATE, ATTACH, and multi-statement input by default;
the wrapper records the agent session, current working directory, resolved database path, operation type, and deny reason before returning an error;
a policy check denies any write operation targeting the protected database, even when the generated shell command would otherwise be valid.

A rejected request can be made visible instead of silently disappearing:

{
  "timestamp": "2026-06-19T10:21:07Z",
  "decision": "DENY",
  "agent_id": "coding-assistant",
  "database_path": "/workspace/data/prod.sqlite",
  "statement_type": "UPDATE",
  "policy_id": "deny-sql-write-protected-db",
  "reason": "protected SQLite database is available to the agent in read-only mode"
}

This pattern keeps the useful parts of the assistant. It can still inspect schemas, explain queries, generate migrations, write tests, and validate behavior against a safe copy. What changes is the default failure mode. If the agent chooses the wrong database path or tries a destructive statement, the filesystem boundary, SQL wrapper, and policy check stop the action before it becomes a data incident.

Audit Logs Should Explain the Decision

Many agent logs are optimized for debugging prompts, latency, and token cost. Authorization logs need a different shape.

A useful decision log should answer:

Who initiated the session?
Which agent acted?
Which tool was requested?
What arguments were supplied?
Which resource was affected?
Which policy version made the decision?
Which rule matched?
Was the request allowed, denied, or allowed in audit-only mode?
What trusted context was used?

For example:

{
  "timestamp": "2026-06-19T10:15:42Z",
  "decision": "DENY",
  "reason": "refund amount exceeds principal limit",
  "policy_id": "refund-requires-limit-and-confirmation",
  "policy_version": "2026.06.18.3",
  "principal_id": "user_123",
  "agent_id": "support-agent-v3",
  "action": "tools.call",
  "tool": "refund_customer",
  "resource": {
    "customer_id": "C-8821",
    "tenant_id": "tenant_acme"
  },
  "arguments": {
    "ticket_id": "T-10029",
    "amount_cents": 200000
  },
  "context": {
    "assigned_ticket_id": "T-10029",
    "refund_limit_cents": 100000,
    "human_confirmed": true
  }
}

This record is useful even if the system works correctly. A denied request may reveal that the agent tried to overreach, that the user's permissions are wrong, that the tool schema is ambiguous, or that the policy is too strict. An allowed request gives reviewers evidence that the decision matched the rule and context at the time.

The same log also supports rollout. Before enforcing policy, a team can run checks in audit-only mode and measure what would have been denied. That is often the safest way to introduce runtime authorization into an existing agent stack.

Rollout Strategy

A team does not need to build the complete system on day one. The best adoption path is incremental.

Step 1: Inventory high-risk tools

Start with tools that can change data, expose sensitive data, move money, change infrastructure, send external messages, or create long-lived credentials. These tools deserve runtime checks before lower-risk read-only tools.

Step 2: Define the authorization request contract

Decide what every policy decision needs: principal, agent, action, resource, arguments, and context. Keep the contract stable and language-neutral. The same shape should work whether the enforcement point is a model gateway, an MCP server, or an internal API.

Step 3: Validate tool schemas

Before adding policy, ensure that tool arguments are structured and validated. If a tool accepts a free-form shell command, free-form SQL statement, or arbitrary URL, the authorization layer has less to reason about. Narrow tools are easier to authorize than universal tools.

Step 4: Run in audit-only mode

Log decisions without blocking requests. Use the logs to find missing context, noisy rules, ambiguous tools, and unexpected agent behavior. Audit-only mode is not a security boundary, but it is a useful migration step.

Step 5: Enforce on the highest-risk actions

Move from audit-only to fail-closed enforcement for actions with serious side effects. Examples include deleting records, issuing refunds, modifying production infrastructure, sending customer-facing messages, changing permissions, or accessing regulated data.

Step 6: Push checks closer to the resource

The gateway is a good first enforcement point, but it should not be the only one. Over time, move authorization into the MCP server and downstream APIs as well. The goal is consistent policy, not a single fragile choke point.

Trade-Offs and Failure Modes

Runtime authorization introduces engineering trade-offs.

Latency is one. A policy check adds a network call unless the policy engine runs locally or caches decisions safely. For high-volume, low-risk calls, teams may need local evaluation, precomputed attributes, or short-lived decision caches. For high-risk actions, the extra latency is usually easier to justify.

Context freshness is another. A decision is only as good as the attributes used to make it. If ticket assignment, tenant membership, or approval status changes frequently, stale context can produce wrong decisions. The system needs clear rules for which data can be cached and for how long.

Policy complexity is a third. A policy layer can become a second application if every edge case is pushed into it. Keep business logic and authorization logic separate. The policy should decide whether an action is allowed, not calculate the refund, generate the SQL, or choose the incident response plan.

Failure behavior matters. If the policy decision point is unavailable, high-risk actions should fail closed. Low-risk actions might degrade gracefully, depending on the system. This decision should be explicit and tested.

Finally, model behavior will still be imperfect. Runtime authorization does not prevent hallucinated plans, bad suggestions, or confusing conversations. It only controls whether a proposed action is allowed to execute. That is still a valuable boundary because it protects the systems behind the agent.

Conclusion

AI agent security should not depend on the agent deciding how much power it deserves. Once an agent can call tools, retrieve data, or trigger workflows, each consequential action needs an independently evaluated authorization decision.

The practical pattern is straightforward: authenticate the user, identify the agent, validate the tool call, enrich the request with trusted context, evaluate policy, log the decision and returned policy metadata, and only then execute the action. The important shift is to treat tool calls as production actions, not as chat completions with extra features.

For teams adopting MCP, AI gateways, or coding agents, the first goal should not be a perfect governance framework. It should be a small number of clear runtime boundaries around the actions that can cause the most damage. Start with high-risk tools, separate discovery from invocation, add argument-level checks, and make every decision explainable.

That gives engineering teams a more useful question than "Can we trust the agent?" The better question is: "What is this agent allowed to do, for whom, with which inputs, under which policy, and where is the evidence?"

Top comments (1)

Raju Dandigam • Jul 2

Separating discovery, invocation, and argument-level authorization is the part a lot of teams still collapse into one fuzzy "guardrails" bucket. The concrete request shape you describe is useful because it treats the tool call like a first-class security event rather than just model output. In practice the missing piece is almost always auditability: teams can answer who had a token, but not why an agent was allowed to touch that specific resource with those arguments. That is why I like pairing policy checks with trace-level visibility; agent-inspect-style execution trees make it much easier to debug both false allows and false denies. Have you seen teams start with audit-only mode successfully, or do they usually need a narrower rollout like high-risk write tools first to avoid policy fatigue?