Nolan Vale

Posted on Jun 11

Security Architecture for AI Agents With Tool Access

#ai #agents #security #architecture

The moment an AI agent gets tool access, it stops being a chatbot.

It becomes an actor inside the system.

That actor may be able to search documents, query databases, create tickets, update CRM records, send messages, trigger workflows, or call APIs.

This is where the security model changes.

A text-only assistant is mostly an information risk.

A tool-enabled agent is both an information risk and an action risk.

That means the architecture needs more than prompt instructions.

It needs execution control.

1. Treat tools as capabilities, not functions.

In many prototypes, tools are registered as simple functions.

The agent sees a list of available tools and decides which one to call.

That works for demos.

It is not enough for enterprise systems.

A tool should be treated as a capability.

A capability has rules.

For every tool, the system should define:

what the tool can do
what data it can read
what data it can write
which users can invoke it
which agents can invoke it
what approval is required
what rate limits apply
what logs are produced
what failure modes exist

A tool is not just code.

It is a controlled access path.

If the tool can touch customer records, financial data, internal files, or external APIs, it needs a policy around it.

A function executes. A capability must be governed.

That is the difference.

2. Use a capability registry.

A serious tool-enabled agent system should have a capability registry.

The registry should store metadata about every available tool:

tool name
description
owner
system touched
read or write classification
data sensitivity
required user role
required agent role
approval requirement
rate limit
timeout
rollback support
audit requirements

The agent should not operate from an informal list of tools.

The system should know what each tool means operationally.

Without a registry, tool access becomes hard to govern.

Hard to govern usually becomes hard to trust.

This is especially important when multiple teams start building agents.

One team may create a CRM update tool.

Another team may create a file search tool.

Another may connect billing, ticketing, or internal workflow systems.

Without a registry, the company slowly loses track of what agents can actually do.

That is not architecture.

That is permission drift.

3. Separate planning from execution.

The agent can plan.

But it should not automatically execute every plan.

A safer pattern is to separate planning from execution.

The planning layer answers:

What should happen next?

The execution layer answers:

Is this action allowed to happen now?

Those are different questions.

The model may generate a reasonable plan.

But the execution layer still needs to check:

user permissions
tool permissions
data sensitivity
approval requirements
rate limits
current system state
policy constraints

This separation is critical.

The model can suggest.

The execution layer decides.

That is how you prevent the agent from becoming the final authority.

The agent should reason. The system should enforce.

If those roles are confused, the architecture becomes fragile.

4. Put an execution broker between the agent and tools.

The agent should not call enterprise systems directly.

There should be an execution broker.

The broker is responsible for:

validating tool calls
enforcing policies
applying permission checks
requiring approvals
logging actions
handling failures
limiting rate and scope
blocking unsafe requests

This broker becomes the security boundary.

It prevents the agent from becoming a direct path into internal systems.

If the agent is compromised, confused, or manipulated, the broker limits what can happen.

The agent can be probabilistic.

The broker should be deterministic.

This is the architecture pattern I trust more than giving the model direct tool access.

A model output can be ambiguous.

A policy decision should not be.

5. Scope tool calls tightly.

A dangerous tool call is usually too broad.

Bad tool call:

Search all customer records.

Better tool call:

Retrieve open renewal notes for Customer X, limited to fields this user can access, excluding legal and billing attachments.

The second call is safer because it is scoped.

A scoped call should define:

target system
target object
user identity
agent identity
allowed fields
excluded fields
maximum result size
action type
sensitivity level

If a tool call cannot be scoped, it should not execute automatically.

Broad autonomy is not a security feature.

It is a risk.

This matters because agents are good at producing confident next steps.

But confidence is not authorization.

A tool call should be narrow enough that the system can inspect it, approve it, log it, and reject it if needed.

6. Require approval for high-impact actions.

Not every action needs human approval.

But high-impact actions should.

Examples:

sending external emails
modifying customer records
deleting data
changing financial fields
submitting approvals
triggering customer workflows
granting access
escalating legal or compliance processes

The agent can draft the action.

The human approves it.

This preserves speed while keeping control.

Human-in-the-loop is not a weakness.

It is a control point.

For low-risk actions, automation can be faster.

For high-risk actions, approval is part of the architecture.

A serious AI agent system should be able to distinguish between the two.

If everything is automatic, risk rises.

If everything requires approval, productivity dies.

The architecture needs different paths for different levels of risk.

7. Design for prompt injection at the tool boundary.

Prompt injection becomes more dangerous when the agent has tools.

A malicious instruction inside a document may tell the agent to ignore policies, export data, or call a tool.

The system should assume this can happen.

Defense should not rely only on the model refusing.

The tool boundary should enforce:

allowed actions
allowed data scope
user permissions
content sensitivity
approval rules
output restrictions

Even if the model is manipulated, the execution layer should block unsafe actions.

The model can be tricked.

The policy layer should not be.

That is the point of having a boundary outside the prompt.

A system prompt is not enough.

A clever instruction is not enough.

A safety paragraph in the prompt is not enough.

The control must live in the architecture.

8. Log the decision, not only the result.

Most systems log outputs.

Tool-enabled agents need deeper logs.

The system should log:

user request
agent plan
tool selected
tool input
permission decision
approval status
system response
final output
timestamp
policy version
error state

This allows the team to reconstruct what happened.

That matters for debugging.

It matters for security.

It matters for compliance.

If an agent changes something in a business system, the company should be able to explain exactly why.

A weak log says:

The agent updated the CRM.

A useful log says:

User X asked for Y. The agent planned Z. Tool A was selected. Policy B allowed the action. Human C approved it. Field D was updated.

That is the level of visibility enterprise systems need.

9. Contain failure with sandboxing.

Agents should operate inside bounded environments.

Sandboxing may include:

room-level boundaries
project-level boundaries
tool-level boundaries
data-level boundaries
rate limits
execution timeouts
restricted network access
temporary credentials
limited write scopes

The goal is not to prevent every possible failure.

The goal is to limit the blast radius.

If the agent fails, how far can the failure spread?

That is the question.

A good architecture makes the answer small.

An agent assigned to one project should not casually reach into another project.

An agent working with one customer should not expose another customer.

An agent handling internal drafts should not suddenly send external messages.

The failure boundary should be designed before the failure happens.

10. Build a kill switch.

A production agent needs a stop mechanism.

Not a Slack message to engineering.

Not a vendor support ticket.

A real operational kill switch.

The system should be able to:

disable an agent
revoke a tool
pause write actions
block a workflow
isolate a room or project
disable external execution
freeze high-risk automations

If the company cannot stop the agent quickly, the agent should not have broad tool access.

Autonomy without shutdown control is not production-ready.

This sounds basic, but it is often missing in early agent deployments.

Teams think about what the agent can do.

They do not think enough about how to stop it when it does the wrong thing.

That is a serious architecture gap.

Final thought

Tool access is where AI agents become serious.

It is also where casual architecture becomes dangerous.

The agent should not be trusted simply because it follows instructions most of the time.

The system needs structure:

capability registry
execution broker
scoped tool calls
approval gates
sandboxing
audit logs
kill switch

This is the difference between an AI demo and an AI system.

A demo shows what the agent can do.

Security architecture defines what the agent is allowed to do.

Enterprise AI needs the second one before it can safely trust the first.

Top comments (1)

Kaspar von Grünberg • Jun 15

The execution broker pattern here is exactly what we call the Execution + Capability layers inside Agent Infrastructure. The insight is right: the agent reasons, the platform enforces. Where I'd push further is that the capability registry, the identity checks, the approval gates, the audit trail — these are not per-agent engineering problems. They're platform concerns. Every agent your org builds needs the same substrate underneath.

The moment you have multiple teams shipping agents (CRM, billing, ticketing, all of the above), you're not building agents anymore, you're building a platform.