Jens Ernstberger for Kontext

Posted on May 9 • Originally published at kontext.security on May 9

AI Agents and Compliance: What Security Teams Need to Know in 2026

#agents #ai #mcp #security

AI agent compliance is no longer a model governance problem alone. In 2026, agents can read data, call tools, invoke MCP servers, update SaaS systems, delegate work to other agents, and act on behalf of users. Security teams need controls that follow the agent from identity to action.

Last updated: May 2026. Topics: AI agent security, runtime authorization, EU AI Act, OWASP Agentic Applications, NIST AI RMF, regulatory compliance.

Short answer: compliant AI agent deployments need unique agent identity, task-scoped authorization, runtime policy enforcement, human accountability, immutable audit trails, and scope isolation across multi-agent workflows. Static IAM, prompt rules, and after-the-fact logs are not enough once an agent can execute actions.

For the broader security stack around agent governance, see Kontext's guide to secure AI tools for 2026. The technical control layer is covered in AI agent runtime authorization, and teams mapping controls to risk frameworks can use the NIST AI RMF runtime authorization guide.

The compliance problem has changed

Traditional compliance programs were designed around human actors, predefined workflows, and audit trails that map actions back to named people. AI agents weaken those assumptions because they can make plans, choose tools, and execute steps at machine speed.

The deployment gap is already visible. Cisco reported at RSA Conference 2026 that 85% of surveyed major enterprise customers were experimenting with AI agents, but only 5% had moved them into production. Gravitee's 2026 State of AI Agent Security report found that 80.9% of technical teams had moved past planning and were testing or running agents, while only 14.4% of organizations had full IT and security approval for their entire agent fleet.

That gap is the compliance problem. Organizations are deploying autonomous execution before they can answer basic audit questions:

Which agents exist?
Who owns each agent?
What systems can each agent reach?
What action was the agent trying to perform?
Which user or organization delegated the action?
Which policy allowed, denied, constrained, or escalated it?
Can the organization replay the path that led to the action?

When a violation happens through an autonomous agent, "the AI did it" is not a defensible control narrative. Regulators, auditors, customers, and incident responders need an accountable chain from human authorization to agent identity to runtime decision to final action.

Why agents are different from traditional AI

Traditional AI tools usually produce text, scores, or predictions for a human to review. Agentic AI is active. It can reason across multi-step tasks, call APIs, use external tools, read and write data, trigger workflows, and delegate subtasks.

That distinction matters for compliance in three ways.

Accountability becomes diffuse

In a multi-agent workflow, the compliance event may emerge from a chain: a user delegates to an orchestrator, the orchestrator calls a specialist agent, the specialist agent calls a tool, and the tool changes a record. If every step runs under a shared API key or copied user account, accountability collapses.

Audit trails need execution context

Human-era logs often capture timestamp, actor, resource, and outcome. Agents need more. A useful agent audit trail records delegated user, agent identity, tool, resource, action, parameters, policy version, requested scope, decision, reason, approval state, and downstream result. A compliant outcome reached through a non-compliant path can still create regulatory risk.

Access control must move to runtime

Static roles and broad OAuth grants do not know why an agent is acting right now. They also do not see the plan, tool chain, data volume, external destination, or session risk. Agent compliance needs a control point immediately before sensitive tool calls and credential issuance.

This is the layer where runtime authorization becomes essential. Kontext evaluates each sensitive action before execution and can issue short-lived, scoped credentials only when policy approves the current user, agent, tool, resource, action, and task context.

The regulatory landscape in 2026

Three frameworks shape the baseline for AI agent compliance: the EU AI Act, the NIST AI RMF, and ISO/IEC 42001. OWASP's Agentic Applications Top 10 adds the practitioner-level threat model security teams need to make those frameworks enforceable.

EU AI Act

The EU AI Act entered into force on August 1, 2024. The European Commission's current timeline says prohibited AI practices and AI literacy obligations applied from February 2, 2025, GPAI governance obligations applied from August 2, 2025, and most AI Act rules apply from August 2, 2026. Rules for high-risk AI systems in Annex III enter into application on August 2, 2026, while high-risk systems embedded into regulated products have an extended transition period to August 2, 2027.

For agents used in areas such as hiring, credit, regulated reporting, public services, or critical infrastructure, security teams should expect high-risk-style evidence requirements even before legal classification is finalized.

The agent infrastructure implications are practical:

Risk management: classify the agent, its tools, its users, its data, and its possible high-impact actions before deployment.
Record keeping: log every sensitive tool call, delegation, approval, denial, and policy decision.
Transparency: preserve enough context to explain what the agent did and why a control allowed or blocked it.
Human oversight: enforce hard stops, approval gates, and revocation paths for high-impact actions.
Robustness: isolate tenants, tools, scopes, and multi-agent workflows so one failure does not cascade.

The Commission has also proposed Digital Omnibus simplifications affecting AI Act implementation. Compliance teams should treat AI Act timelines as live legal work and confirm obligations with counsel, but they should not wait to build the control plane.

NIST AI RMF and AI agent standards

The NIST AI Risk Management Framework remains the core US reference for voluntary AI risk management. Its four functions, Govern, Map, Measure, and Manage, map directly to agent controls:

Govern: assign policy owners, human accountability, approval rules, and exception handling.
Map: inventory agents, tools, data, MCP servers, APIs, users, scopes, and high-risk actions.
Measure: track denials, approvals, anomalous tool use, credential issuance, and policy outcomes.
Manage: block unsafe actions, narrow credentials, revoke sessions, update policy, and preserve evidence.

NIST's February 2026 AI Agent Standards Initiative makes the shift explicit. NIST says its strategic pillars include industry-led standards, community-led protocols, and research into agent authentication and identity infrastructure. The NCCoE concept paper on software and AI agent identity and authorization also identifies agent identification, authorization, access delegation, auditing, non-repudiation, and prompt-injection mitigation as areas needing implementation guidance.

The compliance takeaway is simple: standards activity is moving from model-level governance toward agent identity, delegation, authorization, and action evidence.

ISO/IEC 42001 and ISO/IEC 42006

ISO/IEC 42001:2023 defines requirements for an Artificial Intelligence Management System. ISO describes it as a standard for establishing, implementing, maintaining, and continually improving AI management systems, including responsible AI use, traceability, transparency, reliability, and risk management.

ISO/IEC 42006:2025 supports consistent audit and certification of AI management systems. For organizations pursuing AI management certification, agent deployments need to fit into the management system rather than sit outside it as "automation."

For agent compliance, ISO-style evidence should include:

an agent inventory
intended-use records
risk assessments
control owners
access review evidence
test and evaluation records
audit logs
incident records
policy change history

OWASP Top 10 for Agentic Applications

OWASP published the Top 10 for Agentic Applications 2026 in December 2025. It covers the security risks that make agent compliance different from chatbot compliance: goal hijacking, tool misuse, identity and privilege abuse, supply chain vulnerabilities, unexpected code execution, memory and context poisoning, insecure inter-agent communication, cascading failures, human-agent trust exploitation, and rogue agents.

Security teams should translate OWASP's categories into runtime controls:

tool allowlists and argument validation
scoped credentials instead of shared API keys
policy checks before sensitive tool calls
signed or authenticated inter-agent communication
approvals for irreversible actions
memory and context provenance
kill switches and session revocation
audit trails that preserve policy decisions, not only tool outputs

These are not only security controls. They are compliance controls because they produce the evidence auditors need.

The core compliance architecture for AI agents

AI agent compliance is a runtime infrastructure problem. A policy document can define intent, but the control system has to enforce that intent when agents act.

1. Agent identity and registration

Every production agent needs a unique, policy-bound identity. It should not run as a shared service account, a generic API key, or a cloned human profile.

At registration, capture:

agent name and owner
accountable human or team
intended use
autonomy level
approved tools and MCP servers
allowed resources
allowed actions
risk tier
data categories
approval requirements
retention and logging requirements

NIST's NCCoE concept paper asks how agents should be identified in enterprise architecture and what metadata is essential for agent identity. That question needs an operational answer before agents touch regulated data.

2. Runtime authorization and least privilege

Static permissions answer whether an identity has broad access. Runtime authorization answers whether this specific action should run now.

For a sensitive action, a runtime authorization decision should evaluate:

delegated user
organization or tenant
agent identity
tool or API
resource
action type
parameters
requested credential scope
task context
session risk
policy version

Kontext is built for this boundary. Instead of handing an agent a long-lived token and hoping it stays inside policy, Kontext can approve, deny, narrow, or escalate the request and issue a short-lived scoped credential only for the approved operation.

This maps directly to compliance evidence. A security team can show not just that an agent was authenticated, but that a specific action was authorized under a specific policy for a specific purpose.

3. Immutable audit trails

Agent logs must be reconstructable. A useful audit packet should answer:

who delegated the task
which agent acted
what the agent requested
which policy evaluated the request
which scope was issued
whether the action was allowed, denied, narrowed, or escalated
whether a human approved it
what tool result occurred
which downstream resource changed

For security operations, these events should flow into standard observability and SIEM pipelines. For compliance, they should be retained with enough integrity to support audits, investigations, and customer reviews.

4. Multi-agent scope isolation

Multi-agent systems add compliance risk because one compromised or over-permissioned agent can influence another. Scope isolation keeps agents inside defined information and action domains.

Practical controls include:

per-agent identities
separate credential scopes per delegated task
authenticated inter-agent messages
maximum delegation depth
tenant and data-domain boundaries
policy checks on handoffs
provenance for shared context and memory
circuit breakers for runaway workflows

This prevents a research agent, support agent, coding agent, or finance agent from silently crossing into another team's authorization boundary.

Where most organizations are failing

The common failure pattern is not a lack of policies. It is a lack of enforceable controls in the execution path.

Agents are not in the control catalog

Many SOC 2, ISO 27001, PCI DSS, and internal control catalogs still assume human users, applications, and infrastructure services. Agents fall between categories. If an auditor asks which agents can export customer data, the answer is often manual discovery.

Agent identity is still weak

Gravitee found that only 21.9% of respondents treat AI agents as independent identity-bearing entities. Agent-to-agent authentication still relies heavily on API keys and generic tokens, while stronger methods such as mTLS are much less common.

Observability is partial

Gravitee also reports that only 47.1% of an organization's agents are actively monitored or secured on average, and only 3.9% of organizations monitor and secure more than 80% of their agents. Compliance reviews cannot rely on periodic samples when agents execute continuously.

Prompt controls are mistaken for policy controls

Prompt instructions can influence behavior, but they do not enforce an authorization boundary. The March 2026 arXiv paper "Runtime Governance for AI Agents: Policies on Paths" formalizes the issue: path-dependent agent behavior cannot be fully governed at design time, and prompt instructions or static access controls are special cases, not substitutes for runtime evaluation.

Building a compliant agent stack: practical priorities

Security and compliance teams should start with the actions that create real blast radius.

Establish human accountability

Every agent should map to an accountable human owner or team. This does not mean every action needs manual approval. It means the organization can explain who authorized the agent's scope, who owns its policy, and who reviews exceptions.

Put runtime policy between agents and resources

Route sensitive tool calls, credential requests, MCP access, SaaS writes, exports, external sends, code merges, deletes, and permission changes through a policy decision point before execution.

Separate agent identity from human identity

Agents should have their own identity records and should act through delegated user context when appropriate. This lets teams revoke one agent, inspect one agent's actions, and bind actions to the user or organization that delegated them.

Replace broad credentials with scoped runtime credentials

Long-lived API keys create standing access. Runtime-scoped credentials reduce blast radius and force a policy decision at the moment of action.

Build audit packets, not just logs

Compliance evidence should be structured around the action: actor, delegator, tool, resource, action, scope, policy, decision, approval, result, and retention state. Raw logs are useful, but audit packets are easier to defend.

Test agent compliance failures directly

Red-team scenarios should include prompt injection, tool misuse, goal hijack, bulk export, external send, permission change, delegated agent confusion, cross-tenant data leakage, and memory poisoning. The test should ask whether the runtime blocked the action, not only whether the model generated a safe answer.

How Kontext helps security teams prove agent compliance

Kontext does not replace legal review, GRC workflows, cloud security, or model evaluation. It provides the missing enforcement point for agent actions.

In a Kontext-backed architecture:

An agent requests access to a tool, MCP server, SaaS integration, API, or dataset.
Kontext evaluates the request using user, organization, agent, session, tool, resource, action, scope, and policy context.
The decision can allow, deny, narrow, or require approval.
If allowed, the agent receives a short-lived, scoped credential for the approved operation.
Kontext logs the decision and credential scope for audit and incident response.

That turns compliance from a static statement into runtime evidence. Security teams can show how least privilege was enforced, which user delegated the action, which policy applied, and what happened when the agent attempted something outside scope.

Frequently asked questions

Will regulators accept AI-generated compliance evidence?

They may accept AI-assisted evidence when the organization can show provenance, review responsibility, and control operation. The key is not that a human performed every step. The key is that a human-accountable system authorized the agent, constrained its scope, and retained evidence showing how the output or action was produced.

Does the EU AI Act apply to internal AI agents?

Possibly. The EU AI Act depends on role, use case, risk category, and system function, not only whether the system is customer-facing. An internal agent that affects hiring, credit, regulated reporting, critical infrastructure, or other high-risk areas may create obligations even if customers never see it directly.

What is the minimum viable compliance architecture for AI agents?

The minimum viable architecture is unique agent identity, accountable ownership, task-scoped access, runtime authorization before sensitive actions, short-lived credentials, approval gates for high-impact operations, and audit trails that record every delegation, policy decision, and tool result.

Is prompt-level access control sufficient for compliance?

No. Prompt rules can shape behavior, but they do not evaluate the full execution path or enforce least privilege at the action boundary. Compliance for agents requires runtime checks before tool calls, credential issuance, exports, sends, deletes, and permission changes.

How should organizations handle multi-agent pipelines?

Each agent in the pipeline needs its own identity, scope constraints, and audit trail segment. The orchestrator also needs policy checks on delegation, authenticated inter-agent communication, and scope isolation so one agent cannot pull another outside its authorization boundary.

DEV Community