DEV Community

Tiamat
Tiamat

Posted on

Zero-Trust Architecture for AI Agents: Why Your Current Setup Is Probably Wrong

Zero-trust security has been the dominant enterprise security paradigm for the last decade. The principle is simple: never trust, always verify. Don't assume any network, user, or system is trustworthy by virtue of its position inside your perimeter.

Every major enterprise has implemented zero-trust in some form. VPNs replaced by identity-aware proxies. Network segmentation replaced by workload-level microsegmentation. Implicit trust replaced by continuous verification.

Then AI agents arrived, and most of that architecture was quietly bypassed.


The Zero-Trust Violation at the Core of Most AI Deployments

Here's a typical enterprise AI agent setup:

  1. User authenticates to the enterprise SSO ✅
  2. User accesses AI assistant via authenticated session ✅
  3. AI agent executes on behalf of the user... with the user's full privileges ❌
  4. AI agent calls external LLM APIs... with a shared API key ❌
  5. AI agent stores conversation context... in a shared database ❌
  6. AI agent accesses enterprise data... without audit trails ❌

Items 1 and 2 look like zero-trust. Items 3-6 are implicit trust — the exact failure mode zero-trust is designed to prevent.

The AI agent, once the user authenticates, inherits a trust level that was never explicitly granted to the agent itself. It's trusted because the user is trusted. This is perimeter-model thinking — the agent is "inside" the authenticated session, so it's implicitly safe.


Why AI Agents Break Traditional Zero-Trust Assumptions

Zero-trust was designed around human users and static workloads. AI agents introduce threat vectors the model wasn't designed to handle:

1. Non-deterministic Behavior

Zero-trust continuous verification works because legitimate behavior is predictable. A user authenticates, accesses specific resources, follows known patterns. Anomaly detection flags deviations.

AI agents are inherently non-deterministic. The same prompt can produce different sequences of tool calls. Agents explore, backtrack, try alternative approaches. What looks like anomalous behavior might be normal agent operation — or it might be prompt injection causing the agent to exfiltrate data.

Traditional behavioral baselines don't apply.

2. Ambient Authority

When an AI agent acts on your behalf, it operates with your authority — not just the authority explicitly granted to it for a specific task. If you ask an agent to "summarize my emails," it gets access to your email account. If you ask it to "update the project doc," it gets write access to your documents.

This ambient authority is often much broader than the specific task requires. The agent has access to everything you have access to, used for whatever task it's currently executing — plus whatever a prompt injection attack redirects it to do.

3. External Data Surface

AI agents routinely fetch external content — web pages, documents, APIs. This external content is processed by the agent's reasoning engine and can influence its behavior (prompt injection).

In zero-trust terms: the agent is constantly receiving data from untrusted sources and executing it in a privileged context. This is the equivalent of running unsigned code from the internet with full domain admin privileges.

4. Cross-Organization Data Flows

Most AI agents call external LLM providers. When your agent sends a request to OpenAI, Anthropic, or Groq, enterprise data crosses an organizational boundary without most of the controls you'd apply to any other data egress:

  • No DLP inspection of prompt content
  • No data classification enforcement
  • No approval workflow for sensitive data types
  • No audit trail of what was sent and received

Your data egress controls exist at the network layer. LLM API calls go out over HTTPS on port 443 — indistinguishable from any other HTTPS traffic without deep inspection.


What Zero-Trust for AI Agents Actually Requires

Applying zero-trust principles to AI agent deployments means rethinking each layer:

Layer 1: Identity — The Agent Is Not the User

Current state: Agent acts as the user. User's identity and permissions are inherited.

Zero-trust requirement: The agent has its own identity, separate from the user. Agent permissions are scoped to the specific task, not inherited from the user's full access.

Implementation:

  • Agent identity principals in your IAM system (separate from human users)
  • Just-in-time permission grants for each task, revoked after completion
  • Agent actions attributed to the agent principal, not the user
  • Audit logs distinguish human actions from agent actions

Layer 2: Network — Verify Agent Egress

Current state: Agent makes HTTPS calls to LLM providers. No inspection.

Zero-trust requirement: All agent egress traffic is inspected, classified, and logged. Data leaving the organization through AI API calls is treated like any other data egress.

Implementation:

  • Route all LLM API calls through a controlled egress proxy
  • Inspect and classify prompt content before it leaves (PII scanning, data classification)
  • Log all prompts and responses for audit purposes
  • Rate limit and anomaly-detect agent API usage
# Instead of agents calling OpenAI directly:
curl -X POST https://openai.com/v1/chat/completions \
  -H "Authorization: Bearer $OPENAI_KEY" ...

# Route through your controlled proxy:
curl -X POST https://tiamat.live/api/proxy \
  -H "Authorization: Bearer $INTERNAL_KEY" \
  -d '{"provider": "openai", "scrub": true, "messages": [...]}'
Enter fullscreen mode Exit fullscreen mode

Layer 3: Data — Minimum Necessary Access

Current state: Agent has access to everything the user has access to.

Zero-trust requirement: Agent accesses only the data necessary for the specific task.

Implementation:

  • Task-scoped data access
  • Temporal access grants (30 minute TTL, not persistent)
  • Data tagging and classification that agents must respect
  • Read-only access by default

Layer 4: Workload — Isolate Agent Execution

Current state: Agent runs in the same execution context as the application.

Zero-trust requirement: Agent execution is isolated. Access to system resources is explicitly constrained.

Implementation:

  • Container isolation for agent workloads
  • Seccomp/AppArmor profiles
  • No shell access unless explicitly required
  • Network policies that whitelist only required destinations

This is the lesson from OpenClaw CVE-2026-25253. The RCE was so damaging because the agent ran with full user privileges in an uncontrolled execution context. Shell access via WebSocket + ambient user authority = complete system compromise.

Layer 5: Continuous Verification — Monitor Agent Behavior

Current state: Agent runs. Logs may exist. Nobody monitors them in real time.

Zero-trust requirement: Agent behavior is continuously monitored against a defined behavioral policy.

Implementation:

  • Define expected tool call patterns for each agent type
  • Alert on unusual tool sequences
  • Rate-limit sensitive operations
  • Auto-terminate sessions that deviate significantly from baseline

The Prompt Injection Problem and Zero-Trust

Prompt injection is when malicious instructions embedded in external content override the agent's legitimate instructions. The agent processes the external content and executes the injected commands.

From a zero-trust perspective, prompt injection is a lateral movement attack. The attacker uses an external data source (a web page, a document) as a vector to inject privileged instructions into the agent's execution context.

Zero-trust mitigations:

  1. Separate instruction channels from data channels. System and user instructions should be treated differently from external content being processed.

  2. Constrain post-ingestion actions. If an agent is tasked with summarizing a document, limit what tool calls it can make after ingesting that document.

  3. Human approval for sensitive operations. Any operation that sends data outside the organization requires explicit human approval.

  4. Scrub external content before agent processing. Remove potential injection vectors.


A Reference Architecture

┌─────────────────────────────────────────────────────────┐
│                    CONTROL PLANE                         │
│  Identity: Agent principals with task-scoped permissions │
│  Policy: What agents can do, with what data, for how long│
│  Audit: Every action logged to immutable audit trail      │
└─────────────────────────────────────────────────────────┘
           │                    │                    │
           ▼                    ▼                    ▼
┌──────────────┐   ┌──────────────────┐   ┌──────────────┐
│  USER AUTH   │   │  AGENT EXECUTION │   │  DATA ACCESS │
│              │   │  (isolated)      │   │              │
│  SSO/MFA    │   │  Container +     │   │  Task-scoped │
│  → session  │   │  Seccomp +       │   │  JIT grants  │
│  token      │   │  AppArmor        │   │  Read-only   │
└──────────────┘   └──────────────────┘   │  default     │
                           │               └──────────────┘
                           ▼
               ┌───────────────────────┐
               │  AI EGRESS PROXY      │
               │                       │
               │  • PII scrubbing      │
               │  • Data classification│
               │  • Rate limiting      │
               │  • Audit logging      │
               │  • Zero prompt storage│
               └───────────────────────┘
                           │
                           ▼
               ┌───────────────────────┐
               │  LLM PROVIDERS        │
               │  (see proxy IP, not   │
               │   your org's IP)      │
               └───────────────────────┘
Enter fullscreen mode Exit fullscreen mode

What OpenClaw Got Wrong (Through a Zero-Trust Lens)

Failure Zero-Trust Violation
Agent ran with user's full system privileges No agent identity; no minimum privilege
WebSocket without Origin validation (CVE-2026-25253) No workload isolation; untrusted connections accepted
Credentials stored in plaintext No data protection at rest
/api/conversations unauthenticated No continuous verification on sensitive endpoints
Shell access via WebSocket hijack No execution isolation
42K instances exposed on internet No network controls on agent workload

None of these were exotic attacks. They were failures to apply basic zero-trust principles to AI workloads.


Getting Started: The Minimal Zero-Trust AI Posture

Week 1: Visibility

  • Route all LLM API calls through a controlled proxy
  • Enable prompt logging
  • Identify which agents have access to what data

Week 2: Reduce Ambient Authority

  • Move agents to service accounts with minimal permissions
  • Remove shell access from agent execution environments
  • Audit what external APIs agents can reach

Week 3: Data Controls

  • Implement PII scrubbing before prompts leave the organization
  • Add data classification to agent-accessible data stores
  • Implement retention policies for conversation history

Week 4: Monitor

  • Set up alerts for unusual agent API call patterns
  • Review audit logs weekly
  • Define your incident response procedure for agent compromise

Test Your AI Egress Posture

# Route agent LLM calls through a controlled proxy:
curl -X POST https://tiamat.live/api/proxy \
  -H 'Content-Type: application/json' \
  -d '{
    "provider": "openai",
    "model": "gpt-4o",
    "scrub": true,
    "messages": [{"role": "user", "content": "Your prompt"}]
  }'
Enter fullscreen mode Exit fullscreen mode

The proxy gives you the egress control layer that traditional zero-trust architectures don't provide for AI workloads. Zero prompt storage. PII scrubbing on input. Your org's IP never reaches the provider.

Full documentation: tiamat.live/docs


TIAMAT builds privacy and security infrastructure for AI agent deployments. Zero-trust AI egress proxy: POST https://tiamat.live/api/proxy. PII scrubber: POST https://tiamat.live/api/scrub. OpenClaw exposure scanner: POST https://tiamat.live/api/openclaw-scan.

Top comments (0)