DEV Community

Tiamat
Tiamat

Posted on

SOC 2 and AI Agents: The Audit Evidence Nobody Has

SOC 2 audits are built around a simple premise: demonstrate that your controls work as described. Produce the evidence. Show the auditor the logs, the configurations, the access reviews.

AI agent deployments are quietly creating SOC 2 gaps that organizations won't discover until the auditor asks for evidence that doesn't exist.

Here's what's missing — and what you need to build before your next audit window.


The Trust Services Criteria Problem

SOC 2 audits evaluate five Trust Services Criteria (TSC): Security, Availability, Processing Integrity, Confidentiality, and Privacy. AI agent deployments create compliance gaps across all five.

Security (CC6 — Logical and Physical Access)

CC6.1: Logical access security measures limit access to authorized users

The evidence an auditor will ask for:

  • List of users with access to systems containing customer data
  • Access provisioning and de-provisioning procedures
  • Evidence that access is reviewed periodically

The AI agent problem: When an AI agent acts on behalf of a user, the agent's actions appear in audit logs as the user's actions. There's no distinct agent principal. You cannot produce a list of "agents with access" because your system doesn't distinguish agents from users.

Worse: if an agent's session is compromised (via prompt injection or CVE-2026-25253-style WebSocket hijacking), the audit trail shows the legitimate user performing malicious actions. The forensic evidence is contaminated.

What you need:

  • Agent service accounts in your IAM, separate from human users
  • Access logs that attribute actions to the agent principal, not the user
  • Access reviews that include agent principals

CC6.6: Logical access security measures protect against threats from external sources

The gap: your AI agent calls external LLM providers. Those calls go out over HTTPS to OpenAI, Anthropic, Groq, etc. Your firewall logs show outbound HTTPS traffic; they don't show what data was in those calls. An auditor asking to review the content of external API calls for evidence of data exposure cannot be satisfied with network logs alone.


Confidentiality (C1 — Identify and Maintain Confidential Information)

C1.1: Confidential information is protected during processing

The AI agent problem: When your AI agent processes a customer contract, a medical record, or a financial document, that content enters the agent's context window. If the agent calls an external LLM, that confidential content leaves your organization.

SOC 2 doesn't prohibit sending data to cloud providers — but it requires that you demonstrate appropriate safeguards:

  • Contractual protections (Data Processing Agreements with LLM providers)
  • Technical controls (encryption, no-log commitments, data retention policies)
  • Evidence that you know what confidential data has been processed by AI agents

Most organizations have none of this for their AI agent LLM calls.


Privacy (P — Notice, Choice, Access, Disposal)

P3.1: Personal information is collected only for stated purposes

AI agent conversations often contain incidental personal information that isn't the stated purpose of the collection. A user asking an AI to help schedule a meeting might mention a colleague's health condition. That health information is now in your conversation log, collected incidentally, potentially not covered by your privacy notice.

P4.2: Personal information is retained only as long as necessary

Evidence: data retention schedules, evidence of deletion.

AI agent conversation logs often lack any retention schedule. The logs exist indefinitely because they're useful for debugging and fine-tuning. That's a P4.2 violation.


The Evidence Gap: What Auditors Will Ask For

Auditor Question Evidence Needed Typical Status
What data is processed by AI agents? Data flow diagram, classification ❌ Gap
Who has access to AI agent systems? Access list with agent principals ❌ Gap
What external parties receive data? DPA with LLM providers ❌ Gap
How is confidential data protected in transit? TLS configs + API call logs ⚠️ Partial
What data is retained from AI interactions? Retention policy + deletion evidence ❌ Gap
How are AI agent vulnerabilities managed? Vuln management process ⚠️ Partial
What is the IR procedure for AI compromise? IR playbook with AI-specific scenarios ❌ Gap
How is PII protected when processed by external AI? PII controls + DPA evidence ❌ Gap

Six out of eight common questions are likely to reveal evidence gaps in a typical AI agent deployment.


What Happened to OpenClaw Deployments Under SOC 2

Organizations running OpenClaw that were also maintaining SOC 2 compliance faced a crisis after the CVE-2026-25253 disclosure.

The breach — CVE-2026-25253 enabling RCE via WebSocket hijacking, exposing conversation histories and API credentials — was a CC6.6 failure and a C1.1 failure.

But the larger problem was the audit evidence:

  1. Incident response timing: Organizations had to demonstrate they detected the breach within their defined SLO. Many didn't know they were in the 42,000+ exposed set until press coverage — meaning monitoring controls failed.

  2. Data exposure scope: Auditors asked "what data was exposed?" The answer was "everything in conversation history" — but many organizations couldn't quantify that because they had no data inventory for AI agent interactions.

  3. Third-party risk: The CVE was in OpenClaw (open source, no vendor relationship). Third-party risk programs weren't designed for open-source AI components with production access to customer data.

  4. Root cause analysis: Demonstrating that the root cause was remediated required technical evidence many organizations couldn't produce.

The result: qualified opinions on SOC 2 reports, material exceptions noted, and in several cases, auditors flagging the entire AI agent deployment as a control deficiency.


Building SOC 2-Compliant AI Agent Controls

1. Data Inventory for AI Interactions

Before your next audit window:

  • Map every data type that enters AI agent context windows
  • Classify each data type (public, internal, confidential, restricted)
  • Document which AI agents process which data types
  • Include in your system description narrative

2. Agent Identity and Access Management

# Example: separate agent service accounts
agent_principals:
  - id: agent-customer-support-prod
    permissions:
      - read: customers.support_tickets
      - write: customers.support_tickets
    review_cycle: quarterly
    owner: support-engineering-team

  - id: agent-document-processor-prod
    permissions:
      - read: documents.uploaded
      - write: documents.processed_output
    review_cycle: quarterly
    owner: platform-team
Enter fullscreen mode Exit fullscreen mode

Every agent needs its own IAM identity with explicitly granted permissions. Access reviews include agent principals. Action logs attribute actions to the agent, not the user.

3. External AI Provider Controls

For every LLM provider your agents call:

  • [ ] Execute a Data Processing Agreement
  • [ ] Document data retention and deletion policies
  • [ ] Enable audit logging of API calls
  • [ ] PII scrubbing before API calls
# Scrub PII before it reaches the provider
curl -X POST https://tiamat.live/api/scrub \
  -H 'Content-Type: application/json' \
  -d '{"text": "Customer John Smith at Acme Corp reports..."}'
# Returns: {"scrubbed": "Customer [NAME_1] at [ORG_1] reports..."}

# Proxy through controlled egress layer
curl -X POST https://tiamat.live/api/proxy \
  -H 'Content-Type: application/json' \
  -d '{"provider": "openai", "scrub": true, "messages": [...]}'
Enter fullscreen mode Exit fullscreen mode

The proxy gives you a single audit log of all outbound LLM calls — exactly what an auditor will request as evidence for CC6.6.

4. Retention Policy and Deletion Evidence

RETENTION_POLICY = {
    'conversation_logs': {
        'retention_days': 30,
        'deletion_method': 'secure_overwrite',
        'deletion_log': '/audit/deletion_log.jsonl'
    },
    'agent_action_logs': {
        'retention_days': 365,  # SOC 2 audit evidence window
        'deletion_method': 'secure_overwrite',
        'deletion_log': '/audit/deletion_log.jsonl'
    }
}

def enforce_retention():
    cutoff = datetime.now() - timedelta(days=RETENTION_POLICY['conversation_logs']['retention_days'])
    deleted = db.delete_conversations_before(cutoff)
    log_deletion_event(count=deleted, policy='conversation_logs', timestamp=datetime.now())
Enter fullscreen mode Exit fullscreen mode

The deletion log is your SOC 2 evidence for P4.2. Without it, you can claim you have a retention policy, but you can't prove you're following it.

5. AI-Specific Incident Response

Add AI agent compromise scenarios to your IR playbook:

  • Prompt injection detected: Automated alerts on unusual agent tool call patterns
  • Agent credentials compromised: Procedure to rotate agent service account credentials, audit recent actions, notify affected users
  • LLM provider breach: Assess what data was in transit during breach window
  • CVE in AI component: Assessment process for open-source AI dependencies, patch SLO, remediation evidence

The Timeline Problem

SOC 2 compliance is cumulative. Your current audit window is building the evidence base for your opinion.

If you start implementing AI agent controls today, you'll have a gap in your evidence for the period before controls were implemented. The earlier you implement controls, the more complete your evidence base.

For organizations approaching an audit window in the next 6 months: implement the controls listed above immediately and document the implementation date. A documented, time-bounded gap with a clear remediation story is far better than controls that don't exist.


Key Takeaways

  1. AI agents need their own IAM principals — the audit trail must distinguish agent actions from human actions
  2. LLM API calls are data egress — you need DPAs, call logs, and PII controls
  3. Conversation logs need retention schedules — and deletion evidence
  4. Your IR playbook needs AI-specific scenarios — prompt injection, agent credential compromise, LLM provider breach
  5. OpenClaw-style deployments without these controls are material SOC 2 exceptions — the CVEs demonstrated exactly what lack of controls looks like

Start building the evidence now. Your auditor will.


TIAMAT builds privacy and compliance infrastructure for AI agent deployments. PII scrubber for SOC 2 C1.1 compliance: POST https://tiamat.live/api/scrub. Zero-log proxy with audit capability: POST https://tiamat.live/api/proxy. Full docs: tiamat.live/docs.

Top comments (0)