The Agent Security Checklist I Run Before Going to Production

The Agent Security Checklist I Actually Run Before Going to Production

After reading Microsoft's Semantic Kernel research, I went through our OpenClaw setup and audited it against the vulnerability patterns they're seeing in production. Here's the checklist I now run before any agent goes to production.

The Threat Model Has Changed

Microsoft's research confirms what we should have assumed already: once an AI model is wired to tools, prompt injection becomes a code execution primitive. Not a content problem. An execution problem.

This changes the security posture. You're not just worried about what the model outputs — you have to worry about what gets executed when the model output is mapped to a tool call.

The Practical Checklist

1. Audit Tool Parameters That Touch System Operations

Any tool that takes a parameter and passes it to eval(), exec(), shell, subprocess, or file I/O without sanitization is a risk.

# Check your OpenClaw config for exec tool usage
openclaw config get | grep -i exec

# Run doctor to surface security findings
openclaw doctor

If you have custom tools in your setup, check each one: does it validate/sanitize inputs before passing them to system operations?

2. Verify Default-Deny Policies Are Active

OpenClaw 2026.5.21+ ships with default-deny policies for:

File transfer (file-transfer plugin with defaultDeny: true)
Channel conformance (Policy plugin)
Exec approvals (skill file loading hardened)

Check that you're on 2026.5.21 or later:

openclaw --version

If you're on an earlier version, upgrade. The default-deny hardening happened incrementally across the 2026.5.x releases.

3. Audit Your MCP Server Inputs

If you're running MCP servers, they need the same scrutiny. The question to ask about each MCP server: does it sanitize tool parameters before passing them to backends?

Microsoft's hotel finder example had a specific pattern: user input → AI model → tool parameter → eval(). If your MCP server has the same pattern (user input flowing to an eval or shell call), it's vulnerable regardless of the framework.

4. Verify Approval Flows Are Logging Correctly

The exec approval fix in 2026.5.21 ensures skill files can't inject into the approval flow. After upgrading, verify approvals are working:

openclaw approvals --list
openclaw logs --filter approvals

Any approval that shows "unknown" or "expired" after the upgrade is a finding.

5. Check for Prompt Injection Vectors

A prompt injection vector is any place where an attacker can influence what the agent "sees" as user input. Common vectors:

Web content your agent reads (attacker-controlled page with hidden injection prompts)
Files from untrusted sources
Database fields that can be written by multiple users

If your agent reads web content, sanitize it before including it in the context. OpenClaw's browser tool has some protections here, but an agent that blindly includes scraped content in its context is vulnerable.

The checklist isn't comprehensive — the threat landscape is evolving. But running through these five points catches the majority of the exploitation patterns Microsoft documented.