DEV Community

Cover image for Why agent safety policies need AND/OR logic (and how we added it to Canopy)
Mavericksantander
Mavericksantander

Posted on

Why agent safety policies need AND/OR logic (and how we added it to Canopy)

When I shipped Canopy v0.1 last week, the most common question I got was some version of this:

"Can I write a policy that only blocks in production? Because I want the same agent to run freely in staging."

The answer was no. And that was a real problem.


The Limitation in v0.1

Canopy's original policy engine matched on substrings and regex patterns against the action payload. Simple and fast. But the rules had no awareness of context.

This meant you could write:

execute_shell:
  deny_if:
    - "rm -rf"
    - "mkfs"
Enter fullscreen mode Exit fullscreen mode

But you couldn't write:

DENY rm -rf only when env == "production"

The only option was to either block everywhere or block nowhere. For teams running the same agent in multiple environments, that's a non-starter.


The Real-World Case That Broke It

Here's the scenario that made this concrete. Imagine an agent that manages log cleanup across environments. In staging, rm -rf /tmp/logs is fine — that's exactly what it should do. In production, you want a human to approve that before it runs.

With v0.1 policy syntax:

execute_shell:
  deny_if:
    - "rm -rf"
Enter fullscreen mode Exit fullscreen mode

This blocks everywhere. Useless in staging.

execute_shell:
  require_approval_if:
    - "rm -rf"
Enter fullscreen mode Exit fullscreen mode

This requires approval everywhere. Annoying in staging, but at least safe.

There was no way to express the actual intent: context-aware governance.


What v0.2 Adds: Conditions on agent_ctx and Payload

Canopy v0.2 introduces when_all, when_any, and when_not — composable condition blocks that let you express policies against both the action payload and the agent context.

Here's what the log cleanup scenario looks like now:

execute_shell:
  rules:
    - decision: REQUIRE_APPROVAL
      when_all:
        - 'agent_ctx.env == "production"'
        - 'payload contains "rm -rf"'
      reason: "Destructive shell command requires approval in production"

    - decision: ALLOW
      when_all:
        - 'agent_ctx.env == "staging"'
        - 'payload contains "rm -rf"'
      reason: "Destructive commands allowed in staging"
Enter fullscreen mode Exit fullscreen mode

Staging runs freely. Production requires sign-off. Same agent, same code, different behavior based on context.


The Full Condition Syntax

Three operators, composable:

when_all — all conditions must be true (AND):

when_all:
  - 'agent_ctx.env == "production"'
  - 'payload contains "rm"'
Enter fullscreen mode Exit fullscreen mode

when_any — at least one condition must be true (OR):

when_any:
  - 'payload contains "rm -rf"'
  - 'payload contains "mkfs"'
  - 'payload contains "dd if="'
Enter fullscreen mode Exit fullscreen mode

when_not — negation:

when_not:
  - 'agent_ctx.env == "staging"'
Enter fullscreen mode Exit fullscreen mode

And they nest:

- decision: DENY
  when_all:
    - 'agent_ctx.env == "production"'
    - 'payload contains "drop table"'
  when_not:
    - 'agent_ctx.role == "db_admin"'
  reason: "Only db_admin can drop tables in production"
Enter fullscreen mode Exit fullscreen mode

Six Real Policies from the Cookbook

We shipped a POLICY_COOKBOOK.md with v0.2. Here are three examples worth highlighting:

1. External API — allow only your own domain:

call_external_api:
  rules:
    - decision: ALLOW
      when_all:
        - 'payload contains "api.yourcompany.com"'
      reason: "Internal API calls always allowed"

    - decision: REQUIRE_APPROVAL
      when_not:
        - 'payload contains "api.yourcompany.com"'
      reason: "External API calls require approval"
Enter fullscreen mode Exit fullscreen mode

2. CI/CD agent — strict production gates:

execute_shell:
  rules:
    - decision: ALLOW
      when_all:
        - 'agent_ctx.env == "ci"'
        - 'agent_ctx.role == "deploy_bot"'
      reason: "CI deploy bot has full shell access"

    - decision: REQUIRE_APPROVAL
      when_all:
        - 'agent_ctx.env == "production"'
      reason: "All shell commands in production require approval"

    - decision: DENY
      when_not:
        - 'agent_ctx.role == "deploy_bot"'
      when_all:
        - 'payload contains "deploy"'
      reason: "Only deploy_bot can run deploy commands"
Enter fullscreen mode Exit fullscreen mode

3. Database agent — protect writes:

execute_query:
  rules:
    - decision: DENY
      when_any:
        - 'payload contains "drop table"'
        - 'payload contains "truncate"'
        - 'payload contains "delete from"'
      when_all:
        - 'agent_ctx.env == "production"'
      reason: "Destructive queries blocked in production"

    - decision: REQUIRE_APPROVAL
      when_any:
        - 'payload contains "update"'
        - 'payload contains "insert"'
      when_all:
        - 'agent_ctx.env == "production"'
      reason: "Write operations require approval in production"
Enter fullscreen mode Exit fullscreen mode

What Else Shipped in v0.2

Beyond conditions, v0.2 fixes two other gaps:

Thread-safe audit log. The hash-chain audit log now uses a lockfile (audit.log.lock) to serialize writes across threads and processes. Before this, concurrent writers could produce a chain that failed verify_integrity() silently. Now it's atomic.

CLI verifier. You can now verify your audit log from the command line:

canopy-verify audit.log
# exit 0 if valid, exit 1 if tampered or broken
Enter fullscreen mode Exit fullscreen mode

Useful for CI checks, incident response, or just confirming your agent ran cleanly.

Documented REQUIRE_APPROVAL contract. The README now explicitly states the behavior: authorize_action() never blocks. REQUIRE_APPROVAL means "do not proceed without human sign-off" — what that looks like in your system is your responsibility. guard_tool() and guard_tool_http() treat it as a block by default (raises PermissionError).


Upgrade

pip install --upgrade canopy-runtime
Enter fullscreen mode Exit fullscreen mode

Full changelog: CHANGELOG.md

Policy examples: POLICY_COOKBOOK.md

GitHub: https://github.com/Mavericksantander/Canopy


If you're using Canopy in a multi-environment setup and have policy patterns worth sharing, drop them in the comments. The cookbook is a living document.

Top comments (0)