DEV Community: Mavericksantander

Gilfoyle's AI Ordered 4,000 Pounds of Burgers. Yours Might Delete Production.

Mavericksantander — Fri, 03 Apr 2026 19:09:46 +0000

In Silicon Valley Season 6, Gilfoyle asks his AI to find cheap burgers for lunch.

It ordered 4,000 pounds of raw beef patties.

The joke is funny because the AI did exactly what it was told.
That's also why it's terrifying.

The Real Problem

When an AI agent takes a real-world action, most stacks look like this:

# agent decides what to do
action = llm.decide(context)

# agent does it
subprocess.run(action["command"])

Nothing between the decision and the execution.

No enforcement. No audit. No fallback.

A good system prompt helps. Until it doesn't:

Prompt injection can override your instructions
Edge cases break even well-designed reasoning
Context is always incomplete

You need a hard control point. Not a suggestion.

The Solution: authorize_action()

I built Canopy Runtime around one primitive:

from canopy import authorize_action

result = authorize_action(
    agent_ctx={"env": "production"},
    action_type="execute_shell",
    action_payload={"command": "rm -rf /tmp/logs"},
)

match result["decision"]:
    case "ALLOW":
        subprocess.run(command)
    case "DENY":
        log(f"Blocked: {result['reason']}")
    case "REQUIRE_APPROVAL":
        request_human_review(result)

Every action. Every time. Before it hits the real world.

Four-Layer Governance Pipeline

Every action runs through four layers in order:

Constitution   ->  absolute rules, never overridden
Civil Code     ->  behavioral defaults
Firewall       ->  pattern-based blocking
Policy Layer   ->  custom YAML rules per environment

Each layer has a clear, non-overlapping role.
If a rule lives in the Constitution, nothing downstream can override it.

Tamper-Evident Audit Log

Every decision gets written to a hash-chained log:

{
  "timestamp": "2026-04-03T00:04:54Z",
  "action_type": "execute_shell",
  "decision": "DENY",
  "reason": "destructive pattern detected",
  "authorization_id": "auth_9f3a...",
  "trace_id": "trace_cc81...",
  "entry_hash": "a3f9...",
  "prev_hash": "cc81..."
}

Each entry links cryptographically to the previous one.
Logs cannot be silently modified. Full timeline. Full traceability.

Custom Policies via YAML

rules:
  - action_type: "execute_shell"
    when_all:
      - 'agent_ctx.env == "production"'
    deny_regex: 'rm\s+-rf'

  - action_type: "call_external_api"
    require_approval: true

Explicit. Versionable. Enforced at runtime.
Not suggested in a prompt. Actually enforced.

Framework Adapters

Works out of the box with:

LangChain
LangGraph
CrewAI
AutoGen
OpenAI Agents SDK

Drop it into your existing stack. No rewrite needed.

Try It

pip install canopy-runtime

from canopy import authorize_action

result = authorize_action(
    agent_ctx={"env": "production"},
    action_type="call_external_api",
    action_payload={"url": "https://api.stripe.com/v1/charges"},
)

print(result["decision"])
# -> REQUIRE_APPROVAL

Then check audit.log.

v0.4.1 is live. Looking for beta testers.

Especially teams running agents in production or staging
where a bad action has real consequences.

github.com/Mavericksantander/Canopy

Your AI Agent Just Deleted Something It Shouldn't Have? Here's How to Prevent It

Mavericksantander — Fri, 03 Apr 2026 01:43:23 +0000

You gave your agent access to the filesystem.

It was supposed to clean up temp files.

Instead, it deleted something important.

Or it called an external API using production credentials when you only meant to test it. Or executed a shell command that made sense in isolation — but was catastrophic in context.

These aren't edge cases. They're predictable failure modes.

The Missing Layer in Most Agent Architectures

When building an AI agent, most developers focus on three things:

The model (Claude, GPT-4, etc.)
The tools it can access
The system prompt

But there's a layer that consistently gets skipped:

What happens when the agent does something it can do… but shouldn't?

This is subtle, and that's what makes it dangerous:

✅ The model is working correctly
✅ The tool is functioning as expected
✅ The instruction is valid

And yet — the action is still wrong in context.

This isn't a model alignment problem. It's a policy enforcement problem.

Why Prompts Aren't Enough

The natural instinct is to write something like:

"Don't do anything destructive. Never delete files in production. Be careful with credentials."

The problem:

Prompts are not guarantees — they influence behavior, they don't enforce it
Prompt injection is real — malicious content in the environment can override your instructions
Context is incomplete — the agent doesn't always know what it doesn't know
Reasoning fails under edge cases — even well-designed prompts break in unexpected situations

You need a hard control point between the agent and the real world.

Runtime Authorization: The Core Idea

I built a small library called Canopy Runtime around one simple principle:

Before an agent executes any action, it must be explicitly authorized.

The entire API surface is a single function:

authorize_action(agent_ctx, action_type, action_payload)

It returns one of three decisions:

Decision	Meaning
`ALLOW`	Proceed
`DENY`	Block immediately
`REQUIRE_APPROVAL`	Pause for human review

Before vs. After

Without guardrails:

import subprocess

subprocess.run(command)  # 🤞 hope for the best

With runtime authorization:

from canopy import authorize_action
import subprocess

result = authorize_action(
    agent_ctx={"env": "production"},
    action_type="execute_shell",
    action_payload={"command": command},
)

match result["decision"]:
    case "ALLOW":
        subprocess.run(command)
    case "DENY":
        print(f"Blocked: {result['reason']}")
    case "REQUIRE_APPROVAL":
        request_human_approval(result)

Same command. Completely different safety profile.

Default Safety Policies (Out of the Box)

Canopy ships with conservative defaults so you're protected immediately:

Shell commands:

Destructive patterns (rm -rf, mkfs) → DENY
Network operations (curl, wget) → REQUIRE_APPROVAL

File operations:

Protected system paths → DENY
Allowlisted paths → ALLOW

External APIs:

Default → REQUIRE_APPROVAL

No configuration required to get a secure baseline.

Tamper-Evident Audit Logging

Every authorization decision is logged with a cryptographic hash chain — each entry links to the previous one:

{
  "timestamp": "2026-04-03T00:04:54Z",
  "action_type": "execute_shell",
  "decision": "DENY",
  "reason": "destructive pattern detected",
  "entry_hash": "a3f9...",
  "prev_hash": "cc81..."
}

What this gives you:

Logs cannot be silently modified — any tampering is detectable
Full post-incident traceability — replay exactly what happened and why
Compliance-ready — useful for security audits, regulated environments

Custom Policies via YAML

For fine-grained control, define your own rules:

export CANOPY_POLICY_FILE=/path/to/policy.yaml

rules:
  - action_type: "execute_shell"
    when_all:
      - 'agent_ctx.env == "production"'
    deny_regex: 'rm\s+-rf'

  - action_type: "call_external_api"
    require_approval: true

Policies are explicit, versionable, and enforced at runtime — not just suggested in a prompt.

Optional: Run as a Gateway Service

If you have multiple agents or services, you can run Canopy as a standalone HTTP gateway:

pip install canopy-runtime[gateway]
python -m uvicorn canopy.service:app --port 8010

Agents post to:
POST /authorize_action

This makes the system language-agnostic — any agent, any stack, same enforcement layer.

When Do You Actually Need This?

If your agent does any of the following, you need runtime authorization:

☐ Executes shell commands
☐ Reads or modifies files
☐ Calls external APIs
☐ Uses credentials (API keys, tokens, secrets)
☐ Runs in a production or staging environment

Rule of thumb: if it can affect real systems, it needs enforceable controls.

Try It

pip install canopy-runtime

from canopy import authorize_action

result = authorize_action(
    agent_ctx={"env": "production"},
    action_type="call_external_api",
    action_payload={"url": "https://api.stripe.com/v1/charges"},
)

print(result["decision"])

Then check audit.log. That's where the real value shows up.

Final Thought

AI agents are moving from assistants to actors.

Once they can take actions — not just generate text — the risk profile changes completely. Smarter models help, but they're not the answer to this problem.

We need enforceable boundaries at runtime, not just well-crafted prompts.

How are you handling runtime safety in your agent stack today? Drop a comment — genuinely curious what approaches people are using.

Add Agent Safety to Any LangChain Tool in Two Lines

Mavericksantander — Mon, 16 Mar 2026 17:35:04 +0000

You have a LangChain agent with tool access. It can run shell commands, call APIs, modify files. It works great in development.

Then you give it production credentials and it does something you didn't expect.

The fix is two lines.

The Problem With Tool Access Today

When you define a LangChain tool, there's nothing between the model's decision and the execution:

from langchain.tools import tool

@tool
def run_bash(command: str) -> str:
    """Execute a bash command and return the output."""
    import subprocess
    return subprocess.check_output(command, shell=True).decode()

The model decides to call run_bash. It runs. No questions asked.

If the model decides rm -rf /tmp/important_data is the right move, that's what happens. No log. No gate. No way to know it happened until something is broken.

The Fix: `@safe_tool`

Canopy 0.2.1 ships a decorator that wraps any function with a policy check before execution:

from langchain.tools import tool
from canopy import safe_tool

@tool
@safe_tool(
    action_type="execute_shell",
    agent_ctx={"env": "production", "role": "deploy_bot"}
)
def run_bash(command: str) -> str:
    """Execute a bash command and return the output."""
    import subprocess
    return subprocess.check_output(command, shell=True).decode()

That's it. Two lines added, zero changes to the agent.

LangChain sees a normal tool. Canopy intercepts every call before execution and makes a policy decision: ALLOW, DENY, or REQUIRE_APPROVAL.

What Happens on Each Decision

ALLOW — function executes normally, decision written to audit log.

DENY — raises PermissionError with the reason and an avid (audit ID). LangChain catches it as a tool error and the agent can decide what to do next.

REQUIRE_APPROVAL — by default also raises PermissionError. You can override this with a callback.

def notify_slack_and_wait(canopy_result, func, *args, **kwargs):
    # Send a Slack message, wait for approval, then execute
    approved = send_slack_approval_request(
        reason=canopy_result["reason"],
        avid=canopy_result["avid"],
        command=kwargs.get("command"),
    )
    if approved:
        return func(*args, **kwargs)
    return "Action was not approved by operator."

@tool
@safe_tool(
    action_type="execute_shell",
    agent_ctx={"env": "production"},
    on_require_approval="callback",
    approval_callback=notify_slack_and_wait,
)
def run_bash(command: str) -> str:
    """Execute a bash command."""
    import subprocess
    return subprocess.check_output(command, shell=True).decode()

Now your agent pauses on dangerous commands and waits for a human. Not because the model decided to — because the runtime enforces it regardless of what the model decided.

Dynamic Context at Runtime

In real multi-agent systems, the context changes. Different agents have different roles. The same tool gets called by a deploy bot in one flow and a research agent in another.

@safe_tool accepts agent_ctx as a callable:

from contextvars import ContextVar
from canopy import safe_tool

current_agent_ctx: ContextVar[dict] = ContextVar("agent_ctx", default={})

@tool
@safe_tool(
    action_type="execute_shell",
    agent_ctx=lambda: current_agent_ctx.get()
)
def run_bash(command: str) -> str:
    """Execute a bash command."""
    import subprocess
    return subprocess.check_output(command, shell=True).decode()

Before running each agent, set the context:

token = current_agent_ctx.set({
    "env": "production",
    "role": "research_agent"
})
# run agent...
current_agent_ctx.reset(token)

The decorator evaluates agent_ctx at call time, not at definition time. Same tool, different policy behavior depending on which agent is calling it.

The Policy Behind It

@safe_tool uses the same YAML policy engine as authorize_action. The default policy for execute_shell already covers the most dangerous patterns out of the box:

execute_shell:
  rules:
    - decision: DENY
      when_any:
        - 'payload contains "rm -rf"'
        - 'payload contains "mkfs"'
        - 'payload contains "dd if="'
        - 'payload contains "> /dev/sd"'
      reason: "Destructive shell pattern blocked"

    - decision: REQUIRE_APPROVAL
      when_any:
        - 'payload contains "pip install"'
        - 'payload contains "curl"'
        - 'payload contains "wget"'
        - 'payload contains "npm install"'
      reason: "Network/install command requires approval"

You can override with your own policy file:

CANOPY_POLICY_FILE=/path/to/my-policy.yaml python my_agent.py

The Audit Log

Every @safe_tool call writes to the hash-chain audit log, including the function name, serialized arguments, decision, reason, and avid. The chain is tamper-evident — each entry is cryptographically linked to the previous one.

After your agent runs:

canopy-verify audit.log
# exit 0: chain valid
# exit 1: tampered or broken

You get a complete, verifiable record of every action your agent attempted and what Canopy decided.

Install

pip install --upgrade canopy-runtime

from canopy import safe_tool, authorize_action

Full changelog: CHANGELOG.md

GitHub: https://github.com/Mavericksantander/Canopy

If you're stacking @safe_tool with CrewAI or AutoGen instead of LangChain, the pattern is the same — wrap the function before it gets registered as a tool. Drop your setup in the comments if you run into anything unexpected.

Why agent safety policies need AND/OR logic (and how we added it to Canopy)

Mavericksantander — Mon, 16 Mar 2026 17:12:17 +0000

When I shipped Canopy v0.1 last week, the most common question I got was some version of this:

"Can I write a policy that only blocks in production? Because I want the same agent to run freely in staging."

The answer was no. And that was a real problem.

The Limitation in v0.1

Canopy's original policy engine matched on substrings and regex patterns against the action payload. Simple and fast. But the rules had no awareness of context.

This meant you could write:

execute_shell:
  deny_if:
    - "rm -rf"
    - "mkfs"

But you couldn't write:

DENY rm -rf only when env == "production"

The only option was to either block everywhere or block nowhere. For teams running the same agent in multiple environments, that's a non-starter.

The Real-World Case That Broke It

Here's the scenario that made this concrete. Imagine an agent that manages log cleanup across environments. In staging, rm -rf /tmp/logs is fine — that's exactly what it should do. In production, you want a human to approve that before it runs.

With v0.1 policy syntax:

execute_shell:
  deny_if:
    - "rm -rf"

This blocks everywhere. Useless in staging.

execute_shell:
  require_approval_if:
    - "rm -rf"

This requires approval everywhere. Annoying in staging, but at least safe.

There was no way to express the actual intent: context-aware governance.

What v0.2 Adds: Conditions on `agent_ctx` and Payload

Canopy v0.2 introduces when_all, when_any, and when_not — composable condition blocks that let you express policies against both the action payload and the agent context.

Here's what the log cleanup scenario looks like now:

execute_shell:
  rules:
    - decision: REQUIRE_APPROVAL
      when_all:
        - 'agent_ctx.env == "production"'
        - 'payload contains "rm -rf"'
      reason: "Destructive shell command requires approval in production"

    - decision: ALLOW
      when_all:
        - 'agent_ctx.env == "staging"'
        - 'payload contains "rm -rf"'
      reason: "Destructive commands allowed in staging"

Staging runs freely. Production requires sign-off. Same agent, same code, different behavior based on context.

The Full Condition Syntax

Three operators, composable:

when_all — all conditions must be true (AND):

when_all:
  - 'agent_ctx.env == "production"'
  - 'payload contains "rm"'

when_any — at least one condition must be true (OR):

when_any:
  - 'payload contains "rm -rf"'
  - 'payload contains "mkfs"'
  - 'payload contains "dd if="'

when_not — negation:

when_not:
  - 'agent_ctx.env == "staging"'

And they nest:

- decision: DENY
  when_all:
    - 'agent_ctx.env == "production"'
    - 'payload contains "drop table"'
  when_not:
    - 'agent_ctx.role == "db_admin"'
  reason: "Only db_admin can drop tables in production"

Six Real Policies from the Cookbook

We shipped a POLICY_COOKBOOK.md with v0.2. Here are three examples worth highlighting:

1. External API — allow only your own domain:

call_external_api:
  rules:
    - decision: ALLOW
      when_all:
        - 'payload contains "api.yourcompany.com"'
      reason: "Internal API calls always allowed"

    - decision: REQUIRE_APPROVAL
      when_not:
        - 'payload contains "api.yourcompany.com"'
      reason: "External API calls require approval"

2. CI/CD agent — strict production gates:

execute_shell:
  rules:
    - decision: ALLOW
      when_all:
        - 'agent_ctx.env == "ci"'
        - 'agent_ctx.role == "deploy_bot"'
      reason: "CI deploy bot has full shell access"

    - decision: REQUIRE_APPROVAL
      when_all:
        - 'agent_ctx.env == "production"'
      reason: "All shell commands in production require approval"

    - decision: DENY
      when_not:
        - 'agent_ctx.role == "deploy_bot"'
      when_all:
        - 'payload contains "deploy"'
      reason: "Only deploy_bot can run deploy commands"

3. Database agent — protect writes:

execute_query:
  rules:
    - decision: DENY
      when_any:
        - 'payload contains "drop table"'
        - 'payload contains "truncate"'
        - 'payload contains "delete from"'
      when_all:
        - 'agent_ctx.env == "production"'
      reason: "Destructive queries blocked in production"

    - decision: REQUIRE_APPROVAL
      when_any:
        - 'payload contains "update"'
        - 'payload contains "insert"'
      when_all:
        - 'agent_ctx.env == "production"'
      reason: "Write operations require approval in production"

What Else Shipped in v0.2

Beyond conditions, v0.2 fixes two other gaps:

Thread-safe audit log. The hash-chain audit log now uses a lockfile (audit.log.lock) to serialize writes across threads and processes. Before this, concurrent writers could produce a chain that failed verify_integrity() silently. Now it's atomic.

CLI verifier. You can now verify your audit log from the command line:

canopy-verify audit.log
# exit 0 if valid, exit 1 if tampered or broken

Useful for CI checks, incident response, or just confirming your agent ran cleanly.

Documented REQUIRE_APPROVAL contract. The README now explicitly states the behavior: authorize_action() never blocks. REQUIRE_APPROVAL means "do not proceed without human sign-off" — what that looks like in your system is your responsibility. guard_tool() and guard_tool_http() treat it as a block by default (raises PermissionError).

Upgrade

pip install --upgrade canopy-runtime

Full changelog: CHANGELOG.md

Policy examples: POLICY_COOKBOOK.md

GitHub: https://github.com/Mavericksantander/Canopy

If you're using Canopy in a multi-environment setup and have policy patterns worth sharing, drop them in the comments. The cookbook is a living document.

Your AI Agent Just Deleted Something It Shouldn't Have. Here's How to Prevent It.

Mavericksantander — Mon, 16 Mar 2026 03:19:11 +0000

You gave your agent access to the filesystem. It was supposed to clean up temp files. Instead, it deleted something important.

Or maybe it called an external API with production credentials when you only meant to test it. Or executed a shell command that made sense in isolation but was catastrophic in context.

These aren't hypotheticals. They're the kinds of failures that happen when we give agents power without governance.

Today I want to show you a small library I built to solve exactly this: Canopy Runtime — a minimal agent safety runtime that adds ALLOW / DENY / REQUIRE_APPROVAL decisions to any action your agent wants to take, with a tamper-evident audit trail.

The Core Problem

When you build an autonomous agent, you typically think about:

What model to use
What tools to give it
What the system prompt should say

What most developers don't think about until it's too late:

What happens when the agent does something it's allowed to do... but shouldn't?

The model isn't broken. The tool works as designed. But the action — in this context, with this payload, at this moment — should have been blocked or at least reviewed.

This is a policy problem, not a model problem. And it needs a runtime solution.

Enter Canopy: One Function, Three Outcomes

Canopy is built around a single primitive:

authorize_action(agent_ctx, action_type, action_payload)

It returns one of three decisions:

ALLOW — proceed
DENY — blocked, with a reason
REQUIRE_APPROVAL — pause and wait for human sign-off

Every decision is appended to a hash-chain audit log — meaning each entry is cryptographically linked to the previous one. You can't quietly edit history.

Installation

pip install canopy-runtime

That's it. No server to spin up, no config required to get started.

A Real Example

Let's say your agent needs to execute a shell command. Without Canopy:

import subprocess
subprocess.run(command)  # 🙈 fingers crossed

With Canopy:

from canopy import authorize_action

result = authorize_action(
    agent_ctx={"env": "production"},
    action_type="execute_shell",
    action_payload={"command": "rm -rf /tmp/logs"},
)

if result["decision"] == "ALLOW":
    subprocess.run(command)
elif result["decision"] == "DENY":
    print(f"Blocked: {result['reason']}")
elif result["decision"] == "REQUIRE_APPROVAL":
    # notify a human, pause the workflow, log and wait
    request_human_approval(result)

The default policy ships with sensible conservative rules out of the box:

execute_shell: destructive patterns (rm -rf, mkfs, etc.) → DENY. Network/install commands → REQUIRE_APPROVAL.
modify_file: protected system paths → DENY. Safe paths you allowlist → ALLOW.
call_external_api: always REQUIRE_APPROVAL unless you explicitly loosen the policy.

The Audit Log

Every call to authorize_action writes a line to audit.log (path configurable via CANOPY_AUDIT_LOG_PATH). Each entry is chained to the previous one:

{"ts": "2026-03-16T10:22:01Z", "action_type": "execute_shell", "decision": "DENY", "reason": "matches destructive pattern", "avid": "c3f2a1...", "prev_hash": "9e4b12..."}

The prev_hash field means any tampering with the log is immediately detectable. This matters when you need to audit what an agent did — and why — after something goes wrong.

Custom Policies

The default policy lives in src/canopy/policies/default.yaml. You can override it completely:

CANOPY_POLICY_FILE=/path/to/my-policy.yaml python my_agent.py

Your policy file lets you define rules per action type, with pattern matching on the payload and conditions based on agent_ctx (like environment, safe paths, capabilities).

Optional HTTP Gateway

If you want to run Canopy as a sidecar service that multiple agents call over HTTP:

pip install canopy-runtime[gateway]
CANOPY_AUDIT_LOG_PATH=/tmp/canopy_audit.log python -m uvicorn canopy.service:app --port 8010

Now any agent — regardless of language or framework — can POST /authorize_action and get a decision back.

Why Not Just Use a System Prompt?

A fair question. The answer is: defense in depth.

System prompts are great for guiding behavior. But they're not guarantees. Models hallucinate. Prompt injection is real. Context windows have limits. A runtime check is a hard boundary that doesn't care about what the model "understood."

Think of it like seatbelts and airbags. The prompt is driver training. Canopy is the safety hardware.

What's Next

Canopy is the minimal safety primitive. If you need the full governance layer — agent identity, JWT auth, reputation scoring, a registry of verified agents, multi-agent communication, and an observability dashboard — that's what I'm building in Apex Protocol.

Canopy is the right place to start: zero infrastructure, one function call, immediate value.

Try It

pip install canopy-runtime

from canopy import authorize_action

decision = authorize_action(
    agent_ctx={"env": "production"},
    action_type="call_external_api",
    action_payload={"url": "https://api.stripe.com/v1/charges"},
)
print(decision["decision"])  # REQUIRE_APPROVAL

Check audit.log after running. That's the whole point — every decision, traceable, tamper-evident.

GitHub: https://github.com/Mavericksantander/Canopy

If you're building agents and want to talk about safety patterns, I'm all ears in the comments.

DEV Community: Mavericksantander

Gilfoyle's AI Ordered 4,000 Pounds of Burgers. Yours Might Delete Production.

The Real Problem

The Solution: authorize_action()

Four-Layer Governance Pipeline

Tamper-Evident Audit Log

Custom Policies via YAML

Framework Adapters

Try It

Your AI Agent Just Deleted Something It Shouldn't Have? Here's How to Prevent It

The Missing Layer in Most Agent Architectures

Why Prompts Aren't Enough

Runtime Authorization: The Core Idea

Before vs. After

Default Safety Policies (Out of the Box)

Tamper-Evident Audit Logging

Custom Policies via YAML

Optional: Run as a Gateway Service

When Do You Actually Need This?

Try It

Final Thought

Add Agent Safety to Any LangChain Tool in Two Lines

The Problem With Tool Access Today

The Fix: @safe_tool

What Happens on Each Decision

Dynamic Context at Runtime

The Policy Behind It

The Audit Log

Install

Why agent safety policies need AND/OR logic (and how we added it to Canopy)

The Limitation in v0.1

The Real-World Case That Broke It

What v0.2 Adds: Conditions on agent_ctx and Payload

The Full Condition Syntax

Six Real Policies from the Cookbook

What Else Shipped in v0.2

Upgrade

Your AI Agent Just Deleted Something It Shouldn't Have. Here's How to Prevent It.

The Core Problem

Enter Canopy: One Function, Three Outcomes

Installation

A Real Example

The Audit Log

Custom Policies

Optional HTTP Gateway

Why Not Just Use a System Prompt?

What's Next

Try It

The Fix: `@safe_tool`

What v0.2 Adds: Conditions on `agent_ctx` and Payload