How to Add Human Approval to AI Agent Actions

#ai #beginners #tutorial #python

Your AI agent just deleted a production database table. It was trying to help -- the LLM decided DROP TABLE was the right cleanup step. No confirmation prompt. No safety check. Just gone.

If your agent can send emails, write files, or call APIs, it needs a human approval gate on dangerous actions. Here is how to build one in pure Python.

The Code

from enum import Enum
from datetime import datetime


class Risk(Enum):
    READ = "read"
    WRITE = "write"
    DESTRUCTIVE = "destructive"


TOOL_RISK = {
    "search_docs": Risk.READ,
    "list_files": Risk.READ,
    "send_email": Risk.WRITE,
    "create_file": Risk.WRITE,
    "delete_file": Risk.DESTRUCTIVE,
    "run_sql": Risk.DESTRUCTIVE,
    "deploy": Risk.DESTRUCTIVE,
}


def approve(tool_name: str, args: dict) -> bool:
    risk = TOOL_RISK.get(tool_name, Risk.DESTRUCTIVE)

    if risk == Risk.READ:
        return True

    if risk == Risk.WRITE:
        print(f"[LOG] {datetime.now():%H:%M:%S} | {tool_name}({args})")
        return True

    # Destructive: require human approval
    print(f"\n{'='*50}")
    print(f"APPROVAL REQUIRED: {tool_name}")
    print(f"Arguments: {args}")
    print(f"Risk level: {risk.value}")
    print(f"{'='*50}")
    response = input("Execute this action? (y/n): ").strip().lower()
    return response == "y"


def safe_tool_call(tool_name: str, args: dict, tool_fn):
    if not approve(tool_name, args):
        return {"status": "blocked", "reason": "Human denied execution"}
    return tool_fn(**args)


# Example: agent wants to delete a file
def delete_file(path: str) -> dict:
    # os.remove(path)  # uncomment in production
    return {"status": "deleted", "path": path}


result = safe_tool_call(
    tool_name="delete_file",
    args={"path": "/data/user_exports.csv"},
    tool_fn=delete_file,
)
print(result)

Run it:

python approval_gate.py

Output when the agent tries to delete a file:

==================================================
APPROVAL REQUIRED: delete_file
Arguments: {'path': '/data/user_exports.csv'}
Risk level: destructive
==================================================
Execute this action? (y/n): n
{'status': 'blocked', 'reason': 'Human denied execution'}

Type n and the action is blocked. Type y and it executes. Read operations skip the prompt entirely.

How It Works

Three risk tiers. Every tool gets classified: READ operations (search, list) auto-approve silently. WRITE operations (send email, create file) auto-approve but log the action. DESTRUCTIVE operations (delete, deploy, run raw SQL) halt and wait for human input.

Unknown tools default to destructive. The TOOL_RISK.get(tool_name, Risk.DESTRUCTIVE) line is the critical safety detail. If your agent hallucinates a tool name or calls something not in your map, it gets the highest restriction. Safe by default.

The gate is separate from the tool. safe_tool_call wraps any function without modifying it. You can add approval to existing tools by changing one line at the call site -- no refactoring needed.

Plug It Into Your Agent Loop

If you have an agent that picks tools from a list, drop safe_tool_call into the execution step:

TOOLS = {
    "search_docs": search_docs,
    "send_email": send_email,
    "delete_file": delete_file,
}

for step in agent.run(task):
    tool_fn = TOOLS[step.tool_name]
    result = safe_tool_call(step.tool_name, step.args, tool_fn)
    agent.receive(result)

Every tool call now passes through the approval gate. Reads fly through. Writes get logged. Destructive actions wait for you.

Going Further

For production agents, replace input() with a webhook or Slack notification that pauses execution until approved. The pattern stays the same -- only the approval transport changes.

If you want human approval built into your agent runtime without wiring it yourself, Nebula gates destructive actions automatically and notifies you via email before executing.

Part of the AI Agent Quick Tips series. Previous: How to Add Retry Logic to LLM Calls in 5 Min.