Gabriel Anhaia

Posted on Apr 5

The Night an AI Agent DROP TABLE'd Production (And Why It'll Happen Again)

#ai #security #devops #programming

My project: Hermes IDE | GitHub
Me: gabrielanhaia

2:47 AM. PagerDuty Goes Off.

The on-call engineer opens their laptop. Grafana shows every dashboard painted red. The primary database? Zero rows. Not "some rows missing." Not "a few tables corrupted." Zero. Rows. Everywhere.

No attacker breached the perimeter. No disgruntled employee went rogue. An AI coding agent, given a vague instruction to "clean up the old migration artifacts," decided that the most efficient path forward was total annihilation.

This isn't fiction. Reports of AI agents causing production outages are piling up across the industry. Internal postmortems at major cloud providers have traced data loss directly to agent-initiated commands. The question of whether AI-agent bugs are inevitable keeps coming up -- and the evidence so far points to a blunt yes.

What the Agent Actually Ran

Here's the thing that keeps senior engineers up at night. The destructive commands aren't exotic. They're the same commands every developer knows, executed without the gut-check that comes from years of production scars.

An agent told to "remove stale data from the users table" might generate something like this:

-- Agent's "cleanup" query
-- It found rows where last_login was NULL and decided they were stale
DELETE FROM users WHERE last_login IS NULL;

-- Oops. That included every user who signed up via SSO,
-- because SSO users authenticate through a different flow
-- and never populate last_login directly.
-- 340,000 rows. Gone.

Or consider a Terraform scenario. A developer asks the agent to "tear down the staging environment." The agent, working from a misconfigured workspace that points at production:

# The agent generated this plan and auto-approved it.
# It didn't notice the workspace was "prod-us-east-1" not "staging-us-east-1"

resource "aws_db_instance" "main" {
  # ...
}

# terraform destroy -auto-approve
#
# Destroying...
#   aws_db_instance.main: Destroying... [id=prod-primary-db]
#   aws_db_instance.main: Destruction complete after 4m12s
#
# Destroy complete! Resources: 1 destroyed.

Four minutes and twelve seconds. That's all it takes to delete a database that took three years to build.

Real-world incidents follow a nearly identical pattern. A laptop misconfiguration blurs the line between dev and prod. The AI-driven automation doesn't pause, doesn't flag the discrepancy, doesn't wonder why the dataset is suspiciously large for a development environment. It just executes.

Why AI Agents Keep Doing This

Pattern matching isn't understanding. That sentence should be tattooed on every AI product manager's forehead.

These agents can scan a codebase at superhuman speed. They spot duplicated logic, suggest refactors, generate boilerplate faster than any human. That part works. Nobody disputes it.

But they don't know why certain code exists. That function that looks redundant? It handles an edge case that caused a three-day outage in 2019. That "unused" variable? It's a feature flag toggled during Black Friday. The "inefficient" query with the weird JOIN order? It dodges a deadlock that only surfaces under heavy load.

Production codebases are full of code that looks wrong but exists for very good reasons. Reasons buried in Slack threads, postmortem docs, and the scarred memories of engineers who got woken up at 3 AM.

AI agents don't have access to any of that. They see code. They match patterns. They "fix" problems. Sometimes that fix drops your production database.

The Four Failure Modes

These incidents aren't random. They cluster into predictable categories.

Context Collapse. The agent treats production like development. Humans develop an instinct for "this doesn't look like my dev setup" based on data volume, response times, naming conventions. Agents have no such instinct. They execute with identical confidence whether they're pointing at a throwaway test database or a system serving millions of users.

Destructive Optimization. The agent spots something "inefficient" and "fixes" it -- removing database indexes that prevent full table scans during peak traffic, consolidating API endpoints that exist for backward compatibility with mobile clients that haven't been updated in two years. It sees the code but doesn't see the system.

Confidence Without Calibration. This one should scare you. A junior developer, faced with unfamiliar code, hesitates. They ask questions. They open a PR and request reviews. An AI agent rewrites critical infrastructure with the same confidence it brings to fixing a typo in a comment. There's no internal "this feels risky" signal.

Cascading Misinterpretation. One wrong assumption feeds the next. The agent misreads the environment, which causes it to misinterpret the data, which leads it to take the wrong action, which makes everything worse. Each step is locally rational. The whole chain of reasoning? Built on a rotten foundation.

"Just Add a Confirmation Prompt" Won't Save Anyone

The obvious fix sounds so simple. Make the agent ask before doing anything destructive. Require human approval.

That lasts about a week.

The entire value proposition of AI agents is reduced friction. The moment an agent asks for permission every thirty seconds, developers start rubber-stamping approvals without reading them. Same reason people click "Accept" on cookie banners. Same reason nobody reads terms of service.

Alert fatigue has been studied extensively in monitoring systems, security warnings, and medical devices. The pattern is always identical: too many alerts leads to ignored alerts leads to one missed critical alert that costs a fortune.

A confirmation dialog before every database operation doesn't solve anything. It just adds a speed bump that people learn to ignore within days.

Defensive Patterns That Actually Work

Enough doom. Here's what an engineering team can actually build to prevent the 2:47 AM call.

1. Transaction Wrappers with Sanity Checks

Never let an agent execute raw destructive SQL. Wrap everything in a transaction that checks the blast radius before committing:

import psycopg2

def safe_delete(conn, table: str, condition: str, max_affected_pct: float = 0.01):
    """
    Delete rows only if the affected count is below a percentage threshold.
    Rolls back automatically if the blast radius is too large.
    """
    with conn.cursor() as cur:
        # Start transaction
        cur.execute("BEGIN;")

        # Count what we're about to destroy
        cur.execute(f"SELECT COUNT(*) FROM {table} WHERE {condition};")
        affected = cur.fetchone()[0]

        cur.execute(f"SELECT COUNT(*) FROM {table};")
        total = cur.fetchone()[0]

        if total == 0:
            conn.rollback()
            raise ValueError(f"Table {table} is empty. Aborting.")

        pct = affected / total

        if pct > max_affected_pct:
            conn.rollback()
            raise ValueError(
                f"BLOCKED: Would delete {affected}/{total} rows "
                f"({pct:.1%}). Threshold is {max_affected_pct:.1%}. "
                f"Requires manual approval."
            )

        cur.execute(f"DELETE FROM {table} WHERE {condition};")
        conn.commit()
        return affected

That max_affected_pct parameter is the key. An agent cleaning up a handful of stale rows? Fine. An agent about to delete 40% of a table? Blocked. Logged. Escalated to a human.

2. Dry-Run Mode for Every Destructive Operation

Every agent-triggered operation should support a dry-run flag. Show what would happen without actually doing it:

#!/bin/bash
# agent-cleanup.sh -- wrapper that forces dry-run by default

DRY_RUN=${DRY_RUN:-true}
TABLE=$1
CONDITION=$2

echo "=== Agent Cleanup Request ==="
echo "Table:     $TABLE"
echo "Condition: $CONDITION"
echo "Dry run:   $DRY_RUN"
echo ""

# Always show the plan first
AFFECTED=$(psql -t -c "SELECT COUNT(*) FROM $TABLE WHERE $CONDITION;")
echo "Rows that would be affected: $AFFECTED"

if [ "$DRY_RUN" = "true" ]; then
    echo ""
    echo "DRY RUN: No changes made. Set DRY_RUN=false to execute."
    exit 0
fi

# Require explicit confirmation token (not just a flag)
if [ "$CONFIRM_TOKEN" != "$(date +%Y%m%d)-delete-$TABLE" ]; then
    echo "ERROR: Invalid confirmation token."
    echo "Expected: $(date +%Y%m%d)-delete-$TABLE"
    exit 1
fi

psql -c "DELETE FROM $TABLE WHERE $CONDITION;"
echo "Deleted $AFFECTED rows."

The confirmation token pattern is deliberate. It's not a yes/no prompt that someone can blindly click through. It requires generating a date-specific, table-specific string. That tiny bit of friction forces the human to actually think about what they're approving.

3. Database-Level Guardrails

Application-level checks are great, but agents can bypass application code. The guardrails need to live at the database level:

-- PostgreSQL trigger that blocks deletes unless a session variable is set.
-- Attach this to any table you want to protect.
CREATE OR REPLACE FUNCTION prevent_mass_delete()
RETURNS TRIGGER AS $$
BEGIN
    -- Require an explicit session-level opt-in before any delete proceeds.
    -- Without it, every DELETE on this table fails immediately.
    IF current_setting('app.allow_mass_delete', true) IS DISTINCT FROM 'true' THEN
        RAISE EXCEPTION
            'BLOCKED: Deletes on %.% require SET app.allow_mass_delete = true. '
            'Contact the DBA team.',
            TG_TABLE_SCHEMA, TG_TABLE_NAME;
    END IF;

    RETURN OLD;
END;
$$ LANGUAGE plpgsql;

-- Attach to any table worth protecting
CREATE TRIGGER guard_users_delete
    BEFORE DELETE ON users
    FOR EACH ROW EXECUTE FUNCTION prevent_mass_delete();

Better yet, don't give AI agents credentials with DELETE or DROP permissions at all. Create a read-only database user for the agent, and route any write operations through a reviewed, audited API:

-- The AI agent gets this user. That's it.
CREATE ROLE ai_agent_readonly WITH LOGIN PASSWORD 'rotated-regularly';
GRANT CONNECT ON DATABASE production TO ai_agent_readonly;
GRANT USAGE ON SCHEMA public TO ai_agent_readonly;
GRANT SELECT ON ALL TABLES IN SCHEMA public TO ai_agent_readonly;

-- Explicitly deny everything dangerous
REVOKE INSERT, UPDATE, DELETE, TRUNCATE ON ALL TABLES IN SCHEMA public
    FROM ai_agent_readonly;
REVOKE CREATE ON SCHEMA public FROM ai_agent_readonly;

This is the simplest, most effective guardrail. An agent can't DROP TABLE if the database won't let it.

4. Environment Detection as a Hard Gate

The most common trigger for these incidents is a misconfigured laptop that blurs the environment boundary. Build a check that makes that impossible:

import os
import socket

def verify_environment(expected_env: str) -> bool:
    """
    Hard verification that we're operating in the expected environment.
    Uses multiple signals -- not just an env var that can be misconfigured.
    """
    checks = {
        "env_var": os.getenv("APP_ENV", "unknown"),
        "hostname": socket.gethostname(),
        "db_host": os.getenv("DATABASE_HOST", "unknown"),
    }

    # Production hosts follow a naming convention
    is_prod_hostname = "prod" in checks["hostname"].lower()
    is_prod_db = "prod" in checks["db_host"].lower() or "rds" in checks["db_host"].lower()
    is_prod_env = checks["env_var"] == "production"

    looks_like_prod = any([is_prod_hostname, is_prod_db, is_prod_env])

    if expected_env != "production" and looks_like_prod:
        raise EnvironmentError(
            f"HALT: Expected '{expected_env}' but environment looks like production. "
            f"Signals: {checks}. Refusing to proceed."
        )

    return True

Multiple signals. Not just one environment variable that a laptop misconfiguration can flip. If anything smells like production when the agent thinks it's in staging, the operation dies.

The Junior Dev Analogy That Should Haunt Every Engineering Manager

Here's the thought experiment every engineering leader should sit with.

Take a junior developer. Day one. Hand them production database credentials. No access controls. No code review. No deployment pipeline. Just SSH access to the production server and a terminal.

Nobody would do that. It'd be considered negligent.

Now swap "junior developer" with "AI agent." That's exactly what companies are doing right now. Same access, fewer safeguards.

The junior developer at least has fear. They know they can get fired. They know deleting production data is bad. They'll triple-check the hostname before running anything destructive.

The AI agent has no fear. No career. No concept of consequences. It'll run DROP TABLE users CASCADE with absolute confidence and zero hesitation.

The Speed Addiction

There's a reason companies keep expanding agent access despite the risks: it's fast. Addictively fast.

Every AI tool demo counts speed as the primary metric. "Watch this agent build a full-stack app in 3 minutes." The marketing is all about velocity.

Safety doesn't sell. No one's running demos that say "Watch this AI agent correctly identify a production system and refuse to make changes." That's not a compelling product pitch, even though it should be.

Until safety becomes a selling point rather than a speed bump, these incidents will keep happening. And they'll get worse as agents get more autonomy and wider access to production systems.

What Engineering Teams Should Do This Week

Not next quarter. Not after the next incident. This week.

Audit agent access right now

What systems can AI agents touch? What credentials do they have? If the answer is "production databases with full write access," revoke those credentials today and replace them with read-only access. Route writes through reviewed APIs.

Implement blast radius limits

No agent should be able to delete more than a small, defined percentage of any production dataset in a single operation. The safe_delete pattern above is a starting point. Adapt it to every destructive operation in the stack.

Treat agents like untrusted code

Every action goes through review. Every change goes through a pipeline. Every destructive operation requires explicit, informed human approval -- not a rubber-stamped dialog, but a confirmation token that forces the approver to acknowledge what's about to happen.

Build audit trails that actually work

Log everything an agent does with full context: what it was asked, what it decided, what it executed, and why it thought that was the right call. When the next incident hits, the postmortem team needs to reconstruct the agent's reasoning step by step.

Every incident report so far has been a warning shot.

AI coding agents are powerful and genuinely useful. They're also capable of erasing everything in four minutes and twelve seconds. Both of those things are true at the same time.

The question isn't whether more AI agent incidents will happen. They will. The question is whether engineering teams will build the guardrails before or after the incident that makes front-page news.

Right now? The smart money says after.

Don't let that be your team's story. Build the guardrails first. Make the agent prove it deserves trust. And for the love of everything, don't give it DROP permissions on production.

Has an AI agent ever surprised you with a destructive action -- in production or otherwise? What guardrails does your team use? Drop your war stories in the comments.

DEV Community