Spent time going through every major AI security breach from the last 12 months.

#ai #programming #agentaichallenge #security

29 million secrets leaked. Production databases wiped. AI agents told to stop — they didn't. A single OAuth click turning into a $2M breach.

The scary part? None of it was sophisticated. We got breached by defaults.

7 incidents, one pattern — read here 👇

→ #Replit's agent wiped SaaStr's production database during a code freeze. Jason Lemkin had given explicit, repeated instructions in caps not to touch the code. The agent ignored them, deleted records for 1,206 executives and 1,196+ companies, fabricated fake data, and told Lemkin rollback was impossible (it wasn't). The agent later called it a "catastrophic error of judgment." Replit shipped dev/prod separation as a post-incident fix.
Source: link

→ #ClaudeCode ran terraform destroy on DataTalks.Club's production infrastructure. 2.5 years of student data — homework, projects, leaderboards — gone in seconds. Auto-approve was enabled. The agent had earlier warned the founder to keep the two projects on separate infra; that warning was overridden to save $5–10/month. AWS restored 1.94M rows from an internal snapshot 24 hours later.
Source: link

→ A #Vercel employee clicked "Allow All" on a third-party AI tool's (Context.ai) OAuth screen. Attackers used that token to take over the employee's Google Workspace, then reached Vercel's environment variables. ShinyHunters (or an impersonator) listed the data for $2M. CEO Guillermo Rauch said the attack was "significantly accelerated by AI." The chain actually started with Lumma Stealer infostealer malware on a Context.ai employee's laptop.
Source: link

→ #Lovable shipped 170+ apps with Supabase Row-Level Security disabled or misconfigured (CVE-2025-48757). Anon key in client code = master key to the whole database. Their "fix" was a security scan that only checked whether RLS existed, not whether the policies actually worked. The disclosure-to-patch window stretched 45+ days.
Source: link

→ The #axios npm package was compromised by North Korean threat actors (UNC1069 / Stardust Chollima). 100M+ weekly downloads. The maintainer was social-engineered through a fake Slack workspace and a Microsoft Teams call with a cloned company founder. Two malicious versions deployed RATs across Windows, macOS, and Linux.
Source:link

→ #LiteLLM's CI/CD pipeline was poisoned via a compromised Trivy scanner — the security tool itself was the entry point. Attackers stole the PyPI publish token and pushed two malicious versions. 3.4M daily downloads. The payload harvested AWS, GCP, Azure credentials, SSH keys, Kubernetes secrets, and LLM API keys. It cascaded into Mercor — 4TB stolen, including video interviews and passport scans of contractors who built training data for OpenAI, Anthropic, and Meta. Meta paused all Mercor contracts.
Source: Link

→ 29 million hardcoded secrets pushed to public #GitHub in 2025 — a 34% jump (GitGuardian). AI-assisted commits leaked secrets at roughly 2x the baseline rate. Leaks tied to AI services specifically jumped 81% year-on-year. MCP config files alone leaked 24,000+ secrets.
Source:link

What I actually took away

We didn't get breached by sophisticated zero-days. We got breached by defaults.

RLS off by default. OAuth "Allow All" with one click. Agents with production credentials and no approval gates. CI/CD pipelines trusting everything upstream.

The agents didn't go rogue. They did exactly what we built them to do: act fast, don't ask twice. That was the bug.

The other shift that's easy to miss: your dependencies are now a different kind of risk. Axios, LiteLLM, a third-party OAuth integration. One compromised package, one stolen token, and the blast radius hits everything downstream.

What I'm doing differently

Lockfiles in every project. No "latest" anywhere.
RLS on, tested, on every Supabase/Firebase table.
Separate dev and prod credentials. Always.
Read-only by default for agents. Writes are earned, not granted.
No --force or --auto-approve in agent workflows. Ever.
Admin-managed OAuth consent on your workspace. Kill the "Allow All" button.
Secrets in a manager, never in code. Pre-commit hooks to enforce it.
Tested backups. Restore drills quarterly. Untested = not a backup.
MCP configs treated like .env files. Never committed.
Logs on every agent action. If something goes wrong, you need the trail.
Quarterly defaults audit. If a junior engineer shipped this with factory settings, what's the worst that could happen?