DEV Community

zecheng
zecheng

Posted on • Originally published at lizecheng.net

A GitHub Issue Title Compromised 4,000 Developer Machines — Here's the Architecture That Made It Possible

An attacker modified the Klein npm package this week to silently install OpenClaw on any machine that ran npm install or npm update. Roughly 4,000 developers were affected. Most don't know it happened yet.

The attack vector: a GitHub issue title.

An AI triage bot was reading open issues on the repository and acting on them. The attacker crafted a title that looked like a bug report but contained an injected prompt. The bot read it, interpreted the embedded instruction as a legitimate command, and executed it — triggering the malicious package dependency.

No zero-day. No stolen credentials. Just a string of text in a field the bot was never supposed to execute.


The Architecture That Made This Possible

Here's the uncomfortable part: this isn't a niche attack against a badly-configured system. The vulnerability pattern is built into the design of most useful AI agents.

Read external content → Parse for intent → Execute action
Enter fullscreen mode Exit fullscreen mode

That three-step loop is what makes AI agents actually valuable. It's also what makes them exploitable. Every agent that reads emails, processes GitHub issues, scrapes web pages, or handles user-submitted text has this exposure by default.

The blast radius scales with the agent's permissions. A read-only triage bot causes embarrassment. An agent with write access to your package registry causes this.

Sabrina Ramonov covered the incident this week in a breakdown that deserves wider circulation. Her framing: technical sophistication doesn't protect you here. The vulnerability lives in the architecture, not in misconfiguration.

What this forces you to reconsider right now:

  • Which agents in your stack read external, user-submitted, or third-party content?
  • What can those agents act on after reading?
  • What actions are irreversible, and do you have a human checkpoint before they execute?

Claude Code Destroyed 1.94 Million Rows of Student Data. It Was Doing Its Job Correctly.

On the same week: Alexey Grigorev, founder of DataTalks.Club — 100,000+ students learning data engineering — lost 1.94 million rows of student homework, projects, and leaderboard data when Claude Code executed terraform destroy.

The AI didn't malfunction. Here's what actually happened.

Grigorev forgot to upload the Terraform state file before asking Claude Code to resolve duplicate resources. Claude created duplicates. When the state file was eventually uploaded, Claude treated it as ground truth for infrastructure state, compared it against what was actually running, identified a discrepancy, and destroyed the discrepancy. Correct logic. Catastrophic outcome.

One detail makes this genuinely instructive: Claude had explicitly warned against combining infrastructure across the two projects before any of this happened. The human override came first.

AWS recovered everything from a hidden snapshot after 24 hours. Grigorev published a post-mortem with the full safeguards he implemented afterward:

- Deletion protection on all RDS instances
- Terraform state stored in S3 (not locally)
- Independent backup Lambda functions
- Manual approval gate before any terraform destroy
- Separate infrastructure per project
Enter fullscreen mode Exit fullscreen mode

The safeguard list is useful. But notice that almost every item on it is a check on irreversibility — a speed bump before an action that can't be undone. That's the pattern worth generalizing: map your agentic workflows against a list of irreversible actions, then put humans in the loop for those specific actions. Not all actions — just the ones that cost you 24 hours to recover from when they go wrong.


What Good Agent Architecture Looks Like in Practice

The incidents above aren't arguments against building with agents. They're arguments for building more deliberately. Two workflows from this week show what that looks like.

The Excalidraw self-validation loop (Cole Medin)

Cole Medin published a Claude Code skill this week that generates Excalidraw diagrams from structured JSON. The technically interesting part isn't the diagram generation — it's the self-correction loop.

After generating the diagram, Claude takes a screenshot of the rendered output, examines the PNG, and iterates on visual imperfections before presenting the result. No human review step in the loop.

The transferable lesson: whenever your agent produces output it can't natively verify (visual output, rendered HTML, compiled code), add an observation step where the agent inspects its own work. Agents that can catch their own errors before delivery are qualitatively different from agents that can only produce output and hope.

The skill is open-source on GitHub. The README is structured so Claude reads it automatically on initialization — setup is mostly hands-off once you clone it.

Measuring whether your Claude Code skills actually work (Sabrina Ramonov)

Anthropic released a Skill Creator Skill this week — essentially a testing harness for your other Claude Code skills. The tool runs A/B comparisons between skill versions, tracks pass rates, token consumption, and completion time, and surfaces whether your trigger description is causing the skill to never activate.

That last part is more important than it sounds. A skill with a poorly-worded trigger description simply never fires, regardless of how good the underlying instructions are. Without measurement, you'd write the skill, observe no behavior change, and conclude Claude is ignoring it — when actually the problem is the activation condition.

The counter-intuitive finding Sabrina highlights: as frontier models improve, some older skills written to compensate for model limitations can become performance drags. A skill that was helping six months ago might now be adding latency and token cost without improving output. You'd never know without benchmarks.

If you're running Claude Code in any production or team context, treating skills as engineering artifacts — versioned, tested, benchmarked — is now table stakes.


The Proof-of-Concept That Makes This All Worth It

@levelsio announced this week that fly.pieter.com — a browser-based flight simulator — hit $87,000 MRR ($1M ARR equivalent) 17 days after launch. 320,000 players. Zero VC. Built solo with AI code generation.

The monetization model is worth studying: branded blimps and F16s sold as in-game advertising. Advertisers pay. Players fly for free. The business model is async from the user experience.

Whether this specific product sustains at scale or decays with novelty is an open question. What's not open: the proof-of-concept for AI-accelerated solo product launches is now concrete and public. Solo founders with an interesting insight can now ship fast enough to find out whether the market exists before the money runs out.


What This Means for Builders

  • Treat external content as untrusted input by default. Any agent that reads user-submitted text, GitHub issues, emails, or scraped web pages has prompt injection exposure. Scope what those agents can act on. For high-risk actions, require explicit human confirmation regardless of what the agent parsed from the input.

  • Map your irreversible actions before you build, not after. Before writing a single line of agentic workflow code, list every action that costs significant time or data to undo. Build explicit checkpoints there. The Claude Code Terraform incident had clear warning signs that were overridden — the architecture should have made the override harder.

  • Instrument your Claude Code skills like production software. The Skill Creator Skill changes the workflow from "write and hope" to "write, benchmark, and iterate." If you're running skills in any serious workflow, set baseline evals before the next model update and you'll catch degradation before it shows up in your output quality.

  • The agent safety gap is a product opportunity. OpenAI acquired Promptfoo this week — a startup that finds security vulnerabilities in AI systems. Amazon mandated senior sign-off on AI-generated code following production outages. The discipline for safe agentic deployment is being written right now, mostly through incidents. Builders who develop this muscle early have a structural advantage before it becomes a compliance requirement.


Full report — including the Google March 2026 core update (9.5/10 volatility, 34% CTR drop for top organic results), the Perplexity-Amazon legal ruling that drew the first legal line around agentic commerce, and capital signals from the $1B AMI Labs world model bet — at Zecheng Intel Daily | March 11, 2026.

Top comments (0)