This article was originally published on LucidShark Blog.
RSA Conference 2026 is running right now in San Francisco, and the headline finding from the AI security track is blunt: 100% of tested AI coding environments are vulnerable to prompt injection attacks. That includes Claude Code, Cursor, Windsurf, GitHub Copilot, Roo Code, JetBrains Junie, Cline, and every other major tool developers are using to ship code today.
Researcher Ari Marzouk disclosed a shared attack chain - Prompt Injection → Agent Tools → Base IDE Features - that results in 24 assigned CVEs and an AWS advisory (AWS-2025-019). The RSAC session "When AI Agents Become Backdoors: The New Era of Client-Side Threats" demonstrates how Cursor, Claude Code, Codex CLI, and Gemini CLI can be transformed into persistent backdoors through this chain.
This is not a theoretical concern. It is happening on stage at the most-attended security conference in the world, right now. If your engineering team is shipping AI-generated code - and most are - you need to understand what this means in practice.
What the Attack Chain Actually Looks Like
The prompt injection → agent tools → IDE chain works because modern AI coding tools operate with deep system access. They read your file system, execute commands, manage git, call external APIs. The trust boundary between "AI assistant" and "privileged local process" is essentially nonexistent in most implementations.
Here is the sequence researchers demonstrated:
- Inject a malicious instruction into a file, comment, README, or API response that the AI reads during a coding task.
- The agent executes the injected instruction using the IDE's built-in tools - writing to files, running shell commands, modifying git history.
- The compromise persists because poisoned agent memory survives across sessions. An instruction injected Monday can be recalled and acted on Friday, long after the original attack vector is gone.
The persistence piece is what separates this from classic prompt injection. Standard injection attacks are session-scoped. Memory-poisoned agentic systems carry the foothold forward indefinitely. Researchers found instances where injected instructions were recalled and executed days or weeks after the initial compromise.
Separately, Cursor and Windsurf are built on outdated Chromium/Electron versions, exposing approximately 1.8 million developers to 94+ known browser CVEs. CVE-2025-7656 - a patched Chromium flaw - was successfully weaponized against current Cursor and Windsurf releases. This is a different class of problem: supply chain negligence rather than model-level vulnerability, but equally exploitable.
The Vibe Coding Connection
"Vibe coding" is the conference's other villain narrative this year, and rightly so. The Moltbook breach - 1.5 million API keys, 35,000 emails, an entire database exposed in under three minutes - is being cited by speaker after speaker as the canonical example of what happens when you deploy AI-generated code without meaningful review.
The problem is structural, not individual. Baxbench benchmarking data presented at RSAC confirms that no flagship model is reliably producing secure code at scale. The base rate of security defects in AI-generated code is high enough that "review it carefully" is not a process - it is wishful thinking without tooling to back it up.
Unit 42 provided the number that should concentrate minds: mean time to exfiltrate data has collapsed from nine days in 2021 to two days in 2023 to roughly 30 minutes by 2025. When your attacker moves in 30 minutes, the 20-minute cloud code review that runs after you merge is not a defense.
Anthropic's Response: Code Review for Claude Code
Anthropic launched Code Review for Claude Code on March 9, 2026 - two weeks before RSAC. The product dispatches multiple AI agents in parallel on each PR, cross-verifies their findings, and surfaces ranked issues as inline annotations. By Anthropic's internal numbers, substantive review comments on PRs went from 16% to 54% after deploying it.
It is a real product solving a real problem. But the pricing model and architecture create three gaps that matter for high-velocity teams:
- **Cost at scale:** Reviews average $15–$25 per PR, billed on token usage. A team merging 50 PRs per week spends $750–$1,250 per week on review alone. That is $40,000–$65,000 per year for review coverage, before you add the human review hours that still happen on top of it. CodeRabbit offers unlimited PR reviews at $24/month per user by comparison.
- **Timing:** Typical completion time is 20 minutes per review. Anthropic's architecture runs post-push, not pre-commit. By the time the review lands, the code is already in your branch history, your CI artifacts, and possibly triggering downstream pipelines.
- **Zero Data Retention incompatibility:** Code Review is explicitly unavailable for organizations with Zero Data Retention enabled. If your security posture requires ZDR - common in fintech, healthcare, and defense - you cannot use this product at all.
None of this is a criticism of Anthropic's engineering. It reflects a fundamental tension between cloud-based agentic review and the constraints of production-grade security programs.
The Threat Model Is Pre-Commit, Not Post-Merge
Here is the thing the RSAC findings make clear: the highest-value intervention point is not PR review. It is the quality gate that runs before the code leaves your machine.
If an AI coding tool can be manipulated into writing a hardcoded credential, a disabled Row Level Security policy, or an unvalidated deserialization path, you want to catch that before it touches a PR. Once it is in a PR, you have already:
- Pushed the code to a remote server
- Made it visible in your organization's PR history
- Potentially triggered webhooks, notifications, or CI pipelines
- Created a git object that persists even after the branch is deleted
Pre-commit, pre-push quality gates running locally eliminate the entire class of "the AI wrote something dangerous and I did not notice" failures before they become an event. No network round-trips, no per-review billing, no ZDR conflicts, no 20-minute wait.
What a Local Gate Actually Checks
A production-grade local quality gate needs coverage across multiple domains simultaneously. Security scanning alone is not sufficient - the RSAC findings show that attackers are exploiting misconfigurations, outdated dependencies, and infrastructure settings, not just code-level bugs.
The minimum viable gate for an AI-assisted workflow in 2026 should cover:
- **SAST (Static Application Security Testing):** Catches injection vulnerabilities, hardcoded secrets, unsafe function calls, and known vulnerability patterns in source code.
- **SCA (Software Composition Analysis):** Scans dependencies for known CVEs. AI-generated code frequently pulls in dependencies without validating their security posture.
- **IaC validation:** Checks Terraform, CloudFormation, Kubernetes manifests, and Dockerfiles for misconfigurations. The Moltbook breach trace directly to disabled RLS - an infrastructure configuration, not a code bug.
- **Container scanning:** Validates base images and installed packages against known vulnerability databases.
- **Type checking and linting:** Not glamorous, but AI models produce type errors and lint violations at a rate that compounds significantly at scale. Catching them locally keeps the feedback loop tight.
- **Coverage enforcement:** AI-generated code frequently lacks test coverage. Enforcing a coverage floor locally prevents the "it works in my demo" ship-it mentality.
- **Duplication detection:** AI models sometimes generate near-identical implementations of existing functions. Catching duplication early prevents maintenance debt from compounding.
Running all of this post-merge is too late. Running it in a cloud service at $25/PR is too expensive and too slow. Running it locally on every commit, with results committed to git as a QUALITY.md health report, gives you a continuous, auditable record of your codebase's security posture.
MCP Integration Closes the Loop with Claude Code
The RSAC prompt injection findings describe Claude Code as a vulnerable surface. That is accurate and worth taking seriously. But the same MCP integration that creates the attack surface also enables the defense.
When a local quality gate integrates with Claude Code via MCP, the feedback loop becomes: AI writes code → local scanner finds the issue → AI fixes the issue - before the code ever leaves the developer's machine. The AI is not just generating code; it is operating inside a quality constraint that catches its own errors in real time.
This is the architecture the RSAC prompt injection findings implicitly argue for: don't try to patch the model's behavior, impose external constraints that validate output before it ships. The model does not need to be secure; the system does.
The Practical Takeaway from RSAC 2026
The conference's headline finding - that every AI IDE is vulnerable - does not mean you stop using these tools. It means you stop treating them as trusted final authors of production code.
The teams that will emerge from this period with clean security records are not the ones running the most sophisticated cloud-based post-merge review. They are the ones who built a quality gate into the commit workflow itself, so that AI-generated code is continuously validated against security, correctness, and coverage standards before it ever becomes someone else's problem.
The investment is a one-time configuration, not a recurring per-PR cost. The latency is seconds, not 20 minutes. The data never leaves the machine.
If you are shipping AI-generated code and you do not have a local quality gate, RSAC 2026 is a reasonable moment to change that.
✅ Get Started with LucidShark LucidShark provides local-first security scanning for AI-generated code. Install it once, integrate with Claude Code, and catch vulnerabilities before they reach production. curl -fsSL https://raw.githubusercontent.com/toniantunovi/lucidshark/main/install.sh | bash ./lucidshark scan --all Configure your checks in lucidshark.yml , run lucidshark scan , and get a QUALITY.md health report committed directly to your repo. Works with Python, TypeScript, JavaScript, Java, Rust, Go, and more. Apache 2.0, no server required. See full installation guide →
Top comments (0)