Originally published on CoreProse KB-incidents
In March 2026, security teams logged 35 new CVEs where AI-generated or AI-assisted code was a direct factor.
The cause was not a novel exploit, but AI-written code and AI-heavy libraries shipped without updated AppSec practices.
More than 40,000 vulnerabilities were tracked in NVD in 2025, already overwhelming traditional workflows [6].
AI accelerates both development and exploitation, widening the gap between change velocity and control coverage.
The task: treat AI as a structural shift in how vulnerabilities are created and exploited, and redesign engineering and security practices accordingly.
1. Why 35 AI Code CVEs in One Month Is a Structural Warning
The March 2026 spike reflects a broader trend:
40,000+ vulnerabilities in NVD in 2025, exceeding what traditional tools can handle [6]
16,200 AI-related incidents in 2025 across 3,000 U.S. companies, up 49% YoY [3]
Finance and healthcare made up over half of those incidents [3]
📊 Structural stress indicators
Exploding CVE volume overall [6]
Fast-rising AI-specific incidents [3]
High-value industries disproportionately hit [3]
Why this matters:
Coding assistants are evaluated on “passes the test,” not “is secure in production.”
Sonar’s analysis of 4,000+ Java assignments shows higher-performing models often produce more verbose, cognitively complex code, harder to review and secure [5].
AI output frequently introduces outdated or vulnerable dependencies when developers accept suggestions without checking packages and versions [1].
The 35 March CVEs are therefore:
Not a fluke, but evidence that AI accelerates both feature delivery and exploit-ready defects [3][5]
A sign that “it compiles and passes tests” is dangerously insufficient
⚠️ Mini-conclusion
Treat the spike as a structural warning: AI amplifies existing fragility in software supply chains rather than creating a separate risk category.
2. How AI Code and AI Libraries Became Exploit Delivery Vehicles
The March CVEs involved both unsafe snippets and vulnerable AI/ML libraries, echoing earlier RCE flaws in NeMo, Uni2TS, and FlexTok [2].
Core pattern:
Libraries over-trusted model metadata
Loading a malicious model file caused attacker-controlled metadata to be parsed and executed
Result: RCE in environments using popular AI frameworks with tens of millions of downloads [2]
flowchart LR
A[Malicious model file] --> B[Load by AI library]
B --> C[Metadata parsed]
C --> D{Unsafe eval / deserialization}
D -->|Yes| E[Arbitrary code execution]
style A fill:#f97316,color:#fff
style E fill:#ef4444,color:#fff
AI-powered development tools also became exploit paths:
Copilot RCE (CVE‑2025‑53773) allowed hostile prompts in code comments to make the assistant generate and execute dangerous commands on 100,000+ developer machines [3].
The IDE assistant effectively became a remote shell via prompt injection.
💡 AI supply chain as attack surface
Research shows attackers increasingly target AI supply chains—models, plugins, datasets, orchestration frameworks—as entry points for:
Data exfiltration
Lateral movement
Privilege escalation [7]
The 2026 AI/ML threat landscape notes:
The effective perimeter has shifted from firewalls to model logic and data layers
Vulnerabilities in AI code paths now map directly to business-critical exposure [8]
Qualys’ evaluation of DeepSeek-R1 variants shows:
Widely redistributed LLMs can carry jailbreak and safety-bypass weaknesses into downstream apps
The model itself becomes part of the exploitable surface [10]
⚠️ Mini-conclusion
The risk is not just “bad snippets from ChatGPT.” The entire AI software and tooling supply chain—from models to plugins to assistants—is now an exploit delivery mechanism.
3. Why Traditional AppSec Misses AI-Generated Vulnerabilities
AI-native architectures collide with AppSec practices built for deterministic web and API logic:
Classic frameworks assume fixed control flows, not probabilistic, context-driven behavior
OWASP’s LLM Top 10 emerged because early adopters deployed LLMs without tailored security models; legacy checklists missed prompt injection and model abuse [9]
Prompt injection illustrates the gap:
A model ingests attacker-controlled text (web page, ticket, PDF)
Treats it as instructions
Exfiltrates secrets or triggers tools [7]
Legacy SAST/DAST rarely model this risk in CI/CD [8]
📊 Hidden fragility in “passing” code
LLM-generated code can pass tests yet degrade structural quality and maintainability, correlating with higher defect density [5].
Verbose, complex code paths are exactly where traditional tools struggle.
New datasets highlight the blind spots:
CVE‑Genie, a multi-agent framework, reproduced and exploited ~51% of 841 CVEs from 2024–2025 at ~$2.77 per CVE [4][6].
Many issues live in intricate, environment-specific paths that trivial scanners miss.
Common early AI misuse patterns:
Sensitive data leaked via prompts and RAG pipelines
Over-permissive tool calls by AI agents
Misconfigured connectors between AI platforms and internal systems
These often appear as data leaks, not classic perimeter breaches, so legacy monitoring under-detects AI-enabled intrusion chains [3][7].
⚡ Mini-conclusion
You cannot just “point existing scanners at AI” and expect coverage. AI-aware policies, datasets, and detection logic are required.
4. Production Guardrails: A Concrete Checklist for AI-Generated Code
Engineering leaders need practical guardrails that integrate with existing SDLC tooling.
💼 1. Dependency hygiene by default
AI-generated code often suggests deprecated or insecure libraries [1]. Enforce:
Validation of maintenance status and ecosystem health
CVE scans for all AI-suggested packages before merge
Removal of unnecessary dependencies to reduce attack surface
Tools: Snyk, Dependabot, OWASP Dependency Check [1].
💼 2. CVE-aware CI gates
Run automated vulnerability scans in CI for every AI-generated change
Block deployments on high/critical issues in direct or transitive dependencies introduced by AI [1][6]
💼 3. Static analysis tuned for LLM output
Use engines that flag cognitive complexity, security smells, and anti-patterns in verbose AI code
Follow approaches similar to Sonar’s LLM leaderboard, which measures complexity and maintainability, not just correctness [5]
💼 4. OWASP LLM Top 10 in code and pipelines
Translate guidance into controls:
Strict isolation between system prompts and untrusted inputs
Schema-validated outputs before touching databases, shells, or payment APIs
Adversarial red-teaming for jailbreak and prompt-override patterns [8][9]
💼 5. Treat AI assistants as untrusted components
Monitor coding assistants like powerful agents, not benign helpers
Detect patterns similar to Copilot RCE: unusual comments, embedded prompts, or system-level commands generated by tools [3]
flowchart LR
A[AI code change] --> B[Dependency scan]
B --> C[Static analysis]
C --> D[LLM-specific checks]
D --> E{Policy gate}
E -->|Pass| F[Deploy]
E -->|Fail| G[Block & remediate]
style F fill:#22c55e,color:#fff
style G fill:#ef4444,color:#fff
⚠️ Mini-conclusion
Guardrails must be automated in pipelines. If they rely on developers “being careful” with AI suggestions, they will fail at scale.
5. Operationalizing AI Security: Detection, Response, and Governance
Pre-deployment controls will miss some AI-enabled attacks. Security operations must explicitly understand AI behavior.
💡 Map to the AI incident kill chain
Model incidents across:
Seed: hostile text (prompt, document, email)
Model action: instructions updated, ignored, or subverted
Tool invocation: code execution, data access, workflow triggers
Exfiltration: data leaves via responses, logs, or connectors [7]
Place monitoring and alerts at each stage, not only at network egress.
💡 Layered prompt injection defenses
Use multiple layers [8]:
Clear separation between trusted instructions and untrusted content
Guardrail LLMs to pre-screen inputs for malicious intent
Output sanitization and strict schema enforcement before downstream actions
💡 Governance for AI assets
Continuously inventory and assess:
Models and distilled variants
Prompts and system instructions
Datasets and embeddings
Plugins, tools, and connectors
Qualys’ DeepSeek-R1 evaluation shows jailbreak-prone models can silently propagate unless you maintain an AI bill of materials and vulnerability assessments [10][9].
💡 AI-first incident response
Design IR for AI-specific failure modes:
Treat leaks via prompts, RAG content, or tool calls as first-class incidents
Contain by revoking API keys, rotating model credentials, disabling tools/connectors
Update prompts, guardrails, routing logic, and access policies after incidents, not just network rules [7]
Automated exploit frameworks like CVE‑Genie—reproducing ~51% of CVEs at low cost—show the value of:
Internal pipelines to reproduce, validate, and regression-test AI-related vulnerabilities [4]
Turning ad hoc panics into repeatable security checks
⚡ Mini-conclusion
Operational excellence for AI security means treating models, prompts, and agents as monitored, governed assets, on par with microservices and databases.
Conclusion: Treat AI as High-Speed, Untrusted Input
The 35 AI-linked CVEs in March 2026 show what happens when AI-accelerated development meets legacy security assumptions.
LLMs and AI libraries act like high-speed, semi-trusted contributors that can introduce vulnerable code, unsafe dependencies, and new attack surfaces faster than traditional reviews can keep up.
Organizations must:
Recognize AI as a structural change in how vulnerabilities are created and exploited
Extend AppSec to cover AI-specific risks, from prompt injection to model supply chain compromise
Embed AI-aware guardrails into CI/CD and operations, treating models and assistants as untrusted components requiring continuous monitoring and governance
Teams that adapt now can harness AI’s speed without inheriting its worst security liabilities. Those that do not should expect March 2026 to look mild in hindsight.
Sources & References (10)
1Before You Deploy AI-Generated Code: A Production Checklist AI can generate working code in seconds. Tools like ChatGPT, Claude, and GitHub Copilot have dramatically accelerated development.
But generating code is not the same as shipping production-ready sof...2Remote Code Execution With Modern AI/ML Formats and Libraries Executive Summary
We identified vulnerabilities in three open-source artificial intelligence/machine learning (AI/ML) Python libraries published by Apple, Salesforce and NVIDIA on their GitHub reposi...3AI Agents Security Incidents and related CVEs for Enterprise Security Teams - DataBahn AI Agents Security Incidents and related CVEs for Enterprise Security Teams - DataBahn
Overall Incident Trends
- 16,200 AI-related security incidents in 2025 (49% increase YoY)
- ~3.3 incidents per ...4From CVE Entries to Verifiable Exploits: An Automated Multi-Agent Framework for Reproducing CVEs # From CVE Entries to Verifiable Exploits:
An Automated Multi-Agent Framework for Reproducing CVEs
Report issue for preceding element
Saad Ullah
Boston University
[email protected] Praneeth
Bala...- 5New data on code quality: GPT-5.2 high, Opus 4.5, Gemini 3, and more Functional benchmarks remain a standard for evaluating AI models, effectively measuring whether generated code can pass a test case. As LLMs evolve, they are becoming increasingly proficient at solvin...
6From CVE Entries to Verifiable Exploits: An Automated Multi-Agent Framework for Reproducing CVEs From CVE Entries to Verifiable Exploits: An Automated Multi-Agent Framework for Reproducing CVEs
Abstract
High-quality datasets of real-world vulnerabilities and their corresponding verifiable exploi...- 7Minimum Viable AI Incident Response Playbook The first real AI incidents are not sci-fi. They look like classic data leaks that start from non-classic places: prompts, retrieved documents, model outputs, tool calls, and misconfigured AI pipeline...
8The 2026 AI/ML Threat Landscape Executive Overview
In 2026, the integration of Artificial Intelligence into core business operations has shifted the security perimeter from traditional firewalls to the logic and data layers of the ...9OWASP LLM Top 10: AI Security Risks to Know in 2026 Elevate Consult — March 20, 2026
The OWASP LLM Top 10 framework addresses the most critical security vulnerabilities threatening AI applications today. Organizations deploy large language models in p...- 10DeepSeek Jailbreak Vulnerability Analysis | Qualys DeepSeek-R1, a groundbreaking Large Language Model recently released by a Chinese startup, DeepSeek, has captured the AI industry’s attention. The model demonstrates competitive performance while bein...
Generated by CoreProse in 1m 40s
10 sources verified & cross-referenced 1,479 words 0 false citationsShare this article
X LinkedIn Copy link Generated in 1m 40s### What topic do you want to cover?
Get the same quality with verified sources on any subject.
Go 1m 40s • 10 sources ### What topic do you want to cover?
This article was generated in under 2 minutes.
Generate my article 📡### Trend Radar
Discover the hottest AI topics updated every 4 hours
Explore trends ### Related articles
AI Code Generation Vulnerabilities in 2026: An Architecture-First Defense Plan
Hallucinations#### Over‑Privileged AI: Why Excess Permissions Trigger 4.5x More Incidents
Hallucinations#### The 2026 Surge in Remote & Freelance AI Jobs: Opportunities, Skills, and Risks
trend-radar#### Rogue AI Agents: Inside the Real-World Incidents of Autonomous Systems Going Off-Script
Hallucinations
About CoreProse: Research-first AI content generation with verified citations. Zero hallucinations.
Top comments (0)