Vibe coding security risks are the class of vulnerabilities that emerge when developers use AI coding agents — Cursor, Claude Code, Kiro, and similar tools — to generate and execute code autonomously, without the review checkpoints, audit trails, or policy enforcement that traditionally keep credentials, secrets, and sensitive files protected. Unlike conventional software supply chain risks, vibe coding security risks are baked into the workflow itself: the faster and more autonomously an AI agent works, the larger the window for a malicious instruction to run undetected.
IBM's security researchers put it plainly in their 2025 analysis of AI-assisted development: this is a different beast. The threat model has shifted. The attack surface is no longer just your dependencies or your CI pipeline — it's the instructions you hand to your AI agent before a single line of code is written.
Why AI Coding Agents Introduce a New Class of Risk
Most developers who adopt AI coding agents focus on productivity: faster feature delivery, less boilerplate, fewer context switches. What they don't focus on is what they've introduced alongside those gains. AI coding agents like Cursor, Claude Code, Kiro, CrewAI, LangGraph, and tools built on the OpenAI SDK or Vercel AI framework operate with broad filesystem access by default. They read files, write files, execute shell commands, and make network requests — all in service of completing the task you gave them.
That broad access is what makes them useful. It's also what makes them dangerous when the instructions they receive are malicious or when their behavior at runtime drifts beyond what anyone reviewed.
The attack surface has three distinct layers, and most teams are exposed on all three.
The Skill File Problem Nobody Is Talking About
Many AI coding platforms support Skills — markdown files stored in directories like .cursor/skills/ or .claude/skills/ that contain executable instructions for the agent. These aren't documentation. They're not passive reference material. They're instructions that run. When an agent loads a Skill, it treats the contents as authoritative guidance for how to behave, what tools to use, and what files to access.
Here's where the supply chain risk becomes concrete. When a developer clones a repository that contains a malicious Skill file, that file doesn't trigger a package manager warning. There's no install hook, no signature check, no manifest entry. It sits in the project directory, and the next time the AI agent initializes, it may load and execute those instructions without any prompt to the developer.
A well-crafted malicious Skill can instruct the agent to read ~/.ssh/id_rsa, scan for .env files containing API keys, locate cloud credential files in ~/.aws/credentials or ~/.config/gcloud/, and exfiltrate that data through an outbound request — all as part of a sequence of individually innocent-looking tool calls that collectively achieve silent credential theft.
The depth problem makes this worse. Existing security scanners that attempt to check Skill files typically truncate files at approximately 3,000 characters. A Skill file that hides its malicious instructions past that cutoff — in what appears to be routine documentation — will scan clean. The threat is invisible to current tooling.
Multi-Step Tool Chains: Where "Clean" Skills Still Cause Harm
Even when a Skill file contains no malicious instructions, runtime behavior can still cause harm. AI coding agents are capable of chaining tool calls across multiple steps, and each individual step can look completely legitimate while the combined sequence achieves an unintended outcome.
Consider a sequence like this: the agent reads a project configuration file to understand the environment, then reads a .env file to check for database credentials needed to run a migration script, then writes those credentials into a log file for debugging, then uploads that log to a remote service for analysis. No single step triggers an obvious alarm. The result is credential exfiltration through a chain of plausible, individually defensible actions.
Runtime governance — policy enforcement that monitors and constrains what the agent actually does during execution, not just what its instructions say — is the only mechanism that can catch this class of behavior. Static scanning of Skill files, however thorough, cannot see multi-step runtime sequences before they happen.
This is why two-layer protection is not optional. It's the minimum viable security posture for teams running AI coding agents at any meaningful scale.
The Audit Trail Gap
There is a third problem that gets less attention than supply chain attacks or runtime misbehavior: the absence of any default audit trail. When an AI coding agent reads a file, executes a command, or makes a network request, that action is typically not logged anywhere accessible to a security team. There's no record of what files the agent accessed, what commands it ran, what data it transmitted, or when any of it happened.
For security engineers, this is a familiar problem in a new form. Incident response depends on logs. Compliance frameworks depend on audit trails. The ability to detect, investigate, and remediate a breach depends on having a record of what happened. AI coding agents, by default, provide none of that.
Engineering leaders adopting AI-assisted development workflows need to ask a direct question: if an agent exfiltrated credentials from a developer's machine last Tuesday, would you know? What would you look at? What would you find?
For most teams today, the honest answer is: nothing. There's no trail to follow.
Two Layers of Protection: Skill Sentinel and Runtime Guardrails
At Enkrypt AI, we built our response to this threat model around the premise that single-layer defenses are insufficient. Scanning alone doesn't catch runtime misbehavior. Runtime governance alone doesn't catch supply chain attacks delivered through Skill files. You need both, and they need to work together.
The first layer is Skill Sentinel, our open source scanner purpose-built for AI agent Skill files. Unlike general-purpose scanners that truncate at ~3,000 characters, Skill Sentinel reads the full depth of Skill files and applies detection logic designed specifically for the attack patterns that appear in malicious Skills — exfiltration sequences, credential access instructions, obfuscated commands, and social engineering patterns embedded in what looks like routine documentation. It integrates into CI/CD pipelines and developer workflows across Cursor, Claude Code, Kiro, and other major platforms, catching supply chain threats before they reach execution.
The second layer is runtime guardrails. Enkrypt AI's runtime governance layer monitors agent behavior during execution, enforcing policies that constrain what the agent can read, write, execute, and transmit. When an agent attempts to access ~/.ssh/, read a .env file, or chain tool calls in a pattern consistent with data exfiltration, the guardrail fires. The action is blocked or flagged, and the event is logged — creating the audit trail that enables incident response and compliance reporting.
Together, these two layers address the three dimensions of vibe coding security risk: supply chain threats introduced through malicious Skill files, runtime misbehavior from clean Skills used in harmful sequences, and the audit trail gap that leaves security teams blind to what agents are actually doing.
What This Means for Your Stack
If your engineering team is using Cursor, Claude Code, Kiro, CrewAI, LangGraph, the OpenAI SDK, or Vercel AI, your agents are almost certainly operating without pre-execution Skill scanning, without runtime policy enforcement, and without an audit trail. That's not a hypothetical risk — it's the default state for every team that adopts these tools without adding a security layer explicitly designed for this threat model.
IBM is right that this is a different beast. The attack surface isn't a dependency you can pin, a vulnerability you can patch, or a misconfiguration you can fix once and move on. It's structural. It lives in the workflow. And it scales as fast as your AI agent adoption does.
The answer isn't to slow down AI-assisted development. The answer is to add the security layer that makes fast, autonomous development safe. That starts with understanding what your agents are actually doing — and putting the controls in place to make sure it's only what you intend.
You can learn more about how Enkrypt AI approaches this problem, including Skill Sentinel and our runtime guardrails for AI coding agents, at Enkrypt AI's Secure Vibe Coding solution.
Top comments (0)