DEV Community: Ahmed Ibrahim

The AI Code Review Bottleneck Is Already Here. Most Teams Haven’t Noticed.

Ahmed Ibrahim — Wed, 04 Mar 2026 18:00:55 +0000

A 3-lane, 90-day playbook for teams that ship AI-assisted code.

I’m not anti-AI, but I am anti-surprises. I’ve been working in infrastructure long enough to know how things break, and it’s almost never dramatic. Nobody deploys a rootkit on purpose. Unless they’re the bad guy, and in that case: congrats on being proactive. What actually happens is someone pastes a “quick helper” into a repo on a Friday afternoon, the code compiles, tests are green, and everyone wants to go home. Two weeks later, you’re on a call at midnight because the helper logs the full request body including the authorization header “just for debugging.” Or it catches every exception and silently returns success, so the function never actually fails, until it fails in a way nobody notices for three days. Nobody did anything malicious. It just happened fast.

That’s the shift worth paying attention to. We can now generate a lot of code, quickly, and it often looks confident while being slightly wrong. Stack Overflow’s 2025 Developer Survey captures the tension across multiple AI questions: 84% of respondents say they are using or planning to use AI tools (Stack Overflow Developer Survey 2025 — AI). When asked about AI accuracy, more developers distrust it than trust it (46% vs 33%) (Stack Overflow Developer Survey 2025 — AI). And when asked about frustrations, “almost right, but not quite” is the most commonly cited issue (66%) (Stack Overflow Developer Survey 2025).

If you’ve ever debugged someone else’s code, you know that almost right is sometimes worse than completely wrong. At least completely wrong fails loudly.

The Veracode 2025 GenAI Code Security Report put numbers on what that looks like in practice: 45% of AI-generated code samples introduced OWASP Top 10 vulnerabilities across 100+ large language models, with Java hitting a 72% security failure rate and Python, C#, and JavaScript all falling in a similar range. (Veracode 2025 GenAI Code Security Report) And this isn’t just a lab finding anymore. Aikido Security’s 2026 State of AI report, surveying 450 developers and security leaders across Europe and the US, found that one in five organizations have already suffered a security incident caused by AI-generated code, and 69% have found vulnerabilities introduced by it in their own systems. (Aikido Security: State of AI in Security & Development, 2026)

So this isn’t the scare piece. This is the boring follow-up where we actually do something about it.

The Principle That Makes Everything Else Work

Don’t create an “AI lane.” Create risk lanes. There’s a temptation to treat AI-generated code as something that needs its own special review process, a separate track with a label and a checkbox on the PR template. The intent is usually good because we want visibility into what’s being generated versus what’s being written by hand.

But there’s a real risk this backfires. A KPMG and University of Melbourne study surveying over 48,000 workers across industries in 47 countries found that 57% of employees conceal how they use AI at work. (KPMG Trust in AI, 2025) And a study published in Harvard Business Review showed that when engineers evaluated identical Python code, they rated the author’s competence 9% lower if they believed AI was used, same code, lower score, just because of the label. (HBR: The Hidden Penalty of Using AI at Work, 2025)

These are studies about perception and behavior broadly, not about engineering teams specifically, but the pattern they describe is hard to ignore. If you build a process that singles out AI-generated code, you’re likely creating an incentive to hide it, and then you end up with the worst of both worlds: AI-generated code everywhere, zero visibility into where it is.

Here’s what I think works better. Route reviews by what the code touches, not by who or what wrote it. In practice, that means splitting your changes into something like three lanes:

Fast lane: Documentation, comments, test descriptions, CSS and styling, localization strings. These carry minimal blast radius. No CODEOWNERS requirement on these paths, lighter CI checks (skip SAST, skip IaC scanning), standard branch protection still applies. One reviewer, automated checks pass, merge and move on.

Standard lane: Application logic, API endpoints, frontend components, database queries that don’t touch schema. This is most of your PRs and your default review process. These can still introduce security issues, and that is what your SAST checks and review process are for. One or two reviewers, all status checks green, CODEOWNERS approval where relevant.

Critical lane: Anything that touches authentication or authorization logic, CI/CD workflows and pipeline definitions, infrastructure-as-code, secrets management, database schema migrations, CODEOWNERS itself, or network and firewall rules. Enforce this by adding CODEOWNERS entries for critical paths (like .github/workflows/** and your infra/ directory, for example) and requiring code owner approval in your branch protection rules. That turns the lane from a suggestion into a gate. The designated reviewer actually understands the blast radius of the file they're approving.

The lanes aren’t about AI. A human writing a Terraform change that opens port 22 to the internet is just as dangerous as Copilot doing the same thing. The point is that your review effort goes where the damage potential actually lives, and code that can’t hurt you in production doesn’t sit in a queue waiting for the same level of scrutiny as code that can.

One thing to watch for: people can try to work around the lanes for many reasons. Someone splits a PR so the auth logic change lands in one diff and the “harmless refactor” that makes it work lands in another. Or they rename a file to dodge a CODEOWNERS path. The mitigations are: keep your CODEOWNERS paths broad enough to catch common renames (own the directory, not just the filename), add CI checks that scan for security-sensitive patterns like credential handling or permission changes regardless of which file they appear in, and be honest that if someone is actively working around your review process, you have a trust problem that no amount of tooling will fix on its own.

The Volume Problem Nobody Planned For

Before we get into the playbook, it’s worth understanding why this is urgent rather than just important.

AI tools don’t just change what code looks like, they change how much of it shows up in your review queue. Faros AI published research in 2025 based on telemetry from over 10,000 developers across 1,255 teams. Teams with high AI adoption completed 21% more tasks and merged 98% more pull requests, which sounds great until you see the other side: PR review times increased by 91%, and the PRs were also larger. (Faros AI: The AI Productivity Paradox, 2025). The bottleneck moved. It used to be writing code. Now it’s reviewing it, and most teams haven’t adjusted.

I want to be honest here: the playbook below does not magically solve the volume problem. Nobody has a clean, proven answer to “how do you review twice or triple as many PRs without as many senior engineers.” What the controls below are designed to do is make sure the increased volume goes through actual checks instead of getting rubber-stamped because the reviewer has 47 PRs in their queue and a sprint review in two hours.

The closest thing to a real strategy right now is layering. Automated checks catch the surface-level problems before a human ever opens the diff. Risk-based routing through CODEOWNERS means expensive human attention goes where it actually matters, which is why the lane system above exists: your senior engineers should never spend their limited review time on a docs change when there’s a workflow permission change three PRs down in the queue. Generation-time guardrails like AGENTS.md mean the PR that arrives in your queue is already cleaner because the agent ran linting and tests before opening it. And AI-assisted code review tools like GitHub Copilot code review or CodeRabbit are becoming a practical first-pass layer that catches obvious bugs and known vulnerability patterns before a human reviewer ever sees the diff. None of these layers are perfect on their own, and the AI-assisted review tools in particular are still early enough that your team will spend the first few weeks calibrating what to ignore versus what to act on.

But the net effect is that reviewers spend their time on logic, architecture, and security design instead of catching hardcoded secrets and missing null checks. That’s the difference between a review process that scales and one that quietly collapses under weight.

Day 1: Stop the Bleeding

The lanes decide who reviews what. The 90-day plan below decides what controls run before a reviewer ever sees the diff.

Day 1 is about what you can do this week, not in a perfect world, not after the next planning cycle.

Protect the branch, protect your future self. Start with the basics that everything else depends on: disable force pushes and branch deletion on protected branches. A bad PR that gets merged leaves a trail you can investigate, but a force push rewrites that trail entirely, and a deleted branch takes it with it. Once those are locked down, build on top of them by requiring pull requests for main, requiring at least one reviewer, and requiring status checks before merge with build and tests at minimum. This isn’t about distrust, it’s about stopping “oops” from becoming “incident.” If you don’t have this foundation already, everything else in this article is academic.

CODEOWNERS for the files that can actually hurt you. Not every file in your repo carries the same risk. A CSS change and a workflow permission change are not the same thing, and pretending every reviewer is equally qualified for both is how you end up with a junior approving a change to your CI/CD pipeline because the diff looked small. Add CODEOWNERS for .github/workflows/**, your infrastructure directory, wherever your authentication logic lives, and the CODEOWNERS file itself, because people are creative. If you enable "Require review from Code Owners," GitHub enforces that the right people approve the right files. This is where the critical lane becomes real: CODEOWNERS turns “this needs the right reviewer” from a suggestion into a gate.

Secret scanning with push protection. Push protection stops the commit before the secret reaches the remote, so turn it on even if you think your team would “never” commit secrets, because they will, just not on purpose. With AI tools generating config files and helper scripts at volume, the probability goes up, not down.

Dependency scanning and a basic SAST pass. You don’t need perfect tooling on Day 1, you need consistent signal. Turn on dependency alerts and run a basic SAST scan on PRs. It will be noisy, but you’re not trying to catch everything yet, just trying to stop shipping something obviously avoidable while you build the rest.

AGENTS.md if your team uses AI coding agents. This one surprised me. AGENTS.md is an open format that multiple AI coding agents now support, including Codex and Cursor among others (see agents.md for the current list). Think of it as a README but for agents: you put it in your repo root and it tells the agent things like “run linting before opening a PR,” “never modify workflow files without flagging for review,” and “do not commit credentials even in test files.”

It’s not enforcement, since the agent could still ignore it the same way a human could ignore your CONTRIBUTING.md. But it creates a shared expectation at the repo level, which means you’re not relying on every developer to individually configure their AI tool correctly. If you’re setting up repo templates for new projects, add an AGENTS.md alongside your CODEOWNERS file. It sets a baseline before the first AI-generated PR ever lands.

Day 1 goal: reduce the probability of shipping something that would embarrass you in an incident review.

Day 30: Make Quality Repeatable

By Day 30 you’ve had the Day 1 controls running long enough to see what they catch and what they miss. You’ve probably had at least one PR where the status checks saved you from something, and at least one where they didn’t catch something they should have. That’s the signal you use to tighten things up.

Mandatory checks before merge. At this point, make these non-negotiable in your branch protection: build, tests, dependency scan, secret scan, and an IaC scan if infrastructure-as-code lives in the repo. If a PR can bypass any of these and still land in main, you have a policy, not a control. A policy says “we expect people to do this”, but a control means the system won’t let you skip it. By Day 30, you should be running controls.

AI-assisted code review as a first pass. This is the most direct answer to the volume problem from earlier. Tools like GitHub Copilot code review or CodeRabbit can review every PR before a human touches it. The better ones combine static analysis with LLM reasoning, so they can tell whether that SQL query is actually dangerous in context rather than just flagging every string concatenation. Your human reviewers should be spending their time on whether the architecture makes sense and whether the security design holds up under edge cases, not on spotting a hardcoded API key in line 47. For context on cost, GitHub Copilot Business, which includes the code review feature, is $19/user/month. Third-party alternatives have their own pricing. The specific tool matters less than the principle: if your team generates more PRs than your reviewers can thoughtfully evaluate, you either add an automated first-pass layer or you accept that human review becomes performative.

SBOM as a practical upgrade. I didn’t think much about Software Bill of Materials until I thought through what happens without one. If a customer asks what libraries are inside your product, the answer is an actual list instead of silence and a follow-up email. If a dependency you didn’t even know was in the chain turns out to have a CVE, impact analysis becomes a query instead of detective work. If you want practical guidance on how SBOM fits into broader supply chain security, SLSA (Supply-chain Levels for Software Artifacts) is worth reading. It’s an OpenSSF framework that provides incremental levels for improving supply chain integrity from basic provenance tracking up through tamper-proof builds. (SLSA Framework)

Day 30 goal: the quality of what ships no longer depends on who happens to be reviewing that day.

The Line Between Suggesting and Executing

There’s a line that a lot of guardrail discussions skip over, and I think this is an important distinction in this conversation.

When AI is only generating text in your IDE, you’re dealing with a code quality problem where the developer sees the suggestion, maybe accepts it without reading carefully, and it goes through whatever review process exists. The human is still in the loop, even if they’re not paying enough attention.

When AI can execute actions, that is a different problem entirely. Tools like Codex , Cursor in agent mode, and Cline don’t just suggest code, they read your repository, run terminal commands, modify files across your codebase, and create pull requests autonomously.

Research on LLM agent security has been accelerating. A comprehensive survey published in late 2025 on attacks and defenses targeting LLM-based agents identifies how tool use and iterative execution expand the attack surface compared to single-turn text generation - an agent that can read files and execute commands is a fundamentally different risk surface than a chatbot that answers questions. The practical risks include prompt injection through repository content, exfiltration of code or secrets through tool calls, and manipulation of agent behavior through poisoned context.

Two disclosed CVEs show what this looks like when it reaches production. CVE-2025–53773 is a command injection vulnerability in GitHub Copilot and Visual Studio (CVSS 7.8, patched August 2025). CVE-2025–54135, nicknamed “CurXecute,” is a similar class of issue in the Cursor editor (CVSS 8.6 per vendor advisory, patched in Cursor 1.3.9, July 2025). In both cases, public write-ups describe how prompt-injection style inputs in files an agent reads can translate into unintended command execution. The agent does exactly what it was told, just not by the developer. (Survey: Security of LLM-based agents, ScienceDirect 2025) (Embrace The Red: CVE-2025–53773) (Aim Security: CurXecute)

What this means for the playbook: execution rights require sandboxing, least privilege, and strong audit trails, and the controls in Day 90 are the minimum when you have tools that can run commands.

Day 90: Make It Survive the Worst Week

By Day 90 the basics are muscle memory. PRs get reviewed, checks run, secrets get caught. The question shifts from “are we doing something?” to “could we survive an audit, an incident, or a very pointed question from a customer?”

This phase is about one thing: so you can answer what happened, who approved it, and why? without scrambling through Slack threads at 2 AM.

Least privilege identities per environment. This sounds obvious until you audit what’s actually running in your pipelines. In practice, it means no long-lived production credentials stored as CI secrets. Replace them with short-lived identities: OIDC federation between your CI provider and your cloud, workload identity in Kubernetes, or federated tokens where those aren’t available. In GitHub Actions, that looks like configuring the id-token: write permission and using your cloud provider's OIDC login action instead of storing a long-lived secret. AWS, Azure, and GCP all support this pattern, and the setup is similar across all three: you create a trust relationship between your CI provider and your cloud identity system, scoped to a specific repo and branch, so the token only works for that pipeline running against that branch. Scope RBAC tightly and separate dev, staging, and production identities completely, because if a credential can reach production, it is production infrastructure whether you labeled it that way or not. Teams that do this audit almost always discover at least one identity with broader access than anyone intended.

Environment approvals and separation of duties. Production deployments should require an approval gate, and that approval should be auditable. In GitHub Actions, this means configuring environment protection rules on your production environment: add required reviewers (at least one person who is not the PR author), and enable "Prevent self-review" so the person who triggered the deployment cannot also approve it. The deployment logs then show exactly who approved, what commit SHA was deployed, and what time the approval was given. That paper trail is the difference between a 30-minute incident review and a three-day forensic investigation.

Runner isolation for sensitive jobs. If a runner can deploy to production, treat it as production-adjacent infrastructure. For self-hosted runners, use dedicated runner groups for production deployments with labels like runs-on: production-deploy, restrict those runners so only specific workflows can use them, and put them in a separate network segment with access only to production endpoints and your artifact registry. Ephemeral runners that spin up clean for each job and get destroyed after are even better, because you eliminate the possibility of state leaking between workflow runs. If you're on GitHub-hosted runners, be aware that standard hosted runners share infrastructure and don't offer network-level isolation out of the box. Private networking options exist but they require enterprise-tier plans and additional cloud configuration, so evaluate whether your deployment security requirements justify self-hosted runners instead. This matters more now that AI tools generate deployment scripts and workflow files at volume, because the damage a bad workflow can do is not "this PR has a bug," it's "this workflow has permissions to push artifacts to production and nobody noticed the scope."

Policy as code for the patterns that should never pass. By Day 90, certain things should be automatically blocked before they reach a reviewer. Use Open Policy Agent (OPA) with Rego policies, or if you’re in GitHub, branch rulesets combined with repository rules. The patterns to automate first: workflows requesting permissions: write-all or contents: write without explicit justification, pull requests that modify .github/workflows/** without approval from the platform team (enforced through CODEOWNERS plus a required status check that validates the approval), infrastructure-as-code changes that open inbound access from 0.0.0.0/0 on non-HTTP ports (which means "allow the entire internet to connect," and is one of the most common cloud misconfigurations when someone writes a security group rule for SSH or a database port and forgets to restrict the source IP range), and deployment artifacts that are unsigned when your policy requires signing. You can enforce these through pre-merge checks that run OPA against the PR diff, or through CI steps that validate the final configuration state before deployment proceeds. Start the highest-confidence rules as blocking. For noisier ones, start as warnings but set an explicit deadline to tune or delete them, because advisory mode without a deadline is just a more expensive way to generate alerts nobody reads.

Day 90 goal: your security posture survives the worst week of the year, not just the best.

Final Thought

If you’re already using AI tooling in your development workflow, you’re not early, you’re normal. The Stack Overflow numbers say 84% of developers are in the same position, and the real question is whether your guardrails grew alongside the tools or whether they’re still where they were two years ago when the main concern was someone copy-pasting from the wrong Stack Overflow answer.

Every section in this article is something you can start without a massive initiative. Day 1 is branch protection and CODEOWNERS. Day 30 is mandatory checks and automated review layers. Day 90 is least privilege and policy as code. None of it requires a new department or a six-month project, it just requires deciding that the speed of shipping doesn’t get to outrun the speed of knowing what you shipped.

The controls that protect you from a bad human commit are the same ones that protect you from a bad AI-generated commit. The only difference is volume, and now you know where to start.

Sources

Stack Overflow 2025 Developer Survey — 49,000+ respondents (May-June 2025). Stats cited are from separate questions in the AI section: 84% using or planning to use AI tools; 46% distrust vs 33% trust AI accuracy; 66% cite “almost right, but not quite” as the most common frustration.

Veracode 2025 GenAI Code Security Report — Tested 80 coding tasks across 100+ LLMs in Java, Python, C#, and JavaScript. Published July 2025. Stats cited: 45% introduced OWASP Top 10 vulnerabilities, Java at 72%. (Full PDF)

Aikido Security: State of AI in Security & Development 2026 — Survey of 450 developers, AppSec engineers, and CISOs across Europe and the US. Found one in five organizations suffered a security incident from AI-generated code, 69% found vulnerabilities in AI code.

KPMG and University of Melbourne: Trust in AI 2025 — Global study of 48,000+ workers across industries in 47 countries. Found 57% of employees conceal AI usage at work.

Harvard Business Review: The Hidden Penalty of Using AI at Work — Controlled experiment with 1,026 engineers evaluating identical Python code. Found 9% competence penalty when reviewers believed AI was used. Published August 2025.

Faros AI: The AI Productivity Paradox 2025 — Telemetry from 10,000+ developers across 1,255 teams. Published July 2025. Found 98% more PRs merged, 91% longer review times.

SLSA Framework — Supply-chain Levels for Software Artifacts. OpenSSF project. Incremental levels for supply chain security.

Security of LLM-based Agents: A Comprehensive Survey — Academic survey published on ScienceDirect (2025) covering attack methods, defense mechanisms, and real-world vulnerabilities in LLM agent systems.

CVE-2025–53773: GitHub Copilot and Visual Studio — CVSS 7.8 (High). Command injection issue; patched August 2025. Additional analysis: Persistent Security.

CVE-2025–54135: Cursor RCE via MCP Prompt Injection (CurXecute) — CVSS 8.6 (High) per vendor advisory. Poisoned MCP server data could rewrite global MCP config and execute attacker-controlled commands. Patched in Cursor 1.3.9, July 2025. Detailed exploit chain documented by Aim Security.

AGENTS.md Open Format — Open format for project-level instructions to AI coding agents. Supported by multiple tools including Codex and Cursor. Stewarded by the Agentic AI Foundation under the Linux Foundation.

Developers Have Been Shipping AI-Generated Vulnerabilities Since 2021

Ahmed Ibrahim — Sun, 30 Nov 2025 20:26:57 +0000

Why Is Vibe Coding the Problem?

The Pattern I Keep Recognizing

I’ve worked in cloud engineering long enough to recognize how these things go. A new technology arrives and everyone gets excited. Proof of concepts appear everywhere, usually without anyone asking too many questions. Someone calls it the future of everything. Then months or years later, someone finally asks whether this is actually safe, and by that point the logs are already filling up and half the company depends on it.

I’ve watched this movie enough times to recognize the opening credits.

The newest version is what people are calling vibe coding. If you’ve managed to avoid the term, it describes non-technical people using AI tools to generate full applications and deploying them without deep knowledge of the underlying code. Marketing teams building dashboards with Bolt. Product managers generating internal tools with Lovable. People producing full-stack applications with authentication and databases without writing a single line manually.

A colleague of mine, Fred, brought this up in a meeting recently. We were discussing how the organization should handle vibe coding, and he asked a question that stuck with me: “What’s the opposite of vibe coding? Boring coding?” He laughed, but then made a serious point. His argument was that it shouldn’t matter where code comes from. Vibe coded, Copilot-assisted, or typed out manually by someone who thinks tabs are superior to spaces. All of it should go through the same process. No special lane for anything.

He’s right in principle. The problem is that the “process” he’s describing often consists of someone glancing at a diff and typing “LGTM” before moving on to the next ticket. And look, sometimes that’s fine. Not every three-line config change needs a security review. But when the code is generated by a tool that might have confidently produced something subtly broken, that quick approval becomes a different kind of gamble.

The industry reaction has followed the predictable path. Security teams are writing policies. Leaders are blocking tools. LinkedIn is full of warnings about the dangers of letting non-developers write code. It took roughly three months for this conversation to shift from curiosity to problematic, and some might go further and throw the word crisis.

While everyone panicked about beginners, I kept thinking about something else entirely. The tools the industry is already comfortable with, such as GitHub Copilot, have been generating code since 2021. Copilot completes functions, scaffolds files, and frequently provides entire implementations. Developers adopted it immediately. Security teams approved it. It came from Microsoft, so it felt safe.

This led to an obvious question. If the concern is that non-developers generate code they don’t fully understand, what about developers generating code they also don’t fully understand?

That felt worth digging into. So I started reading research papers. What I found was a mix of reassuring and concerning information, sometimes appearing in the same paragraph. Honestly, the pattern suggested that the industry might be having the wrong conversation entirely.

What the Research Actually Says

The most referenced study on this topic came from NYU in 2022. Researchers evaluated GitHub Copilot across 89 security-related scenarios aligned with MITRE CWE Top 25, generating 1,689 programs in Python, C, and Verilog. About 40 percent of those programs contained security vulnerabilities.

Forty percent. From a tool that millions of developers use daily.

Now, this study came from a respected institution and underwent peer review at IEEE S&P, which makes it credible. It’s also worth noting that this was Copilot in 2021, not today. Things have improved. A targeted replication published in 2023 and presented at SANER in 2024 focused only on Python and found that the rate of insecure solutions had dropped from 36.54 percent to 27.25 percent for comparable scenarios.

That’s genuine progress. It also means more than a quarter of completions still contained vulnerabilities. Progress, yes. Problem solved, no.

Then there’s industry research. Veracode published a GenAI Code Security Report in 2025 where they evaluated 80 coding tasks across more than 100 large language models. They found that 45 percent of generated code contained vulnerabilities when assessed against OWASP Top 10 categories. Java was the worst at roughly 72 percent. Python, JavaScript, and C# ranged from 38 to 45 percent.

The consistent pattern across these studies isn’t the exact percentage. The numbers vary by task, language, and model version. What doesn’t vary is that vulnerability rates are significant and measurable, regardless of which specific tool or year you examine.

The degree varies. The direction doesn’t.

The Understanding Problem Nobody Wants to Discuss

A common argument in vibe coding debates is that non-developers don’t understand the code they generate. That’s obviously true.

But it misses something important sitting right next to it.

Understanding OAuth conceptually is one thing. Understanding a 200-line OAuth implementation that appears instantly from an AI model is something else entirely. Even experienced developers can miss subtle issues when the code looks clean, compiles successfully, and passes a quick manual test. The code appears correct. That’s precisely what makes it dangerous.

The 2025 Stack Overflow Developer Survey provides some uncomfortable context here. They surveyed over 49,000 developers across 177 countries, which makes it one of the largest annual snapshots of how developers actually work.

The headline number is that 84 percent of respondents are using or planning to use AI tools in their development process, up from 76 percent in 2024. Among professional developers, 51 percent use these tools daily. This isn’t experimental adoption anymore. This is how a significant portion of code gets written now.

Here’s where it gets interesting.

At the same time, 46 percent of developers say they don’t trust the accuracy of AI-generated output. That’s up sharply from 31 percent the previous year. Only 33 percent say they trust it, and a mere 3 percent report high trust.

The most common frustration, reported by 66 percent of respondents, is that AI solutions are “almost right, but not quite.” Another 45 percent say debugging AI-generated code takes longer than writing it themselves.

Read that combination again. Developers are widely adopting tools that they don’t entirely trust, that often produce code requiring more debugging effort, and that can still generate vulnerable logic even when the syntax is perfect.

I don’t know about you, but that combination made me pause.

The conversation about non-developers generating code they don’t understand is real, but it’s incomplete. The broader issue is that code is being produced faster than it can be carefully reviewed and understood. And that applies to everyone, regardless of job title.

Why Copilot Feels Safe and Lovable Feels Dangerous

There’s an interesting psychological aspect to all this that nobody seems to discuss directly. Copilot feels safe because it works incrementally. It completes a line here, suggests a function there. The process feels like traditional autocomplete. You stay in your editor. You feel in control. You’re still writing the code, even if the model produced most of it.

The small code suggestion feels like a helpful colleague finishing your sentence.

Tools like Lovable or Bolt feel riskier because the generation is large and immediate. Entire applications appear at once. The user might not be a developer. The velocity feels higher, so the perceived risk feels higher.

It feels like someone else wrote your entire novel and put your name on it. Same words. Different anxiety.

In practice, both paths can lead to thousands of lines of generated code landing in a repo without thorough review. In my opinion, the difference is emotional, not technical. And I think that emotional difference is driving a lot of the panic.

When Stack Overflow asked developers whether vibe coding was part of their professional work, 72 percent said no and another 5 percent answered emphatically no. Yet these same developers are using AI coding tools every day. They just don’t call it vibe coding when they do it.

The behavior is already here. The terminology is what arrived late.

When the Risk Actually Changes

There’s one area where the risk genuinely shifts, and it’s worth being accurate about it. Some newer tools and agent frameworks allow the model not just to generate code, but to execute it. Once execution enters the picture, the threat model is fundamentally different.

A 2024 study by researchers at the University of Illinois built an agent framework on GPT-4 and tested it against sandboxed web vulnerabilities. Under their experimental setup, the agent successfully exploited 73.3 percent of the vulnerabilities when given five attempts per vulnerability. GPT-3.5 achieved only 6.7 percent. Every open-source model they tested failed completely.

A related preprint from the same research group tested GPT-4 on known one-day vulnerabilities when provided the CVE description. The agent exploited 87 percent of them, while GPT-3.5 and all open-source models couldn’t exploit any.

Both papers are preprints, which means their findings should be viewed as early evidence rather than final conclusions. The peer review process exists for good reasons. Still, the mechanism they demonstrate is clear.

If a model can execute code, test hypotheses, and iterate on its own output, the boundary between suggestion and action becomes critically important. A tool that suggests vulnerable code is one kind of problem. A tool that can act on vulnerable systems autonomously, that’s a different kind of problem entirely.

This is where traditional security engineering principles apply directly. And honestly, this is where the conversation should probably be spending more time.

What Actually Makes Sense

For tools that only generate code, existing development processes still work. Code review, automated security scanning, tests, deployment pipelines. All of it remains essential. The research suggests that AI-generated code may need more thorough review, not less, but the fundamental model remains the same. Fred’s point holds, that we should have the same process for everything.

For tools that can execute code, additional safeguards are essential. The principles aren’t new. They’re the same controls applied to CI/CD pipelines, infrastructure automation, and any privileged tooling.

Sandboxed execution environments that can’t reach production systems, data, or credentials. Least-privilege access where agents get only what they need for their specific function. Immutable audit logging for every action, tool call, and access event. Human approval gates before any high-impact operation reaches production.

None of this is revolutionary. It’s the same security engineering that applies to any untrusted execution environment. The difference is that AI tools can produce and sometimes execute code much faster than traditional systems, which means the controls need to be in place before someone decides to connect an agent to something important, but not after.

The Actual Problem Beneath the Panic

The public panic about vibe coding focuses on non-developers generating code. The research points to something broader.

Code is being generated faster than teams can reliably understand, review, and secure it. This applies to developers and non-developers alike. The peer-reviewed studies show vulnerability rates from 27 to 40 percent depending on task and language. The industry analysis shows rates around 45 percent across a wide set of models. Developer surveys show high adoption alongside low trust and significant debugging overhead. Research into AI agents raises legitimate questions about execution rights and boundary enforcement.

A reasonable conclusion is that some AI-generated vulnerabilities are probably sitting in production systems right now.

This isn’t a claim that every system is compromised. It’s an acknowledgment that when you combine meaningful vulnerability rates, widespread adoption, and the typical time organizations take to detect issues, undiscovered problems become statistically likely.

What Organizations Can Actually Do

Several practical steps reduce the risks without requiring organizations to abandon tools that provide genuine productivity benefits.

First, improve visibility into where AI-generated code is coming from and which tools are being used. Without this baseline, security teams can’t assess risk accurately.

Second, increase review depth for code known to be AI-generated, particularly in security-relevant sections. Human review catches many vulnerabilities that automated tools miss, especially the subtle logic issues that AI models tend to produce. And yes, this means “LGTM” might need to become a longer conversation sometimes.

Third, apply execution boundaries to any tools capable of running code. Separate networks, separate credentials, separate data. If an agent can execute code, it shouldn’t be able to reach anything important without explicit human approval.

Fourth, update security tooling. Many SAST tools were built to detect patterns common in human-written code. AI-generated code follows different patterns, and detection capabilities need to evolve alongside the threat.

These steps cost time and resources. They also reduce the likelihood of shipping vulnerabilities that prove expensive to discover later.

The Conclusion

The panic around vibe coding highlights a real concern, but it’s focused too narrowly. The broader issue is that AI-powered development is introducing code faster than it can be reviewed and understood. The data supports this even though the exact rates vary by model, language, and context.

The research is clear about the presence of vulnerabilities. It’s also clear that proper security controls reduce these risks meaningfully. The decision ahead for organizations isn’t whether to use AI coding tools. Adoption is already high and climbing. The decision is whether to build the right guardrails now or learn these lessons through incidents.

Both paths are possible. One is proactive. One is reactive.

I’ve been watching the tech industry long enough to have predictions about which path most organizations will take. But I hope some choose the other one.

It would be a refreshing change from the usual pattern.

Research Sources

Peer-Reviewed Studies:

Pearce, H., Ahmad, B., Tan, B., Dolan-Gavitt, B., and Karri, R. (2022). “Asleep at the keyboard? Assessing the security of GitHub Copilot’s code contributions.” 2022 IEEE Symposium on Security and Privacy (SP).

Majdinasab, V., Bishop, M., Rasheed, S., Moradidakhel, A., Tahir, A., and Khomh, F. (2024). “Assessing the security of GitHub Copilot generated code: A targeted replication study.” 2024 IEEE International Conference on Software Analysis, Evolution and Reengineering (SANER).

Fang, R., Bindu, R., Gupta, A., and Kang, D. (2024). “LLM agents can autonomously hack websites.” arXiv preprint arXiv:2402.06664.

Fang, R., Bindu, R., Gupta, A., and Kang, D. (2024). “LLM agents can autonomously exploit one-day vulnerabilities.” arXiv preprint arXiv:2404.08144.

Industry Reports:

Veracode. (2025). “2025 GenAI Code Security Report.”

Stack Overflow. (2025). “2025 Developer Survey.”

An Infrastructure Engineer’s Guide to MCP and A2A

Ahmed Ibrahim — Thu, 16 Oct 2025 20:24:31 +0000

I’ve spent years thinking in layers, networks, subnets, firewalls, storage, and workloads. Things you could trace, measure, or at least restart at 3 AM and hope for the best. Then came AI, and suddenly everyone was talking about models, context windows, and agents.

It felt like walking into a data center where every rack had been replaced by something invisible and no one could tell me where the traffic was going.

So, like many of you, I started trying to make sense of it. And in my experience, the best way to learn something is to try to explain it. Writing forces clarity. If I can explain it, I probably understand it, or at least, I will by the end of this series.

That’s how this article started. I wanted to understand how AI systems actually work underneath all the hype. And I found myself reaching for something familiar: the OSI model.

Finding Layers in the Chaos

The OSI model has been around for decades. It gave us a common language for networks, a standard way to describe what happens from the physical cable all the way to the application data.

AI systems don’t have that yet. They’re still an evolving mix of tools, APIs, and models that somehow cooperate to sound intelligent. But thinking in layers still helps.

For AI systems, I’ve been using this mental model to make sense of things (not an official standard, just a way I’ve found helpful for thinking through problems):

Infrastructure Layer — Physical compute, GPUs, servers, networking
Model Layer — The AI models themselves (GPT-4, Claude, etc.)
Protocol Layer — How components communicate (MCP for tools, A2A for agents)
Application Layer — Your application logic and orchestration
Interface Layer — What users interact with

This isn’t the OSI model, and it’s not an industry standard. It’s a framework I’m using to organize my thinking as I learn these protocols. Will it help with debugging? That’s the theory, but I haven’t put it to a real test yet. For now, having some structure (even an imperfect one) beats treating everything as a black box.

At the bottom, we still have infrastructure: compute, storage, and networking. Then we have the models doing the thinking. Above that are the protocols that let models use tools (MCP) and agents talk to each other (A2A). And at the top is your application logic and user interface.

That’s where MCP and A2A live, in that protocol layer.

They’re not decades-old, committee-blessed standards like OSI. They’re newer open standards, the kind that evolve fast and have backing from companies like Google, Microsoft, OpenAI, and IBM. Think of them more like HTTP or OAuth: practical protocols that become standards through adoption rather than committee decree. And right now, they’re the closest things we have to a common language for how AI systems actually talk to tools, data, and each other.

A Note on Timing

Before we dive deeper, it’s worth noting how recent all of this is. Anthropic released MCP in November 2024. OpenAI officially adopted it in March 2025. Google announced A2A in April 2025, and by June it was already a Linux Foundation project with over 100 companies backing it.

We’re watching the foundation of AI infrastructure being laid in real-time. These aren’t battle-tested protocols with decades of refinement, they’re emerging standards that could shape how AI systems work for years to come. That makes understanding them now even more valuable.

The OSI Disclaimer

Another quick note before we go further: AI protocols don’t map perfectly to the OSI model. The OSI model was designed for networks, where data flows in predictable ways through well-defined layers. AI systems are messier, they’re about intelligence, context, and decision-making, not just data transport.

But the thinking still helps. When you’re troubleshooting why your AI system isn’t working, asking “which layer is failing?” is just as useful as it was with networks. So bear with me while I stretch the analogy if it helps us to understand more.

What is MCP (Model Context Protocol)?

MCP stands for Model Context Protocol. It’s an open protocol created by Anthropic (the company behind Claude) and adopted by others, including Microsoft and OpenAI, to define how models connect to tools and data sources.

You can think of MCP as a translator that sits between an AI model and the real world. It defines how the model can access tools like a database, a calendar, or a filesystem, and how those interactions are formatted and logged.

Let’s say you’re building a chatbot that can read files from a shared folder.

Without MCP, you’d probably just give it a custom API endpoint or a Python function and hope for the best. With MCP, you define a connector that follows a standard contract. The model doesn’t just “call random code”; it sends a structured request, like “list available files” or “read file X”, through a controlled channel.

Instead of just calling readFile('secrets.txt') and crossing your fingers, the model sends a structured MCP request that looks something like: {action: 'read_resource', resource_id: 'file://shared/report.pdf', permissions: 'read-only'}. The exact schema varies by implementation, but you get the idea: structured, validated, and enforceable.

The Three Primitives

Under the hood, MCP defines three core primitives that make this work:

Resources : Data that models can read (like files, database records, or API responses)
Tools : Actions that models can execute (like “send email” or “create file”)
Prompts : Pre-built templates that MCP servers can expose for common tasks

In practice, this looks like:

Resources : “Give the model read access to our customer database”

# The model can then query:
resource = mcp.get_resource("database://customers/table")

Tools : “Allow the model to restart a failed service”

# The model can call:
result = mcp.call_tool("restart_service", {"service": "api-gateway"})

Prompts : “Provide a template for analyzing error logs”

# The model can use a pre-defined prompt:
analysis = mcp.use_prompt("analyze_error_logs", {"log_file": "errors.log"})

The beauty is that once you’ve defined these through MCP, any MCP-compatible model can use them. You’re not locked into one vendor’s custom integration format. It’s like SNMP for AI tools: one protocol, every vendor has to speak it, and suddenly your monitoring actually works.

This structured approach means:

You can authenticate the connector, using normal credentials or tokens.
You can log every call and enforce permissions (read-only, write, delete).
You can control context, what the model is allowed to see and remember.

Where MCP Fits

So where does MCP fit in terms of protocols?

If we loosely map it to the OSI model, MCP would live around the application layer, sitting above the transport protocol (usually HTTP or WebSocket). It doesn’t move bytes over the network like HTTP or SMTP; instead, it defines a semantic layer that tells the model what actions exist, how to call them, and what rules apply.

In other words, HTTP delivers the request. MCP explains what the request means. It’s the contract layer between the model and its tools responsible for structure, permissions, and safety, not routing packets.

MCP gives structure to how AI connects with the world. But it doesn’t decide what the model should do, that’s where A2A comes in.

What is A2A (Agent2Agent)?

A2A, short for Agent2Agent, is an open protocol that defines how different AI agents collaborate. It was announced by Google in April 2025 and became an official Linux Foundation project just two months later, with backing from major players like Google, Microsoft, IBM, and over 100 other companies.

While MCP standardized how agents talk to tools, A2A standardizes how agents talk to each other.

Imagine you have three agents in a system:

A planner agent that breaks down a task.
A researcher agent that finds information.
A writer agent that generates the final text.

A2A is the communication layer that allows these agents to exchange structured messages and results, like “Here are the search results” or “Now summarize this.”

Making It Concrete

Let’s make this real. Say you’re on-call and you ask your AI system: “Investigate why the production API is slow and create an incident report.”

Without A2A, you’d manually check logs, metrics, and traces yourself, then write the report.

With A2A, here’s what happens:

The coordinator agent receives your request and breaks it into subtasks:

[1. Check error logs, 2. Analyze metrics, 3. Query traces, 4. Generate incident report]

It sends a structured A2A message to the logs agent:

{
  "task": "analyze_errors",
  "service": "production-api",
  "timeframe": "last_30_min"
}

The logs agent responds with structured data:

{
  "results": {
    "error_rate": "15%",
    "top_error": "DatabaseConnectionTimeout",
    "affected_endpoints": ["/api/users", "/api/orders"]
  }
}

The coordinator then sends that data to the metrics agent:

{
  "task": "correlate_metrics",
  "errors": ["DatabaseConnectionTimeout"],
  "timeframe": "last_30_min"
}

The metrics agent responds:

{
  "results": {
    "db_connections": "maxed out at 100",
    "cpu_usage": "85%",
    "likely_cause": "connection pool exhaustion"
  }
}

Finally, the coordinator sends everything to the report agent to generate a markdown incident report.

Each agent stays focused on what it does best: logs analysis, metrics correlation, or report generation. They speak a common language that both sides understand.

If MCP defines how a model talks to tools, A2A defines how multiple intelligent components talk to each other.

Where A2A Fits

In our layer model, A2A sits in the Protocol Layer alongside MCP, but at a higher level of abstraction. While MCP is about “how do I use this tool,” A2A is about “how do we coordinate to solve this problem.”

A2A systems can use MCP underneath to talk to data sources, or they can communicate through APIs, message queues, or shared memory. The key idea is that there’s a standardized language between agents, so they can cooperate instead of just chaining prompts.

Quick Reference

Here’s how they compare:

MCP (Model Context Protocol):

Purpose: Connect models to tools and data
Analogy: Model’s “hands and eyes”
Example: “Read this file” or “Query this database”
Protocol Level: Application layer (Layer 7)
Security Focus: Tool permissions, data access

A2A (Agent2Agent):

Purpose: Connect agents to each other
Analogy: Agents’ “conversation”
Example: “I found the data, now you analyze it”
Protocol Level: Coordination layer (above application)
Security Focus: Agent identity, task authorization

Seeing the Layers in Action

Now that we’ve defined our layer model, let’s see how they work together.

Think about what happens when you ask an AI assistant: “Check if any of our production services are down and notify the team.”

Interface Layer (You):

You type the request into a terminal or chat interface

Application Layer (Orchestration):

Your application receives the message
Routes it to the coordinator agent

Protocol Layer — A2A (Agent Coordination):

Coordinator agent decides: “I need the monitoring agent”
Sends A2A message: {"to": "monitoring_agent", "task": "check_service_health"}

Protocol Layer — MCP (Tool Access):

Monitoring agent uses MCP to call: check_health_status(service="all")
MCP validates: “Does this agent have permission to query health status?”
Tool is invoked through standardized protocol

Model Layer (AI Processing):

GPT-4 or Claude processes the health data
Decides which services are critical
Formats the notification message

Infrastructure Layer (Physical Work):

API calls to your monitoring system
Database queries for service metadata
Network requests to send Slack notification
All over standard networking (TCP/IP)

The point: Each layer has a specific job. When debugging, you can ask “which layer failed?” Just like with networks, this layered thinking helps you isolate problems. The logs tell you if it’s an A2A coordination issue, an MCP permission problem, a model decision error, or an infrastructure failure.

Security, Authentication, and Resource Control

When you give an AI model the ability to act (to access files, call APIs, or delegate tasks) security becomes critical. Let me show you what can go wrong, and how MCP and A2A address it.

What Can Go Wrong

MCP Security Risk: Imagine you give an AI assistant access to your infrastructure through a poorly configured MCP connector. Without proper authorization, a prompt like “show me all configuration files” could expose secrets like API keys or database passwords. Worse, if write permissions aren’t restricted, a malicious prompt could modify firewall rules or delete production data.

A2A Security Risk: Picture an A2A system where a “deployment agent” can push code to production. If there’s no proper authentication between agents, a compromised “code-review agent” could approve malicious code, triggering an unauthorized production deployment. Or a “monitoring agent” could be tricked into sending false alerts, causing unnecessary emergency responses.

Both scenarios are prevented by the same security principles, applied at different layers.

MCP Security Model

MCP provides the framework for addressing these risks through three

mechanisms you should implement:

Authentication: Every MCP connector requires its own credentials. The model itself never holds secrets directly. Instead, the runtime environment provides scoped access (like Azure Managed Identity or temporary tokens).

Authorization: Each MCP tool defines what operations are allowed. Can this model read files? Write? Delete? Execute commands? Those policies live in the MCP server configuration, not in the model.

Audit: Every MCP tool call can be logged with the tool name, parameters, caller identity, and timestamp, though you’ll want to balance completeness with logging costs. This is your packet capture for AI systems.

A2A Security Model

A2A works similarly, providing the structure for security without enforcing it. Implementation is up to you as the engineer.

Authentication: Each agent has an identity. When Agent A sends a message to Agent B, B can verify it’s really from A, not a spoofed message.

Authorization: Agents operate under different permission scopes. The “read-only monitoring agent” can’t trigger the “deployment agent.” The system enforces this at the A2A protocol level.

Audit: Every A2A message is logged: who sent it, who received it, what task was requested, and what was the result. If an agent starts misbehaving, you can trace the entire conversation chain.

Cost as a Security Boundary

There’s one more dimension: cost. Every model call, every token processed, every agent message has a price. If you don’t limit context size or message frequency, you’ll burn through your budget before you can say “debug logs.”

But cost isn’t just about money. It’s also about preventing abuse. Rate limiting and token budgets are security controls as much as they are financial ones. An attacker who gains access to your AI system will try to rack up huge bills by making endless API calls.

Both security and cost are solved by the same thing: visibility. The moment you can trace which layer the activity belongs to (whether it’s MCP tool calls or A2A messages) you can start controlling it.

Learning Out Loud

I’m not writing this because I have all the answers. I’m writing this because teaching is how I learn.

The act of breaking things down, comparing them to what we already know, is how I build my own understanding.

MCP and A2A aren’t perfect or final, but they’re promising open standards that are making AI systems more predictable and secure. They give us mental handles, layers we can observe, audit, and refine as the ecosystem matures.

In the next article, I’ll take this exploration further. I’ll build a minimal demo using MCP connectors and A2A messages to watch how requests move across layers and where the cracks appear. We’ll look at what can go wrong, and what security measures actually help when the theory hits reality.

If AI is going to be everywhere, we might as well understand what’s moving inside it, and make sure it’s both smart and safe.

If you found this helpful, follow me for Part 2 where I’ll build a working demo of MCP and A2A in action. We’ll see what breaks and how to debug it.

Questions? Thoughts? Drop them in the comments. I’m learning this alongside you.