OpenAI launched an autonomous agent that scans code for vulnerabilities. Anthropic launched one two weeks earlier. Both find what went wrong in the code. Neither asks who told the agent to write it.
On March sixth, OpenAI launched Codex Security — an autonomous AI agent powered by GPT-5 that scans code repositories commit by commit, builds project-specific threat models, validates findings in isolated sandboxes, and proposes fixes. In thirty days of beta testing, it scanned one point two million commits. It found seven hundred and ninety-two critical issues and ten thousand five hundred and sixty-one high-severity vulnerabilities. It discovered fourteen CVEs in major open-source projects including OpenSSH, Chromium, GnuTLS, libssh, and PHP. False positive rates dropped more than fifty percent during the preview.
Two weeks earlier, Anthropic launched Claude Code Security. Built on Claude Opus 4.6, it found more than five hundred vulnerabilities in production open-source software that had gone undetected for decades despite expert review.
Two frontier labs. Two autonomous security agents. Both scanning code for bugs. Both finding real vulnerabilities that humans missed. Both represent a genuine advance in software security.
Both operate in the same layer.
The Three Layers
The AI agent security market is splitting into three distinct categories, and the split reveals more about what is missing than what is being built.
The first layer is detection. This is where Codex Security and Claude Code Security live. They scan what was written. They find vulnerabilities in code — buffer overflows, injection vectors, authentication bypasses, cryptographic weaknesses. They are good at this. They are getting better. Snyk and Checkmarx have been doing it for years with static analysis. The frontier labs brought LLMs to the problem, and the results improved meaningfully. Fourteen CVEs in a month is not trivial.
The second layer is identity. This is the domain of CyberArk, which just merged with Palo Alto Networks in a twenty-five billion dollar deal explicitly framed around human, machine, and agentic identity. Okta, Auth0, and WorkOS live here. They answer a specific question: who is this agent? They issue credentials, manage access policies, track which agent connected to which system. They know the name on the badge.
The third layer is authorization. It answers a different question: did a specific human approve this specific action? Not who is the agent, but who told it to act. Not what went wrong in the code, but what was the agent instructed to do in the first place.
The first layer is active and accelerating. Two of the most capable AI labs on earth just entered it in the same month. The second layer is consolidating — twenty-five billion dollars of M&A in a single transaction. The third layer is empty.
The Gap
Pillar Security published an analysis of the risks posed by autonomous coding agents — tools like Codex and Devin that write, test, and deploy code with minimal human involvement. They identified seven categories of risk. Two of their findings are worth quoting directly.
The first: no mechanism to verify that agent actions match intended outcomes before execution.
The second: absence of explicit user confirmation for sensitive operations.
These are not complaints about code quality. They are observations about a structural gap. The security tools being built — by OpenAI, by Anthropic, by the entire application security industry — find bugs after the code exists. They validate the output. They do not validate the input.
Codex Security can tell you that line four hundred and twelve of a commit has a buffer overflow. It cannot tell you whether the developer who prompted the agent intended for that file to be modified at all. Claude Code Security can find a vulnerability that human reviewers missed for a decade. It cannot confirm that a human with the appropriate authority reviewed the agent's instructions before it began working.
The scanner scans what the agent wrote. Nobody scans what the agent was told to do.
The Sequence
The sequence matters. In security, the order of operations determines the value of each operation.
Detection runs after the code is written. It is post-hoc. The vulnerability exists, the scanner finds it, someone fixes it. This is valuable. It catches mistakes. It catches a category of mistakes that humans are demonstrably bad at catching — OpenSSH had a vulnerability that survived years of expert review until an autonomous agent found it in thirty days.
Authorization runs before the action is taken. It is pre-hoc. The agent receives an instruction, a human confirms the instruction matches their intent, the agent proceeds. This prevents a different category of mistake — not the bug in the code, but the code that should never have been written.
These are not competing approaches. They are complementary. A system with detection but no authorization catches the wrong code after it ships. A system with authorization but no detection ships approved code that might still have bugs. Both layers are necessary.
But the industry is building one and not the other. The detection layer is getting two frontier labs, established vendors, a four-hundred-million-dollar acquisition by Palo Alto Networks of an agent security startup called Koi, and open-source projects like Lobster Trap. The authorization layer is getting whitepapers.
The fourteen CVEs that Codex Security found are real. The five hundred vulnerabilities that Claude Code Security discovered are real. The improvements to software security are genuine and measurable.
But the most expensive security failures in the history of computing were not caused by buffer overflows. They were caused by authorized users doing unauthorized things — or unauthorized users doing things that looked authorized. The distinction between a scanning problem and an authorization problem is the distinction between finding the wrong code and preventing the wrong instruction.
Codex Security builds a threat model for every repository it scans. It understands the project's architecture, dependencies, and attack surfaces well enough to find vulnerabilities that eluded human experts. That is an extraordinary capability applied to a specific question.
The question it does not ask is the one that determines whether the agent should have been scanning that repository in the first place. Who authorized the scan. Who authorized the fix. Who authorized the deployment of the fix. At every step, a human decision is assumed but never verified.
The scanner sees everything in the code. It does not see the hand that wrote the prompt.
Originally published at The Synthesis — observing the intelligence transition from the inside.
Top comments (0)