Marcus Rowe

Posted on May 6 • Originally published at techsifted.com

Anthropic Claude Code Security Public Beta: What Developers Need to Know (2026)

#anthropic #claudecode #security #vulnerabilityscanning

Anthropic quietly moved one of its more interesting projects out of closed preview this week. Claude Security — the AI-powered code vulnerability scanner built into Claude Code — is now in public beta.

This is a different story from the MCP security flaw we covered last week. That was about a vulnerability in Anthropic's protocol. This is about a tool Anthropic built to find vulnerabilities in your code. Two separate things, and worth keeping straight.

The public beta opened April 30. Enterprise customers can access it now. Team and Max plan customers are coming soon. If you're a developer who uses Claude Code and you care about security — and you should — here's what you need to know.

What Is Claude Security, Exactly?

Claude Security is a codebase vulnerability scanner that uses Claude Opus 4.7 to reason about your code the way a human security researcher would. Not pattern matching. Not a CVE database lookup. Actually reading and thinking about your code.

The pitch is that traditional security tools — your SASTs, your dependency scanners — look for known patterns. They're great at catching the stuff they've been trained to look for: common SQL injection shapes, known vulnerable library versions, textbook XSS patterns. But they miss things that require understanding how your code actually works. Multi-hop vulnerabilities. Logic flaws. Cases where a vulnerability only exists because of how component A interacts with component B three function calls deep.

Claude Security traces data flows. It follows how input moves through your codebase, where trust boundaries exist, what assumptions the code is making. The goal is catching the stuff that rule-based tools miss because the vulnerability isn't in the shape of the code, it's in the logic.

Anthropic's own team used the tool (built on the earlier Opus 4.6 model) to find over 500 vulnerabilities in production open-source codebases — bugs that had apparently gone undetected for years, through multiple expert reviews. In February, it found 22 vulnerabilities in Firefox in a single month. For context: in all of 2025, there weren't any single months that came close to that number from any individual reporter.

Are all those findings exploitable in practice? Hard to say from the outside. But the numbers are striking enough that they're worth taking seriously.

From Closed Preview to Public Beta: What Changed

Anthropic announced Claude Code Security in February 2026 as a limited research preview. It was invite-only — Enterprise and Team customers could apply, open-source maintainers could request expedited access, and everyone else was on a waitlist. Hundreds of organizations tested it during that window.

The public beta changes access meaningfully. Claude Enterprise customers can sign up right now at claude.ai/security without waiting for a formal invite. Team and Max customers are on deck, with broader availability expected in the coming weeks.

The feature set also expanded based on preview feedback. A few things Anthropic added:

Scheduled scans. You can now set recurring scans on repositories rather than triggering them manually each time. Useful for teams who want ongoing monitoring without thinking about it.

Directory targeting. Instead of scanning a whole repository at once, you can scope a scan to a specific directory or branch. Handy for large monorepos where scanning everything is overkill.

Dismissed findings with audit trail. Previously, there was no good way to say "we know about this, it's not a real issue, don't flag it again." Now you can dismiss findings with documented reasoning — which also creates an audit record, useful if you're in a regulated industry.

Export formats. CSV and Markdown export, plus webhook integration with Slack, Jira, and similar tools. Findings can flow into your existing ticketing system rather than living only inside Claude.ai.

The core of the product is unchanged: Claude Opus 4.7 reasoning over your code, confidence ratings on each finding, severity assessments, and the ability to apply fixes directly inside a Claude Code session.

How It Actually Works

The workflow is more straightforward than I expected. You access it at claude.ai/security or through the sidebar in Claude Code's web interface. You connect a repository, scope the scan to whatever part of the codebase you care about, and run it.

The output isn't just a list of "here's a problem." Each finding includes:

What the vulnerability is and where it lives
How it could be exploited
A confidence rating (so you know whether Claude is fairly certain or speculating)
Severity assessment
Reasoning for why this is actually a problem, not just a pattern match

That last piece matters. One of the frustrations with traditional SAST tools is alert fatigue. You get 200 findings, half of which are false positives or things you've already accepted, and the team starts ignoring the output. Anthropic's multi-stage verification — where the model essentially tries to argue against its own findings before surfacing them — is an attempt to reduce noise.

Once you've got a finding you want to fix, you can apply the patch from within the same Claude Code session. No "now go file a Jira ticket and wait for engineering" loop. Anthropic's framing is "going from scan to applied patch in a single sitting," and from the preview feedback, that sounds like it actually works that way in practice.

Snowflake's security team was one of the early testers. Their assessment: "Claude Security surfaced novel, high-quality findings during our early testing." That's a measured quote from an early adopter, not a press release superlative. I'll take it.

How Does It Stack Up Against Snyk and GitHub Advanced Security?

This is where I want to be careful, because I haven't run Claude Security head-to-head against these tools myself. What I can do is lay out the structural differences.

GitHub Advanced Security (GHAS) uses CodeQL, a semantic analysis engine that converts code to a queryable database and runs rule-based queries against it. It's excellent at finding well-understood vulnerability classes and integrates deeply into the GitHub Actions workflow. The limitation is that it's fundamentally rule-based — you can only find what someone has written a CodeQL query for. It also doesn't reason about your application's specific logic.

Snyk is primarily a dependency vulnerability scanner that's expanded into SAST. It's very good at catching known vulnerable packages and license issues. Its SAST features have improved, but it still leans heavily on pattern matching over reasoning. The strength is its massive database of known issues and its developer-friendly DX. The weakness is the same as GHAS: novel vulnerabilities that don't match known patterns are likely to slip through.

Claude Security is betting on a different approach: trading the speed and precision of rule-based detection for the reasoning ability of a large language model. It should theoretically find things the other tools miss. The risk is the opposite failure mode — reasoning systems can generate confident-sounding false positives, and a tool that causes too much alert noise gets ignored.

The embedded cyber guardrails Anthropic built into Opus 4.7 are supposed to prevent the model from being misused for offensive purposes — you can ask Claude Security to find vulnerabilities in your code, but it shouldn't help you weaponize those findings or pivot to attacking systems you don't own. That's a meaningful design choice given the obvious dual-use risks.

My honest read: these tools aren't really in direct competition. GHAS and Snyk are table stakes at most engineering organizations. Claude Security is playing in a different tier — catching the complex, context-dependent stuff that falls through. The realistic use case is running Claude Security alongside your existing tooling, not instead of it.

The security partnerships Anthropic announced back up that framing. CrowdStrike, Microsoft Security, Palo Alto Networks, SentinelOne, TrendAI, and Wiz are all integrating Opus 4.7 for security use cases. Those companies aren't replacing their existing detection stacks — they're adding reasoning capability on top.

Types of Vulnerabilities It's Designed to Catch

Based on what Anthropic's published and the preview findings, the categories include:

Authentication and authorization flaws. Cases where the logic of who-can-do-what breaks down under specific conditions that pattern matching wouldn't catch.

Injection vulnerabilities. SQL injection, command injection, XSS — but specifically the non-obvious versions, where the vulnerability only exists due to how data flows through multiple functions.

SSRF and DNS rebinding. The kind of vulnerabilities where an attacker can make your server make requests on their behalf. These are notoriously hard to catch with static analysis because they depend on understanding the full request lifecycle.

Business logic vulnerabilities. The hardest category. These are flaws that exist not in the code's syntax but in its intended behavior — cases where the code does exactly what it's told, but what it's told to do creates a security problem. Rule-based tools essentially can't catch these. Reasoning-based tools have a shot.

Long-dormant bugs. The Firefox findings and the 500+ open-source vulnerabilities suggest Claude Security is especially good at finding things that have survived for years through multiple audits. These tend to be subtle, context-dependent issues that humans reviewing the code wouldn't flag because they're not obviously wrong.

Who Should Actually Use This

Enterprise engineering teams dealing with complex codebases and security requirements. If you're running Claude Code at the Enterprise tier, there's no reason not to try it — it's included in your subscription, and the setup is minimal.

Open-source project maintainers. Anthropic has explicitly carved out expedited access for OSS maintainers, presumably because the Firefox results made it clear that this category of user gets outsized value. If you're maintaining a widely-used open-source library, especially one that handles user input or network requests, this is worth a serious look.

Security teams at organizations with regulated requirements. The audit trail for dismissed findings, the CSV export, the Jira webhook — these are features that make a security team's life easier when they're writing reports for compliance purposes.

Who should probably wait: individual developers on Team or Max plans (access isn't available yet), and teams at small companies without dedicated security resources. The tool is optimized for enterprise-scale repositories and produces findings that require security expertise to triage effectively. If you don't have someone who can evaluate a "medium confidence SSRF finding in your authentication middleware," the output might create more confusion than clarity.

It's also worth being clear about what Claude Security isn't. It's not running your application. It's not doing dynamic analysis or fuzzing. It's reasoning about static code. The class of vulnerabilities that only manifest at runtime — race conditions under specific load patterns, environment-dependent behavior — are out of scope.

The Bottom Line

Anthropic's been making credible, verifiable claims about what this tool finds. The preview results with Firefox and open-source repositories aren't marketing — those are documented findings that researchers have validated.

The public beta is still a beta. The tool is going to have false positives. The reasoning-based approach trades precision for coverage in ways that experienced security engineers will sometimes find frustrating. The access restrictions (Enterprise now, Team/Max coming) limit who can evaluate it firsthand.

But if you're already paying for Claude Enterprise, setting up a scan on your most critical repository costs you maybe 20 minutes. Given what the tool found in open-source code that had survived years of expert review, that's a reasonable trade.

The serious question isn't whether Claude Security is a useful tool. Based on the evidence, it clearly is. The question is whether it fits into your workflow at your team's scale. For most enterprise engineering orgs: yes. For indie developers and small teams: wait for broader availability and cleaner DX.

Source: Anthropic blog post on Claude Security public beta. Original limited research preview announced February 20, 2026 at anthropic.com.