Valentin Monteiro

Posted on Apr 8

Claude Mythos Finds Bugs Like a Senior Dev Finds Excuses to Skip Standup

#ai #security #opensource #cybersecurity

A bug in OpenBSD. It had been there for 27 years. 27 years of code reviews, security audits, version updates. Nobody caught it. Another one in FFmpeg, 16 years old, after 5 million automated fuzzing iterations. Still standing.

Then an AI model showed up, read the code, and found them. Without being told where to look.

This is Project Glasswing. And it changes a few things.

What Glasswing Found (and Nobody Else Did)

Anthropic just published results from Claude Mythos Preview, a model specialized in vulnerability detection. The findings speak for themselves.

A flaw in OpenBSD's TCP SACK implementation dating back to 1999. A signed integer overflow allowing remote denial-of-service. The kind of bug that survived hundreds of reviews, dozens of major releases, thousands of pairs of eyes. Still there.

A defect in FFmpeg's H.264 decoder, 16 years old. A sentinel collision causing an out-of-bounds write. Automated tools never caught it. Not for lack of trying: 5 million fuzz tests. Zero results. Mythos found it by analyzing the code directly.

It didn't stop there. The model chained multiple Linux kernel vulnerabilities to build a full privilege escalation path, defeating hardened protections: stack canaries, KASLR, W^X. Not an isolated flaw. A working attack chain.

On FreeBSD, Mythos autonomously identified and exploited a 17-year-old remote code execution vulnerability in the NFS service. Unauthenticated root access. Fully autonomous. No human steering.

And then there's this: against Firefox 147, the model successfully developed JavaScript shell exploits 181 times. Claude Opus 4.6, the previous best model? Twice.

Bugs that survived decades of human review and millions of automated tests. And a model that digs them up on its own.

The key word here is "on its own". Nobody told Mythos "look in this file" or "focus on this function". It went through the code, identified the flaws, and built the corresponding exploits. Black box testing, binary analysis, full pentesting. It even reverse-engineered closed-source browsers and operating systems to find vulnerabilities through reconstructed source code analysis.

The numbers are staggering. Anthropic reports over 1,000 critical-severity vulnerabilities and thousands more high-severity bugs identified across every major operating system and every major web browser. Human validators reviewed 198 reports manually: 89% matched Claude's severity assessment exactly, 98% within one severity level.

Claude Mythos: Not a Chatbot, an Auditor

The benchmarks are brutal.

Benchmark	Claude Opus 4.6	Claude Mythos Preview
CyberGym (vuln reproduction)	66.6%	83.1%
SWE-bench Verified	80.8%	93.9%
SWE-bench Pro	53.4%	77.8%
Terminal-Bench 2.0	65.4%	82.0%

This isn't incremental improvement. On SWE-bench Pro, that's nearly +25 points. On CyberGym, +16.5 points. A generational leap, not a minor update.

For context, CyberGym was developed at UC Berkeley to evaluate AI agents on real-world cybersecurity tasks in realistic environments. Not textbook capture-the-flag exercises. Scenarios modeled on actual enterprise security operations.

And the detail that matters: Mythos doesn't just find known vulnerabilities. It discovers new ones. The model constructs sophisticated exploits autonomously: stack overflows, Return-Oriented Programming (ROP) chains with 20+ gadgets, JIT heap sprays with sandbox escapes, and privilege escalation through chaining 2 to 4 vulnerabilities together.

Nicholas Carlini, a well-known security researcher, put it simply: "I've found more bugs in the last couple of weeks than I found in the rest of my life combined."

12 Companies, $100M, and a Clear Message

Look at the launch partners: AWS, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, Linux Foundation, Microsoft, NVIDIA, Palo Alto Networks. And Anthropic.

Companies that compete on pretty much everything. United around the same project. When Apple and Google sit at the same table, the topic is serious.

Simon Willison noted one notable absence: OpenAI. Whether that's politics or timing, it's worth watching.

Anthropic is putting $100M in Mythos Preview usage credits toward research. $2.5M for Alpha-Omega and OpenSSF through the Linux Foundation. $1.5M for the Apache Software Foundation. Beyond the 12 launch partners, over 40 additional organizations maintaining critical software have been granted access.

This isn't an empty press release. It's real funding for open source security.

The reason for the urgency? The window between vulnerability discovery and exploitation has collapsed. What used to take months now happens in minutes with AI. Greg Kroah-Hartman, a senior Linux kernel maintainer, observed the shift firsthand: "Something happened a month ago, and the world switched. Now we have real reports." Security teams went from filtering AI-generated "slop" to dealing with genuinely sophisticated vulnerability findings overnight.

If attackers have access to these capabilities (and they will), defense needs a head start. No choice.

Safety: Why You Can't Use Mythos (Yet)

Anthropic was clear: no general release planned. This is deliberate.

The system card details why. Over 99% of discovered vulnerabilities remain unpatched at time of publication. Releasing a model that can find and exploit them autonomously would be handing attackers a loaded weapon.

Access goes through two channels:

The Claude for Open Source program for eligible organizations
A Cyber Verification Program for security professionals whose legitimate work is affected by safeguards

Partners use Mythos exclusively for finding and fixing vulnerabilities in their own software or in open-source projects they maintain. All discovered vulnerabilities go through coordinated disclosure with SHA-3 cryptographic commitments and a 90+45 day window. Professional human triagers validate findings before vendor notification to prevent maintainer flooding.

Announced pricing for participants: $25/$125 per million input/output tokens, via Claude API, Amazon Bedrock, Google Cloud Vertex AI, and Microsoft Foundry.

Simon Willison's take: "I can live with that." The security risks are credible. Giving trusted teams preparation time is justified.

What This Changes for Devs

Even without direct access, the direction is set. What Mythos does today in preview, mainstream models will do tomorrow. That means:

Automated code auditing will become a standard. Not a nice-to-have. A mandatory step in CI/CD, just like unit tests. If a model can find a 27-year-old bug in a few passes, the "we did a code review" excuse won't hold up anymore.

Supply chain security moves to the front. Glasswing is already working on update automation, supply chain security standards, and automated patching frameworks. Those dependencies nobody audits in your package.json? That's becoming less and less acceptable.

Memory-safe languages won't save you either. Mythos found a guest-to-host memory corruption vulnerability in a virtual machine monitor written in a memory-safe language. Logic bugs don't care about your type system.

The uncomfortable question stays open. If AI finds zero-days faster than any human, who gets there first? The defender who patches, or the attacker who exploits? Glasswing is a bet on defense. But it's a race, and it's only getting started.

Sources

Official Anthropic Resources

Project Glasswing: Securing critical software for the AI era (Anthropic)
Claude Mythos Preview: Technical Report (Anthropic Red Team)
Alignment Risk Update: Claude Mythos Preview (Anthropic System Card)

Press Coverage

Analysis & Commentary

Anthropic's Project Glasswing: restricting Claude Mythos to security researchers sounds necessary to me (Simon Willison)
Claude Mythos Benchmarks Explained: 93.9% SWE-bench & Every Record Broken (NxCode)
What Is Inside Claude Mythos Preview? Dissecting the System Card (Ken Huang)
Everything You Need to Know About Claude Mythos (Vellum)
Hacker News Discussion: System Card (Hacker News)

Top comments (1)

Rahul Joshi • Apr 21

Fascinating read—this really shows how AI is shifting from just assisting developers to actively uncovering deep, long-standing vulnerabilities humans and tools missed. At the same time, it raises an important question: if AI can find and exploit bugs this effectively, the balance between security and risk is about to get much more intense