Om Shree

Posted on Apr 22

Anthropic's Most Dangerous Model Just Got Accessed by People Who Weren't Supposed to Have It

#ai #security #discuss #claude

Accessed via predictable URL patterns

Anthropic built a model so dangerous they refused to release it publicly. Then a Discord group got in anyway.

The Model They Wouldn't Ship

Claude Mythos Preview is Anthropic's most capable model to date for coding and agentic tasks. Anthropic But it was never meant to reach the public. During testing, Mythos improved to the point where it mostly saturated existing cybersecurity benchmarks, prompting Anthropic to shift focus to novel real-world security tasks — specifically zero-day vulnerabilities, bugs that were not previously known to exist. Anthropic

What they found was stark. Mythos Preview had already identified thousands of zero-day vulnerabilities across critical infrastructure — many of them critical — in every major operating system and every major web browser. Anthropic In one documented case, Mythos fully autonomously identified and exploited a 17-year-old remote code execution vulnerability in FreeBSD that allows anyone to gain root on a machine running NFS. No human was involved in either the discovery or exploitation of this vulnerability after the initial request to find the bug. Anthropic

This is why the model never went public.

Project Glasswing: The Controlled Release

Announced on April 7, Mythos was deployed as part of Anthropic's "Project Glasswing," a controlled initiative under which select organizations are permitted to use the unreleased Claude Mythos Preview model for defensive cybersecurity. Yahoo!

Launch partners included Amazon Web Services, Anthropic, Apple, Broadcom, Cisco, CrowdStrike, Google, JPMorganChase, the Linux Foundation, Microsoft, NVIDIA, and Palo Alto Networks. Access was also extended to over 40 additional organizations that build or maintain critical software infrastructure. Anthropic The logic was clear: get defenders ahead of the curve before the capabilities proliferate to actors who won't use them carefully.

Claude Mythos Preview is available to Project Glasswing participants at $25/$125 per million input/output tokens, accessible via the Claude API, Amazon Bedrock, Google Cloud's Vertex AI, and Microsoft Foundry. Anthropic committed $100M in model usage credits to cover Project Glasswing throughout the research preview. Anthropic

The perimeter was tight by design. The news today is that it didn't hold.

How the Discord Group Got In

A "private online forum," the members of which have not been publicly identified, managed to gain access to the tool through a third-party vendor. The unauthorized group tried a number of different strategies to gain access to the model, including using "access" enjoyed by a person currently employed at a third-party contractor that works for Anthropic. TechCrunch

Members of the group are part of a Discord channel that seeks out information about unreleased AI models. The group has been using Mythos regularly since gaining access to it, and provided evidence to Bloomberg in the form of screenshots and a live demonstration of the software. TechCrunch

The method they used to find the endpoint is particularly revealing. The group, which gained access on the very same day Mythos was publicly announced, "made an educated guess about the model's online location based on knowledge about the format Anthropic has used for other models." TechCrunch This wasn't a sophisticated breach — it was pattern recognition applied to a known naming convention. The group reportedly described themselves as being interested in exploring new models, not causing harm.

Anthropic said it is investigating the claims and, so far, has seen no sign that its own systems were affected — the allegation points to possible misuse of access outside Anthropic's core network, not a confirmed breach of the company's internal defenses. Prism News

Why This Is a Bigger Deal Than It Looks

The immediate reassurance — no core systems compromised, the group wasn't malicious — is accurate but beside the point. The problem isn't what this specific group did. It's what this incident reveals about the entire premise of Project Glasswing.

Anthropic's controlled release strategy rests on the assumption that access can be meaningfully gated through vendor relationships. A small group of unauthorized users reportedly accessed Mythos on the same day Anthropic announced limited testing Prism News — meaning the access controls failed within hours of the first public announcement, before most Glasswing partners had even begun their work. If the group could guess the model's endpoint from Anthropic's known URL patterns, so can threat actors with more resources and worse intentions.

There's also a pattern here worth naming. This is the third significant information control failure at Anthropic in recent weeks. The Claude Code source leak in March exposed 512,000 lines of unobfuscated TypeScript via a missing .npmignore entry. Before that, a draft blog post describing Mythos as "by far the most powerful AI model" ever built at Anthropic was left in a publicly accessible data store. That March 26 leak of draft materials — which Anthropic said resulted from human error in its content-management configuration — was actually Mythos's first public exposure. Prism News

Then there's the government subplot. The National Security Agency is using Mythos Preview despite top officials at the Department of Defense — which oversees the NSA — insisting Anthropic is a "supply chain risk." The department moved in February to cut off Anthropic and force its vendors to follow suit. The military is now broadening its use of Anthropic's tools while simultaneously arguing in court that using those tools threatens U.S. national security. Axios Meanwhile, CISA — the agency whose entire mandate is critical infrastructure protection — reportedly does not have access to the model. Axios

The entity designed to defend critical systems can't get in. A Discord group can.

What Anthropic Actually Said

"We're investigating a report claiming unauthorized access to Claude Mythos Preview through one of our third-party vendor environments," an Anthropic spokesperson said. The company found no evidence that the supposedly unauthorized activity impacted Anthropic's systems at all. TechCrunch

That's a factually careful statement. It's also a familiar shape: acknowledge the narrow, deny the broader implication. Anthropic has been here before.

The Vendor Problem Nobody Wants to Solve

The deeper structural issue is that enterprise AI deployments at frontier capability levels require trust chains that extend across dozens of organizations. Anthropic's 40-organization Glasswing rollout means 40 distinct security postures, 40 sets of contractors, and 40 potential lateral entry points for anyone who knows what they're looking for.

Anthropic said it does not plan to make Mythos Preview generally available, but its eventual goal is to enable users to safely deploy Mythos-class models at scale — for cybersecurity purposes, but also for the myriad other benefits that such highly capable models will bring. Simon Willison That goal is legitimate. But reaching it requires solving vendor access governance at a level the industry hasn't had to reckon with before. This incident is an early indication of what the stakes look like when the effort falls short.

A model capable of finding zero-days in every major operating system and browser has now been accessed by people outside the intended perimeter. The question isn't whether the Discord group caused harm. It's whether the perimeter can hold when the people on the other side are actually trying.

The line between "interested in playing around" and "interested in breaking things" isn't enforced by intent. It's enforced by access controls. Anthropic's have now failed twice in the same month.

Follow for more coverage on MCP, agentic AI, and AI infrastructure.

Top comments (24)

IntSpired® • Apr 22

Not surprising. When access depends on vendor chains and predictable patterns, the perimeter’s already soft. This is an access governance issue, not a model issue.

Om Shree • Apr 22

Thanks Sir !
Loved your Insights!!!

Ingo Steinke, web developer • Apr 22 • Edited

If the Mythos AI was as good as Anthropic claimed at finding securitry issues, why didn't they use it to test and fix its own instance as well?

Rahul S • Apr 27

It's a fair question but tbh these are categorically different problems. Mythos finds code-level zero-days — buffer overflows, use-after-free bugs, logic flaws in protocol implementations. The access failure here wasn't a software vulnerability at all. Someone guessed a URL based on Anthropic's naming convention. That's an OPSEC failure, same class as naming your S3 bucket company-prod-backup and leaving it public. No vulnerability scanner — no matter how sophisticated — catches "we named the endpoint something predictable." It's a human process failure, not a code failure. The irony is real though. You'd think an org building a model that autonomously roots FreeBSD boxes would have someone on staff who says "hey maybe don't use the model codename in the production URL."

Ingo Steinke, web developer • Apr 27

If no scanner catches that then those people are screwed on a much more basic level. They can get hacked with or without artificial intelligence because they lack natural intelligence or so it seems.

Om Shree • Apr 28

Thanks Sir !
Loved your Insights!!!

Om Shree • Apr 27

Thanks Sir !
Loved your Insights!!!

Om Shree • Apr 22

I guess Marketing Strategy, after being in this industry for last 4 years, i can say they are always 2 steps ahead !!!

PEACEBINFLOW • Apr 22

The detail that sticks with me isn't the breach itself—it's the method. An educated guess about URL patterns. Not a zero-day. Not a sophisticated attack chain. Just someone noticing that Anthropic names things consistently and trying the obvious permutation.

That's the kind of failure that feels almost inevitable once you hear it. Security through obscurity works until someone bothers to look. And the people who bother to look are exactly the ones you don't want finding the thing. A Discord group of enthusiasts got there first. Next time it might be someone with a different agenda and better opsec.

What this exposes is a structural tension in how frontier models are being deployed. The partnership model—40 trusted organizations, carefully vetted—assumes that each of those 40 organizations has perimeter security as good as Anthropic's. They don't. They can't. One vendor with a contractor who leaves an API key in a public repo, and the whole containment model unravels. The attack surface scales with the partner list.

The CISA detail is the part that feels like a policy failure hiding inside a technical story. The agency responsible for defending critical infrastructure against exactly the kind of vulnerabilities Mythos can find... doesn't have access. But a third-party contractor's employee apparently did. That's not a security decision. That's an allocation decision dressed up as one.

What I keep wondering is whether controlled release at this capability level is even a coherent concept. If the model can find zero-days autonomously, the value of access is asymmetric. Defenders might patch a few things. An attacker with access gets a permanent research advantage. The containment has to be perfect. And perfect containment across 40 organizations with different security cultures, different contractor policies, and different levels of discipline about credential hygiene... that's a lot to ask. At what point does the controlled release model stop being a safety measure and start being a liability distribution system?

Om Shree • Apr 22

Thanks Sir !
Loved your Insights!!!

Cyber Daemon • Apr 22

Back in the 90s, we called this "guessing the FTP directory was named /secret." Today, it's a "third-party vendor perimeter breach." It is so deeply, beautifully human that we can build an AI capable of exploiting a 17 yr. old FreeBSD vulnerability autonomously, but we still can't patch the human urge to name a production endpoint /api/v1/mythos-preview-super-secret.

And the .npmignore leak from last month? Chef's kiss. I've taken down prod with a bad regex before, so I have zero stones to throw, but man... you build a titanium bank vault and leave the key under a plastic rock.

The NSA and DoD fighting over the toy while CISA is locked outside in the rain is just the cherry on top. Never change, tech industry. Keep your URLs weird and your .npmignore files updated, kids! Great read, btw!

Om Shree • Apr 22

Loved your Insights!!!

Harsh • Apr 22

This is concerning if true. Do you have a source or link to the original report?

Not doubting you just that dangerous model accessed by unauthorized people is a serious claim. Would love to read more about how it happened and what Anthropic has said about it.

If this is real, it raises some uncomfortable questions about model security, access controls, and how dangerous these models actually are when they're out in the wild.

Thanks for sharing following this closely.

Om Shree • Apr 22

Thanks Sir !
All links are attached as hyperlinks in the article only, please click them to get redirected

Matthew O. Persico • Apr 23

Security By Obscurity is Not.

All those companies, all those people at all those companies, and we thought there would be no leaks?

Om Shree • Apr 24

🙃🙂

Laura Ashaley • Apr 26

A reminder that AI safety and access control matter just as much as model capabilitypowerful systems need strong safeguards.

Om Shree • Apr 26

Surely Ma'am!!

Anna kowoski • Apr 22

Great Article Om!

Om Shree • Apr 22

Thanks Ma'am!
Glad you liked it!!!

Daniel Visovsky • Apr 27

The whole 'trust the vendor chain' approach falls apart the second you have 40 partners with 40 different security postures. This is just scope creep for access governance, not an AI safety story.

Om Shree • Apr 27

Thanks Sir !
Loved your Insights!!!

Glasswing • Apr 22

Very interesting point.

Om Shree • Apr 22

Thanks Sir!
Glad you liked it!!!

View full discussion (24 comments)