DEV Community

Damien Gallagher
Damien Gallagher

Posted on • Originally published at buildrlab.com

Anthropic's 'Claude Mythos' Leaked: A Step Change in AI Power — and Risk

Anthropic has been caught with its drafts down. A data leak this week exposed the existence of a powerful new AI model called Claude Mythos — and according to Anthropic itself, it represents "a step change" in performance. The company confirmed the model exists, is in early access testing, and is "the most capable we've built to date."

How It Got Out

The leak wasn't exactly a hack. A "human error" in the configuration of Anthropic's content management system left roughly 3,000 unpublished blog assets sitting in a publicly searchable data store — including what appeared to be a full draft announcement post for the new model. Security researchers Roy Paz (LayerX Security) and Alexandre Pauwels (University of Cambridge) separately discovered and reviewed the material before Fortune reported on it Thursday.

Anthropic quickly removed public access to the data store after being contacted by Fortune. The company characterised the exposed content as "early drafts of content considered for publication." That's one way to put it.

What Is Mythos — and What Is Capybara?

The leaked draft post refers to the model using two names: Claude Mythos and Capybara. The latter appears to be the internal tier name. According to the draft, Capybara represents an entirely new tier sitting above Opus in Anthropic's model hierarchy — something the company has never had before.

Anthropic currently offers three model tiers — Haiku (fast and cheap), Sonnet (balanced), and Opus (most capable). Capybara/Mythos would sit above all of them: larger, more intelligent, more expensive. The draft described it as dramatically outperforming Claude Opus 4.6 on benchmarks — though the specific scores were cut off before the cache was pulled.

If that holds, this is a significant jump. Claude Opus 4.6 is already among the top models in production today. A "step change" above that would put Anthropic solidly ahead of current public benchmarks from OpenAI, Google, and Meta.

The Cybersecurity Warning

Here's where it gets interesting — and a little uncomfortable. According to the leaked draft, Anthropic's own post warned that Mythos poses unprecedented cybersecurity risks. The company was apparently planning to be upfront about this, framing it as part of responsible disclosure around a dramatically more capable model.

That kind of self-flagellation from a frontier AI lab is not nothing. It suggests Anthropic ran its own risk evaluation on the model and found capabilities that crossed some internal threshold worth flagging publicly. Given that we're talking about a model they say surpasses their previous best by a meaningful margin, the concern is credible.

This aligns with the broader debate in AI safety circles about what to do when you build something genuinely powerful. Anthropic's approach has historically been to publish before releasing, to give the security community time to prepare. Whether Mythos will follow that pattern remains to be seen.

What This Means for the Race

The timing is worth noting. OpenAI has been shipping rapidly. Google has been pushing Gemini Ultra into enterprise aggressively. Meta's Llama 4 is threatening the open-source ecosystem. Anthropic has been somewhat quieter publicly — but apparently not in the lab.

A model that meaningfully leapfrogs current Opus performance would be a significant commercial play. Anthropic is also clearly thinking enterprise — the leaked cache also revealed details of a planned invite-only CEO summit in Europe, part of a drive to sell to large corporate customers. Mythos arriving alongside that sales push is no coincidence.

For developers and engineers watching the space: pay attention to the Capybara tier. If Anthropic's internal benchmarks are even half as impressive as the leak implies, the next few months could shift what "state of the art" means again.

The Irony

There's a particular irony in a company known for safety-first AI development accidentally leaking its most capable model via a misconfigured data store. The lesson here isn't really about AI — it's the same lesson we've been learning since the cloud era began: security configuration errors are the silent killer of enterprise credibility.

Anthropic's transparency in confirming the leak and the model's existence is, at least, in character. Now the question is: when does Mythos actually ship?


Sources: Fortune (exclusive reporting), LayerX Security, University of Cambridge cybersecurity research.

Top comments (0)