Wanda

Posted on Apr 8 • Originally published at apidog.com

Claude Mythos: Anthropic says this model is too dangerous to release

TL;DR

Claude Mythos Preview is a restricted Anthropic model available via Project Glasswing—a cybersecurity-focused preview, not a public release. Benchmarks suggest it’s much stronger than Claude Opus 4.6 for software engineering tasks, but Anthropic hasn’t made it widely available, likely due to dual-use risks: powerful models can help both defenders and attackers.

Try Apidog today

Introduction

Most AI labs claim to take safety seriously, but few actually delay rolling out new models for safety reasons. That’s what makes Claude Mythos Preview notable—Anthropic has not launched it as a regular Claude release, nor offered a public API or chat interface. Instead, it’s surfaced via Project Glasswing, a limited program focused on defensive cybersecurity.

The attached benchmarks are significant: Claude Mythos Preview reportedly outperforms Claude Opus 4.6 by a wide margin on coding tasks like SWE-Bench. If these numbers hold, Anthropic may already have a model that shifts the balance in cyber defense and offense.

💡 Pro Tip: If you’re building around AI model releases, API-first testing is critical. Labs often give early access via restricted programs before public rollout. Apidog helps you model upcoming API flows, mock restricted endpoints, and test integrations before public access.

What is Claude Mythos Preview?

According to current reporting, Claude Mythos Preview is an unreleased Anthropic model, available only to selected cybersecurity partners and researchers.

This is not a typical Claude launch like Sonnet or Opus. Instead, it’s a tightly controlled preview model, with access limited to specific defensive cybersecurity use cases. Reuters notes that Anthropic is working with partners such as Amazon, Microsoft, Apple, Google, Nvidia, CrowdStrike, and Palo Alto Networks through Project Glasswing, focusing on security—not mass-market access.

Right now, the most accurate description: Claude Mythos Preview is a restricted-access model for defensive cybersecurity work, not a public Claude release.

Why the Model is Getting So Much Attention

The reason is the reported benchmarks—numbers that are extremely high compared to previous Claude models.

Benchmark	Claude Mythos Preview	Claude Opus 4.6
SWE-Bench Verified	93.9%	80.8%
SWE-Bench Pro	77.8%	53.4%

If these results are accurate, this is a substantial upgrade. SWE-Bench tests a model’s ability to read repositories, understand issues, modify code, and solve tasks under realistic conditions. A jump of this size indicates Anthropic has made significant progress on coding-heavy and agentic tasks.

The bigger story: Anthropic may already be running this model, but is choosing not to release it publicly.

Why Anthropic Might Be Keeping Claude Mythos Private

The main reason is dual-use risk.

A model powerful enough to help defenders find vulnerabilities, analyze attack paths, review unsafe code, and automate remediation can also empower attackers. The same features that help a blue team can help a red team, too.

This risk increases with capabilities like:

Repository-scale code understanding
Autonomous tool use
Vulnerability reproduction
Long-horizon problem solving
Chaining many actions without losing context

These are the features desired for advanced coding agents—but they also raise cybersecurity concerns.

Recently, Anthropic has hinted that new frontier models may require more targeted rollouts. Claude Mythos Preview appears to be the leading example: restricted access, feedback from vetted users, and then a decision on broader release.

What Project Glasswing Means

Project Glasswing is the context behind Claude Mythos Preview.

The aim isn’t “here’s a better model for everyone”; it’s “here’s a better model, but only trusted defensive partners can use it right now.” This changes the product strategy completely.

Instead of a consumer launch driven by growth, the focus is on controlled evaluation: test what the model can do for defenders, assess misuse risks, and evaluate the effectiveness of safeguards.

This approach could signal a new industry norm: the strongest models may first appear in limited, sector-specific deployments, not public releases.

Is Claude Mythos Stronger Than Opus 4.6?

Based on reported benchmarks, it appears so.

Claude Mythos Preview is reportedly well ahead of Opus 4.6 on SWE-Bench software engineering tasks.
Anthropic is treating it as a higher-risk model.
It’s not being rolled out like a standard Claude release.

However:

It’s not proven to be better across every category.
We don’t know if all comparison conditions were identical.
Public users may not see the same gains as restricted partners.

Summary: Claude Mythos Preview seems materially stronger than Claude Opus 4.6 on key coding tasks, and Anthropic is restricting access due to risk.

What This Means for Developers

You probably can’t use Claude Mythos today, but it’s important for several reasons:

1. Public Claude Models May Not Reflect the Real Frontier

The best public Claude may be much weaker than Anthropic’s best internal models. Claude Mythos Preview suggests a bigger capability gap than many developers assumed.

2. Cybersecurity Capabilities Are Now the Release Bottleneck

Model quality isn’t the only release constraint. If a model’s offensive misuse risk is too high, labs may hold it back, regardless of performance.

3. The Strongest Models May Launch Through Restricted Programs

Top models may debut in enterprise pilots, partner networks, or limited previews—not in public APIs or chat apps.

This impacts how developers should plan, evaluate providers, and assess access risk.

What This Means for the AI Industry

Claude Mythos Preview is important as a signal—even more than as a product.

If Anthropic withholds a model due to cyber risk, other labs may do the same. We may see a split market:

Public models: Broad access, more constraints
Restricted models: Higher capabilities, limited access

This affects benchmarking—labs could maintain stronger internal systems while only releasing safer, weaker versions. It also complicates how outsiders assess the true state of the art.

For policymakers and security researchers, this is the scenario they’ve been preparing for: powerful models that require new release mechanisms to balance defensive utility and offensive risk.

Claude Mythos Preview may be the first major case of this happening in real time.

Should Developers Care Right Now?

Yes, but not because you need to change your stack immediately.

You should care because model announcements now mean something different. “Best available” may only mean “best we’re willing to release,” not “best we have.” This impacts provider comparisons—public Claude, GPT, Gemini, GLM, and open-weight models may all lag behind private frontier systems.

Conclusion

Claude Mythos Preview is not a standard product launch. It’s a restricted Anthropic model, possibly much stronger than Claude Opus 4.6 for software engineering, but kept private due to risk.

If the reported benchmarks are accurate, the key story isn’t just that Anthropic built a better model—it’s that some frontier models are now considered too capable, or too risky, for immediate public release.

This marks a significant shift in how advanced AI systems may be delivered to the market.

FAQ

What is Claude Mythos Preview?

A restricted Anthropic preview model being tested with selected defensive cybersecurity partners, not the public.

Is Claude Mythos available to the public?

No public release. Access is restricted via Project Glasswing.

Is Claude Mythos stronger than Claude Opus 4.6?

Reported benchmarks suggest it’s much stronger on coding tasks, but not necessarily across all categories.

What is Project Glasswing?

Anthropic’s restricted-access program for evaluating Claude Mythos Preview in defensive cybersecurity settings.

Why would Anthropic refuse to release a stronger model?

Dual-use risk: the same capabilities that help defenders can also assist attackers.

Can developers use Claude Mythos today?

No—access is limited to select partners and researchers, not public API users.

DEV Community