DEV Community

Cover image for US Order Kills Anthropic's Mythos 5, Fable 5 for All
XOOMAR
XOOMAR

Posted on • Originally published at xoomar.com

US Order Kills Anthropic's Mythos 5, Fable 5 for All

Washington just showed it can force a frontier AI model offline first and explain the evidence later, and that is the wrong template for governing powerful AI.

The U.S. Department of Commerce directed Anthropic to suspend access to Fable 5 and Mythos 5 by any foreign national, including foreign national employees, according to American Banker. Anthropic then shut down both models for all customers to comply. The stated concern centers on a possible jailbreak of Fable 5, while Anthropic says the demonstrated issue involved only “previously known, minor vulnerabilities” that other public models could also find.

This is not a small operational hiccup. Claude Mythos 5 is described as Anthropic’s most advanced model, capable of detecting and exploiting zero-day software vulnerabilities at high speed. Fable 5 is the guarded version. As XOOMAR readers tracking Claude Fable 5 Sells Mythos-Class AI on a Short Leash know, the core tension was always obvious: how do you sell cyber capability without handing attackers a power tool?

The answer cannot be a black-box shutdown order.


A disputed jailbreak claim is too thin a basis for killing live AI models

A jailbreak in this context means a user finds a way around model safeguards to make the system produce restricted outputs. That risk is real. Nobody serious should pretend frontier models with cyber capability are harmless just because their vendor says the guardrails are strong.

But regulation has to distinguish between three very different things:

Risk level What it means Appropriate response
Theoretical vulnerability A safeguard might be bypassed in some cases Testing, disclosure, monitoring
Demonstrated harmful use The bypass produces dangerous output or operational uplift Mandatory mitigation, limited restrictions
Imminent public danger The model can reliably enable serious harm at scale Emergency suspension with review

Anthropic says this incident belongs near the first category, not the third. In its June 12 statement, the company said the government letter “did not provide specific details of its national security concern” and that the disclosed technique identified “a small number of previously known, minor vulnerabilities.”

“We have not even received a disclosure of a concerning non-universal potential jailbreak that led to a harmful result,” Anthropic said.

That does not prove Anthropic is right. Companies downplay risk when their commercial product is at stake. But it does mean the government’s burden of proof matters. If officials can cut off deployed AI systems based on a narrow, disputed technique without a public technical summary, then the standard becomes fear, not evidence.

Secretive takedown orders create bad incentives for safety work

Anthropic has built much of its public identity around AI safety. That does not make it virtuous. It does make this case especially awkward for regulators.

The company says it worked with the U.S. government, the UK AISI, private third parties, and internal teams to red-team Fable’s safeguards for “thousands of hours in total.” It also says it adopted a defense-in-depth strategy because “perfect jailbreak resistance is not currently possible for any model provider.”

That admission should matter. The safer regulatory lesson is: document risk, test aggressively, monitor real use, and patch quickly. The worse lesson is: if a company discloses too much about residual risk, it becomes easier to force the product offline.

This is analysis, not a claim about Anthropic’s future behavior. The point is institutional. Enforcement systems shape incentives. A process that treats every disclosed weakness as a potential trigger for a total shutdown will not make AI labs more candid. It will make candor more expensive.

That is why XOOMAR’s coverage of the Foreign National Ban Makes Anthropic Pull Fable, Mythos matters beyond one company. The order did not just target overseas access. Because it covered foreign nationals inside and outside the United States, including Anthropic employees, Anthropic said the practical effect was disabling both models for everyone.

The off switch should sit at the top of an AI risk ladder, not the first rung

The government needs a visible escalation path before it reaches for the kill switch. Powerful AI systems with cyber capability deserve hard supervision. But hard supervision is not the same as improvised suspension.

A better process would look like this:

  • Notice: Tell the company what capability or failure mode triggered concern.
  • Technical findings: Provide enough detail for engineers to reproduce or contest the issue.
  • Mitigation deadline: Require a fix, monitoring change, or access restriction within a defined window.
  • Independent review: Bring in vetted third-party experts when the evidence is sensitive.
  • Limited restriction: Narrow access if the risk is tied to specific users, outputs, or workflows.
  • Shutdown: Suspend the model only when the danger remains serious after the prior steps, or when the evidence supports immediate action.

Anthropic says the government has provided only verbal evidence of a potential narrow, non-universal jailbreak. David Sacks, tech investor and co-chair of the President's Council of Advisers on Science & Technology, gave the other side on X. He said the government was responding to a “highly credible trusted partner” and that Anthropic CEO Dario Amodei refused to fix the jailbreak or de-deploy the model.

Sacks also wrote:

“It's difficult to fathom how they could claim a jailbreak allowing operability of a cyber weapon could be defined as not ‘serious.’”

That is the strongest factual counterweight in the record supplied. If the government has evidence that Fable 5 can be reliably pushed into cyber-weapon behavior, then urgent action is defensible. But the word “if” is doing serious work. The public record still lacks the technical detail needed to evaluate the claim.

For banks, security teams, and enterprise AI buyers, predictability is not a luxury. American Banker reported that bank CEOs and Treasury Secretary Scott Bessant met with Anthropic in April to discuss Mythos, and that JPMorganChase joined Project Glasswing, a group allowed to test Mythos in their environments. When models used in serious security testing can vanish by directive, customers need a rulebook, not a rumor chain.

Public safety is the best argument for the shutdown, but it still needs proof

The government’s best case is straightforward. If a frontier model can be reliably jailbroken to assist cybercrime, weapons misuse, fraud, or other serious harm, officials should not wait for damage to pile up. AI companies also have commercial incentives. They should not be the sole judges of whether their own products are safe enough.

That argument is strong. It still does not justify vague emergency governance as the default.

Emergency authority is legitimate when three conditions exist: credible evidence, narrow scope, and a path to repair or appeal. Credible evidence does not require publishing exploit details that would help attackers. It does require enough public explanation for outsiders to know whether the government is acting on demonstrated harm or a disputed edge case. Narrow scope matters because Anthropic says the directive’s foreign-national structure forced a total shutdown. A repair path matters because Sacks and Anthropic both say they want this resolved quickly, but the public does not yet know what exact fix would satisfy the government.

Due process does not weaken AI safety. It strengthens it. It forces companies to defend their claims, and it forces regulators to defend theirs.

AI oversight has to become public, technical, and reviewable

Fable 5 and Mythos 5 may come back online quickly. Anthropic says it is “working to restore access as soon as possible,” and Sacks wrote that the administration wants the issue fixed and Fable returned to general release. That would solve the customer outage. It would not solve the governance problem.

The next framework should include public summaries of shutdown orders, confidential technical review when disclosure would create security risk, and independent audits that can test claims without exposing sensitive model details. It should also apply across frontier AI providers under the same standard. Selective, unclear enforcement will not produce safer systems. It will produce legal uncertainty around systems that banks, security teams, and developers are already testing in real environments.

The practical takeaway is simple: lawmakers and regulators need to build the AI enforcement process before the next model is pulled offline. If officials want trust in AI safety decisions, they cannot govern frontier models from inside a black box.

Impact Analysis

  • The order shows the U.S. government can force major AI systems offline before publicly explaining its evidence.
  • The case highlights the unresolved challenge of selling powerful cyber-capable AI without enabling misuse.
  • A black-box shutdown process could set a precedent for AI regulation that lacks transparency and proportionality.

Originally published on XOOMAR. For more news and analysis, visit XOOMAR.

Top comments (0)