DEV Community

Shovon Saha
Shovon Saha

Posted on

Anthropic Asked the Government to Be Able to Shut Down AI. 48 Hours Later, It Happened to Them

Here's a timeline that sounds made up but isn't.

June 9, 2026: Anthropic, the company behind Claude, launches its two most powerful AI models ever. Claude Fable 5 (for the public) and Claude Mythos 5 (for cybersecurity defenders). State-of-the-art on almost every benchmark that exists.

June 10, 2026: Anthropic's CEO, Dario Amodei, publishes a massive policy essay. The core ask: the US government should have the legal power to shut down AI models that fail safety testing. He compares it to how the FAA can ground airplanes.

June 12, 2026: The US government sends Anthropic a letter. Using exactly that kind of power. Both brand-new Claude models go dark for everyone, worldwide, within hours.

You could not write a tighter turnaround if you tried.

What Amodei Actually Asked For

His essay is called "Policy on the AI Exponential." The short version: AI is moving so fast that governments can't keep up, and that gap is now dangerous. His proposed fix, in his own words, treats powerful AI models like airplanes or drugs. Things that are useful but can hurt people if they go wrong.

The specific ask: the government should have the power to block or reverse the deployment of an AI model if independent testing finds it poses serious risk in four areas. Cybersecurity. Bioweapons. Loss of control. AI that builds better AI.

He pointed to one model as proof this is needed: Claude Mythos Preview. Anthropic's own description was that Mythos Preview "scrambled the global cybersecurity landscape" and proved AI models are now tools with national security consequences.

Mythos Preview was the earlier version of the exact model that got shut down two days later.

What Happened Two Days Later

June 12, 5:21pm ET. Anthropic gets a letter from the US government. Citing national security, the government says: no foreign nationals can access the new Claude models, Fable 5 or Mythos 5. Anywhere. Including Anthropic's own foreign employees, even inside US offices.

Anthropic's problem: you can't surgically block "foreign nationals" from a model used by hundreds of millions of people across the internet. So they pulled both models entirely. Everyone. Every region. Gone in hours.

Every other Claude model kept running fine.

The Reason Is the Wild Part

Why did the government do this? Their stated concern: someone found a way to "jailbreak" Mythos, meaning bypass its safety restrictions.

Anthropic looked at what was actually demonstrated. Their description: the jailbreak amounts to asking the model to read a piece of code and find bugs in it.

That's the headline national security concern. A model reading code and finding flaws.

Here's why that's strange. At launch, just three days earlier, Anthropic published their own safety testing results for these exact models. The numbers were genuinely impressive:

  • An external bug bounty ran for over 1,000 hours and found zero "universal" jailbreaks (a universal jailbreak means a method that breaks ALL the safety rules at once, not just one narrow thing).
  • One outside testing partner ran 30 different public jailbreak techniques against the cybersecurity safeguards. The model refused all of them, every time.
  • Anthropic explicitly told the world: "it is likely impossible to completely prevent universal jailbreaks." They never claimed perfection. They said their goal was making jailbreaks slow, costly, and detectable.

So either the government found something genuinely new and serious that hasn't been shared publicly yet, or there's a big gap between "a narrow jailbreak exists" (which Anthropic already expected and said so) and "this justifies an emergency global shutdown."

Anthropic's own statement says it plainly: "we disagree that the finding of a narrow potential jailbreak should be cause for recalling a commercial model deployed to hundreds of millions of people."

The Part Nobody's Talking About

Mythos 5 wasn't just "a powerful model that happens to be risky." It was specifically built for cyberdefenders, deployed through something called Project Glasswing, run in collaboration with the US government itself. The pitch was: this model has the strongest cybersecurity capabilities in the world, and we're giving it to the people defending critical infrastructure.

The government just used its own national security power to shut down the cyberdefense tool it was helping build.

And this isn't even the first friction point. Back in April, the earlier version of this same technology got Anthropic labeled a "supply chain risk" by the Department of Defense after talks broke down. That label is normally reserved for foreign adversaries. Anthropic is suing to get it reversed. That case is still ongoing.

Why This Matters Even If You Don't Use These Models

Three things are true at the same time right now:

One. The CEO of a major AI company spent an entire essay asking governments to take exactly this kind of action, faster, on more models, with real legal teeth. That's not normal CEO behavior. Most companies lobby against the power to shut them down.

Two. Less than 48 hours later, a government used a version of that exact power on Anthropic's own newest, most advanced product. Without warning. Without detailed evidence shared publicly. Within hours of the letter arriving.

Three. Anthropic is complying while publicly saying they think it's a mistake, and that if this standard gets applied consistently, it would stop new AI model releases industry-wide. For every company. Not just them.

This is the first time a government has forced a major AI lab to pull a publicly deployed model entirely. Not "add more restrictions." Not "submit to an audit." Gone, within hours, for everyone on Earth.

If you build anything on top of these models, here's the new risk that didn't exist last week: the model you're using can become legally inaccessible overnight, for reasons that have nothing to do with how well it worked. No bug in your code fixes that. No amount of careful engineering protects you from it. It's a risk that lives one whole layer above your product.

Anthropic Connected the Dots Themselves

Here's the part that makes this not just a coincidence of timing. In their own suspension statement, Anthropic explicitly links back to the policy essay. Their words:

"As we have stated publicly, we believe the government should have the ability to block unsafe deployments, as part of a statutory process that is transparent, fair, clear, and grounded in technical facts. This action does not adhere to those principles."

That's not me drawing a connection. That's Anthropic saying, on the record, two days after asking for this exact power: yes, this is the thing we asked for, and no, this isn't how we said it should work.

They didn't get a transparent process. They got a letter at 5:21pm citing unspecified national security concerns, with verbal-only evidence of "a method... which essentially consists of asking the model to read a specific codebase and fix any software flaws."

They asked for an FAA. They got a phone call.

The Honest Open Question

Anthropic says they're still working to restore access and called this "a misunderstanding." They promised more details within 24 hours.

But step back. A company asked, in writing, for governments to have more power to do exactly this, with specific guardrails: transparent, fair, clear, grounded in technical facts. Two days later, they got the power without the guardrails.

Either the essay's framework still works in principle, and this was just a rough, ungoverned first use of an idea that needs the rules Anthropic proposed. Or asking for this kind of power was always going to mean someone else decides when and how it gets used, on whatever timeline and evidence bar they choose, regardless of what was proposed.

Both of those are uncomfortable. And we're going to find out which one this was in real time, because the next AI policy fight just got a real-world example to point at, three days old, with the receipts already public.


Sources

All claims above are sourced directly from Anthropic's own published statements and verified, dated reporting. No secondhand summaries.

Note on what's confirmed vs. not: The exact nature of the jailbreak the government is concerned about has not been publicly disclosed beyond Anthropic's characterization above. Anthropic said they would share more details within 24 hours of their June 12 statement; check their news page for updates before treating this as the final word.

Top comments (0)