DEV Community

Cover image for Anthropic Mythos Broke Firefox: 271 zero-day vulnerabilities
Alister Baroi
Alister Baroi

Posted on • Originally published at linkedin.com

Anthropic Mythos Broke Firefox: 271 zero-day vulnerabilities

271 zero-day vulnerabilities. One AI model. One Firefox release. And that's just one of four stories worth your attention this fortnight.

If you run engineering, security, or AI at your company, this article will give you a clear message: AI is no longer something your team uses. It's something your team (and your attackers) deploys. Here are the four moves that matter, and the numbers behind each.

1. Mythos found 271 zero-day vulnerabilities in Firefox 150

On April 22, Mozilla shipped Firefox 150 with patches for 271 security vulnerabilities, all identified by Anthropic's unreleased Mythos model during what Mozilla calls its initial evaluation. For context: across all of 2025, Mozilla patched roughly 73 high-severity Firefox bugs. Mythos delivered almost 4× that count in one evaluation window.

  • Mythos is distributed under Anthropic's restricted Project Glasswing programme: not a public model, and not available via API.
  • Firefox 150's security advisory lists 41 CVEs; three of those CVEs are memory-safety roll-ups that bundle many of the 271 individual defects.
  • The most serious finds were use-after-free bugs in the DOM and WebRTC, the same bug class that has driven browser exploitation for two decades.
  • Mozilla's caveat (worth quoting verbatim): Mythos did not find any category of bug that an elite human researcher could not have found. The gain is scale and speed, not new capability.

"A gap between machine-discoverable and human-discoverable bugs favors the attacker, who can concentrate many months of costly human effort to find a single bug. Closing this gap erodes the attacker's long-term advantage by making all discoveries cheap."Mozilla, on the shift in attacker/defender economics.

If Anthropic can hand Mozilla 271 real bugs in a single evaluation, assume your own vendors (and your adversaries) are running similar passes on your stack. The question to ask this quarter is no longer "do we use AI in our security review?" — it is "which of our vendors do, and what does our threat model look like if attackers scale this before we do?"

2. Anthropic launched Claude Design

On April 17, Anthropic released Claude Design, a new Anthropic Labs product built on Claude Opus 4.7. It turns Claude into a design tool that produces real deliverables: prototypes, slide decks, one-pagers, marketing collateral.

  • Reads your codebase and existing design files to apply brand rules automatically.
  • Accepts 5+ input formats: text prompts, images, DOCX, PPTX, XLSX.
  • Exports to Canva, PDF, PPTX, HTML, or a shareable internal URL.
  • Hands off to Claude Code when a prototype needs real implementation.
  • Available in research preview across 4 subscription tiers: Pro, Max, Team, Enterprise.
  • Datadog's quantified claim: prototyping that took one week of back-and-forth now happens in one conversation.

This is Anthropic stepping out of "model behind an API" and into "end-user product", competing directly with Figma, Canva, and the slide-building half of Microsoft 365. If your product organisation still treats model vendors as neutral infrastructure, that assumption has a shorter shelf life than your next budget cycle. The vendor now competes with some of your tooling.

3. Google open-sourced DESIGN.md

Google Labs released a draft open-source specification called DESIGN.md, a format that describes design systems in a way AI agents can read, reason about, and validate against. It shipped alongside Stitch (Google's AI UI tool), but the format itself is platform-agnostic and hosted on GitHub.

  • Encodes design intent so AI agents stop guessing, "agents can know exactly what a color is for".
  • Includes built-in WCAG accessibility validation.
  • Portable across any tool or platform, not locked to Stitch.
  • Released as a draft spec, open to contribution.

Watch the format, not the tool. Markdown files that AI agents read for persistent context — CLAUDE.md, AGENTS.md, README.md, and now DESIGN.md — are becoming the lingua franca of AI-native workflows. The standard here is being set in public, right now. Whichever spec wins become the default your engineering teams (and their AI copilots) work against for the next decade. API.md, SECURITY.md, and ONBOARDING.md are the obvious next chapters. If you have a design system or a platform team, this is a draft you want an opinion on.

4. OpenAI is quietly building "Hermes" — always-on agents inside ChatGPT

Leaked internal screenshots, surfaced by TestingCatalog between April 6–21, show OpenAI actively developing a platform codenamed Hermes. It adds persistent, 24/7 agents to ChatGPT — agents that run even when you are not at the keyboard.

  • Custom workflows and skill assembly.
  • Task scheduling and event-triggered actions.
  • External messaging connectors: agents can reach users outside ChatGPT.
  • Role-based templates: leaked screenshots show CTO and CPO archetypes.
  • Multi-agent orchestration, integrated with OpenAI's existing Workflows builder.
  • Status: internal beta. No release date confirmed. Unofficial: treat as leak, not announcement.

Signal for engineering leaders: If Hermes ships in the form shown, ChatGPT stops being a chat interface and becomes a runtime for autonomous systems, a direct competitor to Salesforce Agentforce, Microsoft Copilot Studio, and every agent startup built on top of the OpenAI API. Those startups are then competing with their own platform provider, using agent patterns their provider can see in aggregate across hundreds of millions of users. If your 2026 roadmap includes an AI agent strategy built on vendor APIs, this is the risk line item you want in your Q3 review.

The Thread

Four announcements, two weeks, one pattern. AI this fortnight was not about bigger models or cleaner benchmarks. It was about AI doing the work — finding real zero-days in shipped software, producing design artifacts that replace a week of iteration, standardizing how agents read intent, and (in OpenAI's case) running as always-on infrastructure your teams have not yet budgeted for.

The message for leaders is simple: the operational reality of AI is moving faster than most roadmaps were written to handle.

Top comments (0)