The shift happened faster than most teams budgeted for. As of April 2026, Anthropic's Claude has overtaken ChatGPT across the three metrics that matter for enterprise software: annualized revenue, daily active users, and seat counts inside large dev orgs. Anthropic's ARR jumped from roughly $9B to $30B inside a single fiscal stretch. That's not a vanity-metric story — it's a procurement signal. If you've been treating "AI vendor" as synonymous with OpenAI on your architecture diagrams, the assumption is now stale.
We tested both stacks across coding, research, and agent workflows over the last quarter. The gap isn't about benchmarks anymore. It's about where the developer surface area lives.
The Numbers That Actually Shifted
The headline metric is the $9B → $30B ARR move, but the more interesting figure is the composition. Anthropic's revenue isn't dominated by consumer chat — it's API and enterprise. That changes how you read the lead. ChatGPT still has a larger consumer footprint; Claude pulled ahead specifically where developers and enterprises pay per token and per seat.
DAUs flipped too. For the first time since GPT-4 launched, Claude has more daily active users than ChatGPT in measured enterprise environments. The qualifier matters — this is enterprise DAU, not the global consumer count, where ChatGPT's brand recognition still dominates. But if you're staffing an internal AI platform team, enterprise DAU is the number that predicts which models your engineers will actually call from production code.
Enterprise DAU and total DAU tell different stories. ChatGPT's consumer reach is still larger globally. Claude's lead is concentrated where teams pay for API access and developer tooling — which is also where vendor lock-in becomes expensive to unwind.
Three things drove the shift, in our reading: Claude Code shipped as a first-class CLI rather than a chat skin, API pricing on Sonnet held steady while quality climbed across two model generations, and the long-context window started clearing the bar developers actually need (full repos, not just files).
Why Claude Code Changed the Math
Claude Code is the part most teams underestimated. OpenAI shipped Codex CLI and a parade of agent products, but Claude Code arrived with a different bet: terminal-native, skills-based, plays well with existing IDEs rather than replacing them. For teams already running Cursor, VS Code, or JetBrains, that's the difference between "another tool to standardize on" and "something that drops into the workflow you already have."
The skills system in particular changed how teams think about reusable AI workflows. Instead of duplicating prompts across projects, you write a skill once — a markdown file with frontmatter — and it activates when relevant. We've watched internal teams cut prompt-engineering overhead substantially by collapsing scattered system prompts into shared skills repos.
API pricing is the other half. Sonnet 4.6 and 4.7 held the same per-token rate as 4.5 while landing measurably better on SWE-bench and the kind of long-context retrieval tasks that matter for codebase work. For finance teams, "same price, better model" is the easiest renewal conversation possible. Compare that to model upgrades that came with price hikes elsewhere.
The Opus tier is more nuanced. It costs roughly 5x Sonnet per million tokens, which means it only pencils out for tasks where the quality delta justifies the spend — deep refactors, architecture review, multi-step agents that can't afford to retry. For most everyday coding, Sonnet is the sensible default. We default to Sonnet 4.6 for routine work and reach for Opus 4.7 when a task fails on Sonnet twice.
How to Evaluate Your AI Stack in 2026
Single-vendor strategies aged badly in 2025. The teams that avoided the worst migrations are the ones that wired their internal tools against a model-agnostic interface from day one — usually a thin wrapper around the OpenAI-compatible chat completions API, which Anthropic also supports.
A practical audit for the next quarter:
- Inventory every production code path that calls a single hardcoded model name. Each one is a future migration cost.
- Measure your actual token mix. Most teams discover 80%+ of spend is on a handful of repeated workflows — those are the ones to benchmark across Claude, GPT, and Gemini before committing.
- Run the same eval suite against Sonnet 4.7 and GPT-5 on your own data. Public benchmarks are directionally useful and contractually useless. Your domain is the only benchmark that matters.
- Treat coding-agent work and chat work as separate procurement decisions. Claude leads coding right now; the chat story is more contested.
The migration cost from one model vendor to another is rarely the API surface. It's the prompts, the evals, the agent scaffolds, and the institutional knowledge your team built around model-specific quirks. Budget for that, not just the per-token delta.
The longer-term question is whether Anthropic's lead in coding tools persists or whether OpenAI's distribution advantage closes the gap once Codex and the GPT-5 family reach feature parity. Our read: the lead compounds for at least the next two quarters because Claude Code's skills ecosystem is now self-reinforcing — third-party skills are shipping faster than first-party features can be cloned.
If you're standardizing today, the defensible move is dual-vendor: Claude as primary for coding and long-context work, with GPT or Gemini as the fallback for chat-shaped workloads where you want a different model's instincts as a sanity check. The cost of running two providers is small; the cost of being single-vendor when the leader changes is large.
Originally published at pickuma.com. Subscribe to the RSS or follow @pickuma.bsky.social for new reviews.
Top comments (0)