Today I got talking with someone about AI coding. I told them it's a strange space. Everyone's bullish on the direction, yet the earliest true believers still can't turn a profit.
I've been in AI from the start, and AI coding was a direction I locked onto early. Back when we were building edge devices, this was the first scenario we went after. But looking back from 2026, trying to build AI coding software looks more and more like a dead end.
The direction isn't wrong. It's too right.
The Completion Era
Roll back to 2021. GitHub Copilot went live. OpenAI trained a model called Codex for Microsoft, wrapped it into an editor, and sold it as code completion. That move started the whole race.
By 2023, Copilot had over a million paying users and $100 million in annualized revenue. At the time, those were eye-popping numbers for a brand-new category. But that year, Microsoft itself was still losing over $20 per Copilot user per month. The subscription price couldn't cover inference costs, but no one cared; grab the ecosystem first.
Then the first wave of startups rushed in. The logic was simple: programming is its own vertical, so why not train a model just for code? General models were expensive and slow; a small, specialized coding model should be cheaper and better.
Microsoft itself released Phi-1, a 1.3B-parameter model that scored 50.6% on HumanEval, matching models ten times its size at the time. In China, Tsinghua and Zhipu's CodeGeeX began iterating in September 2022, with a new generation every six months. A Peking University team called Silicon Intelligence open-sourced the 7B CodeShell at the end of 2023, surpassing CodeLlama-7B and StarCoderBase-7B on HumanEval, and released the IDE plugin alongside it. The shared bet: programming data is concentrated, so a specialized model should beat general-purpose ones on both price and quality.
A whole ecosystem of tool-chain builders sprang up too: evaluation, RAG, vector databases, completion integrations. RAG was hot then, and in programming it gave people plenty to pitch.
But in 2023, there was no real business model yet. Model coding ability was shaky, and developers only used it lightly. The whole sector felt like a sure thing that just hadn't popped.
The Workflow Era
The real inflection point was 2024.
In June 2024, Anthropic dropped Claude 3.5 Sonnet and coding ability jumped to a new level. Anysphere's Cursor was the first to plug this model into its own IDE. It wasn't just an API call; they redesigned the whole workflow around it. Reading context, cross-file edits, multi-turn iterations, acceptance-rate feedback—features that had been scattered across completions were finally strung into an actual pipeline.
The business model changed too. Cursor used monthly subscriptions to subsidize the underlying token costs. You paid them $20 a month, and they bundled in Claude API quota. Buy that much token volume straight from Anthropic and you'd easily pay over a thousand dollars. This arbitrage—selling model capacity cheap to buy growth—pushed user numbers up fast. Anysphere hit $500 million ARR and a $9.9 billion valuation in June 2025; by November 2025, its valuation had reached $29.3 billion.
But that's where the problems started. Cursor's workflow itself had no real moat.
Why didn't anyone copy it right away? Because only Claude 3.5 Sonnet could actually run that workflow; plug in other models and they were either too slow or too stupid. So the second wave of AI coding companies mostly got stuck between two paths: integrate Claude and race Cursor to the bottom on price, or use their own or open-source models—and watch users bounce after one comparison.
The One-Two Punch of DeepSeek and Claude Code
At the end of 2024, DeepSeek V3 went open source. That changed the game.
V3 still hadn't caught Claude 3.5 Sonnet on coding—82.6% against Claude's 93.7% on HumanEval, a gap of more than ten points. But it was absurdly cheap, with token prices about one-forty-second of Claude's. That price-to-performance ratio meant players who couldn't afford Claude before, especially domestic vendors, could suddenly run the same workflow Cursor had built.
During that stretch, pretty much every AI coding software company integrated V3. Their products started looking identical, and the differentiation window slammed shut fast.
Right after that, Anthropic launched Claude Code in February 2025. No IDE, just a CLI. It chats in your terminal, reads your project, edits code, runs tests. When it first came out, I treated it like a demo. After a week of using it, I had a sudden realization: when the model is strong enough, you might not need an IDE at all.
The business numbers were even more striking. Claude Code hit $1 billion in annualized revenue within a year, and by early 2026 estimates had it near $2 billion. That growth rate blew the gap between software makers and model makers wide open.
The CLI form itself has no software moat either. Open source quickly spit out OpenCode, Aider, Cline, and a bunch of others; Cursor later started moving toward CLI too. The result: even the IDE layer, the last place left to differentiate, got flattened.
All Three Layers Were Eaten
Look back at what got eaten.
The first wave of small coding model companies is basically silent now. The original logic was "general models aren't specialized, so I'll build a small one." But every new general model release treated coding as a core benchmark to beat, so the window where specialized small models had an edge barely opened. CodeShell, CodeGeeX, Phi—today they're footnotes in papers, not products.
The middleware folks building toolchains and workflows had it even worse. As models got stronger, all the engineering value built around patching model weaknesses evaporated overnight. RAG integrations, prompt orchestration, and completion enhancers that were selling six months ago were rendered useless by the next model drop.
Even standalone IDEs started feeling the heat. Cursor is still growing, but its core paradigm is being drained away by CLI tools like Claude Code. When the model itself can do most of the work inside a terminal, the IDE's value gets pushed back to the margins.
It isn't that these companies did anything wrong. They picked the right direction and nailed every inflection point. The problem is that this sector is too important and too lucrative, so every foundation model company treats it as a primary battleground. And a primary battleground means the layers above it get cleared out.
From Corsets to Bras
I've been thinking about this recently, and it reminded me of a story from fashion history.
In the early 20th century, women in Europe and America still wore corsets—stiff structures built with steel boning. It was a centuries-old industry, and countless small workshops lived off it. In 1914, a New York woman named Mary Phelps Jacob made a simple brassiere from two handkerchiefs and ribbon, and patented it. No one took it seriously at the time.
The real turning point was World War I. The U.S. War Department asked women to stop wearing corsets—not because they cared about women's comfort, but because the steel bones had to be melted down for weapons. One order killed the entire market.
Most corset makers died. But one company, Warner Brothers Corset, read the shift and switched to bras, hitting $12.6 million in annual revenue by the 1920s.
What's interesting is that the underlying demand never changed. Women still wanted to look good; only the vessel changed. First corsets, then bras. The corset makers died because they were defending a product shape, not a need.
AI coding is at that stage now. Everyone who bet on this direction five years ago was right. Developers using AI to write code, efficiency up tenfold, production relations changing—it's all happening. No one called it wrong. But the vessel carrying that demand is no longer IDE plugins, workflow software, or specialized small models. It's converging fast on "foundation model + CLI/agent."
Everything caught in the middle is being fast-tracked to obsolescence.
What Can Still Be Sold
The sector has been eaten, but the demand is still there. From another angle, what's left to sell probably collapses into three buckets: quality, cost, and everything else.
First, quality. Programming is high-stakes; one bad line can blow up production. Developers will keep paying extra for output that's more accurate and more reliable. That demand will only grow, but the carriers are mostly foundation model companies. If you can push your model to the front of the pack, the business works. The so-called "high-quality token" is essentially this tier.
Second, cost. Same capability, lower price wins. Open-source models, inference optimization, caching, scheduling—every layer has room to cut costs. This is actually an opening for application-layer companies, because not every model company is good at engineering optimization, and not all of them want to push prices to the floor.
Third, everything else. This bucket is bigger than the first two combined. Right now most teams use AI coding chaotically—everyone runs their own setup, and the organization never gets any real efficiency. Stitching together workflow, code review, knowledge retention, and accountability with agent work patterns is a real need. Go further out and security, privacy, recoverability, and compliance—these haven't really unfolded yet, but they'll soon be the core reasons big companies pay. This doesn't look like traditional SaaS; it's more like consulting, engineering, and training rolled together, and the pricing has to be rebuilt from scratch.
No matter which bucket, the old model of "how much for this software package" is basically broken. You either convert to tokens, or convert to services.
Closing Thoughts
We've watched this pattern repeat throughout the AI wave. The more certain your directional bet, the faster your original business form dies. Getting the direction right means the space matters; if it matters, foundation model companies will turn it into a primary battleground; and that means every layer above gets stripped away.
AI coding is the clearest example of this pattern so far, but probably not the last. In the coming months I expect AI customer service, AI design, AI sales, and other domains to go through the same cycle.
Small companies can turn on a dime. The most dangerous spot is the middle—mid-sized companies with sales, delivery, and customer-success teams already built around selling software. They're the ones who find it hardest to walk away from their engineering assets and pick a new angle.
If you're grinding away in AI coding too, or you've seen a similar "being eaten" process in another sector, feel free to leave a comment.
References
- Microsoft has over a million paying GitHub Copilot users(2023)
- GitHub Copilot subscriber count and revenue growth(CIO Dive)
- Phi-2: The surprising power of small language models(Microsoft Research)
- CodeGeeX: A Multilingual Code Generation Model (Tsinghua KEG)
- Silicon Intelligence CodeShell-7B Open Sourced (ITHome)
- WisdomShell/codeshell(GitHub)
- Introducing Claude 3.5 Sonnet(Anthropic)
- Cursor AI coding startup valuation $9.9B(CNBC)
- Cursor maker raises at $29.3B valuation(Bloomberg)
- DeepSeek V3 release notes(DeepSeek API Docs)
- DeepSeek-V3 vs Claude 3.5 Sonnet detailed comparison(DocsBot AI)
- Claude Code product page(Anthropic)
- Anthropic Claude Code revenue(The Information)
- From Bombs to Bras: WWI Conservation Measures(Connecticut History)
- How World War I Helped Women Ditch the Corset(HISTORY)
Top comments (0)