Frontier AI is getting more expensive while open models keep getting cheaper

#pricing #openweightmodels #inference #economics

The price of frontier closed AI models is rising while open-weight models keep getting cheaper, splitting the AI market into a premium, gated tier and a cheap, abundant tier. According to an analysis by the inference company Doubleword, the gap between the two is widening, and government access policies are pushing it wider.

Key facts

What: Closed frontier models are raising prices and tightening access just as Chinese open-weight models slash theirs, a structural reversal with big consequences for who builds with AI.
When: 2026-06-26
Primary source: read the source

API pricing is set by two factors: how expensive the model is to run, and how much pricing power the provider has. Competition and efficiency gains pushed prices down across the board for a while. That dynamic has now diverged. The top closed models—the new GPT-5.6 flagship and Anthropic's Fable and Mythos line—are priced at a premium and, in some cases, gated behind government clearance. Meanwhile, Chinese open-weight models like DeepSeek's latest have seen permanent price cuts, and you can download them and run them yourself for the cost of hardware.

Frontier labs are spending colossal sums on training and on the specialized chips to serve their models, and as their models pull ahead on the hardest tasks, they can charge for that lead. At the same time, the supply of strong open-weight models keeps growing. Open models compete on price because no single company controls them, anyone can host them, and hosts undercut each other. The result is a market splitting into a premium, gated tier and a cheap, abundant tier, with the middle hollowing out.

Doubleword's own approach illustrates one way the cheap tier gets cheaper: not every job needs an instant answer. The company offers async and batch processing, where you accept a wait—sometimes up to a day—in exchange for a steep discount. For workloads like running AI agents overnight, scoring thousands of evaluations, or bulk-processing documents, latency does not matter, so paying a premium for instant responses is pure waste.

Frontier closed models are overnight express: fast, premium-priced, and increasingly requiring a verified account to use the fastest tier. Open models on batch infrastructure are ground freight: slower, dramatically cheaper, and good enough for anything that is not urgent. The smart operator reserves express for the few jobs that truly need it and sends the rest by freight. As frontier express prices rise and gates go up, more cargo moves to freight.

This reversal, layered on top of government vetting for the best closed models, is pushing builders toward open weights—not as a budget compromise but as a strategy. If your application can run on a strong open model you host yourself, you are insulated from price hikes, rate limits, and the risk that a model you depend on gets switched off by a directive, as nearly happened with Mythos. For startups and researchers without a government clearance, the open tier is increasingly the only frontier they can actually reach.

Cheaper is not free, and open is not effortless. Doubleword is a vendor making a vendor's argument, and its cost comparisons are its own, so treat the specific multipliers as marketing until you measure your own workload. Running open models yourself means owning the hardware, the scaling, the reliability, and the security—a real operational burden that the per-token price hides. And the very best models, for now, still tend to live on the closed, premium, gated side of the line. The reversal is real and important, but it is a shift in the trade-offs, not a verdict that one side has won.

Originally published on Ground Truth, where every claim is checked against the primary source.