GitHub Copilot's flat fee is gone. Flat-rate rivals smell an opening.

Santosh — Mon, 08 Jun 2026 05:17:02 +0000

GitHub Copilot moved all of its plans to usage-based billing on June 1, replacing the old premium-request system with one that meters token consumption, according to GitHub's own announcement. The change ends the flat seat price that made Copilot easy to budget, and a set of smaller coding tools are using the moment to pitch developers who want a bill they can predict.

Under the new model, every plan now ships a monthly allotment of GitHub AI Credits, where one credit equals one cent of model usage. Copilot Pro stays at $10 a month and includes 1,500 credits. Pro+ stays at $39 with 7,000 credits. A new Max plan arrives at $100 with 20,000 credits. Once the allotment runs out, GitHub says each additional credit is charged to the card on file at the same penny-per-credit rate.

For light users, little changes. For anyone running sustained agentic workflows, the math is different, because usage is calculated on input, output and cached tokens at each model's API rate. A developer who leaves an agent running can move from a fixed monthly line item to an open-ended one.

That is the opening the flat-rate tools are aiming at.

"The whole reason people liked a flat fee was that the worst case was knowable," said Santosh Arron, who builds Dropstone, an AI coding CLI that charges one monthly price with no usage overage. "The minute the worst case is whatever your agent did overnight, the product feels different to leave running."

Dropstone charges $15 a month for a plan it says sustains about 450 coding turns a week, with no per-token overage and no five-hour reset window. The company defines a turn as one full exchange with the agent at roughly 15,000 input and 800 output tokens, and says anyone can run the same conversion against a vendor's published quota. It keeps the cost flat, the company says, by running on cache-friendly open-weight models and amortizing the repeated parts of each session, which it measures at above 95% cache-hit rate in sustained use.

The trade-off is the usual one for the cheaper tools. They run on open-weight models rather than a closed lab's flagship. Dropstone says it leads SWE-bench Verified at 91.2%, ahead of Claude Opus 4.7 at 87.6%, but concedes Opus still wins the harder SWE-bench Pro, that GPT-5.5 Pro leads on long-context coding, and that Gemini edges it on multilingual translation.

GitHub's case for the change is that usage-based billing matches what developers actually consume, and that the credit allotments cover typical use. The company also kept Copilot Business at $19 per user and handed existing business and enterprise customers promotional credits for June, July and August to ease the shift.

Whether developers move will depend on how often they hit an overage they did not expect. The Pragmatic Engineer's 2026 survey found that 70% of developers already run two to four AI coding tools at once, so the realistic outcome is not a wholesale switch but a reshuffling of which tool runs the all-day, leave-it-running work. The flat-rate tools are betting that job is now theirs to take.

For developers comparing the options, I wrote a fuller breakdown of how the major plans stack up once you normalize them into the same unit, including where the cheaper ones win and where they still lose: https://blankline.org/research/dropstone-1-5.

Dropstone is betting developers are tired of getting rate-limited

Santosh — Mon, 08 Jun 2026 04:55:10 +0000

Dropstone, an AI coding CLI built by Blankline Research, is pitching developers on a single idea: an entire workday of agentic coding for $15 a month, with no session limit and no reset window.

The company's argument is that the thing slowing developers down is not the model. It is the quota. Both Claude Code Pro and OpenAI's Codex run on rolling five-hour reset windows, and Dropstone says a developer on either plan can lose roughly five hours of a nine-to-nine workday waiting for the meter to refill.

"You do not need a faster model when you are mid-debug. You need the one you have to keep going," said Santosh Arron, who builds Dropstone. "The reasoning is usually fine. The lockout is the problem."

The pitch lands at a moment when developers are leaning on these tools harder than ever. Around 84% now use or plan to use AI coding assistants, according to industry surveys published this year, and more than half use them daily. The more central the tool, the more a midday lockout costs.

The comparison problem

Part of what Dropstone is selling is a way to compare plans at all.

Every vendor publishes its quota in a different unit. Anthropic sells hours of Sonnet or Opus per week. OpenAI sells messages or Codex hours. Google sells tokens per day. Cursor sells requests per month. None of them answer the question a working developer actually asks, which is how many hours of coding the plan buys before it stops.

To compare them, Dropstone converts every plan into one unit it calls a heavy-coding turn: one full exchange with the agent, sized at roughly 15,000 input tokens and 800 output tokens, which the company says reflects a typical agentic turn in production. The conversion is mechanical, and the company says anyone can run the same math against each vendor's published quota.

On that basis, Dropstone says its $15 Pro plan sustains about 450 turns a week. It puts Claude Code Pro, at $20 a month, at roughly 150 turns before a recent boost and about 225 after. That is the doubling the company keeps pointing to.

Why the cost is lower

The gap comes down to caching, not a cheaper model alone.

Inside a single coding session, most of the input repeats from turn to turn. The system prompt does not change. The tool definitions do not change. The open files move slowly. When the inference provider caches that repeated prefix, every turn after the first is served at a steep discount.

Dropstone says it measured this on real sessions. The first turn hits nothing, because the cache is cold. By the sixth turn in a sustained session the hit rate holds above 95%, and long sessions push past 99%. Across a mix of short and long sessions the average sits around 82%. At the provider's roughly 92% cache-hit discount, the company says that cuts the per-turn cost by close to an order of magnitude, which is what lets a $15 plan clear 450 turns instead of 200.

There is a limit to the trick, and the company concedes it. Caching discounts only the input. Output tokens are generated fresh every turn at full price, so a workload dominated by long generated output sees less of the benefit.

Where it loses

Dropstone runs on open-weight models rather than a closed lab's flagship, and the company is direct about what that costs.

On SWE-bench Verified, a standard benchmark of real GitHub issues, the company puts its Pro tier at 91. 2%, ahead of Claude Opus 4.7 at 87.6%, GPT-5.5 Pro at 85.0% and Gemini 3.1 Pro at 80.6%. On a broader agentic-coding composite it scores 85.2% at the Pro tier and 90.2% at the Heavy tier, trading blows with Opus-class models on routine work like bug fixes, feature builds and test writing.

That framing fits how developers actually work. The Pragmatic Engineer's 2026 survey found that 70% of developers run two to four AI coding tools at once. Dropstone is not pitching a replacement for the hardest one-shot reasoning tasks. It is pitching the all-day tool that does not lock you out at lunch.

The product also runs inference on US-hosted providers with no retention, the company says, and gates every file write and shell command behind explicit user approval. There is a free tier, and the CLI installs from npm.

Whether developers switch will come down to whether the 450-turn claim holds up in their own workday. Dropstone's bet is that enough of them have hit the two o'clock wall to find out.

DEV Community: Santosh

GitHub Copilot's flat fee is gone. Flat-rate rivals smell an opening.

Dropstone is betting developers are tired of getting rate-limited

The comparison problem

Why the cost is lower

Where it loses