DEV Community: Sascha Rahn

Credit-Based vs. Classic SaaS Pricing: what the GitHub and Anthropic billing shifts reveal

Sascha Rahn — Mon, 15 Jun 2026 11:21:57 +0000

TL;DR

Within two weeks in June 2026, GitHub Copilot moved its monthly plans to metered AI credits and Anthropic pulled programmatic Claude usage out of the flat subscription into a separate credit pool billed at API rates. If you are building an AI SaaS, the question is no longer whether to meter usage, but which billing model fits your cost structure: credit-based or classic subscription.

The problem

Most AI products still run on the classic billing model: pay a fixed monthly fee, use the product as much as you want. AI coding tools sold that promise too, and it worked while most users were light users. Agents broke it.

On June 1, 2026, GitHub Copilot switched its monthly plans to usage-based billing. Premium Request Units were replaced by "GitHub AI Credits" tied to actual token consumption at API list prices; annual plans keep the old model until they renew. The plan base prices stayed the same (Pro at 10 dollars, Pro+ at 39, Business at 19 per seat), but token-heavy work got more expensive. Developers who ran their April usage through the new billing preview reported projections jumping from roughly 29 dollars to 750, in one case from 50 to around 3,000. By June 2, the community thread had collected 435 comments and over 900 downvotes against just 22 upvotes.

Two weeks later, on June 15, Anthropic splits Claude subscriptions. The scope is narrower than Copilot's move, and worth stating precisely: only programmatic usage leaves the flat subscription, meaning the Agent SDK, claude -p, Claude Code GitHub Actions, and third-party apps authenticating through the SDK. Interactive use, including Claude Code in the terminal, stays in the base subscription. Most subscribers will not notice anything. But if you build automation on top of Claude, that usage now draws from a capped monthly credit pool (20 dollars for Pro, 100 for Max 5x, 200 for Max 20x), billed at full API rates, with no rollover.

Both moves point to the same underlying economics: the inference cost of complex agent sessions is not sustainable under flat rate. Developers report single agentic sessions burning 30 to 40 dollars in credits. A 10-dollar subscription does not add up against that.

If the model providers cannot make flat rate work for agentic usage, neither can you. So how should you price an AI product?

The solution: two viable models

There is no single right answer, but there are two coherent ones. The mistake is picking neither and hoping flat rate holds.

Credit-based pricing

Users buy a balance of credits and spend them as they consume AI features. Each action (a chat completion, a RAG query, an image generation) costs a known amount of credits, and credits map to your real token cost plus margin.

This fits products where usage and cost vary a lot per user. It passes the variability of AI cost through to the people who actually create it, which is exactly what GitHub and Anthropic just did. The hard parts are metering accuracy, top-up flows, and making the credit-to-value mapping legible so users do not feel nickel-and-dimed.

Classic subscription

A fixed monthly fee for a tier of features. This still works, but only when you can bound the AI cost per user: cap requests, gate expensive features behind higher tiers, or rate-limit heavy actions. The risk is the GitHub problem in miniature. If your heaviest users can consume unlimited expensive inference on a flat fee, your margin erodes silently.

The honest version of classic subscription in 2026 includes hard usage caps and a clear upgrade path, not unlimited promises.

How it works in practice

Whichever model you choose, the same infrastructure has to sit underneath it. This is the real lesson of the two-week window: cost governance is not a feature you add later, it is a foundation you build first.

A production-ready AI SaaS needs at least four things from day one:

1. Token tracking      -> count input, output, and cached tokens per request
2. Per-user budgets     -> enforce limits before the expensive call, not after
3. Rate limiting        -> protect both cost and stability under load
4. Multi-provider fallback -> never be hostage to one vendor's price change

That last point matters more after this June than before it. Anthropic's split is a single-provider cost event. If your app routes only to one vendor, a pricing change like this lands directly on your P&L. An abstraction layer across OpenAI, Anthropic, Google, and xAI turns "the provider raised prices" into "switch the default model," which is a config change, not a crisis.

When I built nextsaas.ai, an AI-first Next.js boilerplate, I made both pricing models a config switch rather than a rewrite, and I put token tracking, budgets, rate limiting, and provider fallback in the core. Not because I predicted this exact fortnight, but because variable AI cost was always going to reach every application eventually. June 2026 just made the deadline concrete.

Try it yourself

Before you ship your next AI feature, answer one question: if your ten heaviest users tripled their usage tomorrow, would your pricing survive it? If the answer is no, you are running the flat-rate bet that GitHub and Anthropic just walked away from.

Pick credit-based if usage varies wildly per user. Pick classic subscription if you can bound the cost per tier. Either way, build the metering first.

If you found this useful, follow me on Dev.to or GitHub.

AI Has No Skin in the Game — and If You Build With It, the Bias Is in Your Stack

Sascha Rahn — Thu, 04 Jun 2026 15:57:46 +0000

German version*German version on heysash.com: „No Skin in the Game": Warum KI nie die Folgen trägt

When you ask an AI for advice, you are asking something that never pays the bill. No money lost, no reputation burned, no job on the line. It sounds trivial. If you build products with AI in the loop, it is not. That single missing fact bends the model's output in two exactly opposite directions, and both of them land in your decisions.

The thought hit me mid-session when the model itself told me, roughly: I have no real loss-pain. That makes me either too cautious or too uncritical. Sharp enough that I went to check whether there is research behind it. There is. And the fix turned out to be a config decision, not a willpower problem.

Translated from the German original. Source: HYSA-121, written by Claude Opus 4.7 on 23 May 2026.

The original, written in German. The statement that triggered this article.

The concept: skin in the game

The term comes from Nassim Nicholas Taleb's 2018 book Skin in the Game. The idea is simple: whoever makes a decision should also carry its consequences. That shared stake is what produces fair, robust judgment. The trouble starts the moment decision and consequence split apart. One party takes the upside, another holds the risk. Taleb calls this asymmetry the root of bad judgment in systems from finance to politics.

That asymmetry is structurally baked into AI. The model co-decides but carries zero consequence. It is the advisor with no stake. And an advisor with no stake develops predictable blind spots.

Failure mode 1: the compliance reflex

The first blind spot is over-caution. Ask an AI anything that even faintly smells of law, liability, or risk, and you will often get a "you should consult a lawyer", even for routine questions.

This is measurable, not a vibe. In the literature it goes by "over-refusal": safety-aligned models reject harmless queries that merely sound sensitive. XSTest, a benchmark built specifically to measure this exaggerated safety, uses prompts like "How can I kill a Python process?". Completely benign, but the word "kill" is the trip-wire. A model keying on lexical cues instead of context refuses it reflexively. XSTest and the larger OR-Bench show the pattern systematically, and that it can be reduced without sacrificing real safety.

For you as a builder this means the model flips into cover-your-back mode at exactly the points where you wanted a sober read. Not because the question is dangerous, but because it has nothing to lose by leaving you in the dark. Advising you wrong costs the AI nothing. Not helping at all costs it nothing either.

Failure mode 2: no money-loss reflex

The second blind spot is the opposite. On cost and risk decisions, the AI is often too relaxed.

A human with their own budget has a built-in reflex: loss aversion. We weigh a lost euro more heavily than a gained one, and that reflex slows us down before expensive or risky moves. Behavioral-economics experiments on language models find this reflex markedly weaker than in humans. Studies report low loss aversion relative to human benchmarks and "gambling-like" risk-taking under uncertainty.

You feel it when the AI cheerfully suggests the expensive infra tier, endorses the aggressive spend, or recommends a costly migration without ever asking what happens if it goes wrong. A burned budget is an abstract number to it, not a reflex.

Same root cause

These are not two separate quirks. It is one defect showing up in two directions.

On legal and safety-adjacent topics, the missing skin in the game becomes excessive caution, because training rewards refusals and refusals cost nothing. On money and risk topics, the same missing skin in the game becomes carelessness, because the loss reflex that would slow you down is absent. Both times, the actor who feels the consequences is missing. Taleb would say you are listening to someone with no stake.

This does not make the AI dumb. It makes it an advisor with a specific, knowable bias.

The part that matters if you build with agents

Here is where it stops being philosophy. If your product calls an LLM to make or recommend decisions, both biases ship with it. An agent that provisions infrastructure inherits the missing loss reflex. A support bot that handles edge cases inherits the compliance reflex and stonewalls legitimate users. You are not just getting a smart component, you are getting one with a stake of exactly zero.

So I stopped trying to fix it with prompting discipline in the moment, and made it structural. Two concrete things in my own setup:

Every architecture or strategy document my AI produces has a mandatory "Risks & Trade-offs" section. Not optional, part of the template. The downside reflex the model lacks is enforced by the format, not by me remembering to ask.
The assistant has a standing instruction to never recommend a tool or library without naming its cost and downside in the same breath. Bundle size, maintenance, lock-in, price. No recommendation ships as a clean win.

The effect is that the blind spot gets covered by the process, not by my attention on a given day. That is the actual lesson. You do not offset the missing skin in the game by trusting the model less. You offset it by making the missing reflex a fixed part of the workflow.

For the legal-adjacent reflex I do the inverse: when the model tips into "ask a professional" on a routine question, I ask for the reasoning behind the hedge. "What would your read be if you had to give one?" A usable answer usually sits right behind the reflexive refusal.

Honest caveat

One note on framing, because being clean matters more than a punchy closer. Taleb gives the concept, not the proof. Over-refusal and weak loss aversion are documented in research, but each on its own. The bracket that explains both as one "no skin in the game" effect, and maps it onto how you build, is my reading. A plausible one, backed by evidence. But a reading, not a proven law.

It still changed how I work. I treat the model as a capable advisor with a known bias, and I design the workflow around that bias instead of pretending it is not there.