DEV Community: Arnon Shimoni

How to design pricing for AI APIs and LLM-powered products

Arnon Shimoni — Thu, 04 Jun 2026 15:31:57 +0000

AI API pricing comes down to six decisions, in order: what to meter, which primitive to charge for, what to charge per unit, how to structure tiers, hard cap or soft cap, and how the credit wallet behaves. This guide walks through each one with worked examples and dollar amounts. There's a Claude prompt at the end you can paste in to diagnose your own pricing.

The reason for writing this blog post is because I was talking to a founder who showed me his pricing page pricing page, that said "$0.02 per 1k tokens"… I asked what the customer sees on the invoice, and… I didnøt like the answer. So, this is for founders, PMs, and monetization leads about to make the decisions that will haunt the next 18 months of their P&L.

The structure of an AI pricing plan

With classic API pricing, a plan is the subscription tier (Free, Pro, Scale).

Inside each plan there are phases (a 14-day trial, then the default phase) that make it up.

Inside each phase there are rate cards that tie a meter to a price and an entitlement. During a trial, you may get more features than during the "evergreen" phase.

AI pricing adds three twists to this structure.

The meter is no longer just "requests." It's tokens (input and output), credits (abstracted units), or outcomes (a completed task, a resolved ticket, a generated document).

The rate card has to handle multiple model costs sitting under one customer-facing price.

The entitlement is a wallet, not a counter. Credits accumulate, expire, roll over, and get spent across multiple features.

Also, phases matter more than they did in classic SaaS, because AI pricing changes more often. A model provider cuts their price or you ship a new credit rate, or a customer extends their trial - all of those creates a phase transition, not necessarily just a contract amendment.

That gives you six decisions, in order.

Step 1: Pick what you're metering

The meter is the thing you count. Pick it wrong and you'll fight your customers about whether the bill is fair for the next two years.

For AI products, four meters dominate:

Input + output tokens (OpenAI, Anthropic, Mistral)
Credits (Cursor, Notion AI, Linear AI)
Outcomes (Intercom Fin at $0.99 per resolved conversation)
Compute time

The meter has to correlate with two things at once: customer-perceived value, and your cost of goods. A flat per-request meter on a multi-model product will wreak havoc on your (already thin) gross margin the moment a customer sends 50k-token prompts.

💡 Decision to lock: which meter, with the exact definition.

Write it down - "1 token = 1 GPT-5.5-style token, input + output combined". My exprience is that vague definitions are a source of customer disputes that you can very easily avoid.

Step 2: Pick your pricing primitive

I suggest a customer-facing unit, even if you do the math in another internally.

Primitive	What you charge for	When it works well	When not to use it
Token	Input + output tokens at a $/1k rate	Direct API products, technical buyers, OpenAI-style resellers	Multi-model usage, when output volume varies wildly, non-technical buyers
Credit	Abstracted units redeemable against any feature	LLM-powered SaaS, mixed model usage, packaged AI features	When credit rates don't track cost changes, when customers can't predict burn rate
🏆 Outcome	Per generated document, resolved ticket, completed task	Clear value units, enterprise contracts, vertical AI	When outcome definitions are fuzzy, when failure modes aren't priced in

Most products that do well in a pricing change end up running all three at once.

For example, a per-token for power users on the API, with credits for the packaged product, and outcome pricing on top for enterprise deals.

Agentforce charges for conversation, Fin runs outcome-based, but cursor runs credits and OpenAI runs tokens. All are valid, but highly dependent on the business.

So again, outcomes first if you can define them, credits second for flexibility, tokens last if you can't avoid them.

💡 Decision to lock: which primitive appears on the invoice.

Step 3: Calculate your cost and set your price

Here's a good example to work with: Your product makes one Claude Opus 4.8 call per user request. Anthropic charges $5 per 1M input tokens and $25 per 1M output tokens.

If your average request uses 1k input tokens and 500 output tokens, your cost per request is $0.005 + $0.0125 = $0.0175.

Three pricing approaches from that cost to consider:

Cost-plus markup. Charge $0.035 per request. 50% gross margin. Predictable, defensible, boring. Works in the API reseller category. Breaks the moment you add a second model with a different cost profile.
Value-based. The customer would pay $1 per generated document because that's the value to them. Charge $0.50. ~96.5% gross margin. Risky when a competitor undercuts you. Excellent when you have a moat. (The classic Salesforce play. Different now in AI because the moat is thinner.)
Credit-based abstraction. Define a credit as "one standard request" and charge $0.05 per credit. Heavy-input requests cost 2 credits. Document generation costs 5 credits. The customer sees a balance, not a token count. Your margin moves smoothly across model changes because you control the credit-to-cost mapping internally.

Side-note: Opus 4.7 introduced a new tokenizer that can use up to 35% more tokens for the same input text compared to earlier Opus models. Same prompt, same model family can result in a 35% bigger bill. If your pricing is per-token, you just absorbed a price hike you can't pass through easily but if your pricing is per-credit, you adjust the credit-to-token ratio internally and the customer notices nothing. Credits as architecture, again.

A practical rule for me is that if you can't predict the customer's monthly bill within ±20%, your pricing is too tightly coupled to your cost. Credits with controlled rates kinda solve it, while cost-plus per-token definitely doesn't.

💡 Decision to lock: per-unit price (in whatever primitive you picked) and target gross margin.

Step 4: Build your tier structure

Most AI products land on 4-5 tiers. The structure matters more than the numbers. You can change the numbers. You can't easily change the shape.

Here's a starting point to consider:

Tier	Price/mo	Included	Limit type	Best for
Sandbox	$0	100 credits/month	Hard cap	Devs evaluating, demos
Starter	$29	5,000 credits/month	Soft cap, $0.01/credit overage	Indie devs, early prod
Pro	$299	100,000 credits/month	Soft cap, $0.005/credit overage	Growing teams, full prod
Scale	From $2,500	Annual commit, custom volume	Negotiated overage	$1M+ ARR mid-market
Enterprise	Custom	Custom + SLA + multi-entity	Custom	Regulated, multi-region

I like a hard cap on free (protects margin from abuse), or a soft cap with overages on paid (let customers scale), and commit-based for the top.

You should note, that a free tier is typically an acquisition cost - so price it for the abuse boundary… 100 credits gets a developer through a "hello world" and 10,000 credits can get a hobbyist building real things on your dime. That's fine if you're planning for it, but the abuse line is wherever the cost per signup makes your CAC start to really hurt.

Also worth noting that charging 2-3x the included rate as overage is standard. Going higher than 3x makes customers hard-cap their own usage internally which suppresses your expansion revenue and may sabotage your product (it feels like you're robbing them). Going lower than 1.5x definitely leaks margin on power users.

💡 Decision to lock: tier count, included usage per tier, price per tier, overage rate per tier.

Step 5: Decide hard cap vs soft cap

Hard limits block requests when the entitlement is exhausted. Customers get an error (consider an HTTP 429!) and stop until they upgrade or the period rolls over.

Soft limits let usage continue past the entitlement and bill the overage at the period boundary.

Use hard limits on:

Free tiers and trials
Anything where the customer hasn't agreed to overage terms
Developer sandboxes and test environments

Use soft limits on:

Paid tiers where overage terms are in the contract
Anything where blocking would break a production system
Scale and Enterprise customers who want to maximise usage and have credit lines

The common mistake I see people make is putting hard limits on paid tiers because "we don't want to surprise the customer and have their app stop working" - which is noble but getting a huge huge bill at the end of the month isn't great either. A soft limit with a generous spend alert is the move.

💡 Decision to lock: hard or soft per tier, the overage price for soft, the alert thresholds (50%, 80%, 100% of entitlement).

Step 6: Design your credit wallet

If your primitive is credits (Step 2), the wallet is the entire pricing experience. Get the four decisions wrong and customers will tolerate you but never trust you.

Expiry

Do credits expire? Month-end is harsh, year-end is generous, contract-end is the enterprise default.

Having no expiry creates a pretty serious liability on your balance sheet - so talk to a CFO before you ship "credits never expire" as a marketing line.

Rollover

What happens to unused credits at period boundaries? Full rollover is most user-friendly, but some do a partial rollover.

Having no rollover sucks. Customers will eventually notice and resent it.

Top-ups

Can customers buy more mid-cycle?

At the same price, or a premium?

Self-serve, or sales-assist?

Top-ups are where customer happiness lives. Friction here is the single biggest cause of "we love the product but the billing is annoying."

Multi-meter redemption

Can one credit pool be spent across multiple features (chat, search, generation)? Or does each feature have its own pool?

A single pool is friendlier and easier, but a multi-pool is easier to revenue-recognize under ASC 606. You'll end up with single pool because customers ask for it.

💡 Decision to lock: expiry policy, rollover policy, top-up flow, single vs multi pool.

Planning for when the model prices drop

Anthropic took Opus from $15 input / $75 output per 1M tokens (Opus 4.1) to $5 / $25 (Opus 4.5 and later). A 3x cut on the most premium model in the lineup.

OpenAI ships similar moves every quarter, and… Yeah, well, it's hard to plan around this.

If you priced per-token at a fixed markup, your gross margin just tripled. Good for the P&L, but not great for the customer who now feels the bill is too high.

Expect a renegotiation request within 60 days, or a switch to a vendor who reprices automatically.

If you priced credit-based, you have a choice. Hold the credit rate and bank the margin. Lower it and pass savings through. Or rebalance: lower the rate for older models, hold for newer ones.

If you priced outcome-based, you barely notice. The customer pays $0.99 per resolution whether you used Opus 4.7 or Haiku 4.5. Margin compounds automatically. (Why outcome pricing is the most durable shape, and the hardest to ship on day one.)

The infrastructure question you must know is can your billing system change a credit rate mid-cycle without rewriting prior invoices? If not, every model price drop becomes a project and a migration which can really hurt.

Diagnose your pricing with Claude

Copy this prompt into Claude (or any capable model). It runs the discovery with you and returns a tier table and reasoning. It won't get the price points exactly right. That's your judgment and your data. It does get the shape right, and it saves 60 days of design work.

I need help designing pricing for my AI API or LLM-powered product. Please work with me in two phases.

Phase 1 — Ask me these questions in turn. Wait for my answer to each (or each small group) before moving on. Don't move to Phase 2 until you have enough from me to work with.

What does your product do? (What problem does it solve? Who uses it?)

Which models do you use? (Single model, multi-model, with fallback?)

What does an average request look like? (Approximate input tokens, output tokens, latency budget, any multimodal cost.)

What's your cost per request? (Or per token, by model, if multi-model.)

What unit do you want customers to pay for? (Tokens, credits, completed outcomes, hybrid.)

Who are your target customers? (Developers, prosumers, startups, enterprises, mix.)

Are there comparable products? Their pricing if you know it.

What's your goal? (Maximise adoption, maximise revenue, hit a specific gross margin target.)

Do you want a free tier? If so, what's the abuse boundary?

Phase 2 — Design the strategy. Present it as a comparison table covering each plan tier.

For each tier, specify:

Plan name and target audience

Pricing primitive (token, credit, outcome, or hybrid)

Included usage (tokens / credits / outcomes per period)

Hard limit or soft limit, and overage price if soft

Credit wallet behaviour if applicable (expiry, rollover, top-ups, multi-meter)

Model fallback or routing behaviour

Differentiating features (SLA, priority support, advanced models, observability)

Free trial behaviour (yes/no, duration, hard or soft)

After the table, explain:

Why this structure for this product

Why these price points (margin maths)

What tradeoffs I'm making (margin vs predictability, complexity vs adoption)

What to test in the first 90 days

Run it once, refine the inputs, run it again - because the second pass is usually better!

What billing infrastructure has to handle

5 things that we've seen really hurt with AI pricing if you didn't design for them:

Per-event metering at high cardinality. Millions of events per day. Aggregating at month-close doesn't scale.
Credit wallets as first-class entities. Not invoices with line items. Wallets with balances, debit/credit history, expiry, rollover, and multi-meter redemption rules.
Real-time balance updates. Customers want to see credits drop as they use the product. If your system updates nightly, they'll build their own UI on top of your API. Or churn.
Margin visibility at the unit level. Per-customer, per-feature, per-model cost-vs-revenue, in a dashboard the finance team can actually read.
Rate-card versioning. You'll change credit rates 3-5 times in the first year. Your billing system has to handle "this customer was on the old rate until July 15, new rate from July 16" without rewriting February's invoice.

Solvimon treats credits and wallets as first-class billing primitives, which is different from gluing a metering service to a subscription billing tool and writing the wallet logic yourself.

Frequently Asked Questions

How do I price AI tokens?

Calculate your cost per request from the model provider's per-token rates, multiply by your expected input and output tokens per request, and apply either a cost-plus markup, a value-based price, or a credit abstraction. Credit abstraction is more durable across model price changes.

Should I let credits expire?

Yes, with a long enough horizon. 12 months or contract-end is standard. No expiry creates a growing balance-sheet liability. Aggressive expiry (monthly) feels like a gift-card scam and erodes trust at renewal.

When should I use a hard cap vs a soft cap?

Hard cap on free tiers, trials, and developer sandboxes. Soft cap on paid tiers, production workloads, and Enterprise contracts. Hard cap on a paid production tier is the most common mistake. Customers would rather pay an overage than have their app go down.

How many tiers should I have?

Three to five. Sandbox (free), Starter, Pro, Scale, optional Enterprise. More than five and the pricing page stops being a decision aid and starts being a maze.

How do I handle multiple AI models with different costs?

Either pass the cost variance to the customer (different credit rates per model), or absorb it internally (fixed credit rate, smart routing to the cheapest qualifying model). Absorbing it requires a billing system that supports credit-to-cost mapping at the route level.

How do credits get revenue-recognized under ASC 606?

Credits are typically recognized as revenue when consumed (the performance obligation is satisfied at redemption), not when sold. Unredeemed credits sit on the balance sheet as deferred revenue. Breakage (unredeemed credits at expiry) is recognized as revenue at expiry, subject to your historical breakage rate analysis. Ask your auditor before you finalize the wallet policy.

How to design usage-based pricing

Arnon Shimoni — Wed, 03 Jun 2026 19:47:31 +0000

Usage-based pricing is four decisions in a trenchcoat: what you meter, what unit you charge for, how you structure the rate card, and how you handle commits and overages. I've seen many teams get one wrong and only discover it later - forcing a redesign.

Usually, a founder reads a Snowflake retrospective, a post from Tomasz Tunguz, maybe a board deck that has been leaked - and then someone decides "let's go usage-based" into a Notion doc and has the built-in AI design some principles. A few weeks later it finally goes live but you discover…. lots of issues…

Most of what I read on UBP today is consulting-flavoured, with ideas like "align pricing with value" or "optimise for customer success". I've been known to write that too, for full disclosure. Fine, but it can still be unhelpful when you're a few days from launch and need to decide whether the meter is the API call or the successful transaction.

The design problem is rooted in reality, so let's have a look at how to do it.

What usage-based pricing is, briefly

Usage-based pricing (UBP) is a model where customers pay for the volume of a product they actually consume, instead of a flat fee like a seat or a platform fee.

Commonly, the unit can be API calls (Twilio, OpenAI), gigabytes (Snowflake, Datadog), events processed (Segment), characters translated (DeepL), or any other measurable quantity tied to value.

Outcome-based often fits in usage-based, where the result is charged in the same way.

Usage can be metered per request, batched, or summarised at a period boundary. Then, the price can be linear, tiered, or volume-discounted. The contract can be pay-as-you-go, prepaid credits, or a committed minimum with overages.

That's quite a few units as the surface of usage-based, now let's look at the decisions:

What's the right meter?

The meter is the thing you count, so picking the wrong one means fighting your customers about whether the bill is fair for a long time.

What makes a good meter? I think there's four properties:

It correlates with value the customer receives. For example, Adyen charges per successful transaction, not per API call. Snowflake charges per second of compute, not per query. The customer's bill goes up exactly when their business goes up. When it doesn't correlate, the bill feels like paying an even bigger tax and customers churn (or worse, they negotiate it down to zero).
It correlates with your COGS (cost of goods). If you're an AI company, your inference cost is per-token. A flat per-request meter will ravage your gross margin the moment a customer sends very very long prompts. There was a story recently that consultancy spent $500m on tokens…

It's auditable. Both you and the customer need to be able to count it independently and arrive at the same number. If your finance team can't reconstruct yesterday's usage from raw events, your customers can't either, and that's the bill they'll dispute.
It's stable over time. The meter's definition shouldn't change every quarter. Customers build forecasts on it. If you redefine "active user" between Q2 and Q3, you've just burned your renewal cycle.

For most B2B products, the meter is either an event (an API call, a generated document, a workflow run) or a resource over time (compute-seconds, storage-GB-months, active users per month).

Pick one! Don't try two and hope the customer tries to understand which one is the dominant one…

Here's an example: Twilio could have charged per API call but instead they charge per delivered message. A customer who sends 10k and gets 9k delivered pays for 9k. The 1k that didn't deliver were Twilio's network problem. When the meter is honest and defensible, so is the bill.

Snowflake could have charged per query or per data loaded - which was the common thing to do. Instead, they charge per second of compute. A poorly-written query that scans the whole table costs more than one that hits an index. The meter aligns customer behaviour with Snowflake's COGS. (Meter design as competitive lever. Most teams discover the lever only after shipping the wrong meter.)

What's the rate card?

The rate card is the price per unit of the meter.

The first question you should ask is: linear, tiered, or volume-discount?

Linear

Linear is simplest. For example, $0.01 per API call, no matter how many you do. Use it when your cost of goods is also linear and when your competitive landscape allows it.

(chart via BVP)

Tiered

Tiered means rates change at thresholds. First 100k calls free, next 1M at $0.02, anything above at $0.005. Tiered rate cards work when usage is heterogeneous (some customers do 5k/month, some do 5M) and when you want to acquire small customers at a low price point without losing margin on the large ones. Vercel runs tiered. Datadog runs tiered. AWS runs tiered with a thousand-page footnote.

Volume discounts

Volume discount is the SaaS-style continuation of tiered: same per-unit price, applied across all units once a threshold is crossed. Easier to explain to customers, harder to model internally. Pick whichever your customers will read.

A note on penny pricing, though: $0.0001 per token reads like an honest price. It also means your customer has to multiply by 10M to understand. Penny pricing creates emotional distance from the bill, which is great for adoption and terrible for trust at renewal. Round it. Bundle it. Don't manufacture units of consumption that customers can't reason about.

(chart via BVP)

How to structure commits and overages?

Most companies start with pay-as-you-go and graduate into commits as deals get bigger. The shape that works best:

Element	What it is
a base commit	an annual or monthly minimum the customer agrees to pay regardless of usage
overage	an overage rate that kicks in past the commit
true-up (sometimes)	at the period boundary - if usage exceeded the commit, the customer pays the difference
true-down (sometimes)	the customer doesn't pay back if usage was lower, because contracts

The two failure modes to avoid:

Commits without rollover create the gift-card problem. The customer committed to 10M API calls/month, used 6M in January because they were ramping. By December they realized they'd been paying for 4M calls a month they never used. That's stranded value. They feel cheated. Renewal goes cold.
Overages priced too aggressively kill expansion. If your overage rate is 3x your committed rate, customers will hard-cap usage internally before they hit the commit, just to avoid the penalty. You've optimized your bill while suppressing your revenue.

Credits sit somewhere in between that shape… They're a prepaid balance of money or some other metric customers draw down against any meter, often with expiry rules. When done well they give flexibility (the customer can spend their 10M calls on whatever endpoint they need). When used sloppily they become "breakage" revenue and a finance audit liability.

Credits are an architectural decision, not a pricing model.

How do you migrate existing customers onto usage-based pricing?

This is the part nobody writes about because it's the hard part that requires you to communicate well, and understand what your customers value.

There isn't a playbook or template and you typically also can't just flip a switch.

Every customer on the old plan has a contract, a budget, and an expectation - if you surprise you lose renewal trust.

However, there is somewhat of a sequence you can follow that works:

Step 1: shadow billing. For 60-90 days, calculate what each customer would pay under the new model and put it on the invoice as a memo line. No financial impact. The customer can see what's coming. Finance can model the cohort impact before any contract changes.
Step 2: opt-in for new accounts only. Ship UBP as the default for net-new customers. Let the existing book run on the old terms. Product feedback without breaking anyone.
Step 3: voluntary migration with a sweetener. Offer existing customers a price-protection guarantee or a one-time credit grant to move. Some will. Most won't until step 4.
Step 4: forced migration at renewal. At contract renewal, the new model is the only option. By this point, you have 6-12 months of shadow data and customer references. The conversation is "here's your bill, here's the precedent, here's the upside on flexibility." Some customers churn. Plan for it.

This can still take a really long time. We've had customers whose migration migration took just a few days for the technical work and another 9 months for the contract rollover. That's the realistic shape, unfortunately.

What billing infrastructure has to handle

Usage-based pricing fails in production for boring reasons. Most are billing-infrastructure problems, not pricing problems.

The system has to ingest events at scale, deduplicate them, reconcile them to a customer, apply the right rate card, handle commits and overages without double-counting, and produce an invoice that a finance team can audit. Most teams glue this together from Stripe Billing, a metering service, a spreadsheet, and 4,000 lines of orchestration code. That code is now their actual billing system. It's fragile.

The infrastructure questions to ask before you ship usage-based pricing:

Can you compute usage per customer per meter per period in under a minute? If not, your monthly close will take a week.
Can your finance team audit any line item back to raw events? If not, you'll lose every dispute.
Can a customer self-serve a usage breakdown that matches the invoice exactly? If not, your support ticket volume is about to triple.
Can you change a rate card mid-cycle without rewriting historical invoices? If not, every pricing experiment becomes a six-week project.

Solvimon runs the metering, ledger, and rate-card engine as one system, so the infrastructure questions above stop being engineering problems. Different from gluing five tools together.

Frequently Asked Questions

What is usage-based pricing?

Usage-based pricing is a billing model where customers pay based on their actual consumption of a product (API calls, gigabytes, events, compute-seconds), rather than a flat subscription fee. Each meter is tracked and billed at a defined rate, often with tiered or volume discounts.

How is UBP different from hybrid pricing?

Pure UBP is consumption-only. Hybrid pricing combines a base subscription (or seats) with usage on top, often with credits or commits. Most companies that say "we do usage-based" actually run hybrid in practice, because flat usage-only pricing is unpredictable for both sides.

When should I avoid usage-based pricing?

When your cost of goods doesn't scale with the meter, when your customers value billing predictability over flexibility (most enterprise CFOs), or when your unit of consumption isn't legible to a non-technical buyer.

Can I run usage-based pricing on Stripe Billing?

Kinda - Stripe Billing supports basic metered usage but doesn't natively handle complex hybrid configurations (credits across meters, multi-entity, true-ups with proration).

How long does it take to design and ship UBP?

Designing the model takes a couple of weeks of focused work. Implementing it in production typically takes 4-12 weeks depending on existing billing complexity. The harder part is the customer communication when you migrate existing customers onto the new model.

What's the most common mistake teams make with UBP?

Picking a meter that doesn't correlate with cost of goods. The second most common is shipping a rate card with no commit structure, which makes revenue forecasting impossible.

How do I migrate existing customers without losing them?

Shadow billing for 60-90 days, opt-in for new accounts, voluntary migration with a sweetener, forced migration at renewal. Total elapsed time is typically 12-18 months. Skipping the shadow billing phase is the most common way to lose enterprise customers.

What's the difference between a meter and a rate card?

The meter is what you count (API calls, gigabytes, events). The rate card is what you charge per unit (linear, tiered, volume-discounted). One product can have multiple meters, each with its own rate card. Most legacy billing systems handle one rate card per customer at a time, which is why companies outgrow them.

Why we treat credits and wallets as first-class billing primitives

Arnon Shimoni — Sat, 16 May 2026 21:09:02 +0000

tl;dr: Most billing systems model a credit wallet as a prepaid cash balance. That works at day zero. It breaks the moment your product has multiple types of credits with different per-unit costs, different margins, and different rate cards sitting between your token layer and your customer-facing price. Our customer Reson8 needed multiple wallets, structured around a rate card layer instead of one prepaid counter.

Lots and lots of billing systems model credits as prepaid cash. Pay $100, get $100 of credit. Burn the credit, the balance goes down. That's because it's often treated as "just an engineering thing" where it's a fancy counter.

The model is clean until your product has more than one thing customers buy credits for.

Why we'd rather hold credits than money

There's a line we come back to internally:

I'd rather have credits in a wallet than money, because money I have to give back. Credits I can expire.

A money wallet creates a liability. Whatever your customer deposits, you owe it back if they don't spend it. Credits work on your terms: when they expire, what they cover, under what conditions they're valid. The obligation is yours to design.

Not great for customers necessarily, but great for businesses running with AI.

A wallet targeting a single product category is what tax law calls single-purpose: intended use is known at purchase, so VAT is calculated when the funds go in. A general-purpose wallet, where a customer might spend across your product, is multiple-purpose: intended use is unknown at purchase, so VAT is calculated when they actually use it. Credits are a third category. You're selling a product (100 credits for $90), VAT applies at purchase, and the credit has its own exchange mechanics entirely separate from currency.

So if you have three wallet types, you get three different VAT treatments and three different points of revenue recognition.

If you model them as one thing and finance finds out later, and won't be happy with you.

Why single wallets aren't right

The single-wallet model assumes interchangability. All credits are equal. One credit buys one unit of anything in the product.

That assumption is kinda true for simple products but not for modern AI stuff where the cost of delivering one unit varies across workloads. A minute of transcription on a custom-trained domain model costs more to deliver than a minute of batch generic transcription. If you change languages it becomes even clearer, because the underlying token countss are not the same.

When those two workloads draw from the same credit pool, the billing system can't enforce that difference. That means customer's credits become fungible across products that aren't and the margin doesn't match what was invoiced.

That's a data model problem!

What Reson8 needed

Reson8 builds hyper-customizable speech recognition for European languages: real-time, domain-adaptive, running on EU GPUs with no audio retention. They bill by the minute across multiple workload types. Standard transcription. Custom-domain models adapted on up to 1M tokens of customer context with real-time processing versus batch - and each has a different cost basis.

The moment a customer at Reson8 pre-purchases minutes, the question becomes: which kind? Standard-transcription minutes and custom-domain real-time minutes are different products. You can't let customers use one pool for the other: the product doesn't allow it, and the margin profile doesn't support it.

What companies like Reson8, ElevenLabs, Wispr need is at least three distinct rating systems, that translate to wallet types: one for standard minutes, one for custom-domain minutes, and one that handles real-time processing overages as a metered charge when the pre-purchased pool runs empty.

In theory, each can have its own top-up schedule, expiry rules, and invoicing behavior.

A single prepaid balance doesn't model that. It guesses at it at best.

The token layer is why

Speech AI adds a dimension most billing systems weren't built around. The model thinks in tokens. The customer thinks in minutes. Contracts denominate in credits. Finance works in dollars (or Euros).

Each link is a rate:

tokens → minutes → credits → dollar

A minute of real-time transcription in a custom-adapted model on a dedicated EU cluster consumes more tokens than a standard batch job on a generic model. The conversion factor varies by workload, by model version, by language pack, by the level of domain adaptation the customer has configured.

Most billing systems let you set a price per event, or a price per unit of consumption. What they don't support is a rate card layer sitting between the metering layer and the wallet layer. A rule that says: one meter event tagged workload: custom-realtime burns 3 credits from wallet B, while one meter event tagged workload: standard-batch burns 1 credit from wallet A.

Remember I said engineers see it as a counter? That's not how counters work.

When that rate card lives in application code rather than billing configuration, it's invisible to finance and re-coded every time the cost model changes. Which in AI, is quote often.

Why the rate card has to be in the billing layer

Lots of speech vendors also have custom model adaptation as a product in its own right: a one-time charge to adapt the model on a customer's data, then a recurring credit pool to use it. The initial adaptation and the ongoing credit wallet are related but distinct events.

The whole revenue stack has to handle both on the same invoice and recognize revenue at the right moment for each.

You could call this hybrid, where a one-time charge triggers a wallet provisioning event, the wallet burns down as the customer runs workloads, it auto-tops-up on a schedule or on demand, and overages switch to real-time metering when the pool hits zero.

For that to work, the wallet has to carry more than just a counter, but some extra metadata: which plan provisioned it, which meters feed into it, which rate card converts events into credit deductions. A counter doesn't do that.

The Stripe method (and lots of other vendors) have gift-cards or "balances" given to a customer, but they're not separate objects - you can't spin them up and move them around. Connecting them requires orchestration code you write and maintain. Lago's credit primitives don't compose naturally with multi-wallet, multi-rate-card configurations and if you're building on top of them - you need your own custom ogic that reimplements the billing layer on top of the billing layer.

In Solvimon, Wallets are a first-class primitive. Multiple wallet types per customer, each tied to a specific meter and a rate card, with configurable top-up and expiry rules. The rate card lives in configuration. Finance can see it, and the billing system enforces it.

What changes for revenue recognition

Good wallet modeling cleans up revenue recognition.

A customer pre-purchasing 10,000 standard minutes creates a deferred revenue liability. Each minute consumed triggers a recognition event against the right wallet. When the pool empties and the customer tips into metered overages, billing switches from credit burn to real-time usage. Finance sees one invoice, one ledger. The wallet's depletion is the revenue recognition schedule.

During billing calculation, credits are reserved against pending invoices, then deducted only when the invoice goes final. Available balance is real balance minus reservations. That gap between reservation and deduction prevents customers from overspending mid-cycle, and makes the recognition schedule traceable without a spreadsheet.

For Reson8, that matters. EU customers care about data handling and contract structure. An invoice that reflects custom-domain usage, standard usage, and real-time overages as distinct line items, each traced to its wallet and rate card, holds up in procurement, in audit, and in the customer relationship. That's not my favourite thing to have to design around, but healthcare and financial services procurement teams care quite a lot about this.

The big decision/question behind credit pricing you need to make

Credits carry a pricing decision inside them: what value to attach to each unit. The structural question is whether your billing system can model the relationships between your metering layer, your credit pools, and your revenue recognition without engineers maintaining the translation.

For AI companies running multiple workloads with different cost bases, that requires multiple wallets and a rate card layer that understands what each pool represents.

Frequently Asked Questions

Why would a company prefer credits over a money wallet?

A money wallet creates a financial liability: you owe the deposited amount back if the customer doesn't spend it. Credits are a product sale: the customer buys a defined unit with defined terms, including when those credits expire. That distinction affects your balance sheet, your VAT treatment, and how you recognize revenue. For many AI companies, credits are preferable precisely because you set the expiry terms rather than holding an open-ended obligation.

What is the difference between a credit wallet and a prepaid balance?

A prepaid balance is a single pool of value denominated in currency. A credit wallet is a typed pool denominated in a product-specific unit (minutes, tokens, API calls) with a rate card defining how it converts to currency and a meter defining what consumes it. For simple products, they're equivalent. For products with multiple workload types, only the wallet model holds up.

Why would an AI company need multiple credit wallets for the same customer?

When a product has multiple distinct workloads with different unit costs and different margins, a single pool treats all credit consumption as equivalent regardless of delivery cost. Multiple wallets enforce the boundary. Each wallet is tied to the workloads it covers, with its own rate card and top-up behavior.

What is a rate card in AI billing?

A rate card is a configuration layer that defines how a metering event converts into a credit deduction. For a speech AI company, this might be: one minute of real-time custom-domain transcription = 3 credits from the custom wallet. The rate card sits between the metering layer (which counts consumption) and the wallet layer (which tracks the balance).

When rate cards live in application code, they're invisible to finance and hard to update - which is why you should keep them in a billing system.

How does token-to-credit conversion work in practice?

The model layer consumes tokens. The product layer exposes a customer-facing unit (e.g., minutes). The billing layer converts that unit into credits at a rate defined by the rate card. A minute of transcription in a given model configuration costs a known number of tokens to produce. When the model changes and the per-minute token cost changes, the rate card updates in configuration.

Can Stripe Billing handle multiple credit wallet types?

Stripe Billing supports basic combinations of subscription billing and metered usage, and has a credit grants, but not tied to wallets. Getting them to interact, especially across multiple credit types with different rate cards, requires custom orchestration code. Most teams building multi-workload AI products end up maintaining that orchestration layer themselves.

How does Solvimon model credit wallets?

Solvimon treats Wallets as a first-class primitive: multiple wallet types per customer, each associated with a specific meter and a rate card, each with configurable top-up and expiry rules.

The rate card layer handles the conversion between consumption units and credit deductions. Revenue recognition is calculated against wallet events. Wallet configuration lives in Solvimon rather than in application code.

The hard part of usage-based billing isn't the metering

Arnon Shimoni — Thu, 14 May 2026 08:13:17 +0000

Every AI company I talk to is running two pricing motions at once:

A subscription for platform access
+ consumption charges on top.

Sometimes a third if we consider the prepaid credits…

The single-bill view is the easy part. The hard part is what's underneath with three different rating engines arguing about who owns the customer's invoice this month.

This is a piece about what usage-based billing actually requires from your stack, the pricing models people are running, and where the comparison between Stripe Billing + Metronome and Solvimon actually stops being relevant (hint: not cost)

How usage-based billing works

There are typically three components, working in sequence in UBB:

Event collection. Your product emits an event for every billable action (an API call, a token consumed, a task completed). High volume, no data loss, audit trail, the whole shebang.
Metering and aggregation. Raw events get aggregated into billable quantities, per customer, per product line, per billing period, you name it.
Rating and invoicing. Aggregated quantities get multiplied against the applicable rate to produce an invoice.

The hard part is in step 3 because of the complexities of "actioning" the math.

Two customers can consume identical volume and owe different amounts because they're on different plans, different commitment tiers, or have a custom rate the AE negotiated at 2am on the contract close. The billing engine has to hold all of that in its head simultaneously, and apply the right logic per customer per cycle.

You may think that's a minor thing but that's most of the work.

The pricing models people are running

Here's what I see across our customers at Solvimon:

Pure consumption. Pay per unit, fixed rate, no minimum. Common in dev-facing AI APIs (e.g., the early OpenAI API). Volume varies wildly, customers refuse to commit, you eat the variance.

Tiered. Per-unit rate decreases with volume, in two types:

Style	What it means
Volume	Final tier rate applies to all units
Graduated	Each tier's rate applies only to units in that band

You need to support one or both, and you need to do it consistently. (The number of pricing pages I've seen that claim "graduated" and silently rate as "volume"... well.)

Committed consumption with overage. Customer commits to a minimum spend or volume at a negotiated rate. Usage inside the commit bills at that rate. Usage above bills at a higher overage rate. This is the dominant structure for enterprise AI contracts right now. Predictable revenue for the vendor, predictable cost for the buyer, until the commit doesn't cover the year-end traffic spike. Then it gets interesting.

Credit-based. Customer prepays into a balance. Usage draws it down and new credits get bought any time and all the time. You then have to decide:

Do credits expire? At period end, never, or with a max balance roll-over?
How is breakage (unused expired credits) recognized?
How fresh does the balance need to look in the customer's UI? (Real-time, if you want to keep enterprise customers.)
If one balance spans multiple products, who tracks which product drew which credits?

Again, like before - these aren't things you can just guess or decide because they're the architecture you'll be stuck with for eons.

Hybrid. The catch-all for recurring base fee plus usage on top, consolidated onto a single invoice. Most enterprise B2B AI deals are shaped this way (e.g., a $50K/quarter platform fee plus tokens-consumed overage). The system runs subscription logic and consumption metering and then reconciles them.

What the billing infrastructure actually has to do

The list of things is long but the bar to reach, meaning the quality you have to live up to is surprisingly hard to get right:

Event ingestion from a warehouse (Snowflake, BigQuery, Redshift) or directly via API. AI products at scale are doing millions of events a day. The pipeline has to handle that without dropping events.
Real-time metering. Enterprise customers expect to watch usage accumulate in their dashboard, not wait for the invoice. For credit-based products this is non-negotiable.
A rating engine that holds custom terms. Per-customer overrides, ramp schedules, mid-cycle plan changes, currency conversion, recalculation when terms change. Pricing exceptions stay attached to subscriptions as structured data, not as PDF amendments stuffed in Salesforce.
Subscription and contract management at the structured-data level. Bulk migrations, i.e., moving hundreds of customers from one plan version to another while preserving per-customer custom terms, is a routine operation in mature systems. In immature ones it's a JIRA epic.
Multi-entity, multi-currency. Different legal entities, different currencies, different tax jurisdictions, different invoice formats, intercompany reconciliation. Becomes a hard requirement the day you sign a customer in a country that wasn't on the roadmap.
Tax. My favorite topic - because you can either go native, or via Avalara, Anrok, Sphere, etc. Usage-based tax is harder than subscription tax because the taxable amount changes every invoice.
Revenue recognition. My second favorite because under ASC 606 / IFRS 15, you recognize revenue as performance obligations are satisfied. For consumption: as usage occurs. For credits: deferred until consumed. Your billing system needs to feed period actuals, deferred revenue balances, and contract modification records into the finance team's stack in a form they can actually use. This is where most billing-system replacements get triggered.

Stripe Billing + Metronome vs purpose built system like Solvimon

Series B AI companies scaling internationally usually look at two architectures:

Stripe Billing + Metronome. Stripe runs subscriptions, invoicing, and payment collection. Metronome sits above it as the metering, rating, and event processing layer. You're integrating two systems and owning the seams.
Solvimon. Catalog, Metering, Subscriptions, Wallets, Entitlements, Invoicing, Revenue, Workflows in one system. Solvimon connects to Adyen, Stripe, Checkout.com as payment gateways rather than replacing them. (Built by the team that ran billing at Adyen at over €1T in annual payment volume.)

The cost question is the wrong question to lead with because it doesn't matter as much as the architecture:

Capability	Stripe Billing + Metronome	Purpose built system like Solvimon
Usage metering	Metronome (separate system)	Native
Pricing model configuration	Code/API for non-trivial cases	UI, no engineering required
Tax handling	Third-party required	Native but third-party also available
Revenue recognition	Stripe RevRec add-on or separate tool	Period actuals into the finance team's stack
Multi-entity	Engineering work	Native
Payment gateways	Stripe	Adyen, Stripe, Checkout.com
Bulk plan migrations	Engineering work	UI-based bulk operations
Best suited for	Engineering-led teams already on Stripe	Finance and RevOps-led teams, complex enterprise contracts

The Stripe + Metronome stack works when you have engineering capacity to own the seam between the two systems and your pricing changes infrequently. Solvimon makes more sense when finance or RevOps wants to own pricing configuration directly, when enterprise custom terms are accumulating faster than you can wrangle them in code, and when you want pricing iteration to move at the speed of a product manager rather than an engineering sprint.

These are different bets. They're not better-or-worse bets.

What to evaluate when choosing a billing architecture

Forget feature lists. Five questions cut through:

Configuration ownership. Can finance or RevOps change a pricing model directly, or does every change require an engineering ticket? This determines how fast you can respond to a competitor or to model-cost shifts.
Mid-cycle changes. A customer upgrades in week two. How does the system prorate? The right answer is automatic, auditable, consistent across customers.
Credit architecture. Expiration rules, breakage handling, balance freshness, multi-product allocation. Hard to change post-launch.
Bulk operations. Moving customers across plan versions while preserving per-customer custom terms. How the platform handles this tells you whether it was designed for enterprise scale or shipped with the assumption that you'd never need it.
The data pipeline to your finance stack. Where do usage events enter, what's the lag from usage to billing record, and what does reconciliation look like when event counts don't match invoice amounts? This is where most billing-system errors originate. Ask to see the data model before you commit.

Best subscription billing software for SaaS in 2026: A decision guide

Arnon Shimoni — Mon, 11 May 2026 14:58:41 +0000

tl;dr:

There is no single best subscription billing software for SaaS in 2026, and there never was before that. There is however, one that is best for your decision. Eight platforms compete in this market and each one wins a specific buyer.

Pure subscription lifecycle: Chargebee, Maxio, plus Recurly if you're B2C
Pure usage: Metronome
Hybrid pricing (seats, usage, credits, outcomes): Solvimon
Merchant of record for global tax: Paddle
Already deep in Stripe: Stripe Billing
Large enterprise multi-product, where money is no object: Zuora

I'll walk you through five questions that decide your branch, and the trade-offs that matter for each platform.

Why this question is harder in 2026

The category split that mattered in 2020, subscription billing versus usage metering, has collapsed. Most modern SaaS companies sell hybrid pricing now. Seats for the dashboard. Usage for the API. Credits for the AI features. Sometimes outcomes for the agent runs.

The platforms that grew up serving one of those models are scrambling to add the others. The ones built native to hybrid have a structural lead. (And the buyers who picked the wrong tool in 2022 are migrating in 2026.)

The decision today is not "which tool has more features." It is "which tool fits the pricing I'll be selling in two years, in the geographies I'll be selling in, with the engineering team I'll have." That's the question this guide answers.

The 5 questions that decide the billing choice

Before evaluating vendors, think about these five questions:

1. What's your pricing model? Pure subscription (seats, plans, tiers), pure usage (API calls, tokens, events), or hybrid (some combination, including credits and outcomes). What is your FUTURE pricing model - is it going to always remain the same?

2. What's your buyer mix? Pure PLG self-serve, pure sales-led contracts, or both. Hybrid GTM is now the default for B2B SaaS over $5M ARR.

3. What's your geographic scope? Single entity, single currency. Or multi-entity, multi-currency, with VAT and GST obligations across regions.

4. What's your tax posture? You handle compliance yourself with a tax engine. Or you want a merchant of record to take it off your plate (and a percentage of revenue with it).

5. What's your engineering posture? Comfortable configuring billing through a vendor dashboard. Or you want billing as code, versioned, scriptable, integrated into CI/CD.

The answers map cleanly to one of the six decision branches below.

The decision tree

If you sell pure subscriptions: Chargebee, Recurly, or Maxio

These three platforms grew up serving pure-subscription B2B SaaS. They handle plan lifecycle, prorations, dunning, and revenue recognition well. They are weaker on usage-based pricing and hybrid models.

Chargebee

The lifecycle leader for mid-market subscription SaaS. Strong on trial-to-paid conversion, plan upgrades and downgrades, multi-currency, and tax. Used by finance and RevOps teams that want defined workflows around billing.

Where it wins: established mid-market SaaS with stable subscription models that change pricing once or twice a year, not weekly.

Where it doesn't: pure usage-based pricing, AI metering, or hybrid models that mix seats with consumption. Chargebee has added usage features over the past two years, but the core architecture is subscription-first.

Pricing: contact sales.

Recurly

The dunning specialist. Recurly's involuntary churn reduction is its defining feature, with smart retry logic that reduces failed-payment churn meaningfully for subscription businesses. ASC 606 revenue recognition is built in. Accounting integrations are deep.

Where it wins: subscription businesses where involuntary churn is the primary revenue leakage problem.

Where it doesn't: anything outside the subscription and dunning core. Limited usage support, no CPQ, narrower feature surface than full-stack platforms.

Pricing: contact sales.

Maxio

Born from the SaaSOptics and Chargify merger. Maxio is the finance-led choice. GAAP-compliant revenue recognition, deep ARR and NRR reporting, churn cohort analysis. CFOs like it because the reporting is audit-ready out of the box.

Where it wins: finance-led B2B SaaS with strict revenue recognition needs and ARR / NRR reporting requirements.

Where it doesn't: API-first or developer-led teams. Maxio is a finance tool that does billing, not a billing platform that does finance.

Pricing: contact sales.

If you sell pure usage or consumption pricing: Metronome

Metronome is purpose-built for consumption billing. Real-time event ingestion, complex rating logic, high-fidelity metering. The customer base skews developer-first: API companies, infra platforms, AI inference providers.

Where it wins: pure usage-based companies that need a metering engine more than a subscription manager.

Where it doesn't: hybrid pricing with subscriptions, CPQ, or multi-entity invoicing. Metronome's scope is narrower than full-stack platforms.

Pricing: contact sales.

If you sell (or plan to sell) hybrid pricing (seats, usage, credits, outcomes): Solvimon

Solvimon was built native to hybrid pricing where seats and usage and credits and outcomes coexist in the same ledger, with no orchestration code stitching subsystems together. The primitive vocabulary is explicit: Catalog (the products and prices), Metering (real-time usage ingestion), Wallets (credit pools and rollovers), Subscriptions (recurring lifecycle), Entitlements (what each customer can access), Invoicing (multi-entity, multi-currency), Revenue (recognition and reporting), Workflows (the connective tissue).

Two architectural choices set Solvimon apart from the rest of this list.

PSP-agnostic - Solvimon works with Stripe, Adyen, and Checkout.com. You can switch payment processor without rebuilding billing. Stripe Billing locks you into Stripe. Solvimon doesn't lock you anywhere.
Headless - configuration happens through the dashboard, the CLI, the API, and an MCP server. Pricing changes ship as code, in pull requests, with tests. Engineers don't sit in dashboards. They ship pricing the same way they ship features.

Where it wins: AI-native companies, hybrid pricing, multi-entity global SaaS, teams that want billing as code rather than billing as a dashboard. The "outgrown your first system" buyer, the one who started on Stripe Billing or Chargebee and hit the wall somewhere between $5M and $15M ARR.

Where it doesn't: simple pure-subscription businesses that will never need usage or hybrid. Chargebee or Recurly will get you there with less surface area.

Customer proof: Yapily migrated 700 plans off custom code in 90 days.

Pricing: free up to $1M billed on the Essential plan, then $1,000/month up to €2M billed and 750K events/month. Solvimon for AI is free up to $3M billed, then 0.4%. Growth and Enterprise tiers are custom. [VERIFY against live /pricing page before publishing.]

If you want a merchant of record for global tax: Paddle

Paddle takes the merchant-of-record role. They become the seller of record in your transactions, which means they handle VAT, GST, US sales tax, and the compliance overhead that comes with selling internationally. You give up a percentage of revenue. You get back the time and risk of running tax compliance yourself.

Where it wins: SMB and mid-market SaaS selling internationally, especially self-serve. The buyer who'd rather pay 5% to Paddle than hire a tax compliance team.

Where it doesn't: enterprise contracts, hybrid pricing, custom CPQ, or any scenario where you need control over the customer relationship. The MoR model trades flexibility for compliance simplicity.

Pricing: revenue share. Contact sales for enterprise tiers.

If you're already deep in the Stripe ecosystem: Stripe Billing

Stripe Billing is the path of least resistance for teams already running Stripe Payments at depth. The integration is tight. Developer experience is strong. For simple recurring subscriptions, the time-to-launch is hard to beat.

Where it wins: PLG SaaS with simple subscription plans, already on Stripe, with no immediate need for hybrid pricing or multi-entity billing.

Where it doesn't: PSP optionality (you're locked to Stripe), enterprise contracts, multi-entity global billing, or hybrid pricing at scale. Teams routinely hit the Stripe Billing wall between $5M and $15M ARR and migrate.

Pricing: 0.5% to 0.8% of billing volume depending on plan. [VERIFY current Stripe Billing pricing tiers.]

If you're large enterprise with multi-product complexity: Zuora

Zuora is the enterprise incumbent. 50+ pricing models, deep compliance reporting, customizable discounting, multi-product billing. The platform CFOs at Fortune 500s default to.

Where it wins: large enterprises with complex multi-product billing, deep compliance requirements, and the budget for a multi-quarter implementation with consultants.

Where it doesn't: anything that needs to ship fast. Zuora is a heavy implementation that can take a year to implement. Total cost of ownership is high. Smaller teams will spend more time configuring than selling.

Pricing: contact sales.

Side-by-side comparison

Platform	Pricing model fit	Best for	Pricing
Solvimon	Hybrid: seats, usage, credits, outcomes	AI-native, scaling SaaS, multi-entity, headless monetization	Free to $1M billed, then custom
Chargebee	Subscription	Mid-market B2B SaaS	Contact sales
Zuora	All models, enterprise-grade	Large enterprise multi-product	Contact sales
Recurly	Subscription with strong dunning	Subscription-first SaaS	Contact sales
Maxio	Subscription with finance reporting	Finance-led B2B SaaS	Contact sales
Stripe Billing	Subscription, simple plans	PLG already on Stripe	0.5% to 0.8% of volume
Paddle	Subscription with MoR	Global self-serve SaaS	Revenue share
Metronome	Pure usage / consumption	Developer-first usage businesses	Contact sales

What's changing in 2026?

The reason this list looks different from a 2022 buyers' guide is that the underlying pricing models have changed - and things have become more popular (usage-based pricing) or less popular (seat pricing)

Three years ago, most SaaS companies sold seats. Billing platforms sold subscription lifecycle and the world was happy and tidy.

As of May 2026 the typical B2B SaaS company has three pricing surfaces in production at once. On Solvimon, most customers have five on average. A self-serve plan (seats), an API or AI feature (usage or credits), and a custom enterprise contract (negotiated mix). Companies are rightfully pricing on whichever model fits the surface, not picking one.

That puts pressure on the billing layer because a platform that grew up doing subscriptions has to bolt on usage, while a platform that grew up doing usage has to bolt on subscriptions. The seams show up in the customer experience, the engineering backlog, and the revenue leakage at month-close.

The platforms that win the next decade will be the ones built native to hybrid pricing, with PSP optionality, multi-entity infrastructure, and a configuration model that doesn't require an engineering ticket for every price change. Solvimon's wedge phrase, "billing and payments for companies that have outgrown their first system," names the buyer this market is producing in volume.

If you're picking a platform now, pick for where you'll be in 2028, not where you are today.

How I evaluated these platforms

The criteria that drove this guide:

Pricing model coverage: subscription, usage, credits, outcomes, hybrid combinations
Metering: real-time ingestion, rating logic, event deduplication
CPQ: quote-to-cash, custom contracts, multi-year ramps, tiered discounts
Global capability: multi-entity, multi-currency, VAT and GST tax engines, FX
PSP relationships: locked to one processor, or agnostic
Configuration model: dashboard, API, CLI, code, MCP
Revenue reporting: MRR, ARR, NRR, ASC 606 compliance, deferred revenue
Total cost of ownership: implementation time, engineering lift, pricing
Buyer evidence: G2, Gartner, customer migration patterns

This is not a Gartner Magic Quadrant. There is no single winner. There are six decision branches and eight platforms that win different ones.

FAQs

What is subscription billing software?

Subscription billing software automates recurring invoicing, payment collection, and revenue recognition across the customer lifecycle. Modern platforms also handle usage metering, CPQ, multi-entity invoicing, and hybrid pricing models that combine subscriptions with usage, credits, and outcomes.

What's the difference between subscription billing and usage-based billing?

Subscription billing charges a recurring amount on a defined cycle (monthly, annually). Usage-based billing charges based on consumption events (API calls, tokens, transactions). Hybrid models combine both, often with credits or wallets layered on top. Most modern SaaS uses hybrid in some form.

How is Solvimon different from Stripe Billing?

Stripe Billing is the billing layer of the Stripe payments ecosystem. Solvimon is PSP-agnostic billing infrastructure that works with Adyen, Stripe, or Checkout.com. Stripe Billing is optimized for simple subscription plans on Stripe. Solvimon is optimized for hybrid pricing, multi-entity global billing, and headless monetization.

When do SaaS companies typically migrate from Stripe Billing?

The pattern: companies hit the Stripe Billing wall somewhere between $2M and $15M ARR, usually triggered by one of three needs. Hybrid pricing (adding usage to seats). Multi-entity billing (multiple legal entities or currencies). Or an enterprise contract that Stripe Billing's CPQ can't handle.

What's headless monetization?

Headless monetization is the model where billing is configured the way modern frontends are configured. Through APIs, CLIs, code, and MCP servers, not through dashboards. Pricing changes ship as pull requests, with tests, in CI/CD. Engineers stay in their IDE. Finance still gets the dashboard for reporting. Solvimon is the platform built native to this model.

What's the best free subscription billing software?

The Solvimon Essential plan is free up to $1M billed annually, including 750K events per month and one billing entity. The Solvimon for AI plan extends free usage up to $3M billed.

Which billing platform handles multi-entity global SaaS best?

Solvimon and Zuora handle multi-entity global billing natively. Solvimon ships it from day one with built-in VAT, GST, and FX. Zuora handles it through enterprise configuration with longer implementation cycles. Stripe Billing require ssome stitching for multi-entity at scale.