Originally published on the BuildWithHermes blog, where the full per-client P&L table and the 30-minute audit worksheet live.
At 50 active client accounts on a Vapi reseller stack, the average agency is leaving $500 to $3,000 per month on the table in margin that the P&L thinks it is earning. That number comes from Viirtue's 2026 MSP buyer's guide to AI voice billing, which measured a 1.8% to 11.6% margin gap across reseller stacks and concluded that the gap compounds fast at 50 clients.
The cause sits in plain sight. The $0.05/min line on the pricing page is the orchestration fee. It is one of five invoices your stack actually generates, and the real all-in cost lands at $0.15 to $0.36 per minute once STT, LLM, TTS, and Twilio are layered in.
The agencies bleeding at 50 clients are not the ones who picked the wrong platform. They are the ones who priced retainers against a headline rate that only describes one fifth of the bill, and then scaled before the gap surfaced.
The 60,000-minute example
Run the numbers against a representative book. Fifty active accounts, average 1,200 billable minutes per client per month, 60,000 total minutes through the stack.
At the $0.05/min headline rate, your cost-of-service line reads $3,000 for the month. At a $0.23/min real all-in cost for a production GPT-4o setup, the line reads $13,800. The gap is $10,800 of variable cost the P&L did not see.
The lower bound: a tuned stack on GPT-4o mini and Deepgram Aura lands the same 60,000 minutes at $9,000. The delta from headline is still $6,000. That is the 1.8% to 5% margin gap range most agencies actually experience once they discover cheaper model routing.
"A 1.8% to 11.6% margin gap compounds fast. At 50 clients, agencies are looking at $500 to $3,000 per month in lost margin plus 5x more vendor management overhead." — Viirtue, 2026 MSP Buyer's Guide to AI Voice Billing
Where the money actually disappears
Five places. None of them show up on the same statement.
LLM tokens. $0.08 to $0.20 per minute on production conversations. A chatty client on an unbudgeted prompt can double their own LLM line in one month with no change in call volume. The agency absorbs the difference.
TTS. ElevenLabs at $0.18 per 1,000 characters lands at $0.036 to $0.072 per minute on average-verbosity turns. Cheaper catalogs clear about $0.011/min, but switching requires sign-off from the end-client who fell in love with the demo voice.
Concurrency fees. Default plans include 10 concurrent lines; extra lines are billed monthly. At 50 clients running outbound, 40 to 60 extra lines is routine: $400 to $600 a month in fixed cost the self-serve dashboard never warned you about.
Telephony. The per-minute carrier rate is only the base. Carrier filtering, campaign registration, and number rental stack on top.
Reconciliation labor. Five invoices means five logins, five rate cards, five support escalation paths, and five reconciliation tabs at month-end. At 50 clients that runs 8 to 12 founder-hours per month. It is not a cost line on the P&L. It is the most expensive line in the business.
The per-client P&L
Average retainer of $1,500 per client per month, 50 clients, $75,000 gross monthly revenue. Priced against the headline rate, the founder budgets roughly $3,000 of cost of service. The real stack bill, with concurrency, one HIPAA workspace add-on, and reconciliation labor, lands over $16,000. Roll that forward four quarters and the leak is $160,000 of margin against $900,000 of annual revenue. That is the difference between hiring a second AE and capping out at solo.
The 30-minute audit
Pull last month's invoices from every vendor in the stack: orchestration, LLM, TTS, STT, telephony. Sum the total. Divide by total billable minutes. That is your real per-minute cost. Compare it to the per-minute assumption inside your retainer pricing. The gap, times monthly minutes, is your monthly leak. Most owners who run this exercise find a number between $500 and $3,000, matching the Viirtue range.
The structural fix
The leak is not a discipline problem. It is an architecture problem: five vendors, five margins stacked against you, and no single system of record. The fix is consolidating to one platform with one invoice and transparent per-minute pricing, so cost of service is a number you read instead of a number you reconstruct.
That is the design principle behind Hermes: one platform from $149/month, included minutes per plan ($149/300 min, $399/1,000 min, $699/2,000 min), $0.24/min overage, and every client call metered in one usage ledger under your own brand.
Full P&L table, vendor line-item sources, and the audit worksheet: buildwithhermes.com.
Top comments (0)