DEV Community

Alfredo Romero
Alfredo Romero

Posted on • Originally published at buildwithhermes.com

Multi-Tenant Voice AI: Isolate Client Numbers, Billing, and Offboarding

Originally published on the BuildWithHermes blog. If you run more than three clients on a single Vapi or Retell account, you have a multi-tenancy problem you cannot see yet. Here is the architecture that prevents it.

Most AI voice agencies are running a fundamentally broken architecture and do not know it yet. One Vapi account. One Retell account. Five clients sharing it. The phone numbers belong to whoever provisioned them. Billing is a single invoice with no per-client attribution. And if a client cancels, nobody has a clean plan for what happens to their call recordings, their number, or their contact data.

This is not a software problem. It is an architecture problem. It does not become visible until client number five, when something goes wrong: a knowledge base update for one client bleeds into another agent, a cancellation turns into a dispute over who owns the phone number, or a GDPR deletion request forces you to hunt through a shared account trying to find everything that belongs to one company.

The actual multi-tenancy problem

Multi-tenancy means running one platform that serves multiple independent customers, where each customer's data, configuration, and resources are completely isolated from every other customer's. In traditional SaaS this is table stakes. In a voice AI agency using off-the-shelf infrastructure tools, it is almost never handled correctly, because Vapi and Retell were built for developers shipping a single-tenant product, not for an operator managing agents on behalf of many end clients.

Agencies on DIY stacks hit three specific failure modes past three clients:

Failure mode 1: data crossover. On a shared account, all agent configs live in the same namespace. A knowledge base update intended for Client A can land on the wrong agent and immediately affect Client B's calls. The deeper version is session data: poor session management in shared voice systems can surface one customer's information inside another's conversation. When Client A is a dental clinic and Client B is a law firm, that crossover is a breach.

Failure mode 2: billing opacity. One upstream invoice with no per-client attribution means you cannot answer the most basic operating question: which client cost you what this month? You end up reverse-engineering usage from call logs, and your retainer margins are a guess.

Failure mode 3: impossible clean offboarding. When a client cancels on a shared account, there is no clean cut. Their number, recordings, and contacts are tangled into the same account as everyone else, so deletion is manual, error-prone, and a compliance liability.

What proper multi-tenant architecture looks like

Three layers of isolation, by design:

  1. Workspace isolation. Each client gets a fully isolated workspace: separate agents, knowledge bases, contacts, and config in their own namespace. A change in one workspace cannot touch another.
  2. Per-client phone number ownership. Numbers are provisioned to a specific client workspace, with clear ownership, so a cancellation does not become a dispute over who the number belongs to.
  3. Independent usage metering. Every call, minute, and resource is metered per workspace, so per-client cost and per-client billing are exact, not reconstructed.

A clean offboarding procedure

When a client leaves, the architecture should make these five steps self-contained, not a manual hunt across a shared account:

  1. Data export: hand the client their recordings, transcripts, and contacts.
  2. Number transition: port or release the number cleanly, since ownership was explicit from day one.
  3. Data deletion confirmation: delete the workspace and confirm in writing for GDPR/compliance.
  4. Access revocation: revoke credentials tied only to that workspace.
  5. Billing settlement: final per-client invoice, exact because metering was per-workspace all along.

True multi-tenancy is what makes offboarding one client a single, contained action instead of an emergency.

Where BuildWithHermes fits

Hermes was built with native workspace isolation from day one: isolated client workspaces, per-client number ownership, and independent per-workspace usage metering, so data crossover, billing opacity, and messy offboarding are designed out rather than patched over. One platform, your brand, multi-tenant by default. Starter is $149/month.

Full architecture breakdown, the DIY-vs-native cost comparison, and the 10-client reference design: buildwithhermes.com

Top comments (0)