KinthAI

Posted on Apr 28 • Originally published at blog.kinthai.ai

OpenClaw Multi-Tenancy: Why a VM Per User Does Not Scale (and What Does)

#openclaw #multitenancy #ai #infrastructure

Vanilla OpenClaw runs as a single-tenant system. One user, one instance, one VM. For a small group — 5 to 30 people — this works. Beyond 30-50 users, it falls apart. Here is why, and what actual multi-tenancy looks like.

The "Use a VM" Answer Is Technically Correct

You can absolutely give each user their own OpenClaw VM. It will work. But four things go wrong as you scale:

1. Predictable Costs Regardless of Usage

A VM costs $5-15/month at standard cloud pricing whether the user talks to their agent daily or abandoned it after day one. At 100 users, you are paying $500-1500/month. At 1000 users, $5000-15000/month. Most of those VMs are idle most of the time.

2. Onboarding Friction Kills Conversion

The setup sequence: create account → provision VM → install OpenClaw → configure provider API keys → create SOUL.md → initialize gateway. Most users drop off at the provisioning step. The gap between "I want to try this" and "I am talking to my agent" should be seconds, not minutes.

3. Maintenance Across N Installations

With 100 separate OpenClaw instances, you need to push updates to all of them. Most users will never upgrade on their own. You end up with a fleet of stale, vulnerable installations and no central way to push patches.

4. Cross-Tenant Features Become Impossible

Agent marketplaces, shared skill libraries, agent-to-agent communication — none of these work across isolated VMs. If Agent A lives on VM-1 and Agent B lives on VM-2, they cannot collaborate without a networking layer that defeats the purpose of isolation.

What Real Multi-Tenancy Actually Requires

Multi-tenancy is not "put everyone on the same server." It is five distinct engineering problems:

Tenant Identity Propagation

Every API call, every file operation, every memory query must carry a tenant_id. File operations must be restricted to /workspace/<tenant_id>/. Missing a single code path creates a data leak vulnerability.

Resource Quotas

Token budgets, CPU/memory caps, and rate limiting — all per tenant, not per agent. An agent-level budget is easy to game (create more agents). Tenant-level aggregate spending is what actually matters.

Authentication and Authorization

Two levels: platform operations (deploy, install plugins, manage billing) and tenant operations (chat with agent, configure personal settings). OpenClaw's session model was not designed for this distinction.

Data Isolation

Separate storage for: workspace files, memory indexes (vector stores), conversation sessions, and plugin state. A memory search for Tenant A must never return Tenant B's data, even if the embeddings are similar.

Operational Tooling

Monitoring, logging, and metrics sliced by tenant. When something breaks at 3am, you need to know which tenant is affected, not just which server.

Implementation Effort

Component	Timeline	Primary Challenge
Tenant identity propagation	1-2 weeks	Missing code paths = security holes
Per-tenant token budgets	1-2 weeks	Agent-level budgets fail; tenant aggregation required
Container/resource limits	1 week	OS-level configuration
Authentication layer	2-3 weeks	OpenClaw session model vs. identity model conflict
Per-tenant plugin state	Variable	Plugin-dependent complexity
Operational tooling	1-2 weeks	Under-investment creates ops pain later
Total	~2 months	Long-tail edge cases dominate

The breakeven point is roughly 30-50 users. Below that, VMs are fine. Above that, multi-tenancy is clearly worth the engineering investment.

The Memory Problem Makes It Worse

Multi-tenancy is not just about resource isolation — it is about memory isolation. When agents have persistent memory (and they should — see why Character.AI forgets you), the isolation requirements multiply.

A persistent memory system has five components:

Memory store — where memories live (vector DB, SQLite+FTS5, etc.)
Retrieval — how memories are fetched (embedding similarity, keyword, hybrid)
Writeback — how new memories are created from conversations
Conflict resolution — what happens when new information contradicts stored memory
User isolation — ensuring User A's memories are never accessible to User B

Component 5 is trivial in a single-tenant VM (there is only one user). In multi-tenancy, it requires partition-level enforcement at the storage layer, not just query-time filtering. A naive implementation that filters by user_id after retrieval still exposes memory embeddings to the similarity search, which can leak information through nearest-neighbor results.

Managed Alternatives

If you do not want to build multi-tenancy yourself, several managed options exist:

CrewClaw — agent template deployment, message-based pricing
ClawAgora — marketplace-style agent hosting
ClawCloud / ClawRunway / OpenClaw Cloud — managed per-VM hosting (not true multi-tenancy)
KinthAI (agents.kinthai.ai) — native multi-tenancy with persistent memory, agent marketplace, and multi-agent collaboration

The distinction matters: managed per-VM hosting solves the operational burden but not the scaling economics or cross-tenant features. True multi-tenancy solves all three.

Choose Based on Your Scale

< 30 users: Per-VM is fine. Use ClawCloud or self-host.
30-500 users: You need multi-tenancy. Build it (~2 months) or use a platform that has it.
500+ users: Multi-tenancy is non-negotiable. The economics of per-VM will eat your runway.

We built KinthAI because we wanted to deliver agents that remembered users, learned from them, and could collaborate with other agents — at consumer scale. That required solving multi-tenancy at the infrastructure layer, not bolting it on later.

Originally published at blog.kinthai.ai

DEV Community