“Scaling like an adult instead of just adding more pods”
You know that phase where a product “kind of works”, traffic is growing, infra is… fine-ish, and suddenly someone asks:
“Can we onboard this giant enterprise customer who needs data residency in three regions and hard isolation?”
That’s the moment a lot of teams discover deployment stamps.
This post is my attempt to demystify the Deployment Stamp pattern: what it is, when it actually makes sense, and what trade-offs you’re signing up for.
What are Deployment Stamps, really?
In the official definition, a deployment stamp is a repeatable unit of deployment that contains a full copy of your application stack: compute, storage, networking, data stores, everything. You then deploy multiple such stamps to scale and isolate workloads or tenants.
Each stamp is sometimes called a scale unit, service unit, or cell. In a multi-tenant setup, each stamp usually hosts a subset of tenants, up to some capacity limit. When you need to scale, you don’t resize the one big thing — you rubber-stamp a new copy.
Mentally, think:
- Not: “One giant cluster + bigger database”
- But: “Many small, identical cities instead of a single mega-city”
Each “city” has its own roads (networking), buildings (services), and utility grid (databases, caches). If one city loses power, the others keep running.
Why not just scale horizontally like everyone else?
Classic horizontal scaling usually means:
- One shared database (or shard set)
- One or a few big clusters
- Global queues and caches
- All tenants logically mixed in the same infra
That works great… until:
- Cloud limits hit you: Per-subscription / per-region quotas on resources, IPs, CPU, databases, etc., are very real. At scale, these become hard ceilings.
- Noisy neighbors ruin someone’s day: A single bad tenant (or a big batch job) can impact everyone because they share infra.
- Regulatory and data residency constraints: Some customers require their data to stay in a given region or have strict isolation demands (e.g., finance, healthcare).
- Blast radius gets scary: A bad deploy, schema migration, or infra failure affects everyone in your global environment.
- Deployment stamps attack all of these by saying: “What if we make multiple, isolated deployments of the entire platform and route tenants between them?”
The core idea: a control plane + many stamps
A typical deployment-stamp architecture has two big pieces:
- Control plane (brain)
- Knows which tenants live in which stamp
- Handles provisioning new stamps from a template
- Manages routing, observability, and compliance policies
- Data plane stamps (muscle)
- Each stamp is a full copy of your app stack
- Runs workloads for some bounded number of tenants
- Has its own databases, caches, and usually its own network boundary
We can sketch the routing logic in Python-ish pseudocode:
from dataclasses import dataclass
@dataclass
class Stamp:
name: str
region: str
capacity: int
current_tenants: set
def has_capacity(self) -> bool:
return len(self.current_tenants) < self.capacity
# In reality this lives in a control-plane service + DB
STAMPS = [
Stamp(name="stamp-eu-1", region="westeurope", capacity=200, current_tenants=set()),
Stamp(name="stamp-us-1", region="eastus", capacity=300, current_tenants=set()),
]
TENANT_TO_STAMP = {} # tenant_id -> stamp_name
def assign_stamp_for_tenant(tenant_id: str, region_pref: str | None = None) -> Stamp:
# Already assigned?
if tenant_id in TENANT_TO_STAMP:
name = TENANT_TO_STAMP[tenant_id]
return next(s for s in STAMPS if s.name == name)
# Pick a stamp that matches region preference and has capacity
candidates = [s for s in STAMPS if s.has_capacity()]
if region_pref:
regional = [s for s in candidates if s.region == region_pref]
if regional:
candidates = regional
if not candidates:
raise RuntimeError("No stamps available: time to deploy a new one!")
chosen = min(candidates, key=lambda s: len(s.current_tenants))
TENANT_TO_STAMP[tenant_id] = chosen.name
chosen.current_tenants.add(tenant_id)
return chosen
This tiny snippet hides a lot of reality, but it captures the pattern:
- Tenants are bound to stamps.
- Stamps are capacity-bounded.
- When we run out, we stamp out another one.
What actually lives inside a stamp?
At minimum, a stamp usually contains:
- Application services (containers, functions, VMs, whatever you use)
- API gateways / load balancers
- Data stores: relational DBs, NoSQL, caches, search clusters
- Observability stack: logs, metrics, traces (or at least exporters)
- Networking boundaries: VPC/VNet, subnets, firewall rules
- Compliance / security controls specific to that region or customer segment The important part: stamps are deployed from the same template (Bicep, Terraform, Pulumi, CDK, etc.). No snowflake stamps. If stamp N+1 is bespoke, you’ve lost the pattern.
How routing works (without turning into spaghetti)
When a request hits your platform, you typically have a global front door:
- Request arrives at a global entry point (DNS, CDN, anycast gateway).
- Auth happens (or at least token parsing).
- The system identifies the tenant.
- The control plane looks up: tenant_id -> stamp.
- The request is forwarded to the right stamp’s internal endpoint.
You can implement routing in different ways:
- Centralized router: One service routes everything to the correct stamp. Easier to reason about, but you must keep it lean.
- Stamp-aware DNS: Resolve tenant-specific hostnames (tenant123.app.com) directly to the stamp front door.
Token-encoded stamp: The client receives a base URL or claim indicating which stamp to call directly after initial login.
The main invariant: once a tenant is bound to a stamp, all of its traffic and data should flow there. Cross-stamp traffic should be the exception, not the norm.
And why they’re not a silver bullet
This pattern is powerful, but it’s not free.
1. Operational overhead
You now have N copies of:
- Databases
- Clusters
- Diagnostics
- Secrets, keys, certs
Without heavy automation, this turns into SRE misery. The docs explicitly stress that deployment stamps assume infra-as-code, automated rollout, and centralized monitoring.
2. Cross-stamp analytics is harder
Want a query across all tenants? That means aggregating data from multiple stamps:
- Centralized data lake fed by per-stamp ETL
- Or federated queries across per-stamp warehouses
Either way, “just run a query against the main DB” is gone.
3. Version drift risk
If you don’t manage deployments carefully:
- Stamp A is on version 1.3
- Stamp B is on 1.4 with a DB migration
- Stamp C is on 1.2 because someone paused rollout
Now debugging becomes archaeology. Blue/green or canary strategies per stamp help, but demand discipline.
4. Routing mistakes hurt
If a bug routes a tenant to the wrong stamp, requests will fail or, worse, hit the wrong data. Your tenant-to-stamp mapping and identity model must be rock solid.
How does this compare to other patterns?
Quick mental map:
- Bulkhead pattern: isolates components inside one deployment (e.g., pool separation, thread pools, queues).
- Deployment stamps: isolates full deployments from each other.
- Simple sharding: typically focused on data-layer segmentation (e.g., shard IDs in DB).
- Stamps: full-stack segmentation, including compute, storage, and often networking.
You can (and often do) combine them:
Use stamps to separate large groups of tenants or regions.
Inside each stamp, use bulkheads and sharding for further resiliency and scale.
TL;DR
Deployment stamps = multiple, independent copies of your entire app stack (compute + data + network) deployed from a shared template.
You bind tenants to stamps and route all their traffic there. When capacity or compliance demands grow, you deploy more stamps.
Benefits: near-linear scale-out, better isolation, cleaner blast-radius boundaries, and stronger data residency guarantees.
Costs: more infra to manage, complex routing, tricky cross-stamp analytics, and the need for serious automation.
This pattern shines for large multi-tenant SaaS and compliance-heavy scenarios. For small systems, it’s usually unnecessary complexity.



Top comments (0)