The Worlds of Distributed Systems — Chapter 2
“As long as nothing has left the room, you can retry forever.”
RML-1 (Closed World) is the least flashy of the three worlds, but it’s the one that quietly determines whether everything else becomes painful.
In this chapter, we’ll reframe RML-1 as:
- a room where you can fail safely, and
-
a design discipline for deciding:
- what must stay inside RML-1, and
- where you must “graduate” into RML-2 / RML-3.
1) What is the Closed World?
If we summarize RML-1 in one line:
A temporary world that is still not observable from the outside.
1.1 Conditions for RML-1
A process belongs to RML-1 when it satisfies (roughly) these conditions:
- No writes to external databases
- No calls to external APIs that cause real effects
- No notifications to humans (email / Slack / push / etc.)
- No logs that later become inputs to business decisions or audits
In other words:
If this process fails, it leaves no external trace of having happened.
1.2 Typical examples
- Form validation in the browser
- Building a query from search conditions (not executed yet)
- Scoring/inference in a dry run where results are discarded
- Simulation that reads production data but never writes
flowchart LR
subgraph ClosedWorld[RML-1: Closed World]
A[Memory] --> B[Temporary files]
B --> A
end
ClosedWorld -->|If OK| OUT[External world (RML-2/3)]
You can experiment inside the room forever without changing the outside world.
1.3 “Sandbox” is not the same as RML-1
A common confusion:
“Isn’t this just a sandbox environment?”
The key distinction:
-
Sandbox / staging = isolated environment
- separate cluster/account/URL
- can still perform DB writes and external integrations inside that environment
-
RML-1 = isolated state / behavior
- even in the same environment (sometimes even in production), you can run in a mode where outcomes are not externally observable
A useful phrase:
Sandbox is about where. RML-1 is about what leaks out.
This is why:
- staging can still contain RML-2/RML-3 behavior, and
- production can still host an RML-1 “mode” (dry-run preview, simulation).
2) Why you should deliberately carve out RML-1
RML-1 isn’t academic—it creates concrete operational advantages.
2.1 A place where you can “try messy things safely”
If you can keep work inside RML-1, you can do:
- experiments on real data
- algorithm comparisons
- offline side of A/B testing
…with a healthy loop:
“Try quickly.”
“If it’s bad, erase it completely.”
If you don’t design RML-1 explicitly, a small PoC tends to leak into RML-2/3 immediately, and teams fall into the trap of:
- “Production data is scary, so we’ll only test with toy data” → models and logic never mature.
2.2 You make the boundary visible
When RML-1 is explicit, the boundary becomes discussable:
- transaction begin
- queue publish
- sending HTTP requests
You start to talk about exits as:
“This is the doorway from RML-1 → RML-2.”
In design reviews, it enables a simple and powerful question:
“Are we still inside the room here?”
2.3 It changes how you think about staging and testing
Many “staging” setups are fake safety:
- they read production DBs directly
- they connect to real external services
- they send real emails/notifications
That’s not RML-1. That’s already RML-2/3 behavior.
Once you think in worlds, you can do better:
- embed RML-1 inside production (dry-run APIs)
- forbid RML-2/3 effects even in staging
- enforce “world boundaries” in code, not only in infrastructure.
3) RML-1 design patterns
Now let’s get practical: patterns for keeping computation inside the room.
3.1 Read-only + Dry Run
Idea
Read production-like data, but never write and never notify.
Use cases
- evaluate a new algorithm on real data
- “If we apply this rule, what changes?” preview
- dry-run a batch job (compute + discard; optional non-business debug logs)
Implementation sketch
- use a DB role with read-only privileges
- external clients support no-op mode
type World = "RML1" | "RML2" | "RML3";
type Env = {
world: World;
db: DbClient; // may be read-only
notifier: Notifier; // no-op in RML-1
};
async function simulate(env: Env, input: Input): Promise<SimulationResult> {
if (env.world !== "RML1") {
throw new Error("simulate() is RML-1 only");
}
const raw = await env.db.fetchProductionData(input);
const result = runNewAlgorithm(raw);
// In RML-1 we return results but do not persist or notify.
return result;
}
Key habit
- make “world” explicit in code
- in RML-1,
db.update(...)/notifier.send(...)should either throw or be no-op.
3.2 Consolidate exits into one place (Effect Dispatcher)
If effects can happen anywhere, you can’t reason about RML-1.
Anti-patterns
- random HTTP calls scattered across the codebase
- “a library casually sends an email”
- business logic directly publishes to queues
Pattern
Collect all external effects as values, and execute them only at a single commit point.
type Effect =
| { type: "SendEmail"; to: string; subject: string; body: string }
| { type: "UpdateDb"; table: string; id: string; payload: unknown }
| { type: "EmitEvent"; topic: string; payload: unknown };
type World = "RML1" | "RML2" | "RML3";
// Minimal placeholders to keep the example self-contained.
type Input = { userEmail: string };
type Result = { ok: true };
type CommitResult =
| { mode: "DRY_RUN"; planned: Effect[] }
| { mode: "EXECUTED" };
async function runBusinessLogic(input: Input): Promise<{ result: Result; effects: Effect[] }> {
const effects: Effect[] = [];
// compute/validate (keep this RML-1-like)
const result: Result = { ok: true };
// Propose effects as data (do not execute here).
effects.push({
type: "SendEmail",
to: input.userEmail,
subject: "Notice",
body: "...",
});
return { result, effects };
}
async function commit(world: World, effects: Effect[]): Promise<CommitResult> {
if (world === "RML1") {
// Dry-run: do not execute. Return the plan for preview/UI.
return { mode: "DRY_RUN", planned: effects };
}
// In RML-2/3, effects are executed here and only here.
for (const e of effects) {
// dispatch by type
}
return { mode: "EXECUTED" };
}
Why this matters:
-
runBusinessLogic()stays inside the room -
commit()becomes the world boundary and the responsibility boundary
In reviews you can ask:
“Inside this function, we only accumulate effects, right?”
3.3 RML-1 inside production: safe “cheat mode”
A powerful step is:
Embed RML-1 features inside production.
Examples:
- users see a preview that isn’t committed yet
- admins simulate “what would happen” before applying changes
UI-wise it’s often:
- “Save” button = exit to RML-2/3
- “Simulate” button = stay in RML-1 and return
Another example: safe shadow validation:
- run new payment logic on real traffic
- keep results in RML-1
- production behavior remains old logic until explicitly promoted
You can explain “shadow deployments” cleanly as a world concept.
3.4 Sidebar: idempotency and side-effect gradients
Advanced readers may ask:
“If writes are perfectly idempotent, can they be treated like RML-1?”
For this series, we intentionally draw a strict line:
RML-1 assumes zero externally observable side effects.
Because in practice:
- proving idempotency is hard
- “we thought it was safe” often means “we missed how it’s observed”
However, there is a practical gradient:
- in-process memory cache updates
- temporary files nobody else can observe
- single-process internal state
These can be treated as “internal implementation detail” inside RML-1.
But the moment you touch something observable by others:
- shared caches (Redis)
- shared temp tables referenced by other services
- logs used for operations / audits
…you should treat it as RML-2/3, even if the write is idempotent.
A simple rule:
- RML-1: only unobservable side effects
- RML-2+: any side effect observable by another service or human
4) Common pitfalls that break RML-1
4.1 “Logging is harmless” (it often isn’t)
A classic trap:
“It’s just logs, so it’s still RML-1.”
But logs come in two categories:
-
purely technical logs (latency, internal state)
- often fine in RML-1
-
business-meaning / audit logs
- if they will be referenced later, you’re effectively writing history
If “writing logs” becomes “writing history,” that’s RML-3 behavior.
Guideline:
- keep RML-1 logs to those not used for business decisions
- treat audit/transaction logs as RML-3 artifacts
4.2 “It’s staging, so it’s safe” (worlds don’t change)
Staging can still be RML-2/3 if:
- it connects to real payment gateways
- it sends real emails
- it hits real partners
Staging is an environment; the world is defined by observability and effect leakage.
4.3 “Let’s add a notification to our simulation feature”
This is the temptation:
“Since we computed it, let’s email the result to someone.”
The moment you do that:
- the world graduates to RML-2/3
- humans act on it, and you’ve created irreversible reality
Countermeasure:
- decide up front: RML-1 features must not send external notifications
-
if sharing is needed, make it a two-step flow:
- copy results out
- share via an explicit RML-2/3 UI path
5) Detecting “we thought it was RML-1, but it wasn’t”
The scariest state is:
everyone believes it’s RML-1
but it actually behaves like RML-2/3.
5.1 Checklist
If any of these are “yes,” be suspicious:
- Does the result leave logs that change someone’s decision-making later?
- Does it write to a DB referenced by other services?
- Have you ever received a question about data produced by this flow?
- Do you need to explain this behavior via ToS/SLA/compliance?
If you get even one yes:
“Are we sure this isn’t at least RML-2?”
5.2 A pragmatic approach
Perfect boundaries are hard. Start from extremes:
- expand “clearly RML-1-safe” areas
- label “clearly not RML-1” areas
Then fill the middle over time.
6) How to use this at work: label RML-1 in your project
6.1 Label features
Try labels like:
-
RML-1 strict(no observable effects) -
RML-1 read-only(production reads allowed, no writes) -
RML-2+(effects exist)
Example:
| Feature | RML label | Notes |
|---|---|---|
| Scoring simulation | RML-1 read-only | production reads, no writes |
| Auto assignment for A/B | RML-2+ | writes to user attributes |
| CSV export preview | RML-1 strict | view-only, no file creation |
This makes a new kind of proposal easy:
“Let’s keep this RML-1 strict and move effects into a separate flow.”
6.2 Create an “RML-1-only” module boundary
Even a soft code convention helps:
app/
world_rml1/
scoring/
validation/
simulation/
world_rml2/
saga/
effect_dispatcher/
world_rml3/
ledger/
incident/
You don’t need perfection. You need a shared boundary.
7) Summary: build the room first
- RML-1 = a temporary world not observable from outside
- a well-designed RML-1 enables safe PoCs on real data and makes exits visible
-
key patterns:
- read-only + dry-run
- consolidate exits (effect dispatcher)
- embed RML-1 modes inside production (preview/simulation/shadow)
-
key pitfalls:
- logs that are actually history
- staging ≠ RML-1
- mixing notification/writes into simulations
-
a practical rule:
- RML-1 allows only unobservable side effects
RML-1 isn’t just “testing convenience.”
It’s the foundational work of agreeing:
Where does the world start to change?
In the next chapter, we’ll step outside the room:
Chapter 3 — RML-2 (Dialog World): rollback as conversation between services and humans.
Top comments (0)