DEV Community

kanaria007
kanaria007

Posted on • Originally published at zenn.dev

Chapter 2 — RML-1 (Closed World): Build a Room Where Failure Is Safe

The Worlds of Distributed Systems — Chapter 2

“As long as nothing has left the room, you can retry forever.”

RML-1 (Closed World) is the least flashy of the three worlds, but it’s the one that quietly determines whether everything else becomes painful.

In this chapter, we’ll reframe RML-1 as:

  • a room where you can fail safely, and
  • a design discipline for deciding:

    • what must stay inside RML-1, and
    • where you must “graduate” into RML-2 / RML-3.

1) What is the Closed World?

If we summarize RML-1 in one line:

A temporary world that is still not observable from the outside.

1.1 Conditions for RML-1

A process belongs to RML-1 when it satisfies (roughly) these conditions:

  • No writes to external databases
  • No calls to external APIs that cause real effects
  • No notifications to humans (email / Slack / push / etc.)
  • No logs that later become inputs to business decisions or audits

In other words:

If this process fails, it leaves no external trace of having happened.

1.2 Typical examples

  • Form validation in the browser
  • Building a query from search conditions (not executed yet)
  • Scoring/inference in a dry run where results are discarded
  • Simulation that reads production data but never writes
flowchart LR
  subgraph ClosedWorld[RML-1: Closed World]
    A[Memory] --> B[Temporary files]
    B --> A
  end
  ClosedWorld -->|If OK| OUT[External world (RML-2/3)]
Enter fullscreen mode Exit fullscreen mode

You can experiment inside the room forever without changing the outside world.

1.3 “Sandbox” is not the same as RML-1

A common confusion:

“Isn’t this just a sandbox environment?”

The key distinction:

  • Sandbox / staging = isolated environment

    • separate cluster/account/URL
    • can still perform DB writes and external integrations inside that environment
  • RML-1 = isolated state / behavior

    • even in the same environment (sometimes even in production), you can run in a mode where outcomes are not externally observable

A useful phrase:

Sandbox is about where. RML-1 is about what leaks out.

This is why:

  • staging can still contain RML-2/RML-3 behavior, and
  • production can still host an RML-1 “mode” (dry-run preview, simulation).

2) Why you should deliberately carve out RML-1

RML-1 isn’t academic—it creates concrete operational advantages.

2.1 A place where you can “try messy things safely”

If you can keep work inside RML-1, you can do:

  • experiments on real data
  • algorithm comparisons
  • offline side of A/B testing

…with a healthy loop:

“Try quickly.”
“If it’s bad, erase it completely.”

If you don’t design RML-1 explicitly, a small PoC tends to leak into RML-2/3 immediately, and teams fall into the trap of:

  • “Production data is scary, so we’ll only test with toy data” → models and logic never mature.

2.2 You make the boundary visible

When RML-1 is explicit, the boundary becomes discussable:

  • transaction begin
  • queue publish
  • sending HTTP requests

You start to talk about exits as:

“This is the doorway from RML-1 → RML-2.”

In design reviews, it enables a simple and powerful question:

“Are we still inside the room here?”

2.3 It changes how you think about staging and testing

Many “staging” setups are fake safety:

  • they read production DBs directly
  • they connect to real external services
  • they send real emails/notifications

That’s not RML-1. That’s already RML-2/3 behavior.

Once you think in worlds, you can do better:

  • embed RML-1 inside production (dry-run APIs)
  • forbid RML-2/3 effects even in staging
  • enforce “world boundaries” in code, not only in infrastructure.

3) RML-1 design patterns

Now let’s get practical: patterns for keeping computation inside the room.

3.1 Read-only + Dry Run

Idea

Read production-like data, but never write and never notify.

Use cases

  • evaluate a new algorithm on real data
  • “If we apply this rule, what changes?” preview
  • dry-run a batch job (compute + discard; optional non-business debug logs)

Implementation sketch

  • use a DB role with read-only privileges
  • external clients support no-op mode
type World = "RML1" | "RML2" | "RML3";

type Env = {
  world: World;
  db: DbClient;       // may be read-only
  notifier: Notifier; // no-op in RML-1
};

async function simulate(env: Env, input: Input): Promise<SimulationResult> {
  if (env.world !== "RML1") {
    throw new Error("simulate() is RML-1 only");
  }

  const raw = await env.db.fetchProductionData(input);
  const result = runNewAlgorithm(raw);

  // In RML-1 we return results but do not persist or notify.
  return result;
}
Enter fullscreen mode Exit fullscreen mode

Key habit

  • make “world” explicit in code
  • in RML-1, db.update(...) / notifier.send(...) should either throw or be no-op.

3.2 Consolidate exits into one place (Effect Dispatcher)

If effects can happen anywhere, you can’t reason about RML-1.

Anti-patterns

  • random HTTP calls scattered across the codebase
  • “a library casually sends an email”
  • business logic directly publishes to queues

Pattern

Collect all external effects as values, and execute them only at a single commit point.

type Effect =  
  | { type: "SendEmail"; to: string; subject: string; body: string }  
  | { type: "UpdateDb"; table: string; id: string; payload: unknown }  
  | { type: "EmitEvent"; topic: string; payload: unknown };  

type World = "RML1" | "RML2" | "RML3";  

// Minimal placeholders to keep the example self-contained.  
type Input = { userEmail: string };  
type Result = { ok: true };  

type CommitResult =  
  | { mode: "DRY_RUN"; planned: Effect[] }  
  | { mode: "EXECUTED" };  

async function runBusinessLogic(input: Input): Promise<{ result: Result; effects: Effect[] }> {  
  const effects: Effect[] = [];  

  // compute/validate (keep this RML-1-like)  
  const result: Result = { ok: true };  

  // Propose effects as data (do not execute here).  
  effects.push({  
    type: "SendEmail",  
    to: input.userEmail,  
    subject: "Notice",  
    body: "...",  
  });  

  return { result, effects };  
}  

async function commit(world: World, effects: Effect[]): Promise<CommitResult> {  
  if (world === "RML1") {  
    // Dry-run: do not execute. Return the plan for preview/UI.  
    return { mode: "DRY_RUN", planned: effects };  
  }  

  // In RML-2/3, effects are executed here and only here.  
  for (const e of effects) {  
    // dispatch by type  
  }  

  return { mode: "EXECUTED" };  
}  
Enter fullscreen mode Exit fullscreen mode

Why this matters:

  • runBusinessLogic() stays inside the room
  • commit() becomes the world boundary and the responsibility boundary

In reviews you can ask:

“Inside this function, we only accumulate effects, right?”


3.3 RML-1 inside production: safe “cheat mode”

A powerful step is:

Embed RML-1 features inside production.

Examples:

  • users see a preview that isn’t committed yet
  • admins simulate “what would happen” before applying changes

UI-wise it’s often:

  • “Save” button = exit to RML-2/3
  • “Simulate” button = stay in RML-1 and return

Another example: safe shadow validation:

  • run new payment logic on real traffic
  • keep results in RML-1
  • production behavior remains old logic until explicitly promoted

You can explain “shadow deployments” cleanly as a world concept.


3.4 Sidebar: idempotency and side-effect gradients

Advanced readers may ask:

“If writes are perfectly idempotent, can they be treated like RML-1?”

For this series, we intentionally draw a strict line:

RML-1 assumes zero externally observable side effects.

Because in practice:

  • proving idempotency is hard
  • “we thought it was safe” often means “we missed how it’s observed”

However, there is a practical gradient:

  • in-process memory cache updates
  • temporary files nobody else can observe
  • single-process internal state

These can be treated as “internal implementation detail” inside RML-1.

But the moment you touch something observable by others:

  • shared caches (Redis)
  • shared temp tables referenced by other services
  • logs used for operations / audits

…you should treat it as RML-2/3, even if the write is idempotent.

A simple rule:

  • RML-1: only unobservable side effects
  • RML-2+: any side effect observable by another service or human

4) Common pitfalls that break RML-1

4.1 “Logging is harmless” (it often isn’t)

A classic trap:

“It’s just logs, so it’s still RML-1.”

But logs come in two categories:

  • purely technical logs (latency, internal state)

    • often fine in RML-1
  • business-meaning / audit logs

    • if they will be referenced later, you’re effectively writing history

If “writing logs” becomes “writing history,” that’s RML-3 behavior.

Guideline:

  • keep RML-1 logs to those not used for business decisions
  • treat audit/transaction logs as RML-3 artifacts

4.2 “It’s staging, so it’s safe” (worlds don’t change)

Staging can still be RML-2/3 if:

  • it connects to real payment gateways
  • it sends real emails
  • it hits real partners

Staging is an environment; the world is defined by observability and effect leakage.

4.3 “Let’s add a notification to our simulation feature”

This is the temptation:

“Since we computed it, let’s email the result to someone.”

The moment you do that:

  • the world graduates to RML-2/3
  • humans act on it, and you’ve created irreversible reality

Countermeasure:

  • decide up front: RML-1 features must not send external notifications
  • if sharing is needed, make it a two-step flow:

    • copy results out
    • share via an explicit RML-2/3 UI path

5) Detecting “we thought it was RML-1, but it wasn’t”

The scariest state is:

everyone believes it’s RML-1
but it actually behaves like RML-2/3.

5.1 Checklist

If any of these are “yes,” be suspicious:

  • Does the result leave logs that change someone’s decision-making later?
  • Does it write to a DB referenced by other services?
  • Have you ever received a question about data produced by this flow?
  • Do you need to explain this behavior via ToS/SLA/compliance?

If you get even one yes:

“Are we sure this isn’t at least RML-2?”

5.2 A pragmatic approach

Perfect boundaries are hard. Start from extremes:

  • expand “clearly RML-1-safe” areas
  • label “clearly not RML-1” areas

Then fill the middle over time.


6) How to use this at work: label RML-1 in your project

6.1 Label features

Try labels like:

  • RML-1 strict (no observable effects)
  • RML-1 read-only (production reads allowed, no writes)
  • RML-2+ (effects exist)

Example:

Feature RML label Notes
Scoring simulation RML-1 read-only production reads, no writes
Auto assignment for A/B RML-2+ writes to user attributes
CSV export preview RML-1 strict view-only, no file creation

This makes a new kind of proposal easy:

“Let’s keep this RML-1 strict and move effects into a separate flow.”

6.2 Create an “RML-1-only” module boundary

Even a soft code convention helps:

app/
  world_rml1/
    scoring/
    validation/
    simulation/
  world_rml2/
    saga/
    effect_dispatcher/
  world_rml3/
    ledger/
    incident/
Enter fullscreen mode Exit fullscreen mode

You don’t need perfection. You need a shared boundary.


7) Summary: build the room first

  • RML-1 = a temporary world not observable from outside
  • a well-designed RML-1 enables safe PoCs on real data and makes exits visible
  • key patterns:

    • read-only + dry-run
    • consolidate exits (effect dispatcher)
    • embed RML-1 modes inside production (preview/simulation/shadow)
  • key pitfalls:

    • logs that are actually history
    • staging ≠ RML-1
    • mixing notification/writes into simulations
  • a practical rule:

    • RML-1 allows only unobservable side effects

RML-1 isn’t just “testing convenience.”

It’s the foundational work of agreeing:
Where does the world start to change?

In the next chapter, we’ll step outside the room:

Chapter 3 — RML-2 (Dialog World): rollback as conversation between services and humans.

Top comments (0)