The Worlds of Distributed Systems — Chapter 11 (Practical Adoption Guide)
“Nice theory… but what do we do tomorrow?”
After Chapter 10, the worldview may feel right—but you might still be unsure about:
- where to start,
- how far is “too much,”
- and how to talk about this across teams.
This chapter is not “more theory.”
It’s a recipe / adoption guide for bringing RML into real work:
- You can’t do everything at once.
- But you can introduce it gradually—without overhauling the org.
1) Five tiny steps you can start tomorrow
These are intentionally small. You can do them as an individual—even without “organizational permission.”
1.1 Say the word “world” once in a conversation
In a design review or incident retrospective, ask this one question:
“Which world are we talking about—RML-1, RML-2, or RML-3?”
It’s OK if nobody has an answer yet.
The point is to flip the mental switch:
Make “world awareness” part of the discussion.
1.2 Add an RML column to a single backlog list
In Jira, Notion, a spreadsheet—anything.
- Pick one epic or feature list.
- Add a column called
RML. - Write
1 / 2 / 3for a few items, using your best guess.
Later, when you review it with the team, it becomes a productive “discussion starter”:
- “Why is this RML-2?”
- “Is this actually RML-3 once it touches money/contracts?”
1.3 Add world to a new exception class (just once)
Any language is fine. Keep it lightweight.
class RmlError extends Error {
world: "RML1" | "RML2" | "RML3";
constructor(args: { world: "RML1" | "RML2" | "RML3"; message: string }) {
super(args.message);
this.world = args.world;
}
}
You don’t need perfect handling on day one.
Even just writing down:
“This
throwassumes which world?”
…changes how engineers reason about failures.
1.4 Retrospect one recent incident with an RML tag
Don’t try to rewrite your entire incident process.
Take a single recent incident and add world markers to the timeline:
- “Up to here was RML-2…”
- “This point is where it first became RML-3.”
Just drawing that boundary makes the “real weight” visible.
1.5 Write one Runbook and one Playbook (one page each)
From Chapter 9:
- Runbook = RML-2 (technical operations)
- Playbook = RML-3 (organizational decision-making)
Pick one representative scenario for each:
- Runbook: “Payment gateway outage (containment + retries + compensation)”
- Playbook: “Incorrect billing occurred (who decides refunds + comms + legal steps)”
Two pages can align the team surprisingly fast.
2) Role-based “do only this” adoption guide
RML becomes easier when each role contributes a small part.
2.1 Application engineers
- Add
world/actionto error types (even minimally) - Add idempotency keys to critical external calls
- Ask once per feature: “What’s the RML of this?”
Goal:
Make RML appear in code and design reviews.
2.2 SRE / Platform engineers
- Add
rml.world/rml.actiontags to logs/metrics/traces - Create one alert based on RML-3 signals
- Prepare one Runbook (RML-2) + one Playbook (RML-3)
Goal:
Make RML flow into dashboards and incident response.
2.3 Product managers / POs
- Allow (or encourage) adding an
RMLcolumn to the backlog -
Put “promotion” into roadmap discussion:
- “Do we promote this to RML-3 next quarter?”
Roughly track RML-3 costs (refunds, coupons, support time)
Goal:
Connect RML to product strategy and P&L reality.
2.4 Legal / Compliance
- Define (with engineers + business) what counts as RML-3
- Align ToS/contract compensation terms with actual RML-3 operations
- Join at least one postmortem for a history-grade incident
Goal:
Align language for how the org enters the History World.
3) A sample RML kickoff meeting agenda (90 minutes)
If you want a “lightweight organizational start,” do one kickoff meeting.
Agenda (90 minutes)
- Intro (10 min)
- what RML is, in one slide
- what you’ll call it internally (RML is fine)
- Case sharing (20 min)
- pick 1–2 recent incidents or near-misses
- draw a timeline
- mark where it shifts: “RML-2 → RML-3”
- Role perspectives (20 min)
- engineers: rollback/retry/compensation reality
- business: customer experience + brand impact
- legal: contracts/regulation + disclosure obligations
- Draft “our org’s RML rules” (30 min)
- list default RML-3 cases
- define a “when unsure, do X” principle
- pick the Playbook owner for RML-3 incidents
- Next step (10 min)
- choose one service/feature to start labeling
- choose one incident type to apply the case-file template
The trick is not perfection.
Treat it as a meeting to decide the minimum that lets you move tomorrow.
4) FAQ (the questions teams actually ask)
Q1) “Our domain is basically all RML-3… doesn’t that make RML useless?”
In finance/health/public infrastructure, it can feel like “everything is History World.”
Even then, separating helps if you split:
- World as a processing layer
- World as a responsibility boundary
Example:
- “Showing appointment candidates” → RML-1 (proposal-only)
- “Reserving a slot” → RML-2 (adjustable via operations)
- “Medical records / billing / official notes” → RML-3 (legal history)
Even in an RML-3 domain, you usually still have RML-1/2 layers inside it.
Q2) “Is labeling RML wrong itself risky?”
Early on, you will mislabel. That’s normal.
Treat labels as:
Best current hypotheses, updateable over time.
What’s riskier is the default state:
“We don’t know which world we’re in, so we guess mid-incident.”
It’s safer to label, learn, and refine than to operate without a shared map.
Q3) “We’re a small startup—we can’t do all this.”
Small orgs often need RML more, not less—because you can’t afford repeated history-grade mistakes.
Start with minimal scoping:
- “Payments are RML-3 and we treat them seriously.”
- “Everything else stays RML-2 unless proven otherwise.”
Early boundaries make later growth easier.
5) A small homework assignment
One small exercise to make this “your team’s book”:
Write down three RML-3-ish events in your environment.
Past incidents or imagined “worst days” are both fine.
For each, write:
- what happened,
- who was affected,
- who owned responsibility,
- what “history residue” remains.
Those three items become page one of:
Your team’s version of The Worlds of Distributed Systems.
That’s the end of the main content of this book—for now.
Next:
Epilogue — Toward Engineering with a Worldview
From here, you can:
- remove chapters,
- rename terms,
- or refactor the entire worldview into your own.
If it helps you build systems that don’t just “work most of the time,” but are designed for trust, then the map did its job.
Top comments (0)