DEV Community

Vikas Kumar
Vikas Kumar

Posted on

Operational Transformation (OT)

Background

Distributed systems that allow concurrent updates face a difficult problem:

How do we keep replicas consistent when operations arrive late, out of order, or at the same time?

Operational Transformation (OT) is a technique that solves this by modifying operations instead of merging full state. It is best known from collaborative editors, but the idea applies to any system where replicas exchange operations.


The Core Problem

In distributed systems:

  • Multiple replicas hold the same logical data
  • Each replica can update independently
  • Network delays are unavoidable
  • Messages may arrive late or out of order

When a user creates an operation, it is based on the current local state. By the time the operation reaches another replica, that state may have changed.

If we apply the operation without adjustment, inconsistencies can occur.


Why Operations Break

Consider a replicated list:

[A B C D]
Indexes: 0 1 2 3
Enter fullscreen mode Exit fullscreen mode

Two users edit at the same time:

User 1 → Delete(1)      // removes B
User 2 → Insert(2, X)
Enter fullscreen mode Exit fullscreen mode

Both operations are valid locally.

Replica 1:

Delete(1) → [A C D]
Enter fullscreen mode Exit fullscreen mode

Replica 2:

Insert(2, X) → [A B X C D]
Enter fullscreen mode Exit fullscreen mode

Now Replica 1 receives Insert(2, X).

But index 2 no longer points to the same position as before. The operation’s assumption about the structure is now wrong.


What Operational Transformation Does

Operational Transformation prevents this issue by rewriting incoming operations so they match the current state.

Instead of merging state, OT systems:

  • Detect concurrent operations
  • Transform late operations
  • Adjust parameters (like indexes)
  • Apply safely

No updates are discarded, and replicas remain consistent.


The Key Idea Behind OT

At the center of OT is the transformation function:

T(op_incoming, op_existing) → transformed_op
Enter fullscreen mode Exit fullscreen mode

Meaning:

Before applying a remote operation, transform it relative to operations already applied.

Goal:
Ensure the operation still represents the user’s original intent.


Example — Insert vs Insert

Initial state

[A B C D]
Enter fullscreen mode Exit fullscreen mode

Concurrent operations

op1 = Insert(1, X)
op2 = Insert(1, Y)
Enter fullscreen mode Exit fullscreen mode

Both operations target the same position.

If applied blindly, replicas may diverge depending on arrival order.

Replica 1

Apply op1 -> [A X B C D]
Apply op2 -> [A Y X B C D]
Enter fullscreen mode Exit fullscreen mode

Replica 2

Apply op2 -> [A Y B C D]
Apply op1 -> [A X Y B C D]
Enter fullscreen mode Exit fullscreen mode

Final states differ -> divergence.

OT Resolution Requires Deterministic Ordering

Assume a deterministic rule:

op1 < op2   (example: lower site ID or earlier timestamp)
Enter fullscreen mode Exit fullscreen mode

Transform op2 against op1:

Transform Insert(1, Y) against Insert(1, X)
-> Insert(1 + length(X), Y)
-> Insert(2, Y)
Enter fullscreen mode Exit fullscreen mode

Apply Operations Safely

Apply op1 -> [A X B C D]
Apply transformed op2 -> [A X Y B C D]
Enter fullscreen mode Exit fullscreen mode

If Ordering Were Reversed

If the rule says:

op2 < op1
Enter fullscreen mode Exit fullscreen mode

Transform op1 against op2:

Insert(1, X) -> Insert(2, X)
Final -> [A Y X B C D]
Enter fullscreen mode Exit fullscreen mode

Example — Insert vs Delete

Initial state

[A B C D]
Enter fullscreen mode Exit fullscreen mode

Concurrent operations

op1 = Delete(1)
op2 = Insert(2, X)
Enter fullscreen mode Exit fullscreen mode

If deletion happens first:

Delete(1) -> [A C D]
Enter fullscreen mode Exit fullscreen mode

One element before the insert position is gone. The insert must shift left:

Insert(2 - 1, X) -> Insert(1, X)
Enter fullscreen mode Exit fullscreen mode

Result:

[A X C D]
Enter fullscreen mode Exit fullscreen mode

If insertion happens first:

Insert(2, X) -> [A B X C D]
Enter fullscreen mode Exit fullscreen mode

The deletion still works because its target is unchanged.


Example — Concurrent Deletes

Initial state

[A B C D]
Enter fullscreen mode Exit fullscreen mode

Concurrent operations

op1 = Delete(1)
op2 = Delete(2)
Enter fullscreen mode Exit fullscreen mode

Without transformation:

Delete(1) -> [A C D]
Delete(2) -> removes D (wrong)
Enter fullscreen mode Exit fullscreen mode

With OT:

Transform Delete(2) against Delete(1)
-> Delete(2 - 1)
-> Delete(1)
Enter fullscreen mode Exit fullscreen mode

Final state:

[A D]
Enter fullscreen mode Exit fullscreen mode

Both deletions behave as intended.


Final State Property

  • All operations are preserved or correctly adjusted
  • Deterministic rules guarantee convergence
  • Replicas reach identical states despite different arrival orders

What Correctness Means in OT

A correct OT system must preserve three important properties:

  • Convergence – All replicas reach the same state
  • Intention Preservation – User actions keep their meaning
  • Causality Preservation – Dependencies are respected

These guarantees prevent subtle divergence bugs.

High‑Level Execution Flow

Typical OT workflow:

  1. User generates a local operation
  2. Apply it locally immediately (low latency)
  3. Send it to other replicas
  4. On receiving a remote operation:
  • Transform it against concurrent operations
  • Apply the transformed version

Transformation ensures safety under concurrency.

Why Transformation Is Necessary

Operations depend on structure. When the structure changes, operations may become invalid due to:

  • Index shifts
  • Element movement
  • Structural changes

OT continuously adjusts operations so they remain correct.


Advantages of OT

Operational Transformation enables:

  • Immediate local updates
  • Smooth concurrent editing
  • Low metadata overhead
  • Natural user experience

These benefits made OT popular in early collaborative systems.

Challenges & Limitations

OT is conceptually simple but hard to implement correctly.

Common difficulties include:

  • Complex transformation rules
  • Many edge cases
  • Subtle correctness bugs
  • Difficult testing and verification

Incorrect logic can cause silent divergence between replicas.


OT vs Naive Conflict Handling

Simpler systems often:

  • Overwrite updates
  • Reject concurrent changes
  • Use last‑write‑wins rules

OT avoids these issues by transforming operations instead of discarding them.

OT vs CRDT (Conceptual Difference)

Both OT and CRDTs aim for replica convergence but follow different strategies.

Operational Transformation

  • Focuses on rewriting operations
  • Sensitive to ordering
  • Lower metadata overhead

CRDTs

  • Focus on specially designed data structures
  • Ordering‑independent merges
  • Higher metadata overhead

Both approaches are valid depending on system requirements.


Final Perspective

A simple way to understand OT:

Before applying a remote operation, rewrite it so it makes sense for the current state.

Or even more intuitively:

Fix the coordinates before executing the command.

Operational Transformation is a good fit when:

  • Systems exchange operations rather than full state
  • Updates depend on positional context
  • Low metadata overhead is desired
  • Some central coordination is acceptable

Most commonly seen in collaborative editing systems.

Central Insight of OT:

Operational Transformation does not pick a winning update, Instead, it keeps all valid operations and modifies them only when needed. Conflicts are handled by making operations compatible rather than rejecting them, Operational Transformation is fundamentally about keeping operations valid in a constantly changing replicated system.

Instead of rejecting conflicts or merging full state, OT reshapes operations to guarantee:

  • Consistency
  • Convergence
  • Preservation of user intent

Top comments (0)