Ever wondered how Google Docs or Figma allow multiple people to edit the same document at the same time without destroying each other's work?
The magic behind this is called Operational Transformation (OT) .
OT sounds like complex academic jargon (and it can be), but the core concept is beautiful in its simplicity. Today, we're going to demystify OT by building a tiny, working example in Go.
The Problem: The "Late Comer" Conflict
Imagine two users, Alice and Bob, editing the same text document: "Hello".
- Alice decides to insert a space after "Hello". The document becomes
"Hello ". - Bob, at the exact same time, decides to insert an exclamation mark at the end. The document becomes
"Hello!".
If we just apply these changes in the order they arrive at the server, we have a problem. If the server applies Alice's change first, then Bob's, we get "Hello !" (which is fine). But if it applies Bob's first, then Alice's, we might get "Hello !" as well? Actually, let's look closer:
-
Server State:
"Hello" -
Bob's Action: Insert
'!'at position 5. -
Alice's Action: Insert
' 'at position 5.
If the server applies Bob's change first:
- Server applies Bob's insert:
"Hello!"(length 6,'!'at index 5). - Now it needs to apply Alice's insert (space at position 5). But position 5 is now occupied by Bob's
'!'. The space gets inserted before the'!', resulting in"Hello !".
If the server applies Alice's change first, we get "Hello !" as well. So in this case, we are fine because the operations are commutative (order doesn't matter). But let's look at a case where it does matter.
The Real Conflict: Insert vs. Delete
Let's start with "Hello".
- Alice highlights
"ell"and deletes it. Desired result:"Ho". Her operation:Delete(start: 1, end: 4). - Bob decides to insert an
'X'at position 1 (between'H'and'e'). Desired result:"HXello". His operation:Insert(position: 1, char: 'X').
What happens if the server applies Bob's insert first?
- Server:
"Hello" - Apply Bob (Insert 'X' at 1):
"HXello"(Indices: H=0, X=1, e=2, l=3, l=4, o=5) - Apply Alice (Delete from 1 to 4): We look at the current document. Delete from index 1 to 4. That means deleting indices 1, 2, 3, and 4. That's
"Xell". The result is"Ho". This is correct! ("H"+"o").
Now, what if the server applies Alice's delete first?
- Server:
"Hello" - Apply Alice (Delete from 1 to 4):
"Ho"(Indices: H=0, o=1) - Apply Bob (Insert 'X' at 1): Insert
'X'at position 1 of"Ho". This yields"HXo".
Uh oh! The final document is "HXo", but it should be "HXo"? Wait, let's check the desired result. If Bob wanted "HXello" and Alice wanted "Ho", the final result if they were truly collaborating would be "HXo". Actually, in this case, "HXo" is the correct, converged state!
So why do we need OT? Because in a real distributed system, the server might not be the one deciding the order. The clients need to be able to apply changes optimistically and then reconcile them. The key is that every client must end up with the same final state, regardless of the order they saw the operations.
OT solves this by transforming operations against each other so they can be applied in any order and still converge.
The Core Idea of OT
OT works by treating every edit as an operation (Op). When two operations happen concurrently, instead of just applying them in a raw order, we transform one operation against the other.
We need a function: transform(a, b) -> (a', b')
This function takes two operations that were applied to the same document state and returns two new operations. The guarantees are:
- Inclusion Property: Applying
athenb'results in the same document state as applyingbthena'. - Context:
a'isaadjusted to make sense afterbhas been applied, and vice-versa.
Building OT in Go
Let's implement a simplified text OT system in Go. We'll handle two operations: Insert and Delete.
1. Defining Our Operations
First, we define what an operation looks like.
package main
import "fmt"
// OpType defines the kind of operation
type OpType int
const (
Insert OpType = iota
Delete
)
// Operation represents a single change
type Operation struct {
Type OpType
Pos int // Position in the document (0-indexed)
Data string // The text to insert or delete
}
func (op Operation) String() string {
if op.Type == Insert {
return fmt.Sprintf("Insert('%s' at %d)", op.Data, op.Pos)
}
return fmt.Sprintf("Delete('%s' from %d)", op.Data, op.Pos)
}
2. Applying an Operation to a Document
We need a way to apply an operation to a string to see the result.
// Apply performs an operation on a given document state
func Apply(doc string, op Operation) string {
switch op.Type {
case Insert:
return doc[:op.Pos] + op.Data + doc[op.Pos:]
case Delete:
return doc[:op.Pos] + doc[op.Pos+len(op.Data):]
default:
return doc
}
}
3. The Heart of OT: The Transform Function
This is where the magic happens. We need to handle all combinations of operations (Insert vs Insert, Insert vs Delete, Delete vs Insert, Delete vs Delete). For simplicity, we'll assume deletions always delete the exact string present (so we store the Data in the delete op).
// Transform takes two operations that were applied to the same state
// and returns two operations that can be applied in reverse order
// to achieve the same final state.
func Transform(a, b Operation) (aPrime, bPrime Operation) {
// Case 1: Both are Inserts
if a.Type == Insert && b.Type == Insert {
if a.Pos < b.Pos {
// a happens before b in the document.
// b's position needs to shift right by the length of a's insertion.
aPrime = a
bPrime = b
bPrime.Pos += len(a.Data)
} else if a.Pos > b.Pos {
// b happens before a in the document.
// a's position needs to shift right by the length of b's insertion.
aPrime = a
bPrime = b
aPrime.Pos += len(b.Data)
} else {
// Same position. We need a rule. A common strategy is to use a tie-breaker
// like UserID or in this case, we'll just say the second one (b) goes after.
aPrime = a
bPrime = b
bPrime.Pos += len(a.Data)
}
return aPrime, bPrime
}
// Case 2: Both are Deletes
if a.Type == Delete && b.Type == Delete {
// This gets complex quickly with overlapping ranges.
// For this simple example, we'll assume non-overlapping deletes for now.
// A real implementation must handle overlapping ranges carefully.
// Simplified: If a deletes something entirely before b's delete,
// b's position needs to shift left by the length of a's deletion.
if a.Pos+len(a.Data) <= b.Pos {
aPrime = a
bPrime = b
bPrime.Pos -= len(a.Data)
} else if b.Pos+len(b.Data) <= a.Pos {
aPrime = a
bPrime = b
aPrime.Pos -= len(b.Data)
} else {
// Overlap! In a real system, you'd merge these.
// We'll panic for this demo.
panic("overlapping deletes not handled in this simple example")
}
return aPrime, bPrime
}
// Case 3: a is Insert, b is Delete
if a.Type == Insert && b.Type == Delete {
if a.Pos <= b.Pos {
// Insert happens before the delete.
// The delete's position needs to shift right.
aPrime = a
bPrime = b
bPrime.Pos += len(a.Data)
} else if a.Pos > b.Pos+len(b.Data) {
// Insert happens after the deleted region.
// The insert's position needs to shift left.
aPrime = a
bPrime = b
aPrime.Pos -= len(b.Data)
} else {
// Insert happens inside the deleted region.
// The insert is deleted! So aPrime becomes a no-op? Or we shift it.
// In a real system, this is handled by the delete including the inserted text.
// We'll simplify: move insert to the delete position.
aPrime = a
bPrime = b
aPrime.Pos = b.Pos
}
return aPrime, bPrime
}
// Case 4: a is Delete, b is Insert (symmetrical to case 3)
if a.Type == Delete && b.Type == Insert {
// Transform (Delete, Insert) by swapping and using case 3, then swapping back.
bPrimeSwapped, aPrimeSwapped := Transform(b, a)
// aPrimeSwapped is the transformed Delete, bPrimeSwapped is the transformed Insert.
return aPrimeSwapped, bPrimeSwapped
}
// Should never reach here if all cases covered
panic("unknown operation combination")
}
Note: This Transform function is simplified for educational purposes. Production OT systems (like in ShareDB or Google Wave) handle hundreds of edge cases, especially around cursors, selections, and overlapping deletes.
Putting It All Together: A Test
Let's simulate a real concurrent edit scenario to see if our transformation works.
func main() {
// Start state
doc := "Hello"
fmt.Printf("Start: '%s'\n", doc)
// Two clients start from this state
aliceOp := Operation{Type: Delete, Pos: 1, Data: "ell"} // Wants "Ho"
bobOp := Operation{Type: Insert, Pos: 1, Data: "X"} // Wants "HXello"
fmt.Printf("Alice wants: %v\n", aliceOp)
fmt.Printf("Bob wants: %v\n", bobOp)
// Server receives Alice's op first, then Bob's (or vice versa).
// To make them work concurrently, we transform them.
aPrime, bPrime := Transform(aliceOp, bobOp)
fmt.Printf("\nTransformed Alice: %v\n", aPrime)
fmt.Printf("Transformed Bob: %v\n", bPrime)
// Scenario 1: Apply Alice, then Transformed Bob
state1 := Apply(doc, aliceOp)
state1 = Apply(state1, bPrime)
fmt.Printf("\nAlice first, then Bob': '%s'\n", state1)
// Scenario 2: Apply Bob, then Transformed Alice
state2 := Apply(doc, bobOp)
state2 = Apply(state2, aPrime)
fmt.Printf("Bob first, then Alice': '%s'\n", state2)
// Check if they converge
if state1 == state2 {
fmt.Printf("\nβ
Convergence! Final state: '%s'\n", state1)
} else {
fmt.Printf("\nβ Divergence! '%s' vs '%s'\n", state1, state2)
}
}
Expected Output
When you run this, you should see:
Start: 'Hello'
Alice wants: Delete('ell' from 1)
Bob wants: Insert('X' at 1)
Transformed Alice: Delete('ell' from 2)
Transformed Bob: Insert('X' at 1)
Alice first, then Bob': 'HXo'
Bob first, then Alice': 'HXo'
β
Convergence! Final state: 'HXo'
It works! We started with "Hello". The final document is "HXo". This makes sense: Alice deleted "ell" from "Hello" leaving "Ho", and Bob inserted an "X" right after the "H", resulting in "HXo".
By transforming Alice's operation from Delete at 1 to Delete at 2, we accounted for Bob's "X" being inserted before the deleted region.
What's Next?
This is just the tip of the iceberg. A real-world OT system needs:
- Cursors & Selections: Transforming cursor positions so they don't jump around.
- Undo/Redo: A robust stack system that works with transformed operations.
- Networking: A way to send operations between clients and a server (WebSockets in Go are great for this).
- Durability: Saving the operation log to a database.
If you're interested in exploring further, check out these excellent resources:
- ShareDB: A real-time database backend based on OT.
- The original OT paper by Ellis and Gibbs (1990).
OT is a fascinating field that bridges the gap between distributed systems and UX. Go's concurrency model and performance make it a fantastic language to build these real-time collaboration features.
Happy coding! π
Top comments (0)