Modern distributed databases such as Amazon DynamoDB and Apache Cassandra replicate data across multiple servers to improve scalability, fault tolerance, and availability.
However, replication introduces a major challenge: data inconsistency during concurrent updates.
To address this problem, distributed systems use a technique called versioning.
In this article, we’ll explore why versioning is needed, how it works, and how it helps resolve conflicts in eventually consistent systems.
Why Versioning Is Needed
In an eventually consistent system, updates do not reach all replicas at the same time due to network delays or partitions.
Consider a system with three replicas:
N = 3
Replicas: s0, s1, s2
Two clients update the same key at the same time:
Client A → put(key1 = 10)
Client B → put(key1 = 20)
Because of network delays, the replicas may temporarily store different values:
s0 = 10
s1 = 20
s2 = 10
Now the system contains conflicting values for the same key.
This situation is called a write conflict.
What Versioning Does
Versioning allows the system to track the history of writes.
Each update is assigned a version identifier.
Example:
(key1, value = 10, version = v1)
(key1, value = 20, version = v2)
Instead of immediately overwriting data, the system may temporarily store multiple versions of the same value.
This helps the system determine:
- Which write happened earlier
- Whether two writes occurred concurrently
Two Possible Scenarios
1. One Version Is Newer
v1 → value = 10
v2 → value = 20
If the system detects that v2 happened after v1, then the newer version replaces the older one.
v2 replaces v1
In this case, no conflict occurs because the system knows the correct order of writes.
2. Concurrent Versions
Sometimes the system cannot determine the order of writes.
Example:
v1 → value = 10
v2 → value = 20
These updates are considered concurrent.
Instead of choosing one automatically, the database may return both versions:
key1:
value1 = 10
value2 = 20
Now the client application must resolve the conflict.
Client Reconciliation
The application decides how to resolve conflicting values.
Common strategies include:
1. Last Write Wins
The system selects the value with the latest timestamp.
Latest timestamp → chosen value
2. Merging Values
Sometimes values can be combined instead of overwritten.
Example in a shopping cart system:
cart1 = {apple}
cart2 = {banana}
Merged result:
cart = {apple, banana}
This approach ensures that no data is lost.
3. Application Logic
The application can apply custom business rules to determine the correct value.
For example:
- Priority-based updates
- User-based conflict resolution
- Domain-specific merge logic
Versioning with Vector Clocks
Some distributed databases use vector clocks to track the history of updates.
A vector clock stores a list of node counters representing update history.
Example:
v1 = {s0:1}
v2 = {s1:1}
If two versions have different histories, the system treats them as concurrent updates.
Vector clocks help databases automatically detect conflicts without relying only on timestamps.
Key Takeaways
Versioning allows distributed databases to:
- Detect conflicting updates
- Temporarily store multiple versions of data
- Resolve conflicts through reconciliation strategies
This approach enables eventual consistency while keeping the system highly available and scalable.
Final Thoughts
Versioning is a fundamental technique in distributed systems. It allows modern databases to maintain consistency without blocking writes, which is critical for systems operating at large scale.
By tracking the history of updates and allowing conflict resolution later, versioning ensures that distributed systems remain reliable, flexible, and highly performant.

Top comments (1)
Always a great read Rajat! This one hits different after a prod incident 😅