saurabh.v

Posted on Jul 31, 2017

Eventual vs Strong Consistency in Distributed Databases

#distributeddatabases #technology #database #consistency

Explanation of this topic starts with an analogy, taking an example from real life to understand the concept better.

I have the habit of writing something I call Tech Notes on my laptop daily to summarize technical concepts that I learn. It helps me to recollect them in an easier way whenever I want to.

But sometimes I used to worry about my laptop being stolen or what if it crashes. I will lose all these Tech Notes so I started backing them up on my external Hard Disk. To further reduce the possibility of losing them, I also purchased a subscription of Dropbox.

Every fortnight, I update my external Hard Disk with revised and newly written Tech Notes and Dropbox gets updated as soon as I connect my laptop to the internet.

Here, I am using Hard Disk and Dropbox as a source of reading Tech Notes while the laptop is being used for reading as well writing them. (Master-Slave Model)

Redundancy introduces Reliability.

Now let’s get to the point.

Case 1: Eventual Consistency
Whenever we use multiple replicas of a database to store data and let’s say a write request comes to one of the replicas. In such a situation, Databases had to discover a strategy to make this write request at one replica reach other replicas as well so that they all could write data of the request and become consistent.

Consistency here means that a read request for an entity made to any of the nodes of the database should return the same data.

Eventual consistency makes sure that data of each node of the database gets consistent eventually. Time taken by the nodes of the database to get consistent may or may not be defined.

Data getting consistent eventually means it will take time for updates to reach other replicas. So what?
This implies that if someone reads from a replica which is not updated yet (since replicas are updated eventually) then it may return stale data.

My Hard Disk also keeps stale data for a period of 15 days as it gets updated fortnightly. Let’s assume John, my friend comes after few days of updation and asks for my Hard Disk.

John: I want your hard disk to read your Tech Notes.

I: Sure, why not. But it has not been updated since last few days.

John: I am fine with it.

Now Hard Disk was supplied to John immediately (low latency) at the risk of having stale data in it. But I am sure about the fact that it will get updated when the next fortnight starts.

Eventual consistency offers low latency at the risk of returning stale data

While on the other hand, we have something known as Strong Consistency.

Case 2: Strong Consistency
It says data will get passed on to all the replicas as soon as a write request comes to one of the replicas of the database.
But during the time these replicas are being updated with new data, response to any subsequent read/write requests by any of the replicas will get delayed as all replicas are busy in keeping each other consistent.

As soon as they become consistent, they start to take care of the requests that have come at their door.

This time my friend Veronica comes and asks for my Tech Notes.

Veronica: I want your latest Tech Notes.

I: Sure, why not. I will share a Dropbox link with you.
   But Veronica, access it after few minutes as I have written a
   new Tech Note on the laptop which will get synced with my 
   Dropbox account.

Now Veronica was able to access up-to-date Tech Notes but after few minutes of delay.

Conclusion

Strong Consistency offers up-to-date data but at the cost of high latency.
While Eventual consistency offers low latency but may reply to read requests with stale data since all nodes of the database may not have the updated data.

If you liked the article, please share it with others.
This article was first published on Medium. You can take a look at it here.

Oldest comments (3)

Ben Halpern • Jul 31 '17

Really well explained, Saurabh. These are the issues at the core of our drive to provide a low latency experience on dev.to for all our readers globally. The concept of eventual consistency is strongly at play in:

How our users interact with our CDN and how the CDN interacts with the origin server. I also want to distribute the origin server that could make reads from eventually consistent follower DBs.

We already make use of an eventually consistent distributed search via Algolia.

I find the verbiage in this domain hard to wrap my head around even if the topic is reasonably straightforward, so posts like this are super useful for people I think.

saurabh.v • Aug 1 '17

Thanks for appreciating the article, Ben.

Puria Kordrostami • Aug 17 '19

i think this is a bad explanation. in your example eventual consistency is actually inconsistent but it really is not. i think strong consistency is when your architecture does not provide you with consistency and you should take extreme measures to apply it. but in eventual consistency, your architecture works in a way that your data becomes eventually consistent without using complicated d-locks.