We daily use messenger applications such as Whatsapp and Telegram to chat with our friends. They all have a few design solutions behind the scene that directly affect user experience. In this article, I am describing some features visible from a user perspective and explaining tech details under the hood. However, it is not a complete system design and there are a lot of topics where you could dive much deeper.
Message delivery statuses
All messengers indicate a state of a message. The application provides a user with information on whether the message was sent, delivered, and read by a recipient. This is very good seen on Whatsapp.
It shows a few stages the message goes through:
- When a user just clicked a Send button the message is sent from a client device to web servers (clock icon).
- When the message came to a server it sends a notification back to the client. At that moment icon changed to a single check.
- The web server looks for a connection that is established between it and a recipient and sends a message. The recipient client device sends back a notification to a server and after that, the server can send a notification to the sender that the message was delivered and the icon changed to double-check.
- Eventually, the recipient opens an application and reads the message. His device sends one more notification to the server and the server communicates to the sender that the message was read. The icon changed to blue-colored double-check.
The scheme below illustrates this process and brings a few more necessary design blocks to the further discussion.
The messenger app has to handle more than 100 billion messages daily or roughly a million per second. The messenger app follows the distributed design to take such a load and the important part of it is Load Balancer (LB). Load Balancer applies a particular strategy to equally distribute effort between different servers.
Messenger aims to quickly deliver messages between users and servers. Web sockets are used to ensure it. Web socket is a protocol providing duplex communication over a TCP connection. Keeping a web socket connection is cost cheaper than an HTTP connection. The client and server subscribe to each other and wait for new messages. That’s why all arrows on the scheme are bi-directional.
Different users might be connected to different web servers so we need to have Web Socket Manager. This service is responsible for providing information about connections and routing messages to a proper web server.
If the recipient is offline the message should be stored. The database should be optimized for frequent write and delete operations. Delete operation performed after the client is online and all messages are delivered to him. NoSQL database will be a choice for this case, e.g. some messengers use HBase and Cassandra to solve the task.
Background data fetching and strong consistency
The web socket protocol allows messenger apps to keep connections alive for a long time and load messages even if a user is not active right now. Mobile apps often stay in the background and are able to load new messages if your device is connected to the network. You could notice that you usually see messages immediately after you unblocked the phone and opened the app. Let's look at the desktop client that was closed for a long time and there are no background processes.
In this recording, the application opens a web socket connection only after its launch. When the connection is established new messages are received from the server. It is loaded not in one batch but one by one. It follows us to another important requirement of messenger system design. It is consistency. The sequence of messages is critical. It would be strange if a recipient gets messages in order different from the original, also messages can’t be lost. The consistency of this application is to provide messages in a strict order. There is a CAP theorem saying that any distributed data store can provide only two of the following three characteristics: Consistency, Availability, and Partition Tolerance. The applications must provide Partition tolerance, hence they have to choose between Consistency and Availability. Messenger services choose Consistency over Availability. That’s why we see the picture in the figure above when the user has to wait a few seconds for chats to be loaded.
Downloaded files caching
We discussed a few aspects of chatting, however, messengers also allow users to share media files. The last bit is about optimizing network bandwidth. Files are obviously larger than messages and it would be a nice idea to minimize the times when a user has to upload and download them. When the user uploads a file the server stores it and calculates a checksum that is unique for any particular file. If a user uploads a file that is already stored on the server, the application will look at the checksum and prevent duplication. From the user's perspective, we can try to repeat the following steps:
- Upload the file. I use Telegram Saved messages to check it.
- Remove it and clear the device cache.
- Upload the file again.
There will be a noticeable difference between the first and second upload attempts, the second one is almost immediate.
Note: the speed of checking for duplication could depend on the user’s geolocation. Distributed systems have a lot of servers around the world so the time of response (when we talk about distributed system design we usually use the term latency) can be actually higher. If a user uploads the duplicate from a location that is different from the origin it might take more time because the closest server doesn’t have information about it.
Another thing to mention is Content Delivery Network (CDN). CDNs are used to store static content such as images and videos. They are based on the network infrastructure and allow to put the files closer to users geographically. Hence, the content is delivered faster and latency is lower.
Conclusion
We talked about a few features of messenger applications that you see daily when you are chatting with your friends, and now you can have a better understanding of system blocks and design decisions that provide us with such user experience.
Thank you for taking the time to read this article. I would love to see your feedback and favorite app features which internals you would like to know in the comments!
Top comments (3)
Hi
Very good topic
I recently made a message sender by google sheet and app inventor by using http request and as you told it has massive data delivery for new messages fetch algorithm to be refreshing quickly and forced me get high traffic from google server.
I am searching new strategy
Great, gave me basic concept of a messanger apps, thanks. Can we say there different server to serve different location?
Yes, definitely. Different server locations allow for solving 2 problems of the distributed system. First, decrease the latency of user requests by physically placing servers closer to users. Second, increase the availability of the system so that even if some region is not available due to network or data center issues the requests still can be served by another server location. For this, data is replicated across different locations.