Pujon Das Auvi

Posted on Feb 1

How Real-Time Messaging Apps Actually Work and Handle massive amount of data

#systemdesign

We all use social media messaging platforms like WhatsApp, Messenger, and Telegram. But have you ever wondered how they manage to handle so many things in real-time, like showing typing indicators, authenticating users, processing massive amounts of data concurrently and in parallel, while also ensuring messages don't get sent to the wrong users?

This is fundamentally event-driven architecture in action. Let's dive into this today.

How Real-Time Connections Work

First, we need to talk about how real-time connections work in the first place. Real-time communication relies on a technology called WebSockets, which is fundamentally different from traditional HTTP connections used for REST APIs.

Traditional HTTP works like this:

Client sends a request --- Server processes --- Server sends response --- Connection closes

For every new piece of data, you need a new request and after sending response connection will be closed.

But on the other hand WebSocket works differently:

User A ---- WebSocket ---- Server ---- WebSocket ---- User

With WebSockets, the connection doesn't close. Instead, a persistent bidirectional TCP connection is established between users and the server, allowing data to flow in both directions at any time.

How Messages Flow

Let's say there are two users: User A and User B.
When User A sends User B a message saying "Hi", here's what happens:

A and B are connected through a unique identifier (eventId/roomId/channelId)
Here messages are events. When A sends "Hi", they trigger an event that says: "I pushed 'Hi' to room X, notify User B"
Then server receives this event and immediately pushes it to all users subscribed to that room (in this case, User B)
User B's client receives the event instantly through their open WebSocket connection

This is the fundamental mechanism behind real-time messaging.

Now that we understand how real-time systems work, let's see how various features are implemented

Typing Indicators

When User A types, their client emits a typing_start event to the room
User B's client receives this event and displays "User A is typing..."
When A stops typing, a typing_stop event is emitted

Message Status (Sent/Delivered/Read)

Sent: Confirmed when server receives the message
Delivered: Triggered when recipient's client acknowledges receipt via WebSocket
Read: Triggered when recipient opens the chat and views the message (another event)

Security: Ensuring Messages Go to the Right User

Messages don't get sent to the wrong users because:

Each conversation has a unique, secure roomId generated by the server
Users can only join rooms they're authorized for
The server validates every event against the roomId and user permissions
WebSocket connections are authenticated and tied to specific user sessions

Authentication and Authorization

For Authentication and Authorization, authentication happens when user logged in 1st via username/password or OAuth

But for Authorization it happens during user A sending msg.

Before User A can send messages to User B, the server checks if A has permission to access the specific roomId
Every event (message, typing indicator, etc.) is validated against room membership
This prevents unauthorized access and ensures privacy

Now we understand how real time communication works. Now lets drive into this platform handle this massive amount data at ease.
For this, they use message brokers like RabbitMQ, Apache Kafka, or Redis Pub/Sub.

Why Message Brokers?

You can say how and mainly why they use it? For this let's Identify the problems 1st:

A single server can only handle a limited number of WebSocket connections (~10,000-100,000 depending on resources) so you can not handle this a single server.
With millions of users, you need multiple servers
When User A (connected to Server 1) sends a message to User B (connected to Server 2), how does it work?

What is a Message Broker?

For going into the solution lets take a look at message broker component and what is message broker.

Basically, a message broker is a central communication system that helps different servers and services talk to each other without knowing where each other are.

A message broker has several internal components:

1. Exchange (The Sorting Center)

The entry point where all messages arrive
Decides how to route messages based on rules
Different types: Direct, Topic, Fanout, Headers

2. Queue (The Mailbox)

Temporary storage where messages wait
Each queue holds messages for specific subscribers
Messages stay here until consumed by a server

3. Binding (The Routing Rule)

Connects an exchange to a queue
Defines which messages go to which queue
Example: "All messages for room_123 go to Queue A"

4. Routing Key (The Address Label)

Metadata attached to each message
Helps the exchange decide where to send the message
Example: "room:room_123" or "user:user_456"

How Message Brokers Solve the Problem

So now we know what is message broker now. Now lets see how it solves our problem step by step:

Step 1: When User A sends a message from Server 1:

Server 1 sends the message to the message broker
The message includes a routing key (like "room:room_123")
The message arrives at the exchange (the broker's entry point)

Step 2: The exchange looks at the routing key and decides:

"This message is for room_123"
"Which queues are interested in room_123 messages?"
"Queue_ABC is bound to room_123 messages"
The exchange forwards the message to Queue_ABC

Step 3: Queue_ABC now holds the message:

It waits for subscribers to pick it up
If no subscriber is ready, it stores the message temporarily
Multiple servers can read from the same queue (or different queues)

Step 4: Server 2, which is subscribed to Queue_ABC:

Continuously listens to Queue_ABC
When a new message arrives, Server 2 receives it
Server 2 then delivers it to User B via WebSocket

You can compare this with post office.
Suppose You're sending a birthday card. Now:

You (Server 1) write a card and give it to the post office (Message Broker)
You write the address on the envelope ("123 Main Street" = Routing Key)
The sorting facility (Exchange) reads the address
It puts the card in the correct delivery truck route (Queue) for that neighborhood
The mail carrier (Server 2) picks up cards from their assigned queue
The recipient (User B) receives the card at their mailbox

This is how they scale it for massive amount of data. Although they use many more things for more scalability so that it can handle without failure. They use:

Load Balancers

Load balancers distribute incoming WebSocket connections across multiple servers to ensure no single server gets overwhelmed.

When a user tries to connect:

The load balancer checks which servers have capacity
It routes the user to the least busy server
If one server fails, the load balancer redirects traffic to healthy servers
This ensures high availability and optimal resource utilization

Database Sharding

With millions of users, a single database becomes a bottleneck. Sharding solves this by splitting data across multiple databases.

How it works:

User data is divided based on userId or roomId
Example: Users with IDs 1-1,000,000 go to Database 1, IDs 1,000,001-2,000,000 go to Database 2
Each database handles a portion of the load
Queries become faster since each database has less data to search through
This allows horizontal scaling as user base grows

Caching with Redis

Not every piece of data needs to be fetched from the database every time. Redis provides in-memory caching for ultra-fast access.

Horizontal Scaling

Unlike vertical scaling (adding more power to one server), horizontal scaling means adding more servers as demand increases.

why:

Need more capacity? Add more servers
Each new server can handle additional WebSocket connections
Servers subscribe to relevant message queues
No single point of failure
Can scale up during peak hours, scale down during low traffic

Handling Concurrency

With millions of operations happening simultaneously, preventing conflicts and data corruption is critical. Let's see how messaging platforms handle concurrency:

Distributed Locks

When multiple servers need to perform operations that shouldn't happen simultaneously:

Redis provides distributed locking mechanisms
Before a critical operation (like creating a group), a server acquires a lock
Other servers wait until the lock is released
Prevents duplicate group creation or conflicting updates

Database Transactions

Ensures data consistency when multiple operations must succeed or fail together:

If saving a message involves updating multiple tables
All operations complete successfully, or all are rolled back
No partial data corruption
ACID properties maintain data integrity

Idempotency Keys

Prevents duplicate messages when network issues cause retries:

Each message gets a unique ID generated on the client
If the same message is sent twice (due to network retry)
Server checks: "Have I seen this ID before?"
Duplicate requests are ignored, original response is returned

Message Ordering in Kafka

Ensures messages appear in the correct sequence:

Messages for the same room go to the same partition
Kafka guarantees order within a partition
User B sees messages in the order User A sent them
Critical for maintaining conversation flow

Rate Limiting

Prevents abuse and ensures fair resource distribution:

Each user has a limit on messages per second
Implemented using token bucket algorithm in Redis
If a user exceeds the limit, requests are throttled
Protects system from spam and malicious actors

Atomic Operations in Redis

For operations that must complete without interruption:

Incrementing message counters
Updating user presence status
Managing typing indicators
Redis ensures these happen atomically even under high concurrency

Now let's connect all dots together:

User A's client sends message via WebSocket to Server 1
Rate limiter checks if User A hasn't exceeded message limits
Server 1 validates authentication and authorization
Idempotency check ensures this isn't a duplicate message
Database transaction saves the message atomically
Server 1 publishes event to message broker with routing key
Exchange routes message to appropriate queue
Queue holds message temporarily
Server 2 (subscribed to queue) receives the message
Server 2 pushes message to User B via WebSocket
User B sees the message instantly

This event-driven architecture, combined with WebSockets, message brokers, and distributed systems, is what powers the seamless real-time experiences we enjoy in modern messaging apps every day.

DEV Community