<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Pujon Das Auvi</title>
    <description>The latest articles on DEV Community by Pujon Das Auvi (@thepujon).</description>
    <link>https://dev.to/thepujon</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1262090%2Fc4a5ff55-a99a-457f-b140-b9416844cfc3.jpeg</url>
      <title>DEV Community: Pujon Das Auvi</title>
      <link>https://dev.to/thepujon</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/thepujon"/>
    <language>en</language>
    <item>
      <title>How Real-Time Messaging Apps Actually Work and Handle massive amount of data</title>
      <dc:creator>Pujon Das Auvi</dc:creator>
      <pubDate>Sun, 01 Feb 2026 13:41:22 +0000</pubDate>
      <link>https://dev.to/thepujon/how-real-time-messaging-apps-actually-work-and-handle-massive-amount-of-data-3p9j</link>
      <guid>https://dev.to/thepujon/how-real-time-messaging-apps-actually-work-and-handle-massive-amount-of-data-3p9j</guid>
      <description>&lt;p&gt;We all use social media messaging platforms like WhatsApp, Messenger, and Telegram. But have you ever wondered how they manage to handle so many things in real-time, like showing typing indicators, authenticating users, processing massive amounts of data concurrently and in parallel, while also ensuring messages don't get sent to the wrong users?&lt;/p&gt;

&lt;p&gt;This is fundamentally &lt;strong&gt;event-driven architecture&lt;/strong&gt; in action. Let's dive into this today.&lt;/p&gt;

&lt;h2&gt;
  
  
  How Real-Time Connections Work
&lt;/h2&gt;

&lt;p&gt;First, we need to talk about how real-time connections work in the first place. Real-time communication relies on a technology called &lt;strong&gt;WebSockets&lt;/strong&gt;, which is fundamentally different from traditional HTTP connections used for REST APIs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Traditional HTTP&lt;/strong&gt; works like this:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Client sends a request --- Server processes --- Server sends response --- Connection closes
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For every new piece of data, you need a new request and after sending response connection will be closed.&lt;/p&gt;

&lt;p&gt;But on the other hand &lt;strong&gt;WebSocket&lt;/strong&gt; works differently:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User A ---- WebSocket ---- Server ---- WebSocket ---- User
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With WebSockets, the connection doesn't close. Instead, a persistent bidirectional TCP connection is established between users and the server, allowing data to flow in both directions at any time.&lt;/p&gt;

&lt;h3&gt;
  
  
  How Messages Flow
&lt;/h3&gt;

&lt;p&gt;Let's say there are two users: User A and User B.&lt;br&gt;
When User A sends User B a message saying "Hi", here's what happens:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A and B are connected through a unique identifier (eventId/roomId/channelId)&lt;/li&gt;
&lt;li&gt;Here messages are events. When A sends "Hi", they trigger an event that says: "I pushed 'Hi' to room X, notify User B"&lt;/li&gt;
&lt;li&gt;Then server receives this event and immediately pushes it to all users subscribed to that room (in this case, User B)&lt;/li&gt;
&lt;li&gt;User B's client receives the event instantly through their open WebSocket connection&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is the fundamental mechanism behind real-time messaging.&lt;/p&gt;

&lt;p&gt;Now that we understand how real-time systems work, let's see how various features are implemented&lt;/p&gt;

&lt;h3&gt;
  
  
  Typing Indicators
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;When User A types, their client emits a &lt;strong&gt;typing_start&lt;/strong&gt; event to the room&lt;/li&gt;
&lt;li&gt;User B's client receives this event and displays "User A is typing..."&lt;/li&gt;
&lt;li&gt;When A stops typing, a &lt;strong&gt;typing_stop&lt;/strong&gt; event is emitted&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Message Status (Sent/Delivered/Read)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Sent:&lt;/strong&gt; Confirmed when server receives the message&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Delivered:&lt;/strong&gt; Triggered when recipient's client acknowledges receipt via WebSocket&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Read:&lt;/strong&gt; Triggered when recipient opens the chat and views the message (another event)&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Security: Ensuring Messages Go to the Right User
&lt;/h3&gt;

&lt;p&gt;Messages don't get sent to the wrong users because:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each conversation has a unique, secure roomId generated by the server&lt;/li&gt;
&lt;li&gt;Users can only join rooms they're authorized for&lt;/li&gt;
&lt;li&gt;The server validates every event against the roomId and user permissions&lt;/li&gt;
&lt;li&gt;WebSocket connections are authenticated and tied to specific user sessions&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Authentication and Authorization
&lt;/h3&gt;

&lt;p&gt;For &lt;strong&gt;Authentication and Authorization&lt;/strong&gt;, authentication happens when user logged in 1st via username/password or OAuth&lt;/p&gt;

&lt;p&gt;But for &lt;strong&gt;Authorization&lt;/strong&gt; it happens during user A sending msg.&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Before User A can send messages to User B, the server checks if A has permission to access the specific roomId&lt;/li&gt;
&lt;li&gt;Every event (message, typing indicator, etc.) is validated against room membership&lt;/li&gt;
&lt;li&gt;This prevents unauthorized access and ensures privacy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now we understand how real time communication works. Now lets drive into this platform handle this massive amount data at ease.&lt;br&gt;
For this, they use message brokers like &lt;strong&gt;RabbitMQ&lt;/strong&gt;, &lt;strong&gt;Apache Kafka&lt;/strong&gt;, or &lt;strong&gt;Redis Pub/Sub&lt;/strong&gt;.&lt;/p&gt;

&lt;h3&gt;
  
  
  Why Message Brokers?
&lt;/h3&gt;

&lt;p&gt;You can say how and mainly why they use it? For this let's Identify the problems 1st:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;A single server can only handle a limited number of WebSocket connections (~10,000-100,000 depending on resources) so you can not handle this a single server.&lt;/li&gt;
&lt;li&gt;With millions of users, you need multiple servers&lt;/li&gt;
&lt;li&gt;When User A (connected to Server 1) sends a message to User B (connected to Server 2), how does it work?&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  What is a Message Broker?
&lt;/h3&gt;

&lt;p&gt;For going into the solution lets take a look at message broker component and what is message broker.&lt;/p&gt;

&lt;p&gt;Basically, a message broker is a central communication system that helps different servers and services talk to each other without knowing where each other are.&lt;/p&gt;

&lt;p&gt;A message broker has several internal components:&lt;/p&gt;

&lt;h4&gt;
  
  
  1. Exchange (The Sorting Center)
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;The entry point where all messages arrive&lt;/li&gt;
&lt;li&gt;Decides how to route messages based on rules&lt;/li&gt;
&lt;li&gt;Different types: Direct, Topic, Fanout, Headers&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  2. Queue (The Mailbox)
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Temporary storage where messages wait&lt;/li&gt;
&lt;li&gt;Each queue holds messages for specific subscribers&lt;/li&gt;
&lt;li&gt;Messages stay here until consumed by a server&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  3. Binding (The Routing Rule)
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Connects an exchange to a queue&lt;/li&gt;
&lt;li&gt;Defines which messages go to which queue&lt;/li&gt;
&lt;li&gt;Example: "All messages for room_123 go to Queue A"&lt;/li&gt;
&lt;/ul&gt;

&lt;h4&gt;
  
  
  4. Routing Key (The Address Label)
&lt;/h4&gt;

&lt;ul&gt;
&lt;li&gt;Metadata attached to each message&lt;/li&gt;
&lt;li&gt;Helps the exchange decide where to send the message&lt;/li&gt;
&lt;li&gt;Example: "room:room_123" or "user:user_456"&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  How Message Brokers Solve the Problem
&lt;/h3&gt;

&lt;p&gt;So now we know what is message broker now. Now lets see how it solves our problem step by step:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1: When User A sends a message from Server 1:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Server 1 sends the message to the &lt;strong&gt;message broker&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;The message includes a &lt;strong&gt;routing key&lt;/strong&gt; (like "room:room_123")&lt;/li&gt;
&lt;li&gt;The message arrives at the &lt;strong&gt;exchange&lt;/strong&gt; (the broker's entry point)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Step 2: The exchange looks at the routing key and decides:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;"This message is for room_123"&lt;/li&gt;
&lt;li&gt;"Which queues are interested in room_123 messages?"&lt;/li&gt;
&lt;li&gt;"Queue_ABC is bound to room_123 messages"&lt;/li&gt;
&lt;li&gt;The exchange forwards the message to Queue_ABC&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Step 3: Queue_ABC now holds the message:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;It waits for subscribers to pick it up&lt;/li&gt;
&lt;li&gt;If no subscriber is ready, it stores the message temporarily&lt;/li&gt;
&lt;li&gt;Multiple servers can read from the same queue (or different queues)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Step 4: Server 2, which is subscribed to Queue_ABC:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Continuously listens to Queue_ABC&lt;/li&gt;
&lt;li&gt;When a new message arrives, Server 2 receives it&lt;/li&gt;
&lt;li&gt;Server 2 then delivers it to User B via WebSocket&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;You can compare this with post office.&lt;br&gt;
Suppose You're sending a birthday card. Now:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;You (Server 1)&lt;/strong&gt; write a card and give it to the &lt;strong&gt;post office (Message Broker)&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;You write the address on the envelope &lt;strong&gt;("123 Main Street" = Routing Key)&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;sorting facility (Exchange)&lt;/strong&gt; reads the address&lt;/li&gt;
&lt;li&gt;It puts the card in the &lt;strong&gt;correct delivery truck route (Queue)&lt;/strong&gt; for that neighborhood&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;mail carrier (Server 2)&lt;/strong&gt; picks up cards from their assigned queue&lt;/li&gt;
&lt;li&gt;The &lt;strong&gt;recipient (User B)&lt;/strong&gt; receives the card at their mailbox&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This is how they scale it for massive amount of data. Although they use many more things for more scalability so that it can handle without failure. They use:&lt;/p&gt;

&lt;h3&gt;
  
  
  Load Balancers
&lt;/h3&gt;

&lt;p&gt;Load balancers distribute incoming WebSocket connections across multiple servers to ensure no single server gets overwhelmed.&lt;/p&gt;

&lt;p&gt;When a user tries to connect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The load balancer checks which servers have capacity&lt;/li&gt;
&lt;li&gt;It routes the user to the least busy server&lt;/li&gt;
&lt;li&gt;If one server fails, the load balancer redirects traffic to healthy servers&lt;/li&gt;
&lt;li&gt;This ensures high availability and optimal resource utilization&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Database Sharding
&lt;/h3&gt;

&lt;p&gt;With millions of users, a single database becomes a bottleneck. Sharding solves this by splitting data across multiple databases.&lt;/p&gt;

&lt;p&gt;How it works:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;User data is divided based on userId or roomId&lt;/li&gt;
&lt;li&gt;Example: Users with IDs 1-1,000,000 go to Database 1, IDs 1,000,001-2,000,000 go to Database 2&lt;/li&gt;
&lt;li&gt;Each database handles a portion of the load&lt;/li&gt;
&lt;li&gt;Queries become faster since each database has less data to search through&lt;/li&gt;
&lt;li&gt;This allows horizontal scaling as user base grows&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Caching with Redis
&lt;/h3&gt;

&lt;p&gt;Not every piece of data needs to be fetched from the database every time. Redis provides in-memory caching for ultra-fast access.&lt;/p&gt;

&lt;h3&gt;
  
  
  Horizontal Scaling
&lt;/h3&gt;

&lt;p&gt;Unlike vertical scaling (adding more power to one server), horizontal scaling means adding more servers as demand increases.&lt;/p&gt;

&lt;p&gt;why:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Need more capacity? Add more servers&lt;/li&gt;
&lt;li&gt;Each new server can handle additional WebSocket connections&lt;/li&gt;
&lt;li&gt;Servers subscribe to relevant message queues&lt;/li&gt;
&lt;li&gt;No single point of failure&lt;/li&gt;
&lt;li&gt;Can scale up during peak hours, scale down during low traffic&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Handling Concurrency
&lt;/h2&gt;

&lt;p&gt;With millions of operations happening simultaneously, preventing conflicts and data corruption is critical. Let's see how messaging platforms handle concurrency:&lt;/p&gt;

&lt;h3&gt;
  
  
  Distributed Locks
&lt;/h3&gt;

&lt;p&gt;When multiple servers need to perform operations that shouldn't happen simultaneously:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Redis provides distributed locking mechanisms&lt;/li&gt;
&lt;li&gt;Before a critical operation (like creating a group), a server acquires a lock&lt;/li&gt;
&lt;li&gt;Other servers wait until the lock is released&lt;/li&gt;
&lt;li&gt;Prevents duplicate group creation or conflicting updates&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Database Transactions
&lt;/h3&gt;

&lt;p&gt;Ensures data consistency when multiple operations must succeed or fail together:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;If saving a message involves updating multiple tables&lt;/li&gt;
&lt;li&gt;All operations complete successfully, or all are rolled back&lt;/li&gt;
&lt;li&gt;No partial data corruption&lt;/li&gt;
&lt;li&gt;ACID properties maintain data integrity&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Idempotency Keys
&lt;/h3&gt;

&lt;p&gt;Prevents duplicate messages when network issues cause retries:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each message gets a unique ID generated on the client&lt;/li&gt;
&lt;li&gt;If the same message is sent twice (due to network retry)&lt;/li&gt;
&lt;li&gt;Server checks: "Have I seen this ID before?"&lt;/li&gt;
&lt;li&gt;Duplicate requests are ignored, original response is returned&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Message Ordering in Kafka
&lt;/h3&gt;

&lt;p&gt;Ensures messages appear in the correct sequence:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Messages for the same room go to the same partition&lt;/li&gt;
&lt;li&gt;Kafka guarantees order within a partition&lt;/li&gt;
&lt;li&gt;User B sees messages in the order User A sent them&lt;/li&gt;
&lt;li&gt;Critical for maintaining conversation flow&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Rate Limiting
&lt;/h3&gt;

&lt;p&gt;Prevents abuse and ensures fair resource distribution:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each user has a limit on messages per second&lt;/li&gt;
&lt;li&gt;Implemented using token bucket algorithm in Redis&lt;/li&gt;
&lt;li&gt;If a user exceeds the limit, requests are throttled&lt;/li&gt;
&lt;li&gt;Protects system from spam and malicious actors&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Atomic Operations in Redis
&lt;/h3&gt;

&lt;p&gt;For operations that must complete without interruption:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Incrementing message counters&lt;/li&gt;
&lt;li&gt;Updating user presence status&lt;/li&gt;
&lt;li&gt;Managing typing indicators&lt;/li&gt;
&lt;li&gt;Redis ensures these happen atomically even under high concurrency&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Now let's connect all dots together:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User A's client sends message via WebSocket to Server 1&lt;/li&gt;
&lt;li&gt;Rate limiter checks if User A hasn't exceeded message limits&lt;/li&gt;
&lt;li&gt;Server 1 validates authentication and authorization&lt;/li&gt;
&lt;li&gt;Idempotency check ensures this isn't a duplicate message&lt;/li&gt;
&lt;li&gt;Database transaction saves the message atomically&lt;/li&gt;
&lt;li&gt;Server 1 publishes event to message broker with routing key&lt;/li&gt;
&lt;li&gt;Exchange routes message to appropriate queue&lt;/li&gt;
&lt;li&gt;Queue holds message temporarily&lt;/li&gt;
&lt;li&gt;Server 2 (subscribed to queue) receives the message&lt;/li&gt;
&lt;li&gt;Server 2 pushes message to User B via WebSocket&lt;/li&gt;
&lt;li&gt;User B sees the message instantly&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;This event-driven architecture, combined with WebSockets, message brokers, and distributed systems, is what powers the seamless real-time experiences we enjoy in modern messaging apps every day.&lt;/p&gt;

</description>
      <category>systemdesign</category>
    </item>
  </channel>
</rss>
