DEV Community: Kader Khan

WebRTC P2P vs MCU vs SFU

Kader Khan — Tue, 06 Jan 2026 09:07:39 +0000

1. What Is WebRTC (Quick Overview)?

WebRTC stands for Web Real-Time Communication — an open standard that enables audio and video streaming directly between browsers and apps without plugins. It’s the foundation of modern video calling on the web because it:

📌 Works in most browsers
📌 Uses real-time protocols (RTP/UDP) for low delay
📌 Secures streams with encryption
📌 Doesn’t require installation of special plugins

But at its core, WebRTC is originally designed for peer-to-peer connections — meaning one peer connects directly to another. This is great for 1-to-1 calls, but becomes complicated with more participants.

2. Peer-to-Peer (P2P) ➝ Mesh Architecture

🌐 How P2P works

Imagine you and one other person want a video call. WebRTC makes a direct connection between your device and theirs. Both devices send and receive streams directly — no server in the middle.

This is ideal for one-to-one video calls:

✔ Low latency
✔ No central server required
✔ No additional cost

🧠 But what if more people join?

If you add a third person, each participant must connect with each other:

A ↔ B
A ↔ C
B ↔ C

That’s 3 connections. If you add a fourth, it becomes more tangled:

6 total connections:
A↔B, A↔C, A↔D,
B↔C, B↔D,
C↔D

This pattern is called a mesh — each peer connects to all others directly.

📉 Problems with Mesh

🔄 Bandwidth explosion: Each peer must send its video stream to every other peer — quickly saturating upload bandwidth.
🖥 CPU & encoding cost: Each codec needs to encode video multiple times.
🧪 Not reliable when peers > ~4–6, especially over mobile or slow networks.

Thus, mesh works only for very small groups (usually up to ~5 participants).

3. Beyond Mesh — Server-Mediated Architectures

To build scalable multi-party calling, we introduce a central media server. This server can relieve peers from uploading to every other peer. There are two major ways to do this:

A. SFU — Selective Forwarding Unit

🧠 What SFU does

With SFU:

Every peer sends their stream once to the server.
The SFU forwards streams to all other participants — but it doesn’t decode or re-encode them.
Each peer receives the streams it wants and renders them.

SFU acts like a traffic hub: one upload from each user, and multiple forwards.

📊 Example

Imagine 5 participants:

You send your stream _once_ → SFU  
SFU sends out your video to Bl, B2, B3, B4 → each gets the streams they subscribed to

Each participant still receives (N-1) streams, but they only upload once.

⭐ Advantages of SFU

📈 Scales better than mesh — because upload cost on the user side doesn’t explode.
⚡ Lower server load — the server just forwards, not processes bits deeply.
🎛 Clients can choose which streams to show (e.g., pin a speaker).
📱 Supports simulcast (multiple quality layers) — better adapts to bandwidth.

⚠ Limitations

Still sends multiple streams to each client (could be heavy on download).
Server introduces another hop — slightly more latency than direct mesh.

B. MCU — Multipoint Control Unit

💡 What MCU does

MCU also receives streams from all peers. But unlike SFU, it decodes and mixes them into a single combined stream:

✔ Every participant receives just one stream — no matter how many others are in call.
✔ MCU handles mixing, layout, encoding, and then sends that one stream to all clients.

🎨 Example

In a call with 5 users:

Each user sends their stream to the MCU.
MCU combines all 5 videos into a tiled layout (e.g., a 2×2 grid plus one picture).
That single mixed video is sent back to each participant.

💎 Advantages of MCU

📉 Clients receive only one video stream — minimal CPU & bandwidth.
📺 Easy consistent layout for all participants.
📼 Good for legacy devices that can’t handle many streams.

🔥 Downsides

🧠 Very heavy server processing — mixing + encoding is CPU intensive.
💰 Expensive to scale — server resources grow with participants.
😴 Less flexible — clients get one view determined by server (can’t rearrange locally).

4. SFU vs MCU — A Quick Comparison

Aspect	Mesh (P2P)	SFU	MCU
Server Required	❌ No	✅ Yes	✅ Yes
Upload per peer	N-1 streams	1 stream	1 stream
Download per peer	N-1 streams	N-1 streams	1 stream
Server CPU Load	Low	Moderate	Very High
Client CPU Load	High	Moderate	Very Low
Scalability	Poor	High	Moderate-High
Layout Flexibility	High	High	Low

5. Why SFU Is Dominating Modern Video Apps

Today, services like Zoom, Google Meet, Jitsi, and many WebRTC SaaS platforms rely on SFU for group calls because it:

✔ Offers the best balance between scalability and performance
✔ Allows custom layouts and controls
✔ Supports simulcast adaptation to network conditions
✔ Doesn’t overwhelm the server like classic MCU does ([Clan Meeting][2])

MCU is still used for special cases like webinar broadcasting or legacy device support, but SFU is the most widely deployed.

6. Signaling, STUN & TURN — The Supporting Cast

Real world WebRTC calls don’t magically connect peers:

✔ Signaling

WebRTC uses signaling servers (your app’s backend) to exchange metadata so peers can discover each other and initiate connections.

✔ STUN

Helps discover each peer’s public IP address through NAT.

✔ TURN

Acts as a relay when direct connection isn’t possible (e.g., firewalls).

All of these help establish WebRTC connections before any media is sent.

7. Practical Examples to Visualize

🧑‍🤝‍🧑 1-to-1 Call

✔ Mesh / P2P
✔ Direct connection — minimal cost
✔ Best for simple calls

👩‍👩‍👦 Small Group (3–6 users)

✔ Mesh still kinda works
✔ But upload & CPU start suffering

🧑‍💻 Large Group (8–50+ users)

✔ Best with SFU
✔ Each user uploads once, downloads only what they want
✔ Clients can choose video layout

📺 Webinar / Broadcast

✔ MCU or Hybrid
✔ Mixed stream broadcast to many viewers

8. Summary — How WebRTC Makes Video Conferencing Work

WebRTC enables real-time audio/video streaming in browsers and apps.
For two peers, direct P2P works fine.
As participants grow, P2P becomes inefficient (mesh).
SFU solves this by forwarding streams through a central server with minimal processing.
MCU mixes all media into one stream but at high server cost.
Real apps often use hybrid models — e.g., P2P when only 2 users, SFU for groups, and even MCU for broadcasting large sessions.

WebSocket VS Polling VS SSE

Kader Khan — Sat, 03 Jan 2026 20:40:21 +0000

📌 The Classic Request-Response Model (and Its Limitations)

How Standard Web Apps Work

In a typical web app:

A client (browser/app) sends a request to the server.
The server processes it (DB access, computation, etc.).
The server sends back a response.
The connection closes.

This cycle is simple and efficient for most applications.

👉 But here’s the key problem:

Once the response is done, the server cannot send fresh data to the client unless the client asks again.

Example: A Stock Market App

Suppose you have a simple stock application:

🧑‍💻 Clients A, B, C connect and request current stock prices.
📡 The server responds — and bam! connection closes.
📉 Later, prices change on the server.
But clients A, B, C still only have old (stale) data.

This becomes a real-time problem:
👉 How does the server tell clients that data has changed?

🚀 Solution 1: WebSockets

WebSockets let you keep a persistent full-duplex connection open between clients and servers.

What Does This Mean?

Instead of:

Client → Server → Response → Connection closes

WebSockets keep the connection open:

Client ↔ Server ↔ Client ↔ Server

This allows:

The server to push updates anytime.
The client to send data anytime.
Both sides talk without closing the connection.

How It Works (Simple Diagram)

Client                         Server
  | — WebSocket handshake →     |
  |                             |
  | ← Accept & open channel —   |
  |                             |
  | — Updates can flow both →   |
  |                             |

Once the connection is open, either side can send data.

Pros of WebSockets

✅ Real real-time updates
✅ Low latency
✅ Full duplex (two-way communication)

Cons of WebSockets

❌ Hard to scale — it’s stateful (server must remember every connected client)
❌ If you have millions of connections, scaling horizontally becomes expensive
❌ Servers must synchronize updates among themselves in clustered systems

🚀 Solution 2: Polling

Polling is the simplest alternative to WebSockets.

What Is Polling?

Instead of keeping a connection alive, the client asks the server again and again:

Client: “Any new updates?”
Server: “Nope.”
Client: “Any new updates?”
Server: “Yes — here you go!”

Simple Polling Example

Let’s say the client checks every 2 seconds:

0s → “Give me new data”
2s → “Give me new data”
4s → “Give me new data”
…

If new data appears at 3.5s, the client will only get it at the next poll (4s).

👉 That means the maximum delay is equal to your poll interval — 2 seconds in this example.

Pros of Polling

✅ Easy to implement
✅ Works with load balancers and many servers
✅ Stateless — each request is independent

Cons of Polling

❌ Not truly real-time
❌ Can waste requests if no new data
❌ Frequent polling may still add network load

🚀 Solution 3: Long Polling

Long polling is an optimized form of polling.

What Is Long Polling?

Instead of responding immediately, the server holds the request open until:

New data arrives, or
A timeout expires

Then it responds with data in one shot.

Example: Long Polling for 5 Seconds

Client → Server: “Any updates?”  
Server: Hold request for 5 seconds

If updates come within 5s:
  Server → Client: Latest updates
Then client immediately re-requests.

Pros of Long Polling

✅ Fewer requests than short polling
✅ More “real-time” feel than simple polling
✅ Still stateless

Cons of Long Polling

❌ Can still hold server resources
❌ Not as instant as WebSockets
❌ Server must manage held requests

📊 Comparing the Approaches

Technique	Real-Time	Scalability	Server Load	Complexity
Polling	Moderate (delayed)	🔥 Easy	🔥 Medium	🟢 Easy
Long Polling	Good	🔥 Good	🔥 Medium	🟡 Moderate
WebSockets	Excellent	🔻 Hard	🔻 High	🟡 Moderate

🧠 Real-World Considerations

Do You Always Need Full Real-Time?

Not always.

For example, in a stock chart app:

You might only need fresh price updates, not two-way communication.
Buying/selling can still happen via regular POST API routes.

That means:

WebSockets might be overkill.
Polling or long polling might be perfectly fine.

Why Polling Works Well with Load Balancers

When you scale with many backend servers and a load balancer:

Polling requests get distributed across servers,
You avoid being tied to one server connection,
If a server goes down, your next poll goes to another healthy server.

🏁 My Final Thoughts

Real-time systems aren’t magic — they’re about choosing the right tool for the job:

🔹 Need instant push updates? → WebSockets
🔹 Need lightweight, scalable updates? → Polling / Long Polling
🔹 Want a mix of both? → Start with polling, evolve as needed

Every choice has trade-offs. Understanding the fundamental communication patterns helps you make the best architectural decision — and prevents unnecessary complexity early on.

Consistent Hashing - System Design

Kader Khan — Wed, 31 Dec 2025 22:10:07 +0000

📌 1) 💥 The Core Problem: Traditional Hashing Breaks in Distributed Systems

❓ The Scenario

In a distributed system (lots of servers handling data), we must decide which server stores what data.

A naive approach might be:

serverIndex = hash(key) % N

Where N = number of servers.

🚨 What Goes Wrong with This?

Data Reassignment on Scale Changes:
Suppose you initially have 3 servers, so you store data using hash(key) % 3. If you add a 4th server — the output of hash(key) % N changes for almost all keys instead of just the new ones, because N changed. This forces huge data reshuffling across all 3 servers — terrible at scale.
Server Failures Reassign All Keys:
If one server dies, now N changes again, so most keys will get recomputed to new locations — even if the data itself didn’t move — causing many cache or lookup failures.

➡ That means every server change leads to data migrations proportional to the size of the dataset — extremely expensive for millions of keys.

📌 2) 🧠 The Core Idea of Consistent Hashing

Consistent hashing solves exactly the above problems by reshaping the hashing strategy:

✔ Both servers and keys are placed onto the same circular hash space (“hash ring”).

Each server and each data key gets a hash value that represents a position on this circle.

Imagine the hash output as degrees on a clock:

0 ——————————————— 359

It wraps around like a circle — meaning address 359 is next to 0.

✔ The rule for placing data:

To decide where a piece of data belongs, hash the key and then move clockwise around the circle until you find the first server.
That server becomes the owner of that piece of data.

This clockwise traversal is the fundamental idea — and here’s why it matters.

📌 3) 🌀 How Clockwise Traversal Works — Step by Step

📍 Step A — Place Servers on a Ring

When the system starts, each server’s identity (e.g., IP address) is hashed to a position:

Server A -> hash = 50  
Server B -> hash = 150  
Server C -> hash = 300

On the hash ring, that might look like:

0 — A(50) — B(150) — C(300) — (wraps to 0)

This division implicitly creates ranges of the ring managed by each server:

From after C back to A covers one region
From after A to B covers another
And so on

📍 Step B — Assign Data Keys

Now if you receive a data key:

Key1 hashed -> 100

You traverse clockwise from position 100:

100 -> next server clockwise = B(150)

So Key1 is stored on server B.

Another example:

Key2 hashed -> 320  
320 -> next server clockwise = A(50, after wraparound)

Key2 is stored on A — because after you go past the highest server hash, you wrap to the lowest one.

This clockwise rule ensures:

👉 Every key maps to exactly one server
👉 You never have gaps — because the ring loops indefinitely

📌 4) 🧩 What Happens When a Server Is Added?

📌 The Problem Before Consistent Hashing

Adding a new server normally forces remapping of all keys. That means huge data movement.

📌 What Consistent Hashing Does Instead

Suppose we add:

Server D -> hash = 200

Now the ring looks like:

0 — A(50) — B(150) — D(200) — C(300)

Only keys that fell between B and D in the ring used to be assigned to C, before D existed.

Now when you insert D, data whose hashes lie between B(150) and D(200) will be transferred to D — but all other keys stay exactly where they are.

This is the critical benefit:

🧠 Only the keys in the range that D takes over change their assignment. Everything else stays the same.

And that’s exactly what “consistent” means — only a small, predictable subset is redistributed.

📌 5) 🧠 What Happens When a Server Is Removed or Fails?

Let’s say server B (at hash 150) fails.

Then:

All keys that were assigned to B go to the next server clockwise — which now is D (at 200).
Keys originally mapped to A and C remain untouched.

This means most keys stay where they were, only the ones belonging to the removed server migrate.

📌 6) 📌 Why This Minimizes Disruption

Traditional % N hashing redistributes almost all keys when N changes.

Consistent hashing redistributes only the keys that were mapped to:

✔ the area between the new server’s predecessor and itself (on addition)

✔ the removed server’s range (on removal)

That’s only ~1/N of the total keys — meaning only a small portion moves.

This is why consistent hashing scales beautifully.

📌 7) 🧠 Load Balancing & Virtual Nodes

⚠ Uneven Load Problem

Without extra care, a server could accidentally be placed such that it covers a large arc of the ring — leading to uneven load: one server gets many keys, others get few.

🎯 Solution: Virtual Nodes

Instead of mapping each server once on the ring, each server gets many virtual points (replicas) scattered around the circle.

For example:

Server A -> spots at 10, 110, 210  
Server B -> spots at 40, 140, 240

This spreads the data load more evenly, because each server participates in many regions of the hash space — smoothing out uneven gaps.

📌 8) 🔎 Practical Uses & Why It Matters

Consistent hashing is widely used in real production systems to enable:

✅ Distributed caching (e.g., Memcached, Redis) — so cache nodes can scale without evictions everywhere.
✅ Distributed databases (e.g., Cassandra, Dynamo) — to shard data efficiently.
✅ Content Delivery Networks (CDNs) — to cache content close to clients with minimal reshuffle.
✅ Load Balancing in microservices — to route requests consistently by user/session.

📌 9) Summary: Why It Matters in Real Systems

Problem	Traditional Hashing	Consistent Hashing
Key mapping	Simple	Circular traversal
Node addition	Redistributes almost all keys	Only ~1/N keys move
Node removal	Redistributes almost all keys	Only keys from removed node move
Load balance	Can be uneven	Virtual nodes smooth it

Consistent hashing turns what would be a chaotic, system-wide reshuffle into a local, predictable relocation — ideal for high-scale, dynamic infrastructure.

Event Sourcing - System Design Pattern

Kader Khan — Tue, 30 Dec 2025 13:02:03 +0000

“Imagine every action in your system writes to a timeline. This timeline can be read later to rebuild any version of the system — like time travel.”

✅ The Problem with Traditional CRUD Systems

In traditional systems (like most apps we’ve built):

We update the database to change state (e.g., set status = “processed”)
We overwrite old values
We lose history — we only store the latest state

📌 This leads to real problems such as:

No audit trail
We often can’t answer questions like: “What exactly happened to this order between 10:01 and 10:03?”
Inconsistencies due to partial failures
If part of a workflow fails (e.g., processing succeeds, but updating state fails), the system goes into an inconsistent state with no clear way to fix it.
Hard to debug or replay history
we cannot rewind to a point in time and reconstruct what state should have been.

👉 As systems scale with heavy workloads, these problems get worse. we need a better way to track changes than just “update this value now.”

✅ Event Sourcing — The Core Idea (Solved Problem)

Event Sourcing says:
👉 Instead of saving only the current state in the database, save every change as an event in order.

These events are:

✔ Immutable (never changed after they’re written)
✔ Ordered (every event has a timestamp or sequence)
✔ Replayed to reconstruct the current state

So instead of doing:

Product Price
100

We store events like:

PriceChanged from 90 ➝ 100 at 10:01AM
PriceChanged from 100 ➝ 110 at 10:10AM

To compute the current state, we simply replay those events.

💡 What Event Sourcing Solves (In Simple Terms)

Traditional CRUD	Event Sourcing
Only current state	Full history of all changes
Hard to track why something happened	we can replay to see why something happened
Race conditions can corrupt data	we always record events in a safe log
Hard to debug	we’ve got an audit trail

So the problem being solved is not just scaling — it’s:

“How do we store every change in a way we can trace, debug, and rebuild the system state reliably?”

📦 Event Sourcing Architecture (AWS)

🧱 AWS Architecture Example — Ride Booking (From AWS Guidance)

AWS provides a real architecture pattern for event sourcing:

1. User Action — Client Calls API Gateway

A user does something, e.g., Book a Ride.
This request first hits Amazon API Gateway, which exposes a public API endpoint.

2. Lambda Writes an Event to AWS Kinesis(Kafka in AWS)

The Lambda function acts as a command handler:

✔ It checks business logic
✔ It creates an event like RideBooked
✔ It sends this event to Amazon Kinesis Data Streams — an append-only event storage and streaming service

📌 Why Kinesis?
Because it can handle very high write throughput and acts as an event log we can replay.

3. Events Are Stored & Archived

Kinesis doesn’t just stream — we can also:

✔ Archive events in Amazon S3 for long-term retention (for compliance & audits)
✔ Retain events for replay or future analysis

This means our system generates a complete history of every change, backed up indefinitely.

4. Event Processor Lambda Builds Materialized Views

Another Lambda function consumes events from Kinesis to build read models (optimized tables that are easy to query). Typical read stores are:

✔ Amazon Aurora (MySQL/PostgreSQL)
✔ Amazon DynamoDB

This process creates current state views for read-heavy use cases.

5. Replay to Rebuild State (Hydration Model)

If something goes wrong, or we want to compute state at any point in time, we simply replay the events using hydration model stored in Kinesis + archived in S3.

This is what it calls Hydration — re-deriving the current or historical state of the system from the event log.

🧠 Hydration Model Explained (Simple)

Think of hydration as:

🎬 Re-running the entire timeline of events
so that our system always ends up in the correct state.

For example Video streaming platform, in this service example:

Event 1: VideoUploaded
Event 2: VideoProcessingStarted
Event 3: VideoProcessingSucceeded

To know current state:

state = "initial"
apply VideoUploaded → state="uploaded"
apply VideoProcessingStarted → state="processing"
apply VideoProcessingSucceeded → state="success"

That’s Hydration — it rebuilds state by replaying events in order, not by reading a single “status” value.

🐘 Why Kafka or Kinesis Are Used

Both Kafka (used in the transcript example) and Kinesis (AWS alternative) are event streaming platforms — essentially massive, durable, ordered logs of events. Also these are make sure specially, consumer group and topic partitions concepts make sure processors are getting sequential events and patch those sequentially too.

Why this matters

✔ We can replay events — essential for event sourcing
✔ We can scale horizontally (many consumers)
✔ We guarantee event order within partitions — crucial for replay and consistent state reconstruction

📌 Consumer Groups & Topic Partitions (Why They Matter)

When the event volume is large, we cannot have one server read everything.

So we use:

🔹 Kafka Consumer Group

Multiple workers that form a group and share work.
Each worker gets assigned partitions so no duplicates occur.

🔹 Topic Partitions

A topic (event category) is split into partitions — think of partitions as divided lanes of the event log. This allows:

✔ Parallel processing
✔ Ordered event consumption per partition
✔ Scale without losing order for each entity

For example in the video streaming platform pipelines, video A events are always in partition 0 and video B in partition 1, so events for each video are always processed in order even across many workers.

Problem Being Solved

Traditional system:

Database:
video_id | status
------------------
123      | "processing"

Problems:
✔ What if the update failed?
✔ What do you show to the user?
✔ What if you need to know the exact steps the video went through?

Event Sourcing Pattern solves it:

Event Log:
1. VideoUploaded(videoID=123)
2. VideoProcessingStarted(videoID=123)
3. VideoProcessingProgress(videoID=123, percent=50)
4. VideoProcessingFailed(videoID=123, error="timeout")

To get state:

Hydration Model reads:

apply VideoUploaded → status="uploaded"
apply VideoProcessingStarted → status="processing"
apply VideoProcessingProgress → status="processing:50%"
apply VideoProcessingFailed → status="failed"

We can even show why the failure happened — something impossible with simple CRUD.

🧩 AWS Services we can use

Role	AWS Service
API entrypoint	API Gateway
Command processor	AWS Lambda
Event storage	Kinesis Data Streams
Archive & audit log	Amazon S3
Event distribution	EventBridge / DynamoDB Streams
Read-optimized views	Aurora / DynamoDB
Async processing	Lambda consumers

CQRS Pattern and Event Sourcing System Design

Kader Khan — Mon, 29 Dec 2025 16:13:35 +0000

Core Concepts and Overview

CQRS (Command Query Responsibility Segregation) separates the operations that modify data (commands) from those that read data (queries).
Traditional applications handle CRUD (Create, Read, Update, Delete) operations in a single database and layer, potentially causing bottlenecks during heavy read/write loads.
CQRS addresses this by splitting the system into two parts:
Command side: handles all data mutation (create, update, delete).
Query side: handles all read operations.
This separation helps optimize system performance, scalability, and maintainability, especially in high-complexity systems.

Traditional Application Architecture and Its Limitations

Users interact with a server layer exposing REST API endpoints (GET, POST, PATCH, DELETE).
The server processes requests via controllers and service layers, directly performing CRUD operations on a single database.
Scaling traditional apps involves vertical scaling (adding CPU/RAM) or horizontal scaling (adding more server instances).
Bottleneck: When reads and writes compete on the same database, locks during updates cause delays and slow queries, especially under high load (example: Amazon product price updates vs reads).
This leads to database contention and performance degradation.

CQRS Pattern Architecture

Component	Description
Presentation Layer	User Interface and REST API endpoints that act as the entry point for all requests
API Gateway	Routes read (query) requests to the query side and mutation (command) requests to the command side
Command Side	Handles commands (create, update, delete) and writes to a dedicated write database
Query Side	Handles queries (read operations) from a separate read database
Event System	Synchronizes changes from the write database to the read database using events and queues

Separate Databases for reads and writes: read database is optimized for queries (often denormalized), write database is normalized and optimized for transactions.
The write model processes commands validating and authorizing them before updating the write database.
The read model processes queries against the read database, which is updated asynchronously via events emitted after writes.
This results in eventual consistency between read and write databases, acceptable in many business scenarios but unsuitable for highly real-time systems (e.g., stock markets).

Event Sourcing Integration

CQRS can be combined with Event Sourcing, where every change is stored as an append-only event log rather than directly updating the state.
The system stores immutable logs of all commands/events, which can be replayed to rebuild the current state of the database (hydration).
This provides fault tolerance; if the read database becomes corrupt or stale, it can be regenerated from the event log.
Event logs can also trigger side effects such as sending promotional or notification emails.

Practical AWS-Based System Design Example

AWS Component	Role
API Gateway	Routes requests based on HTTP method to command or query services
Elastic Load Balancer (ELB)	Distributes requests among multiple horizontally scaled EC2 instances for command/query services
EC2 Instances (Command Handlers)	Execute commands, perform validation and authorization
Kafka (or AWS Kinesis)	Event/message broker for append-only event logs
SQS Queues	Handle asynchronous event processing and fan-out to services like email notifications
Lambda Functions	Process events to update read database and trigger other actions
DynamoDB (Read DB)	Stores denormalized data optimized for fast queries
ClickHouse or similar	Example write database storing append-only logs
CloudFront CDN	Caches GET requests for faster read performance with cache invalidation upon updates

The architecture enables horizontal scalability, fault tolerance, and efficient separation of concerns.
Read and write paths can be independently optimized with different database technologies (SQL for writes, NoSQL for reads).

Benefits and Trade-offs

Benefits:

Improved scalability by separating reads and writes.
Reduced contention and locking issues on databases.
Flexibility to use different databases optimized for different workloads.
Fault tolerance and recoverability via event sourcing.
Ability to implement complex business logic and authorization in command handlers.

Trade-offs:

Eventual consistency model means read data may lag slightly behind writes.
Added architectural complexity unsuitable for small or simple applications.
Complexity in keeping read and write databases synchronized.
Not ideal for systems requiring strong real-time consistency guarantees.

When to Use CQRS

Suitable for complex, large-scale, distributed systems.
When read and write workloads have different performance, scaling, or consistency requirements.
When multiple microservices and databases are involved, requiring data segregation.
When eventual consistency is acceptable for the business domain.
Not recommended for small/simple applications or those needing immediate strong consistency.

Key Insights

CQRS is a powerful pattern for scaling complex applications by segregating commands and queries.
The use of different databases for read and write sides is central to the pattern.
Event sourcing complements CQRS by maintaining a reliable audit log and enabling system state reconstruction.
AWS ecosystem components like API Gateway, ELB, EC2, Lambda, DynamoDB, Kafka/Kinesis, and SQS can effectively implement CQRS with event sourcing.
Eventual consistency is a core characteristic and must be carefully evaluated against application needs.

Meet Pulsimo - Monitor Your Systems with Precision & Power

Kader Khan — Sun, 16 Nov 2025 13:58:07 +0000

Have you ever wondered—

If your production backend or database service crashes, how fast do you actually get notified, and how quickly can you jump into troubleshooting?

My Personal Research:

🏭 Typical Industrial Use Case

If a Prometheus + Alertmanager setup is properly tuned, you usually get notified within 1–1.5 minutes.

⏱️ As-Fast-As-Possible Estimated Timeline Theory

Scrape Interval

Let’s assume Prometheus scrapes metrics every 15–30 seconds, which is common in well-optimized setups.
If we take 15 seconds as the fastest scenario, the earliest delay starts here.

Rule Evaluation Interval

After scraping, alerting rules are evaluated every 15 seconds.

Rules Manifest (for: 1m or reduced)

Assume you've configured the rule such that if the service is down for 10 seconds, Prometheus should fire an alert.

Alertmanager buffering (minimal assumptions)

Ignoring group_wait, group_interval, repeat_interval to keep it raw—
Let’s assume Alertmanager needs ~10 seconds to process and send the first notification.

📌 Combined Timeline

Putting it all together:

Scrape delay → ~15s
Rule evaluation delay → ~15s
Down detection threshold → ~10s
Alertmanager handling → ~10s
Network jitter → (Optional small fluctuation)

👉 Total: ~50 seconds – ~1 minute
In real-world noisy networks → up to 1.5 minutes
This means you start taking action 1–1.5 minutes after the actual outage.

During this time, your data loss may be small or large—depending on how critical the endpoint is.
But for mission-critical endpoints, data loss will happen.

🚀 But what if you could know within just 10 seconds?

Imagine receiving outage alerts ~50 seconds earlier than Prometheus.

Not just faster alerts—you could:

Closely monitor application behavior in real-time
Understand performance patterns
Visualize dependency graphs
Analyze blast radius
Improve MTTR, SLA, SPOF detection
Perform critical path analysis
And much more...

Introducing Pulsimo 🎉

An on-premise focused endpoint monitoring platform designed to give ultra-fast detection and deep observability.

Currently in public beta.
Any kind of feedback is truly appreciated.

🔗 https://pulsimo.github.io

If anyone is interested in contributing — feel free to reach out!

Calico Node Readiness Probe Failed Issues

Kader Khan — Wed, 22 Oct 2025 18:11:04 +0000

🛠️ Resolving Calico Node Readiness Issues: A Practical Guide

🧩 Problem Overview

In Kubernetes clusters utilizing Calico as the networking solution, nodes may occasionally report a "not ready" status due to BIRD (Border Gateway Protocol) not initializing properly. This issue often stems from Calico's IP autodetection mechanism selecting an unintended network interface, leading to misconfigured BGP sessions which is impacting node to node communication.

🔍 Symptoms

Pods on the affected node cannot communicate with pods on other nodes.
The node's status is "not ready" in the Kubernetes cluster.
BIRD logs indicate errors like:

  bird: Unable to open configuration file /etc/calico/confd/config/bird.cfg: No such file or directory
  bird: Unable to open configuration file /etc/calico/confd/config/bird6.cfg: No such file or directory

These errors suggest that BIRD cannot find its configuration files, often due to incorrect IP autodetection.

🧭 Root Cause

Calico's default IP autodetection method (first-found) may select an unintended interface, especially in nodes with multiple network interfaces. This misconfiguration can lead to BIRD being unable to establish proper BGP sessions, resulting in the node being marked as "not ready".

✅ Solution Approach

1. Identify the Correct Network Interface

Determine the appropriate network interface for Calico's BGP peering. Typically, this would be the primary network interface used for inter-node communication.

2. Set IP Autodetection Method Temporarily

To test the new configuration, set the IP_AUTODETECTION_METHOD environment variable on the Calico node DaemonSet:

kubectl set env daemonset/calico-node -n calico-system IP_AUTODETECTION_METHOD=interface=eth0

Replace eth0 with the correct interface name identified in the previous step.

3. Verify the Configuration

Check the status of the Calico node pods to ensure they are running correctly:

kubectl get pods -n calico-system

Additionally, inspect the logs of the Calico node pods to confirm that BIRD has initialized without errors:

kubectl logs -n calico-system calico-node-<pod-id>

4. Set IP Autodetection Method Permanently

To make the change permanent, update the Calico Installation resource:

apiVersion: operator.tigera.io/v1
kind: Installation
metadata:
  name: default
  namespace: tigera-operator
spec:
  calicoNetwork:
    nodeAddressAutodetectionV4:
      interface: eth0

Apply the updated configuration:

kubectl apply -f <your-installation-file>.yaml

This ensures that the specified interface is used for IP autodetection across all nodes in the cluster.

5. Restart Calico Node Pods

After applying the changes, restart the Calico node pods to apply the new configuration:

kubectl rollout restart daemonset/calico-node -n calico-system

This command restarts the Calico node DaemonSet, ensuring that all pods pick up the new configuration.

🧪 Verification

After completing the steps above, verify that the node has transitioned to a "ready" state:

kubectl get nodes

Ensure that the node in question is listed as "Ready".

Also, confirm that BIRD is running without errors:

kubectl exec -n calico-system calico-node-<pod-id> -- birdcl show status

The output should indicate that BIRD is initialized and ready.

💡 Best Practices

Consistent Configuration: Ensure that the IP autodetection method is consistently configured across all nodes to avoid network inconsistencies.
Regular Monitoring: Regularly monitor the status of Calico node pods and BIRD to detect and resolve issues promptly.
Documentation: Document the network interfaces and configurations used for IP autodetection to facilitate troubleshooting and future configurations.

By following this approach, you can resolve Calico node readiness issues related to IP autodetection and ensure stable networking within your Kubernetes cluster.

Local Docker Registry Setup Guide

Kader Khan — Fri, 14 Mar 2025 16:16:14 +0000

Prerequisites

Make sure your machine has public IP associate with itself
Ensure you have sudo privileges on your system.
Update your system's package list and upgrade existing packages.

Step 1: Install Docker and Docker Compose

Update Your System:

   sudo apt update && sudo apt upgrade -y

Install Docker:

   sudo apt install -y docker.io
   sudo systemctl enable --now docker

Add User to Docker Group:

   sudo usermod -aG docker $USER
   newgrp docker

Verify Docker Installation:

   docker --version

Step 2: Run a Local Docker Registry

Run the Registry:

   docker run -d -p 5000:5000 --name registry --restart always registry:2

Verify the Registry is Running:

   curl http://localhost:5000/v2/

Check Available Registry Images:

   curl http://localhost:5000/v2/_catalog

Step 3: Secure the Registry with Authentication

Create Authentication Credentials:

   sudo mkdir -p /etc/docker/registry
   sudo chmod 777 /etc/docker/registry

Install Apache Utilities (htpasswd):

   sudo apt update
   sudo apt install -y apache2-utils

Generate Credentials:

   htpasswd -Bbn <username> <password> > /etc/docker/registry/htpasswd

Login to the Private Registry:

   docker login localhost:5000

Step 4: Secure the Registry with SSL/TLS

Install Certbot for SSL Certificates:

   sudo apt install -y certbot

Generate an SSL Certificate:

   sudo certbot certonly --standalone -d-<your_domain_name>

Run the Registry with SSL & Authentication:

At First Stop the running registry

   docker stop registry && docker rm registry

Then run the registry again with

   docker run -d -p 5000:5000 --name registry --restart always \
   -v /etc/docker/registry:/auth \
   -v /etc/letsencrypt:/certs \
   -e "REGISTRY_AUTH=htpasswd" \
   -e "REGISTRY_AUTH_HTPASSWD_REALM=<your_realm>" \
   -e "REGISTRY_AUTH_HTPASSWD_PATH=/auth/htpasswd" \
   -e "REGISTRY_HTTP_TLS_CERTIFICATE=/certs/live/<domain>/fullchain.pem" \
   -e "REGISTRY_HTTP_TLS_KEY=/certs/live/<domain>/privkey.pem" \
   registry:2

Test Secure Connection:

   curl -k -u <user>:'<password>' https://<domain>:5000/v2/

Troubleshooting

If you encounter any issues, run the following commands to adjust permissions:

sudo chmod -R 755 /etc/letsencrypt/
sudo chmod -R 755 /etc/letsencrypt/live/
sudo chmod -R 644 /etc/letsencrypt/live/<domain>/*
sudo chmod -R 644 /etc/letsencrypt/archive/<domain>/*
sudo chmod 640 /etc/docker/registry/htpasswd
sudo chown root:docker /etc/docker/registry/htpasswd

Longhorn CSI pvc attachment issues fixing with multipath

Kader Khan — Wed, 19 Feb 2025 18:15:56 +0000

🚀 Longhorn CSI Mount Issue Fix

❗ Issue

Pods using Longhorn volumes may fail to start due to errors in longhorn-csi-plugin, specifically related to mount failures caused by multipathd.

🔍 Error Message

Mounting command: mount
Mounting arguments: -t ext4 -o defaults /dev/longhorn/pvc-xxxx /var/lib/kubelet/pods/xxx/mount
Output: mount: /var/lib/kubelet/pods/xxx/mount: /dev/longhorn/pvc-xxxx already mounted or mount point busy.

🎯 Root Cause

The multipath daemon (multipathd) automatically creates multipath devices for block devices, including Longhorn volumes. This results in conflicts when mounting Longhorn volumes, preventing pods from starting.

✅ Solution

1️⃣ Check Longhorn Devices

Run the following command to list devices created by Longhorn:

lsblk

🔹 Longhorn devices typically have names like /dev/sd[x].

2️⃣ Modify `multipath.conf`

Create the configuration file (if it doesn’t exist):

   sudo touch /etc/multipath.conf

Add the following blacklist rule:

   blacklist {
       devnode "^sd[a-z0-9]+"
   }

3️⃣ Restart Multipath Service

Apply the changes by restarting the multipath daemon:

sudo systemctl restart multipathd.service

4️⃣ Verify Configuration

Check if the new configuration is applied:

multipath -t

🎉 Your pods should now be able to mount Longhorn volumes correctly!

📌 Additional Tips

Ensure that longhorn-csi-plugin logs are clear of mount errors.
If the issue persists, consider rebooting the node after applying the fix.
Check the status of multipath with:

  systemctl status multipathd.service

🛠️ Need More Help?

🔹 Visit the Longhorn Documentation

🔹 Join the Longhorn Community

🚀 Happy Deploying!

AWS EBS Multi-attach Clustered Storage System with GlusterFS

Kader Khan — Wed, 19 Feb 2025 18:04:40 +0000

Clustered Storage System with GlusterFS on AWS EC2 Instances

This guide describes how to set up a clustered storage system using GlusterFS on two AWS EC2 instances, utilizing EBS Multi-Attach for shared storage.

Prerequisites

Minimum 2 EC2 (minimum c5.large) instances
Both instances in the same AWS region and availability zone
Tested on AWS Provided Ubuntu 24.04 LTS
EBS Volume Type: Provisioned IOPS SSD (io2) to support Multi-Attach
SSH access to EC2 instances (ensure both instances have public IPs)
External EBS Multi-Attach enabled for volumes

Solution Overview

Steps

Create EC2 Instances and EBS Volumes
Attach EBS Volumes to Both Instances
Format and Mount EBS Volumes
Install GlusterFS and Set Up Cluster
Create GlusterFS Volume and Mount Shared Storage
Auto-mount EBS Volume on Instance Reboot

Respective Diagram

Step 1: Create EC2 Instances and EBS Volumes

1.1 Create EC2 Instances

Create 2 EC2 instances in the same region and availability zone on aws console.
Assign public IPs to the instances so that they can be accessed.

1.2 Create EBS Volumes (io2)

Create EBS volumes with the type Provisioned IOPS SSD (io2).
Ensure that these volumes are in the same availability zone as your EC2 instances.
Enable Multi-Attach for both volumes so they can be attached to multiple instances.

1.3 Attach EBS Volumes to EC2 Instances

Attach the created EBS volumes to both EC2 instances.
Goto EBS Volume then Select the EBS Volume (io2) then on the upper right corner Select Action then Attach Volume.

Step 2: Format and Mount EBS Volumes

2.1 Verify EBS Volume Connection

SSH into both EC2 instances and verify the connection of the EBS volume by running:
```
lsblk
```
Ensure that the volume is attached to the instance.

2.2 Format the EBS Volume

Format the EBS volume once on either of the instances (only format it once for the shared file system):
```
sudo mkfs.xfs /dev/nvme1n1  # Replace 'nvme1n1' with your disk name
```

2.3 Mount the EBS Volume

Create a directory to mount the EBS volume:
```
sudo mkdir /home/ubuntu/data
```
Mount the EBS volume to the newly created directory:
```
sudo mount /dev/nvme1n1 /home/ubuntu/data
```
Verify the mount:
```
df -h /home/ubuntu/data
```

Step 3: Install GlusterFS

3.1 Install GlusterFS on Both Instances

Add the GlusterFS repository and install the GlusterFS server package:

sudo add-apt-repository ppa:gluster/glusterfs-10
sudo apt-get update
sudo apt-get install -y glusterfs-server

3.2 Start GlusterFS Service

Start the GlusterFS service on both instances:

sudo systemctl start glusterd
sudo systemctl enable glusterd

Step 4: Set Up GlusterFS Cluster

4.1 Peer Probe

Choose one instance as the primary and the other as the secondary.
On the primary instance, run the following command to add the secondary instance to the cluster:
```
sudo gluster peer probe <secondary_instance_privateIP>
```
If successful, the output will display: Success.

Step 5: Create GlusterFS Volume

5.1 Create Shared Volume

On the primary instance, create a GlusterFS volume called shared-volume using the mounted EBS volumes. This volume will be replicated across both instances:

sudo gluster volume create shared-volume replica 2 transport tcp <primary_instance_privateIP>:/home/ubuntu/data <secondary_instance_privateIP>:/home/ubuntu/data force

5.2 Start the GlusterFS Volume

Start the shared-volume on from any instances once:
```
sudo gluster volume start shared-volume
```

Step 6: Mount GlusterFS Volume

6.1 Mount the GlusterFS Volume on Both Instances

Create a mount point directory (e.g., /mnt/shared):
```
sudo mkdir /mnt/shared
```

Mount the shared-volume GlusterFS volume to this directory:

sudo mount -t glusterfs <primary_instance_privateIP>:/shared-volume /mnt/shared

Verify that the GlusterFS volume is mounted correctly by creating or updating a file in the /mnt/shared directory on either instance.

Step 7: Auto-Mount EBS Volume on Reboot

7.1 Add to `/etc/fstab`

Edit /etc/fstab to automatically mount the EBS volume on reboot:

echo '/dev/nvme1n1 /home/ubuntu/data xfs defaults 0 0' | sudo tee -a /etc/fstab

Conclusion

You have successfully set up a clustered storage system using GlusterFS with shared EBS volumes in AWS. The shared storage is now accessible from both EC2 instances, and the GlusterFS volume ensures that data is synchronized between the two instances.