DEV Community: Piyush Gupta

Master-Class: Caching — What Every Software Engineer Actually Needs to Know

Piyush Gupta — Tue, 05 May 2026 08:01:18 +0000

Skip the theory rabbit holes. This is the caching knowledge that shows up in system design interviews, code reviews, and the 2 AM production incidents nobody warned you about.

Why Caching — The 30-Second Version
Where Do You Actually Cache?
Cache-Aside — The Pattern You'll Use 80% of the Time
Write Strategies — The Other Side of the Coin
Eviction Policies — LRU, LFU, and When It Matters
TTL — Getting Expiry Right
Cache Invalidation — The Hard Problem
Cache Stampede — The Failure Mode That Kills Systems
What NOT to Cache
Practical Decision Framework
Summary Cheat Sheet

1. Why Caching

Your database query takes 50ms. Your Redis lookup takes 0.5ms. If that same data is read 10,000 times before it changes, you're doing 10,000 × 50ms of database work — or you could do 1 × 50ms + 9,999 × 0.5ms. That's the entire case for caching.

Caching works because of read-heavy, write-light data patterns. Most applications read the same data far more than they change it. A product page might be read 100,000 times a day and updated once. A user profile is fetched on every page load and changed when the user updates their settings.

The fundamental tradeoff is always speed vs. freshness. A cache is a copy of data. That copy can be stale. Every caching decision is asking: how stale is acceptable, and for how long?

2. Where Do You Actually Cache?

In real applications, you'll encounter caches at multiple layers. Know them, because they interact:

In-process cache — data stored in your application's memory (a dict, a functools.lru_cache, Guava Cache in Java). Sub-millisecond. Doesn't survive restarts. Not shared across instances. Great for configuration, rarely-changing reference data, or memoization.

from functools import lru_cache

@lru_cache(maxsize=1000)
def get_country_code(country_name: str) -> str:
    return db.query("SELECT code FROM countries WHERE name = ?", country_name)

Distributed cache — Redis or Memcached. Your whole fleet of servers shares the same cache. 0.5–5ms per call. This is what people mean when they say "add a cache layer." It's the workhorse.

CDN — Cloudflare, CloudFront, Fastly. Caches HTTP responses at edge servers close to users. Milliseconds instead of hundreds of milliseconds. Invaluable for static assets, public API responses, and rendered pages.

Database query cache — Most databases (PostgreSQL, MySQL) cache query results internally. You get this for free and generally don't think about it directly, but it's why the second run of a cold query is faster than the first.

In a system design interview, when the interviewer asks "how do you handle scale?", the answer often involves: add Redis between your app and your database. Know when that's appropriate and when it isn't.

3. Cache-Aside

This is the pattern you will use most. Learn it cold.

The read flow:

Check the cache
If HIT → return the cached value
If MISS → query the database, store the result in cache with a TTL, return it

The write flow:

Write to the database
Delete (invalidate) the cache entry

def get_user(user_id: int) -> User:
    cache_key = f"user:{user_id}"

    # 1. Check cache
    cached = redis.get(cache_key)
    if cached:
        return User.from_json(cached)

    # 2. Cache miss — hit the DB
    user = db.query("SELECT * FROM users WHERE id = %s", user_id)

    # 3. Store in cache for next time (5 minute TTL)
    redis.setex(cache_key, 300, user.to_json())
    return user

def update_user(user_id: int, data: dict):
    db.execute("UPDATE users SET ... WHERE id = %s", user_id)
    redis.delete(f"user:{user_id}")  # Invalidate — don't update, just delete

Why delete instead of update on writes?

Updating the cache on a write sounds sensible but creates race conditions. Two concurrent writes can race each other, leaving the cache in the wrong state. Deleting is safe — the next read will fetch fresh data from the database and re-populate the cache correctly.

Why cache-aside is the safe default:

If Redis goes down, your application still works (just slower — it falls back to the DB)
Only data that's actually requested gets cached (no wasted memory)
Simple to reason about and debug

The main downside: the first request after a cache miss always pays the full DB cost. For a freshly deployed service, all traffic misses until the cache warms up.

4. Write Strategies

Cache-aside handles reads well. For writes, you have three meaningful choices depending on your consistency requirements.

Write-Through — Always Consistent, Always Slower

On every write: update the DB and update the cache in the same operation. The cache is never stale.

def update_product_price(product_id: int, new_price: float):
    db.execute("UPDATE products SET price = %s WHERE id = %s", new_price, product_id)
    # Also update cache immediately
    product = db.get_product(product_id)
    redis.setex(f"product:{product_id}", 300, product.to_json())

Use when: you have strict read-after-write requirements. A user updates their profile — they expect to immediately see the new version on their next page load.

The cost: every write is now double the latency (DB write + cache write). In write-heavy systems, this adds up.

Write-Behind (Write-Back) — Fast Writes, Risk of Data Loss

Write to the cache immediately, acknowledge success to the user, flush to the DB asynchronously in the background.

Write → Cache (instant ACK to user)
              ↓ (background, batched)
         Database

Use when: you're writing high-frequency counters, view counts, analytics events — data where you can tolerate eventual consistency and need sub-millisecond write latency.

The real risk: if Redis crashes before flushing to the DB, that data is gone. Only use this when data loss is survivable.

Write-Around — Skip the Cache on Write

Writes go straight to the DB. The cache is only populated on reads. Use this when data is written once and rarely (or never) re-read soon after writing.

Classic example: log entries, uploaded files, audit records. You write them, they go to storage, and they get cached only if someone actually requests them.

In Practice

Most applications use cache-aside for reads + delete-on-write for invalidation. Write-through is added for critical paths where consistency matters. Write-behind is reserved for high-frequency write scenarios like analytics. You're rarely choosing just one pattern — you're mixing them based on data type.

5. Eviction Policies

When your cache is full and a new entry needs to go in, something has to be evicted. Redis gives you several policies. Here's what actually matters:

LRU (Least Recently Used): evict the entry that hasn't been accessed for the longest time. The default and the right choice for most applications. Rooted in a simple truth: if you haven't touched it recently, you probably don't need it.

LFU (Least Frequently Used): evict the entry with the fewest total accesses. Better when your popular data is stable and long-lived (a top-10 product list that's been hot for months). Worse for new content — it starts with zero frequency and gets evicted immediately even if it's about to go viral.

FIFO: evict the oldest inserted entry. Ignores access patterns. Rarely the right choice.

No eviction: Redis returns errors when full instead of evicting. Useful when your cache is a source of truth and data loss is unacceptable.

Redis Configuration

# In redis.conf or via CONFIG SET:

# Evict the least recently used keys across all keys
maxmemory-policy allkeys-lru

# Evict the least frequently used keys across all keys
maxmemory-policy allkeys-lfu

# Only evict keys that have a TTL set (leave persistent keys alone)
maxmemory-policy volatile-lru

# Set max memory (Redis won't use more than this)
maxmemory 2gb

Interview answer: "I'd use allkeys-lru for a general-purpose cache and allkeys-lfu if the dataset has clear long-term hot items like a top products list."

6. TTL

TTL (Time-To-Live) is how long a cache entry lives before Redis automatically deletes it. Getting this right matters more than most engineers think.

Too short: high miss rate, unnecessary database pressure, cache barely helps.

Too long: users see stale data, which is a product problem.

Practical TTL by Data Type

# Configuration / reference data — changes almost never
redis.setex("country_codes", 86400, data)           # 24 hours

# User profiles — changes occasionally
redis.setex(f"user:{id}", 300, data)                # 5 minutes

# Product catalog — changes regularly
redis.setex(f"product:{id}", 60, data)              # 1 minute

# Live inventory / pricing — changes frequently  
redis.setex(f"inventory:{id}", 10, data)            # 10 seconds

# Trending content — freshness matters
redis.setex("trending_posts", 30, data)             # 30 seconds

# Sessions — live as long as the session
redis.setex(f"session:{token}", 3600, data)         # 1 hour

The Jitter Rule

If you deploy a service and populate 10,000 cache entries all with TTL=300, they will all expire at exactly T+300 simultaneously. Every single one misses at once. This is a thundering herd problem — more on that in the next section.

Fix it by adding randomness to your TTL:

import random

def cache_set(key: str, value, base_ttl: int):
    # Add ±10% random jitter
    jitter = random.randint(-base_ttl // 10, base_ttl // 10)
    redis.setex(key, base_ttl + jitter, value)

This one habit prevents an entire class of production incidents.

7. Cache Invalidation

Phil Karlton famously said: "There are only two hard things in computer science: cache invalidation and naming things."

It's hard for a real reason: the data in your cache can go out of sync with your database, and getting it back in sync across distributed systems without race conditions is genuinely tricky.

Strategy 1: Let TTL Handle It

The simplest approach. Set a TTL, serve slightly stale data, and accept that users might see data that's up to TTL seconds old.

Works for: product descriptions, blog posts, user profiles, catalog data — anything where "slightly stale" is a product-acceptable answer.

# Just use TTL and don't overthink it
redis.setex(f"product:{product_id}", 60, product.to_json())
# Worst case: 60 seconds of stale data

This is underused. Many engineers reach for complex invalidation schemes when a TTL is perfectly fine.

Strategy 2: Explicit Delete on Write

When data changes, delete the cache entry immediately. The next read re-populates it fresh.

def update_product(product_id: int, data: dict):
    db.update(product_id, data)
    redis.delete(f"product:{product_id}")  # Done — next read will be fresh

Works for: any data where you own both the write path and the cache. Simple and reliable when the same service writes and caches.

The problem it doesn't solve: in microservices, Service A writes the data and Service B has it cached. Service B doesn't know about Service A's write.

Strategy 3: Event-Driven Invalidation

Service A writes to the DB, publishes an event. Service B subscribes and invalidates its cache.

# Service A (Order service) — after a status change:
def update_order_status(order_id: int, new_status: str):
    db.update_order(order_id, new_status)
    kafka.publish("order.updated", {"order_id": order_id, "status": new_status})

# Service B (Dashboard service) — subscribes to events:
@kafka.subscribe("order.updated")
def handle_order_updated(event):
    redis.delete(f"order:{event['order_id']}")

Works for: microservice architectures where multiple services cache the same data. More complex, but necessary at scale.

One important nuance: after deleting from cache, don't immediately re-populate it. Let the next natural read repopulate it. If you delete and re-populate immediately inside the event handler, you might re-populate with the old data if the DB replica hasn't caught up yet.

The Race Condition to Know

This comes up in interviews. The "double write" race:

Time 1: Request A reads product:123 → cache miss → queries DB → gets price $10
Time 2: Request B updates price to $20, deletes cache:product:123
Time 3: Request A writes the old $10 value back to cache
         Cache now has stale $10 data

The standard fix: set a short TTL even when doing explicit invalidation. The window for stale data is bounded by the TTL, not infinite.

8. Cache Stampede

This is the failure mode that takes down production systems. You need to understand it, and you need to know one way to prevent it.

What Happens

A popular cache key expires. Instead of one request rebuilding it, thousands of simultaneous requests all see a cache miss and all independently query the database at the same time. Your database, which was handling 100 queries/second because the cache was absorbing the load, suddenly receives 5,000 simultaneous identical queries.

Normal:   10,000 req/s → cache (99% hit) → ~100 req/s to DB   ✅
Stampede: popular key expires → all 10,000 req/s hit DB at once 💥

This is real. In 2019, Shopify's recommendation cache expired during Black Friday. Tens of thousands of concurrent requests slammed their database. Checkout pages timed out for 8 minutes, costing millions.

Prevention 1: TTL Jitter (Mandatory)

Covered in the TTL section — always add randomness. This prevents mass synchronized expiry but doesn't protect a single hot key from expiring under high traffic.

Prevention 2: Distributed Lock (The Reliable Fix)

Only let one request rebuild the cache. Everyone else either waits or gets served slightly stale data.

def get_trending_products():
    cache_key = "trending:products"
    lock_key = "lock:trending:products"

    # Fast path — cache hit
    cached = redis.get(cache_key)
    if cached:
        return json.loads(cached)

    # Cache miss — try to acquire a rebuild lock
    acquired = redis.set(lock_key, "1", nx=True, ex=10)  # 10s lock TTL

    if acquired:
        try:
            # This process rebuilds the cache
            products = db.get_trending_products()
            redis.setex(cache_key, 300, json.dumps(products))
            return products
        finally:
            redis.delete(lock_key)
    else:
        # Another process is rebuilding — wait briefly and try cache again
        time.sleep(0.1)
        cached = redis.get(cache_key)
        return json.loads(cached) if cached else db.get_trending_products()

Key rules for the lock:

Lock TTL must exceed the maximum time the DB query takes (if it takes 2s, your lock can't be 1s)
Use atomic SET key value NX EX ttl — never two separate commands
Always release the lock in a finally block

Prevention 3: Stale-While-Revalidate

Serve the expired (stale) data immediately, trigger a background refresh. Users never see a slow response.

def get_with_stale(key: str, fetch_fn, ttl: int):
    raw = redis.get(key)
    if raw:
        payload = json.loads(raw)
        # If soft TTL expired, refresh in background but still return stale data
        if time.time() > payload["refresh_at"]:
            threading.Thread(target=lambda: _refresh(key, fetch_fn, ttl)).start()
        return payload["value"]

    # True miss — must fetch synchronously
    value = fetch_fn()
    redis.setex(key, ttl + 60, json.dumps({
        "value": value,
        "refresh_at": time.time() + ttl
    }))
    return value

This is also built into HTTP via Cache-Control: stale-while-revalidate=N — CDNs and browsers serve the stale response for up to N seconds while refreshing in the background.

Interview answer for "how do you handle cache stampede?": "I'd add TTL jitter to prevent synchronized expiry, and for high-traffic keys I'd use a distributed lock so only one process rebuilds the cache while others serve stale data or wait briefly."

9. What NOT to Cache

Knowing what to skip is as important as knowing what to add.

Unique or rarely repeated queries. If every user has a unique, personalized query that's never repeated, you'll have a 0% hit rate. You're paying cache overhead for no benefit. Search queries are a classic trap — "nike red shoes size 10" is unlikely to be queried identically again anytime soon.

Highly mutable data. If a value changes every few seconds, the TTL has to be so short that it barely helps. Invalidating it on every write negates the performance gain.

Large objects that take up disproportionate cache memory. Caching a 5MB JSON blob means that single entry uses the memory of 5,000 small entries. Cache the small things — IDs, scalars, lightweight structs — not entire serialized graphs.

User-specific data at scale. Caching each user's dashboard state for 50 million users means 50 million cache entries, most of which will be evicted before anyone reads them again. Profile data for frequently active users: cache it. Personalized dashboard content for casual users: recalculate on demand.

Data with complex invalidation dependencies. If invalidating one piece of data requires you to figure out and invalidate 20 related cache keys, you'll eventually miss one. The cache becomes inconsistently stale in ways that are very hard to debug. Sometimes the right answer is: don't cache this, compute it fresh.

Sensitive data without careful thought. PII, payment info, auth tokens — caching these in a shared Redis instance requires careful access control and encryption. The risk surface is real.

10. Practical Decision Framework

Before adding a cache, run through this:

1. Is the data read frequently and written infrequently?
   → No: caching probably won't help. Don't bother.
   → Yes: continue.

2. Is the data expensive to fetch or compute?
   → No (< ~5ms DB query, simple lookup): don't cache.
   → Yes (complex joins, external API calls, heavy computation): good candidate.

3. How stale can this data be?
   → Never stale (balances, inventory): use write-through or invalidate on every write.
   → Slightly stale OK (profiles, catalog): cache-aside + TTL (seconds to minutes).
   → Very stale OK (blog content, reference data): long TTL (hours to days).

4. Who writes the data?
   → Same service: delete from cache in the write path. Simple.
   → Different service: event-driven invalidation via Pub/Sub or message queue.
   → Multiple writers: use a short TTL as a safety net. Accept bounded staleness.

5. What's the failure mode if cache goes down?
   → Application falls back to DB (slower but works): cache-aside, safe to proceed.
   → Application dies: reconsider the architecture before adding more caching.

6. What's the expected hit rate?
   → < 50%: the cache is barely helping. Rethink the key design or don't cache.
   → > 80%: worth it.
   → > 95%: excellent.

11. Summary Cheat Sheet

Caching Patterns at a Glance

Pattern	Reads from	Writes to	Consistency	Cache fails →	Use when
Cache-Aside	Cache, then DB	DB (delete cache)	Eventual	App still works	Default choice — most scenarios
Write-Through	Cache	Cache + DB (sync)	Strong	App still works	Read-after-write consistency required
Write-Behind	Cache	Cache → DB (async)	Eventual	Data loss risk	High write throughput, counters
Write-Around	Cache, then DB	DB only	Eventual	App still works	Write-once, rarely re-read data

Eviction Policies at a Glance

Policy	Evicts	Best For
`allkeys-lru`	Least recently used	General-purpose cache (safe default)
`allkeys-lfu`	Least frequently used	Stable hot datasets (top products, popular content)
`volatile-lru`	LRU among TTL-bearing keys	When some keys must never be evicted
`noeviction`	Nothing (returns error when full)	When data loss is unacceptable

TTL Quick Reference

Data Type	Suggested TTL
Static config / country codes	24 hours
User profiles	5 minutes
Product catalog	1–5 minutes
Session tokens	Matches session lifetime
Trending / real-time feeds	30–60 seconds
Live pricing / inventory	5–15 seconds
Auth rate limit counters	1–60 seconds (by design)

Invalidation Strategies at a Glance

Strategy	When to Use	Tradeoff
TTL only	Staleness is acceptable	Simplest; stale window = TTL
Delete on write	You own both read and write paths	Zero stale data; extra step on writes
Event-driven	Multiple services cache the same data	Decoupled; slight propagation delay

Stampede Prevention

Technique	Best For
TTL jitter	Preventing synchronized mass expiry (always do this)
Distributed lock	Single hot key under high concurrent traffic
Stale-while-revalidate	When serving slightly stale data during refresh is acceptable
Cache warming	Preventing cold cache after deploy or restart

Closing Thoughts

Caching is one of the highest-ROI changes you can make to a system — but only when applied to the right problems. A well-placed Redis cache in front of an expensive query can slash p99 latency by 100x. A poorly placed cache adds complexity, stale data bugs, and operational overhead for no gain.

The rules to commit to memory:

Cache-aside is the default. Don't start with anything else unless you have a specific reason.
Delete, don't update, on writes. Prevents race conditions.
Always add TTL jitter. It's two lines of code that prevent a class of production incidents.
Know the stampede problem. It's the single most common cache-related outage pattern and it always hits at the worst moment (high traffic).
Measure hit rate. If it's below 80%, the cache is doing less than you think.
Design for cache failure. The system should degrade gracefully, not collapse, when Redis goes down.

Tags: #database #systemdesign #backend #performance #redis

Master-Class: Sending Real-Time Updates from Server to Clients: Server to Server, Android, iOS

Piyush Gupta — Sat, 25 Apr 2026 14:28:47 +0000

Real-time communication is now a cornerstone of modern software. Whether you're showing live scores, streaming AI responses, pushing a payment confirmation to a phone, or propagating an event between two microservices — the underlying question is the same: how does the server reach the client before the client asks?

The answer looks very different depending on who the client is. A backend server, an Android device, and an iPhone each live in fundamentally different environments, face different constraints, and have completely different ecosystems waiting to solve this problem. This article walks through each scenario in depth — the concepts, the practical tooling, and real code.

Part 1: The Mental Model — Push vs. Poll

Before jumping into techniques, it's worth understanding why this is even hard.

HTTP was originally designed around a simple request-response cycle: the client asks, the server answers, connection closes. This works perfectly for loading a webpage but is deeply mismatched for "tell me when something changes." Developers have historically compensated in two ways:

Polling — the client repeatedly asks the server "anything new?" on a timer. Simple to implement, universally supported, but wasteful. You're spending bandwidth and compute on mostly-empty responses.

Long polling — a refinement where the server holds the request open until it actually has something to say, then responds, and the client immediately reconnects. Better latency than polling, but it still creates a new HTTP connection per event, adding overhead and complicating server logic. It was common in the 2000s and early 2010s and is now largely regarded as a legacy fallback.

Push — the server maintains an open channel and sends data down it whenever necessary. This is the modern standard, and the rest of this article is about how it's done in practice.

Part 2: Server to Server

When the "client" is another server or backend service, you have the most flexibility. There's no battery to drain, no mobile OS gating the connection, and no user permission dialogs. The ecosystem here splits broadly into two families: persistent streaming connections and asynchronous messaging.

WebSockets

WebSockets establish a full-duplex, persistent TCP connection via an HTTP upgrade handshake. Once the handshake completes, either side can send frames at any time with minimal overhead — no HTTP headers are re-sent per message.

GET /chat HTTP/1.1
Host: server.example.com
Upgrade: websocket
Connection: Upgrade
Sec-WebSocket-Key: dGhlIHNhbXBsZSBub25jZQ==

HTTP/1.1 101 Switching Protocols
Upgrade: websocket
Connection: Upgrade

After this exchange, the connection is a raw duplex channel. A Node.js server using the popular ws library looks like this:

// Server (Node.js)
const WebSocket = require('ws');
const wss = new WebSocket.Server({ port: 8080 });

wss.on('connection', (ws) => {
  console.log('Client connected');

  // Push data to the client at any time
  const interval = setInterval(() => {
    if (ws.readyState === WebSocket.OPEN) {
      ws.send(JSON.stringify({ type: 'update', data: getLatestMetrics() }));
    }
  }, 1000);

  ws.on('close', () => clearInterval(interval));
});

# Client (Python)
import asyncio
import websockets
import json

async def listen():
    async with websockets.connect("ws://server.example.com:8080") as ws:
        async for message in ws:
            data = json.loads(message)
            print(f"Received: {data}")

asyncio.run(listen())

WebSockets are the right choice when you need low latency, bidirectional messaging, or binary data. The tradeoff is that they require stateful servers — you can't just put a standard load balancer in front without sticky sessions or a shared pub/sub layer (Redis is commonly used for this).

Server-Sent Events (SSE)

SSE is a simpler, HTTP-based protocol for one-directional streaming from server to client. The server responds with Content-Type: text/event-stream and never closes the connection, sending data in a simple text format:

data: {"type": "price_update", "value": 142.50}\n\n
data: {"type": "price_update", "value": 143.10}\n\n

Each event is separated by a blank line. Events can also carry an id field for resumption after reconnects, and a named event field for routing on the client.

// Server (Node.js / Express)
app.get('/events', (req, res) => {
  res.setHeader('Content-Type', 'text/event-stream');
  res.setHeader('Cache-Control', 'no-cache');
  res.setHeader('Connection', 'keep-alive');
  res.setHeader('X-Accel-Buffering', 'no'); // Critical for Nginx

  const sendEvent = (data) => {
    res.write(`id: ${Date.now()}\n`);
    res.write(`data: ${JSON.stringify(data)}\n\n`);
  };

  const interval = setInterval(() => sendEvent({ metrics: getMetrics() }), 2000);
  req.on('close', () => clearInterval(interval));
});

# Client consuming SSE (Python)
import sseclient
import requests

response = requests.get('http://server.example.com/events', stream=True)
client = sseclient.SSEClient(response)

for event in client.events():
    print(f"Received event: {event.data}")

SSE has built-in automatic reconnection — if the connection drops, the client will reconnect and send the last received event ID, allowing the server to resume from where it left off. However, SSE only transmits UTF-8 text and is strictly unidirectional. For server-to-server communication where the receiving service only needs to consume a stream of events, SSE is a lighter-weight and simpler choice than WebSockets.

Important gotcha: corporate proxies and some CDN/reverse proxy configurations buffer SSE streams silently, making events arrive in batches rather than in real time. The X-Accel-Buffering: no header fixes this for Nginx, but intermediaries you don't control remain a problem. For server-to-server communication within a private network, this is rarely an issue.

gRPC Streaming

gRPC is Google's open-source RPC framework, built on HTTP/2 and Protocol Buffers. It supports four communication patterns, including server streaming — where the client makes one request and the server streams back a sequence of responses.

// service.proto
syntax = "proto3";

service MetricsService {
  // Client sends one request, server streams many responses
  rpc StreamMetrics (MetricsRequest) returns (stream MetricsResponse);
}

message MetricsRequest { string service_name = 1; }
message MetricsResponse {
  double cpu_usage = 1;
  double memory_usage = 2;
  int64 timestamp = 3;
}

# Server (Python)
class MetricsServicer(MetricsService):
    def StreamMetrics(self, request, context):
        while context.is_active():
            yield MetricsResponse(
                cpu_usage=get_cpu(),
                memory_usage=get_memory(),
                timestamp=int(time.time())
            )
            time.sleep(1)

// Client (Go)
stream, err := client.StreamMetrics(ctx, &MetricsRequest{ServiceName: "api"})
for {
    resp, err := stream.Recv()
    if err == io.EOF { break }
    fmt.Printf("CPU: %.2f%%, Memory: %.2f%%\n", resp.CpuUsage, resp.MemoryUsage)
}

gRPC is particularly well-suited for internal microservice communication. Binary serialization via Protocol Buffers makes it fast and compact. HTTP/2 multiplexing allows many streams over one connection. The strongly typed contract enforced by .proto files also provides excellent developer ergonomics in polyglot architectures.

Message Queues and Pub/Sub Brokers

For asynchronous delivery — where the receiving server doesn't need to be online at the moment the event is produced — message brokers are the standard approach. They decouple the producer from the consumer and provide durability, retry logic, and fan-out.

Apache Kafka is the dominant choice for high-throughput event streaming. It persists events to disk in ordered, replayable logs called topics. Consumers can catch up from any offset, making it excellent for architectures that need auditability or event sourcing.

# Producer (Python)
from kafka import KafkaProducer
import json

producer = KafkaProducer(
    bootstrap_servers=['kafka:9092'],
    value_serializer=lambda v: json.dumps(v).encode('utf-8')
)
producer.send('user-events', {'event': 'checkout', 'user_id': 'u123', 'amount': 49.99})
producer.flush()

# Consumer
from kafka import KafkaConsumer
consumer = KafkaConsumer(
    'user-events',
    bootstrap_servers=['kafka:9092'],
    value_deserializer=lambda m: json.loads(m.decode('utf-8'))
)
for message in consumer:
    process_event(message.value)

RabbitMQ is better suited for task queues and flexible routing. It supports exchanges with different routing modes — direct, fanout, topic, and headers — and is widely used for work queues where each message should be processed by exactly one consumer.

# Publisher (Python / pika)
import pika, json

connection = pika.BlockingConnection(pika.ConnectionParameters('rabbitmq'))
channel = connection.channel()
channel.exchange_declare(exchange='notifications', exchange_type='fanout')

channel.basic_publish(
    exchange='notifications',
    routing_key='',
    body=json.dumps({'type': 'payment_confirmed', 'order_id': 'ord_789'})
)

Webhooks deserve a mention here as well. Rather than the consumer server maintaining a persistent connection, the producer simply makes an HTTP POST to a pre-registered URL whenever an event occurs. This is the dominant pattern for third-party integrations (payment processors, GitHub events, Stripe, Twilio, etc.) because it requires no persistent infrastructure on either side. The tradeoff is that the receiving endpoint must be publicly reachable and the producer must handle retries.

Part 3: Server to Android

Android introduces meaningful constraints. Maintaining a persistent TCP connection in the background drains the battery, and Android's OS actively restricts background work. The ecosystem has converged on a clear stack.

Firebase Cloud Messaging (FCM)

FCM is the de facto standard for reaching Android devices from a server. It is Google's cloud-based messaging infrastructure that handles the persistent connection to every Android device so your server doesn't have to.

The architecture involves three parties: your application server, the FCM infrastructure, and the device. Your server sends a payload to FCM's API; FCM routes it to the target device using a persistent, battery-optimized connection maintained by Google Play Services.

The flow:

The Android app registers with FCM on first launch and receives a unique registration token.
The app sends this token to your backend and stores it (typically in a database keyed to the user).
When your server needs to push an update, it posts to FCM's HTTP v1 API with the target token and message payload.
FCM delivers the message to the device, waking the app if necessary.

// Android — FirebaseMessagingService
class MyFirebaseMessagingService : FirebaseMessagingService() {

    // Called when FCM generates a new token (on first install or token refresh)
    override fun onNewToken(token: String) {
        super.onNewToken(token)
        // Send token to your backend so it can reach this device
        sendTokenToServer(token)
    }

    // Called when a message arrives while app is in foreground
    override fun onMessageReceived(remoteMessage: RemoteMessage) {
        remoteMessage.data.isNotEmpty().let {
            val updateType = remoteMessage.data["type"]
            val payload = remoteMessage.data["payload"]
            handleUpdate(updateType, payload)
        }
    }
}

// Server — sending via FCM HTTP v1 API (Node.js)
const admin = require('firebase-admin');
admin.initializeApp({ credential: admin.credential.cert(serviceAccount) });

async function sendUpdate(deviceToken, data) {
    await admin.messaging().send({
        token: deviceToken,
        data: {
            type: 'order_update',
            orderId: 'ord_789',
            status: 'shipped'
        },
        android: {
            priority: 'high',
            ttl: 3600 * 1000 // 1 hour
        }
    });
}

FCM supports two message types. Notification messages are handled automatically by the system — Android displays them in the notification tray without any app code running, which is what most apps use for alerts. Data messages deliver a custom key-value payload to your app's onMessageReceived handler, giving you full control over how to process and display the update. You can combine both in a single message.

Token management is a detail that trips up many implementations. Tokens change when the user reinstalls the app, clears data, or on certain OS events. Your server must handle the UNREGISTERED error from FCM and remove stale tokens, and your app must call onNewToken to push refreshed tokens to the backend.

WebSockets on Android (Foreground Only)

For apps that need real-time streaming while the user is actively using them — a live chat, a trading terminal, a GPS tracker — WebSockets work well on Android. The standard library is OkHttp, which is already a transitive dependency in most Android projects.

// Android WebSocket with OkHttp
val client = OkHttpClient.Builder()
    .pingInterval(30, TimeUnit.SECONDS) // Keep-alive pings
    .build()

val request = Request.Builder().url("wss://api.example.com/stream").build()

val listener = object : WebSocketListener() {
    override fun onOpen(webSocket: WebSocket, response: Response) {
        webSocket.send("""{"action": "subscribe", "channel": "updates"}""")
    }

    override fun onMessage(webSocket: WebSocket, text: String) {
        val update = parseUpdate(text)
        runOnUiThread { updateUI(update) }
    }

    override fun onFailure(webSocket: WebSocket, t: Throwable, response: Response?) {
        // Implement exponential backoff reconnection here
        scheduleReconnect()
    }
}

val webSocket = client.newWebSocket(request, listener)

The critical caveat is lifecycle. Android will kill background services, and a WebSocket in a paused or stopped app is unreliable. For anything that needs to work when the app is not in the foreground, FCM is the right tool.

MQTT — For IoT and Low-Bandwidth Scenarios

MQTT is a lightweight publish/subscribe protocol designed for constrained devices. It runs over TCP and uses a small binary format, making it efficient on poor networks. Apps using Eclipse Paho or HiveMQ client libraries subscribe to topics on a broker; any server can publish to those topics.

// Android MQTT with Paho
val client = MqttAndroidClient(context, "tcp://broker.example.com:1883", clientId)
client.connect()
client.subscribe("sensors/temperature/#", 1) { topic, message ->
    val payload = String(message.payload)
    updateSensorDisplay(topic, payload)
}

MQTT is particularly common in IoT applications where devices are battery-powered or on metered connections, and where the update cadence is high (sensor data every second, for example). For mainstream consumer apps, FCM is a simpler path.

Part 4: Server to iOS

iOS has the most controlled environment of the three. Apple enforces strict rules on background execution and network usage, all in service of battery life and user privacy. The practical ecosystem is narrow but well-designed.

Apple Push Notification service (APNs)

APNs is the only sanctioned way to send server-initiated updates to an iOS device when the app is not in the foreground. There is no alternative. Just like FCM on Android, APNs maintains a persistent, encrypted connection to every Apple device so third-party servers don't have to.

The flow mirrors FCM in structure but differs in the details:

The iOS app requests permission from the user to receive notifications.
On approval, it calls UIApplication.shared.registerForRemoteNotifications().
iOS registers with APNs and returns a device token to application(_:didRegisterForRemoteNotificationsWithDeviceToken:).
The app sends this token to your backend.
Your server constructs a JSON payload and sends it to APNs over HTTP/2, authenticated with a JWT signed by your APNs key.
APNs delivers the notification to the device.

// AppDelegate.swift
func application(_ application: UIApplication,
                 didFinishLaunchingWithOptions launchOptions: [UIApplication.LaunchOptionsKey: Any]?) -> Bool {
    let center = UNUserNotificationCenter.current()
    center.requestAuthorization(options: [.alert, .sound, .badge]) { granted, error in
        guard granted else { return }
        DispatchQueue.main.async {
            UIApplication.shared.registerForRemoteNotifications()
        }
    }
    return true
}

func application(_ application: UIApplication,
                 didRegisterForRemoteNotificationsWithDeviceToken deviceToken: Data) {
    let token = deviceToken.map { String(format: "%02.2hhx", $0) }.joined()
    sendTokenToBackend(token)
}

func application(_ application: UIApplication,
                 didReceiveRemoteNotification userInfo: [AnyHashable: Any],
                 fetchCompletionHandler completionHandler: @escaping (UIBackgroundFetchResult) -> CompletionHandler) {
    // Handle silent background update
    if let type = userInfo["type"] as? String {
        handleSilentUpdate(type: type) { result in
            completionHandler(result)
        }
    }
}

APNs payload structure:

{
  "aps": {
    "alert": {
      "title": "Order Shipped",
      "body": "Your order #789 has been dispatched."
    },
    "sound": "default",
    "badge": 1
  },
  "order_id": "ord_789",
  "status": "shipped"
}

For silent background updates — where you want to update the app's data without showing a visible notification — use content-available: 1 with no alert key and set the priority to 5 (low priority):

{
  "aps": {
    "content-available": 1
  },
  "type": "cache_refresh",
  "resource": "product_catalog"
}

Authentication: Apple strongly recommends using APNs authentication keys (.p8 files) over the older certificate-based approach. Keys never expire and one key works across all your apps in a team.

# Server — sending to APNs (Python with httpx + PyJWT)
import jwt, time, httpx

def create_apns_jwt(key_id: str, team_id: str, private_key: str) -> str:
    return jwt.encode(
        {"iss": team_id, "iat": time.time()},
        private_key,
        algorithm="ES256",
        headers={"alg": "ES256", "kid": key_id}
    )

async def send_apns_notification(device_token: str, payload: dict):
    token = create_apns_jwt(KEY_ID, TEAM_ID, PRIVATE_KEY)
    url = f"https://api.push.apple.com/3/device/{device_token}"

    async with httpx.AsyncClient(http2=True) as client:
        response = await client.post(
            url,
            headers={
                "authorization": f"bearer {token}",
                "apns-topic": "com.example.myapp",
                "apns-push-type": "alert",
                "apns-priority": "10",
            },
            json=payload
        )
    return response.status_code == 200

Note that APNs requires HTTP/2. The httpx library with [http2] extras, or Apple's own apns2 libraries in various languages, handle this correctly.

WebSockets on iOS (Foreground)

For in-app real-time features, iOS 13+ ships URLSessionWebSocketTask natively, eliminating the need for third-party libraries for basic use cases.

// Native WebSocket (iOS 13+)
class RealtimeManager {
    private var webSocketTask: URLSessionWebSocketTask?
    private let session = URLSession(configuration: .default)

    func connect() {
        let url = URL(string: "wss://api.example.com/stream")!
        webSocketTask = session.webSocketTask(with: url)
        webSocketTask?.resume()
        receiveMessage()
    }

    private func receiveMessage() {
        webSocketTask?.receive { [weak self] result in
            switch result {
            case .success(let message):
                switch message {
                case .string(let text):
                    self?.handleUpdate(text)
                case .data(let data):
                    self?.handleBinaryUpdate(data)
                @unknown default: break
                }
                self?.receiveMessage() // Continue listening
            case .failure(let error):
                self?.scheduleReconnect(after: 3.0)
            }
        }
    }

    func disconnect() {
        webSocketTask?.cancel(with: .goingAway, reason: nil)
    }
}

For older iOS versions or more complex scenarios (automatic reconnection, heartbeats, channel-based pub/sub), libraries like Starscream are commonly used.

Combining APNs with In-App Streaming

A well-architected iOS app typically uses both: APNs for background/offline delivery, and WebSockets for in-app streaming. The logic at the app level checks whether a WebSocket connection is active; if so, the update arrives via that channel. If not, APNs wakes the app or delivers a visible notification. Firebase SDKs handle this abstraction automatically when you use the FCM SDK on iOS, routing through APNs under the hood.

Part 5: Choosing the Right Approach

Here's a practical decision guide based on what the software community actually uses:

Scenario	Recommended Approach
Server → Server (low latency, bidirectional)	WebSockets or gRPC streaming
Server → Server (one-way stream)	SSE or gRPC server streaming
Server → Server (async, durable)	Kafka (high throughput) or RabbitMQ (task routing)
Server → Third-party service	Webhooks (HTTP POST)
Server → Android (background)	FCM
Server → Android (foreground, in-app)	WebSocket (OkHttp)
Server → iOS (background/offline)	APNs (directly or via FCM SDK)
Server → iOS (foreground, in-app)	URLSessionWebSocketTask or Starscream
Server → IoT devices	MQTT

Part 6: Production Considerations

Regardless of which technology you pick, several cross-cutting concerns determine whether a real-time system actually holds up in production.

Reconnection and resilience. Networks fail. Connections drop. Every client implementation needs exponential backoff reconnection logic. SSE and FCM handle this automatically; WebSocket implementations on mobile must do it manually. On the server side, design your event delivery to be idempotent — clients may receive the same event more than once after a reconnect.

Missed events. Persistent connections mean that events produced while a client was offline can be missed. SSE's Last-Event-ID header helps for short outages. For mobile, FCM stores up to 100 messages per device and delivers them when the device comes back online (subject to TTL). For critical business events, use a separate REST endpoint the client can call after reconnecting to fetch the state it missed.

Scalability. A single server can maintain tens of thousands of WebSocket connections, but as you scale horizontally, you need a shared pub/sub layer so that a message produced on Server A can reach a client connected to Server B. Redis Pub/Sub and Redis Streams are the most common solutions for this. Kafka is used when you need durability and replay.

Security. WebSocket connections should always use wss:// (TLS). APNs and FCM connections are encrypted by their respective platforms. JWT or session tokens should be validated during the WebSocket/SSE handshake, not just at connection time. For APNs, rotate your .p8 key if it is ever exposed.

Token hygiene for mobile. Device tokens (both FCM and APNs) change and expire. Build logic to handle registration errors returned by FCM (UNREGISTERED) and APNs (BadDeviceToken, Unregistered) and remove stale tokens from your database immediately. Sending to dead tokens wastes quota and can trigger rate limiting.

Conclusion

Real-time server-to-client communication is not a single technology but a landscape of tools, each optimized for a specific environment and use case. WebSockets and SSE dominate server-to-server streaming; gRPC streaming is the preferred choice in microservice architectures; message brokers like Kafka and RabbitMQ handle asynchronous event propagation at scale. On Android, FCM is the unambiguous standard for background delivery, augmented by WebSockets for in-app streaming. On iOS, everything routes through APNs for background delivery, while URLSessionWebSocketTask handles the foreground case.

The key insight is that mobile operating systems impose constraints that make "just keep a connection open" untenable for background use — which is precisely why platform-managed push infrastructure (FCM, APNs) exists. For server-to-server communication where those constraints don't apply, you have far more freedom, and the choice comes down to latency requirements, directionality, durability needs, and operational complexity you're willing to take on.

Pick the smallest, most appropriate tool for each client type, build reconnection and missed-event recovery into every path, and your real-time system will be robust regardless of what the network throws at it.

LinkedIn for Developers: Stop Ghosting Your Own Profile 🚀

Piyush Gupta — Fri, 24 Apr 2026 08:00:00 +0000

We often spend hundreds of hours grinding LeetCode, but only five minutes updating our LinkedIn. In the current job market, that’s a mistake. Your LinkedIn profile is often the first thing a recruiter sees before they even open your PDF resume.

Based on the latest guide from Engineer Talks, here is a systematic way to turn your LinkedIn profile into an interview-generating machine.

1. First Impressions: The "Header" Strategy 🖼️

Recruiters scan profiles in seconds. Your top section needs to tell them exactly who you are.

The Photo: You don't need a tuxedo, but skip the party photos. A clean, semi-formal headshot works best [00:01:08].
The Headline: Don't just put "Student." Be specific: "Looking for SDE Internships | Java, Python, Spring Boot" or "Software Engineer at [Company] | Backend Specialist." This makes you searchable.

2. The "Featured" Section is Your Portfolio 🌟

Most people leave this blank. This is a massive missed opportunity.

What to add: Upload your latest Resume (PDF), a link to your GitHub/Portfolio website, and your best-performing project [00:02:00].
The Benefit: It allows recruiters to see your work immediately without having to message you for a CV first.

3. Experience: Show, Don't Just Tell 🛠️

Don't just list "Software Intern" and the dates.

The Bullet Point Method: For every role, list exactly what you did, which tech stack you used, and what you achieved [00:03:26].
Example: "Developed a web scraper using Python and BeautifulSoup that reduced data entry time by 40%." ## 4. Skills & Social Proof ✅
The Skills section isn't just a list; it’s a ranking system.
Endorsements: Reach out to colleagues or college friends. Endorse them for their top skills, and they’ll likely do the same for you. This "Social Proof" tells recruiters that other people vouch for your expertise [00:05:17].
Strategic Listing: Prioritize technical skills (like React, AWS, or SQL) over generic ones like "Microsoft Word" [00:05:43].

5. The Power of Recommendations 💬

A recommendation is like a mini-review of your work ethic. If you’ve finished a project or internship, ask your manager or teammate for a 2-3 sentence recommendation [00:06:09]. It adds a level of trust that a standard resume simply can't provide.

6. Activity: Be Visible, Not Just Present 📈

LinkedIn’s algorithm rewards activity.

Don't just scroll: Like posts, comment on tech news, or share what you’re currently learning.
Referrals over Applications: Instead of just clicking "Easy Apply," find someone at the company and reach out for a conversation. An active profile makes these cold messages much more successful [00:08:18].

Final Thoughts

Your LinkedIn profile should be a living document, not something you update once a year. By treating it as a dynamic portfolio, you significantly increase your "surface area for luck."

Is your LinkedIn profile up to date? Drop your profile link in the comments if you want a peer review! 👇

Inspired by the Engineer Talks guide on LinkedIn optimization. Watch the full video here.

Top 5 FREE Resources to Master DSA (Stop Paying for Coding Bootcamps! 💸)

Piyush Gupta — Fri, 17 Apr 2026 08:00:00 +0000

One of the biggest myths in tech is that you need a $500+ "Elite" course to pass Big Tech interviews. In reality, the best engineers often use the exact same free resources available to everyone.

Based on the latest guide from Engineer Talks, here are the top 5 free resources and a proven strategy to master DSA from scratch.

The Master Strategy: Don't "Resource Hop" 🚫

Before we dive in, remember this: Mastering one resource is better than finishing 10% of five different ones. Stick to a plan and finish the curriculum before moving on.

1. Pepcoding (Best for Learning Concepts) 🎓

If you are a beginner or find certain topics (like Dynamic Programming) confusing, start here.

Why it works: The explanations break down complex logic into understandable segments.
The Starting Point: Check out the "Level 1" playlist for a complete scratch-to-pro journey.
Deep Dives: They have specialized playlists for Linked Lists, Binary Trees, and DP that are goldmines for interview prep.

2. LeetCode (The Practice Standard) 💻

This is the industry standard for coding practice. Use it to transition from "understanding" a concept to "implementing" it under constraints.

Filtering: Use tags to solve topic-wise problems and sort by difficulty (Start with Easy, move to Medium).
Discussion Forum: If your code is slow, the discussion tab is where you learn how to optimize time and space complexity from the community.

3. GeeksforGeeks (The Company Specific Cheat Code) 🏢

GFG is unbeatable when it comes to Company-Specific preparation.

The Strategy: Two weeks before your interview with a specific company, go to GFG and read their "Interview Experiences".
Patterns: This helps you identify the specific types of questions a company likes to ask (e.g., Amazon loves Trees; Google loves Graphs).

4. InterviewBit (The Time-Management Tool) ⏱️

In a real interview, you usually have 45 minutes to solve 2 problems. Speed and accuracy matter.

Timed Practice: InterviewBit forces you to solve problems against a clock, simulating the pressure of a real interview.
Pro Tip: Find a question on GFG to understand it, then find the same question on InterviewBit to solve it under time pressure.

5. Techie Delight (For "Off-Road" Problems) 🛠️

Once you have exhausted LeetCode or want a fresh challenge, Techie Delight offers unique problems that don't always appear on mainstream platforms. It's a great "final boss" resource before you start your actual interview loop.

Honorable Mentions 🎖️

MyCodeSchool: One of the best places for absolute beginners to learn about Pointers, Recursion, and basic Sorting [00:07:09].

My Recommended Roadmap:

Learn core concepts from Pepcoding.
Practice implementation on LeetCode.
Refine your speed on InterviewBit.
Target specific companies using GeeksforGeeks.

Which resource helped you the most in your last interview? Let’s share the knowledge in the comments! 👇

Inspired by the Engineer Talks guide on free DSA resources. Watch the full video here.

Master-Class: Scaling Databases with Sharding, Partitioning, and Consistent Hashing

Piyush Gupta — Sun, 12 Apr 2026 08:00:00 +0000

Scaling a database is often the "final boss" of system design. While scaling your application layer is as simple as spinning up more containers, databases are stateful, making them significantly harder to distribute.

When vertical scaling hits a physical limit, you have no choice but to scale horizontally. In this guide, we break down the three pillars of horizontal scaling and how they intersect with replication strategies to build truly resilient systems.

Sharding vs. Partitioning: The Core Difference
Partitioning Strategies: Horizontal vs. Vertical
The "Modulo" Problem: Why Simple Hashing Fails
Consistent Hashing: The Secret to Rebalancing
The Hybrid Model: Relating Replication to Sharding
The Complexity Trade-offs

Sharding vs. Partitioning: The Core Difference

While often used interchangeably, they operate at different layers of the stack:

Partitioning: This is a logical split of your data. It’s about taking a giant dataset and breaking it into smaller, manageable chunks (e.g., splitting a 1TB table into ten 100GB chunks).
Sharding: This is a physical split across machines. It involves distributing those partitions across different database instances (shards) to distribute the CPU, RAM, and I/O load.

The Golden Rule: You shard the database and you partition the data.

Partitioning Strategies

How do you decide which data goes into which partition? It must be deterministic.

A. Horizontal Partitioning (Sharding)

You split the table by rows. For example, users with IDs 1–1000 go to Partition A, and 1001–2000 go to Partition B.

Best for: Distributing massive datasets across nodes for write/read throughput.

B. Vertical Partitioning

You split the table by columns. A User table might be split into User_Profile (Name, Email) and User_Metadata (Bio, Settings, Blob data).

Best for: Reducing I/O for queries that only need specific, frequently accessed columns.

The "Modulo" Problem: Why Simple Hashing Fails

When sharding, we need to map a key (like user_id) to a specific server. The simplest way is Modulo Hashing:

server_index = hash(key) \pmod n

Where $n$ is the number of servers. This works until you need to scale. If you have 3 servers ( $n = 3$ ) and you add a 4th ( $n = 4$ ), the result of the modulo operation changes for almost every single key. This forces a massive data migration that can bring your system down.

Consistent Hashing: The Secret to Rebalancing

Consistent Hashing solves the "Modulo Problem" by decoupling the keys from the number of servers using a Hash Ring.

How it Works:

The Ring: Both servers and data keys are hashed onto the same ring (range $0$ to $2^{32}-1$ ).
Clockwise Assignment: To find a key's server, you move clockwise on the ring until you hit the first server.
Minimal Migration: When you add a new server, it only takes over a small portion of the ring. Only the keys between the new server and its counter-clockwise neighbor need to move.

The Hybrid Model: Relating Replication to Sharding

Sharding handles capacity, but Replication handles availability. In a production system, every "Shard" is actually a "Replication Group."

A. Sharded + Single-Leader Replication

Each shard is a replica set with one Leader and multiple Followers.

The Workflow: Consistent hashing sends you to Shard 4. Within Shard 4, all writes go to the Leader, and reads can be distributed to Followers.
Example: Vitess (MySQL) or MongoDB.

B. Sharded + Multi-Leader Replication

Common in multi-region deployments. You shard by geography (e.g., EU users vs. US users), but within each shard, you have multiple leaders (e.g., London and Paris) to reduce local latency.

Example: Global Spanner configurations.

C. Sharded + Leaderless Replication (The "Dynamo" Way)

In systems like Cassandra, sharding and replication are unified on the ring.

The Workflow: A key is hashed to the ring. Instead of going to one node, it is replicated to the next N nodes clockwise.
The Quorum: You use $W + R > N$ to ensure consistency across those sharded replicas.

The Complexity Trade-offs

Sharding is powerful, but it comes with a high "architectural tax":

Cross-Shard Joins: Joining data from Shard A and Shard B is incredibly expensive. You often have to perform the join in the application layer.
Hotspots: If one user (e.g., a celebrity) gets 100x more traffic than others, the shard they reside on will become a bottleneck.
Operational Overhead: Managing backups, monitoring, and failover for 100 shards is significantly more complex than a single instance.

Summary Comparison

Feature	Modulo Hashing	Consistent Hashing
Scaling Impact	~100% data migration	~$1/n$ data migration
Fault Tolerance	Low	High
Replication Choice	Usually Single-Leader	Often Leaderless / Multi-Leader

Final Thought

Sharding is how we find the "house" for our data; Replication is how we ensure that house doesn't burn down. Don't shard until you hit the physical limits of your hardware, but when you do, use Consistent Hashing to ensure your growth is sustainable.

Piyush Gupta

Apr 5

Master-Class: Understanding Database Replication (Single, Multi, and Leaderless)

#systemdesign #backend #database #distributedsystems

Comments

3 min read

How to Use LeetCode Effectively for Interview Prep (The Strategic Way)

Piyush Gupta — Tue, 07 Apr 2026 15:12:32 +0000

Preparing for coding interviews at top-tier companies can be overwhelming. While everyone knows about LeetCode, most people use it inefficiently—simply solving random problems without a clear roadmap.

In this guide, based on insights from the Engineer Talks deep dive, we break down the most effective way to master Data Structures and Algorithms (DSA) without burning out.

1. Quality Over Quantity 📈

One of the biggest mistakes is focusing on the "Total Solved" count. Solving 500 problems without understanding the core patterns won't help you when an interviewer throws a curveball.

The Strategy: Focus on understanding the logic behind every solution.
The "Struggle" Rule: If you are stuck, give it at least 30–45 minutes of pure thought before looking at the hints. If you must look at the solution, don't just copy-paste it. Close the tab and try to implement it from scratch.

2. Topic-Wise Mastery (The "Tag" Strategy) 🏷️

Randomizing problems is for the final stages of prep. When you're starting, you need to build a foundation.

Step-by-Step: Pick one topic (e.g., Linked Lists) and solve 15–20 problems of varying difficulty before moving to the next.
The Benefit: This builds pattern recognition. When you see a new problem in an interview, you’ll immediately recognize it as a "Two Pointer" or "Sliding Window" problem because you've trained your brain to see that specific structure.

3. The "Difficulty Ladder" 🪜

Don't jump straight into "Hard" problems. It’s the fastest way to lose motivation.

Easy: Start here to get familiar with the syntax and basic operations.
Medium: This is the "Sweet Spot." Most interview questions from Big Tech companies fall into this category. You should spend 70% of your time here.
Hard: Tackle these only once you are consistently solving Mediums in under 30 minutes.

4. Leverage the Community 💡

The real gold on LeetCode isn't just the problem statement; it's the Discussion and Solutions tabs.

Optimization: Even if your code passes, check the top-voted solutions. There is almost always a more elegant or optimized way (better Time or Space Complexity) that you hadn't considered.
Alternative Languages: Seeing how a problem is solved in Python vs. C++ can give you a better perspective on language-specific optimizations.

5. Consistency Over Intensity ⏱️

It is better to solve 1 problem a day for 30 days than 30 problems in one weekend and then quitting for a month.

The Daily Challenge: Use the "LeetCode Daily Challenge" to keep your streak alive and force yourself to touch different topics every day.
Mock Contests: Once you're comfortable, participate in Weekly Contests to simulate the time pressure of a real interview.

Final Thoughts

LeetCode is a tool, not a goal. Your objective is to become a better problem solver, not a better "LeetCoder." Focus on the why behind the code, and the offers will follow.

"Don't count the problems, make the problems count."

Have you started your LeetCode journey yet? What’s the hardest topic you’ve encountered so far? Let’s discuss in the comments! 👇

Based on the guide by Engineer Talks. Watch the original video here.

Mastering Schema Evolution: Why Apache Avro is the King of Big Data (Part 2)

Piyush Gupta — Sun, 05 Apr 2026 14:03:38 +0000

The Evolution Nightmare

In Part 1, we saw how Thrift and Protocol Buffers use numeric tags to shrink data. But in a real-world distributed system, you can’t upgrade every microservice at the same time. You will always have Old Code talking to New Code.

This creates a massive problem: Schema Evolution. If Service A adds a new field to its database, will Service B (running the old code) crash when it tries to read that data?

Forward vs. Backward Compatibility

To build a resilient system, you must understand two concepts:

Backward Compatibility: New code can read data written by old code. (Essential when you update your "Readers" first).
Forward Compatibility: Old code can read data written by new code. (Essential when you update your "Writers" first).

In Thrift and Protobuf, this is managed by Tags. If a reader sees a tag it doesn't recognize, it simply ignores it. But what if you want to avoid tags entirely?

3. Apache Avro: The "No Tag" Evolution

Apache Avro was created within the Hadoop ecosystem because Thrift wasn't a perfect fit for massive data files. Unlike Protobuf, Avro does not store tag numbers or field types in the binary data. It only stores the raw values.

How it works: The Pairwise Resolution

How does the reader know what the data is if there are no tags?

Avro uses two schemas:

Writer’s Schema: The schema the application used when it sent the data.
Reader’s Schema: The schema the receiving application expects.

When data is read, the Avro library looks at both schemas side-by-side. If the field order changed or a field was renamed, Avro "resolves" the difference by looking at the field names.

![Diagram: Avro Reader and Writer Schema Resolution Logic]

Implementation Example (Avro JSON Schema)

Avro schemas are written in simple JSON, making them much easier to generate dynamically than the IDLs we saw in Part 1.

user_schema.avsc

{
  "type": "record",
  "name": "User",
  "namespace": "com.piyush.devto",
  "fields": [
    {"name": "username", "type": "string"},
    {"name": "age", "type": ["int", "null"], "default": 0},
    {"name": "email", "type": ["string", "null"]}
  ]
}

Python Implementation:

import avro.schema
from avro.datafile import DataFileWriter
from avro.io import DatumWriter

# 1. Load the schema from a file
schema = avro.schema.parse(open("user_schema.avsc", "rb").read())

# 2. Write binary data to an Avro Object Container File
with DataFileWriter(open("users.avro", "wb"), DatumWriter(), schema) as writer:
    writer.append({
        "username": "Piyush", 
        "age": 25, 
        "email": "piyush@example.com"
    })

print("Data serialized to users.avro")

The Killer Feature: Database Integration

One reason Avro is the "gold standard" for Kafka and Big Data is its relationship with relational databases.

Because Avro schemas are JSON, you can write a script to automatically convert a SQL table into an Avro schema.

SQL Column → Avro Field Name
SQL Data Type → Avro Type
Nullable Column → Avro Union ["type", "null"]

This makes Avro perfect for Change Data Capture (CDC), where you stream every single update from your Postgres or MySQL database into a Data Lake like S3 or Snowflake.

Summary: Choosing Your Weapon

Which encoding should you use for your next project?

Feature	JSON	Protobuf / Thrift	Apache Avro
Best For	Public APIs	Internal Microservices	Big Data / Pipelines
Speed	Slowest	Fast	Fastest (no tags)
Schema	Optional	Required (IDL)	Required (JSON)
Logic	Human-readable	Tag-based	Resolution-based

Final Thoughts

Encoding isn't just about saving bytes; it's about defining the contract between your services.

Use JSON when you need ease of use.
Use Protobuf for gRPC and internal speed.
Use Avro when your data is massive and your schemas are constantly evolving.

What are you using in production? Let's discuss in the comments!



---

### Why this works for dev.to:
* **Series Link:** The `series` tag in the metadata automatically links Part 1 and Part 2 on the platform.
* **Liquid Tags:** I used blockquotes and bolded text to highlight "Forward/Backward" compatibility—a common interview question for senior devs.
* **Conclusion Table:** Dev.to readers love a quick "Cheat Sheet" or comparison table to wrap up a long-form post.

Beyond JSON: A High-Performance Guide to Thrift & Protocol Buffers (Part 1)

Piyush Gupta — Sun, 05 Apr 2026 14:02:03 +0000

The "Text-Based" Performance Tax

Most of us start our careers in the land of JSON. It’s human-readable, it’s the language of the web, and it’s incredibly easy to debug. But as your system scales from a few hundred requests to hundreds of thousands per second, JSON starts to reveal its "tax."

Why JSON/XML is Killing Your Throughput:

Redundancy: Every single message repeats the keys. If you send a list of 1,000 users, you are sending the string "username" 1,000 times. That’s pure overhead.
Parsing Overhead: Converting a string like "12345.67" into a 64-bit float is a CPU-intensive operation. In high-performance systems, the time spent parsing JSON often exceeds the time spent processing the actual business logic.
Binary Inefficiency: If you need to send binary data (like a profile picture or a byte array), you must use Base64 encoding, which increases the data size by approximately 33%.

The Solution: Binary Encoding. By using a schema-based binary format, we can strip away the field names and focus entirely on the data.

1. Apache Thrift: The Multi-Protocol Powerhouse

Originally developed at Facebook to solve cross-language service communication, Apache Thrift is both an encoding format and a full RPC (Remote Procedure Call) framework.

The Magic of the IDL

Thrift uses an Interface Definition Language (IDL). Instead of defining your data in code, you define it in a .thrift file. This acts as the single source of truth for all your services, whether they are written in Python, Go, or Java.

Binary Protocol vs. Compact Protocol

Thrift offers different ways to "pack" your data:

Binary Protocol: A simple, fast approach that encodes data in a straightforward binary format without much compression.
Compact Protocol: This is where the efficiency shines. It uses Variable-length integers (Varints). For example, the number 7 only takes 1 byte, while 7,000,000 takes significantly more. It also packs field IDs and data types into a single byte whenever possible.

Implementation Example (Thrift IDL)

// user_profile.thrift
namespace java com.piyush.devto
namespace py devto.piyush

struct UserProfile {
  1: required string username,
  2: optional i32 age,
  3: optional bool is_active = true,
  4: optional list<string> tags
}

Python Serialization Code:

from thrift.protocol import TCompactProtocol
from thrift.transport import TTransport
from devto.piyush.ttypes import UserProfile

# Creating a sample object
user = UserProfile(username="Piyush", age=25, tags=["distributed-systems", "backend"])

# Step 1: Initialize a memory buffer
transport = TTransport.TMemoryBuffer()

# Step 2: Use the Compact Protocol
protocol = TCompactProtocol.TCompactProtocol(transport)

# Step 3: Write (Serialize) the object to the buffer
user.write(protocol)

# Get the raw bytes
payload = transport.getvalue()
print(f"Total encoded size: {len(payload)} bytes")

2. Protocol Buffers (Protobuf): The Google Standard

If you’ve ever looked into gRPC, you’ve encountered Protocol Buffers. Developed by Google, it is arguably the most popular binary encoding format in the industry today.

The "Tag" System: Why Order Matters

In Protobuf, the name of the field (username) is never sent over the wire. Instead, Protobuf uses Tags (unique numbers assigned to each field).

When the encoder sees string username = 1;, it simply writes: [Field Tag 1] [Data Length] [Value].

Critical Rule: Once you assign a tag number (like 1), you can never change it. If you change a tag, you break the ability for your services to understand each other.

Visualizing the Binary Layout

While JSON looks like a mess of curly braces and quotes, a binary message looks like a streamlined stream of bits.

Tag & Type	Length/Value	Meaning
`0x12`	`0x06`	Field #1 is a String of 6 bytes
`0x50 0x69 0x79...`	"Piyush"	The actual data
`0x18`	`0x19`	Field #2 is an Integer (25)

Implementation Example (Protobuf)

syntax = "proto3";

package devto.piyush;

message UserProfile {
  string username = 1;
  int32 age = 2;
  bool is_active = 3;
  repeated string tags = 4;
}

Java Implementation:

// Building the object
UserProfile user = UserProfile.newBuilder()
    .setUsername("Piyush")
    .setAge(25)
    .setIsActive(true)
    .addTags("java")
    .addTags("grpc")
    .build();

// Serialization to byte array
byte[] binaryData = user.toByteArray();

// Deserialization on the other end
UserProfile receivedUser = UserProfile.parseFrom(binaryData);
System.out.println("User: " + receivedUser.getUsername());

Summary of Part 1

By moving from JSON to a format like Thrift or Protobuf, you aren't just saving a few bytes. You are:

Lowering Latency: Binary data is parsed up to 10x faster than text.
Reducing Infrastructure Costs: Less bandwidth means lower cloud egress bills.
Enforcing Type Safety: Your API becomes a contract that cannot be easily broken.

Coming up in Part 2: We will dive into Schema Evolution (how to update your data without crashing your app) and why Apache Avro is the undisputed king of Big Data and Kafka pipelines.

B-Trees, Clustered Indexes, and the OLAP Revolution (Part 2) 📊

Piyush Gupta — Sun, 05 Apr 2026 13:52:02 +0000

In Part 1, we looked at LSM Trees—the write-heavy champions found in NoSQL databases. But if you’re using PostgreSQL, MySQL, or Oracle, you’re likely interacting with a different beast: the B-Tree.

Today, we’ll explore why B-Trees still dominate the relational world and how the "Big Data" era forced us to rethink how we store rows entirely.

1. The King of RDBMS: B-Trees
2. Clustered vs. Non-Clustered Indexes
3. OLTP vs. OLAP: The Great Divide
4. Why Column-Oriented Storage Wins at Scale

1. The King of RDBMS: B-Trees

B-Trees are the most widely used indexing structure in history. Unlike the variable-size segments in LSM Trees, B-Trees break the database down into fixed-size pages (usually 4KB to 16KB).

How it Works:

A B-Tree is a balanced tree where each node contains multiple keys and pointers to child pages.

The Root: The entry point for every query.
Leaf Nodes: These contain the actual data or a reference to where the data lives on disk.

Because the tree is always balanced, a B-Tree with a branching factor of 500 and just 4 levels can store 256 billion rows! This makes lookups incredibly consistent at $O(\log n)$.

B-Trees vs. LSM Trees: The Trade-off

Feature	B-Trees	LSM Trees
Best for	Read-heavy workloads	Write-heavy workloads
Fragmentation	High (due to empty page space)	Low (sequential background merges)
Throughput	Lower (multiple disk seeks)	Higher (sequential appends)

2. Clustered vs. Non-Clustered Indexes

Where does the actual row live? This is a common interview question that boils down to index design:

Clustered Index: The index is the data. The leaf nodes of the B-Tree contain the actual row values. You can only have one clustered index per table (usually the Primary Key).
Non-Clustered (Secondary): The index contains a "pointer" (like a Row ID) to the data's location. This allows for multiple indexes but requires an extra "hop" to fetch the full row.

3. OLTP vs. OLAP: The Great Divide

Most web developers spend their time in OLTP (Online Transaction Processing). You handle thousands of small queries: "Update this user’s bio" or "Add this item to the cart."

However, businesses eventually need OLAP (Online Analytical Processing). This involves massive aggregate queries like: "What was the total revenue in Q4 across all Asian markets?"

Running these on your production database will cause it to crawl. Instead, we move data to a Data Warehouse using ETL (Extract, Transform, Load).

4. Why Column-Oriented Storage Wins at Scale

Traditional databases store data Row-by-Row. To calculate an average age, the DB loads the entire row (Name, Email, Bio, Password) just to access one tiny "Age" integer.

Column-Oriented Storage (BigQuery, Snowflake, ClickHouse) stores each column in its own file.

The Code Reality:

Imagine a table with 100 columns and 1 billion rows.

-- Row-oriented: Reads 100 columns from disk per row.
SELECT AVG(age) FROM users;

-- Column-oriented: Reads ONLY the 'age' file. 
-- It ignores the other 99 columns completely.

By only reading the necessary bytes, analytical queries that used to take hours now take seconds. Furthermore, because column data is often repetitive (e.g., many users living in the same "City"), these files compress significantly better than row-based data.

💬 Engineering Challenge

Most modern architectures use Polyglot Persistence—an RDBMS (B-Trees) for user transactions and a Data Warehouse (Columnar) for analytics.

Join the conversation:

Have you ever crashed a production DB by running a heavy "Analytical" query during peak hours?
What’s your "Big Data" tool of choice—Snowflake, BigQuery, or maybe self-hosted ClickHouse?

Drop a comment below with your horror stories or favorite setups! 🛠️✨

How Databases Actually Work: From Log Files to LSM Trees (Part 1) 🚀

Piyush Gupta — Sun, 05 Apr 2026 13:49:52 +0000

We often treat databases like PostgreSQL, MySQL, or MongoDB as magic black boxes. We send a query, and data comes back. But what is actually happening on the disk?

If you've ever wondered why some databases are "fast for writes" while others are "fast for reads," the answer lies in the Storage Engine.

In this post, we’re going to build a database from scratch and evolve it into a production-grade LSM Tree.

1. The World’s Simplest Database
2. Adding an Index (The Hash Map)
3. Solving the Space Crisis: Compaction
4. The Power of SSTables and LSM Trees

1. The World’s Simplest Database

The simplest way to store data is to just append it to a text file. No complex schemas, just raw speed.

# Simple Key-Value Store in Bash
db_set() {
  echo "$1,$2" >> database.db
}

db_get() {
  # We use 'tail -n 1' to get the most recent update for that key
  grep "^$1," database.db | sed "s/^$1,//" | tail -n 1
}

The Verdict:

Writes: O(1) — Extremely fast. You just append to the end of the file.
Reads: O(n) — Terrible. To find one key, you have to scan the entire file from start to finish.

2. Adding an Index (The Hash Map)

To fix the $O(n)$ read problem, we use a Hash Index. Think of this as a "Table of Contents" kept in your server's RAM.

# Conceptual In-Memory Index
index = {
  "user_123": 0,    # Byte offset in the file
  "user_456": 64,
  "user_789": 128
}

def get_data(key):
    offset = index[key]
    file.seek(offset)
    return file.read_line()

This is how Bitcask (the default storage engine for Riak) works. It’s incredibly fast, but there’s a catch: All your keys must fit in RAM. If you have billions of keys, your server will crash.

3. Solving the Space Crisis: Compaction

Since we only append to our log, the file grows forever even if we're just updating the same key. Databases solve this via Compaction.

We break the log into segments. Once a segment reaches a certain size, we close it and start a new one. A background process then merges these segments, throwing away old, overwritten values.

4. The Power of SSTables and LSM Trees

What if we store our data files sorted by key? This is a Sorted String Table (SSTable). Sorting allows us to merge segments efficiently (like Merge Sort) and perform range queries.

The Modern Architecture:

Memtable: All writes go to a balanced tree in memory (AVL or Red-Black Tree).
SSTable: When the Memtable gets too big, we flush it to disk as a sorted file.
WAL (Write-Ahead Log): Before writing to the Memtable, we append the operation to a "crash-recovery" log on disk.

class StorageEngine:
    def write(self, key, value):
        # 1. Append to WAL (for crash recovery)
        self.wal.append(key, value)

        # 2. Add to Memtable
        self.memtable.add(key, value)

        if self.memtable.is_full():
            self.memtable.flush_to_sstable_on_disk()

LSM Trees (Log-Structured Merge Trees) are used by Cassandra, RocksDB, and LevelDB. They are the kings of write-heavy workloads!

💬 Let's Discuss!

Have you ever run into a situation where your database writes were lagging? Did you realize your storage engine might be the bottleneck?

Question for the comments: If you were building a high-frequency trading app with millions of updates per second, would you choose an LSM-based store or a traditional RDBMS? Why?

Master-Class: Understanding Database Replication (Single, Multi, and Leaderless)

Piyush Gupta — Sun, 05 Apr 2026 13:39:36 +0000

Database Replication is the process of keeping a copy of the same data on multiple nodes. Whether you are aiming for high availability, reduced latency, or horizontal scalability, choosing the right replication algorithm is critical.

In this guide, we will explore the three primary algorithms used in modern distributed systems: Single Leader, Multi-Leader, and Leaderless.

Single Leader Replication
Multi-Leader Replication
Leaderless Replication
The Replication Lag Problem
Summary Comparison

1. Single Leader Replication

This is the most common approach (used by MySQL, PostgreSQL, and MongoDB). One node is designated as the leader (master), and all other nodes are followers (read replicas).

How it Works

Writes: All write requests must be sent to the leader. The leader writes the data locally and sends the change to all followers.
Reads: Clients can read from the leader or any follower.

Synchronous vs. Asynchronous

Synchronous: The leader waits for followers to confirm the write.
- Pros: Guaranteed consistency.
- Cons: High latency; if one node fails, the whole write pipeline blocks.
Asynchronous: The leader confirms the write immediately.
- Pros: High performance.
- Cons: Risk of data loss if the leader fails before followers sync.

Handling Failures

Follower Failure: A follower "catches up" by using its local log to request missing data from the leader.
Leader Failure (Failover): Requires detecting failure via timeouts, electing a new leader, and reconfiguring the system.

2. Multi-Leader Replication

In this setup, more than one node can accept writes. This is typically used for applications spread across multiple geographic data centers.

Use Cases

Multi-Data Center Operation: Users write to the nearest data center to reduce latency.
Offline Operation: Apps like calendars or note-taking tools act as local "leaders" that sync with a server later.

The Challenge: Conflict Resolution

If two users edit the same data in different data centers simultaneously, a conflict occurs.

Conflict Avoidance: Routing all writes for a specific record to the same leader.
Convergence: Using Last Write Wins (LWW) or Conflict-free Replicated Data Types (CRDTs) to merge changes.

3. Leaderless Replication

Popularized by Amazon’s Dynamo, this approach allows any node to accept writes and reads. Systems like Cassandra and Riak use this model.

Quorums ($n, w, r$)

To maintain consistency without a leader, these systems use quorums:

$n$: Total number of replicas.
$w$: Nodes that must confirm a write.
$r$: Nodes that must be queried for a read.
The Rule: For a successful read of the latest data, $w + r > n$.

Fixing Stale Data

Since nodes can go down, systems fix stale data via:

Read Repair: When a client detects an old version during a read, it pushes the newer value back to that node.
Anti-Entropy: A background process that constantly syncs data between replicas.

The Replication Lag Problem

Regardless of the algorithm, asynchronous replication often results in "replication lag." To maintain a good user experience, developers should implement:

Read-Your-Own-Writes: Ensures a user always sees the updates they just made.
Monotonic Reads: Ensures a user doesn't see data "disappear" when querying different replicas.
Consistent Prefix Reads: Guarantees that if writes happen in a specific order, they are read in that same order.

Summary Comparison

Algorithm	Best For	Main Downside
Single Leader	Read-heavy apps, general simplicity	Leader is a single point of failure for writes
Multi-Leader	Multi-region apps, offline capabilities	Extremely complex conflict resolution
Leaderless	High write throughput, high availability	Complexities in eventual consistency