DEV Community: Vivek Upadhyay

How Databases Store Data: B+ Tree Explained Simply for Beginners (With Real-World Examples)

Vivek Upadhyay — Fri, 03 Apr 2026 05:38:17 +0000

Description: Learn how B+ Trees work in databases like MySQL, PostgreSQL, and MongoDB. This beginner-friendly guide explains database indexing, disk I/O, leaf nodes, range queries, and why B+ Trees are the industry standard — with relatable real-world examples.

🎯 What Will You Learn From This Blog?

✅ Understand why databases can't simply store data in a plain file — and what breaks when they try.
✅ Learn what a B+ Tree is, how it works step by step, and why it powers almost every database you'll ever touch — MySQL, PostgreSQL, MongoDB, SQLite, and more.
✅ Master core database concepts like disk I/O, leaf nodes, non-leaf nodes, range queries, and logarithmic search — explained so simply you could teach them to your non-tech friend over chai.

🔤 SECTION 1 — Jargon Buster (Glossary for Beginners)

Before we jump into how databases store data, let's kill every confusing term upfront. Read this section like a mini-dictionary.

Term: Database
Simple Definition: A digital cupboard where your app stores and finds information.
Real-World Example: Think of your phone's contact list. All names, numbers, and photos are stored in one organized place. That organized place is a database.

Term: SQL (Structured Query Language)
Simple Definition: A language you use to talk to a database — ask questions, add data, or delete data.
Real-World Example: When you type a name in your phone's contact search bar, you're basically running a query. SQL is how developers write that search instruction for a database.

Term: NoSQL
Simple Definition: A different style of database that doesn't require fixed rows and columns like a spreadsheet.
Real-World Example: SQL is like a school attendance register — fixed columns: Roll No, Name, Present/Absent. NoSQL is like a personal diary — each page can look different. One page has a list, another has a paragraph, another has a drawing.

Term: MongoDB
Simple Definition: A popular NoSQL database that stores data in flexible documents (like JSON files).
Real-World Example: Imagine each student in a school has a flexible profile folder. One student's folder has 5 pages, another has 8. Some have photos, some don't. MongoDB lets data be flexible like that.

Term: Storage Engine (e.g., WiredTiger)
Simple Definition: The behind-the-scenes system inside a database that decides HOW data is physically saved on your hard disk and HOW it's found.
Real-World Example: Think of a grocery store manager. Customers don't care how products are arranged in the warehouse. The manager (storage engine) decides which shelf gets which item and knows the fastest way to find anything.

Term: B+ Tree
Simple Definition: A smart tree-shaped structure that databases use to organize data so finding anything takes just a few steps, even among billions of records.
Real-World Example: Think of the table of contents in a textbook. Instead of flipping every page to find "Chapter 7: Photosynthesis," you go to the table of contents → Unit 3 → Chapter 7 → Page 142. A B+ Tree works like a multi-level table of contents.

Term: O(n) Complexity (Big O Notation)
Simple Definition: A way to describe speed. O(n) means "if data doubles, the work doubles too." It's slow for big data.
Real-World Example: Imagine you lost your keys somewhere in your house. You check every room, every drawer, every pocket one by one. If your house is twice as big, it takes twice as long. That's O(n).

Term: Logarithmic Complexity — O(log n)
Simple Definition: Even if data grows massively, the work barely increases. Super fast.
Real-World Example: You're playing the "guess my number between 1 and 100" game. You say 50, friend says "higher." You say 75, "lower." You say 62, "higher." Each guess cuts possibilities in HALF. Even for 1 to 1 billion, you only need ~30 guesses. That's O(log n).

Term: Disk I/O (Input/Output)
Simple Definition: Reading data from or writing data to a hard drive. It's one of the SLOWEST things a computer does.
Real-World Example: Your RAM is like the kitchen counter — everything you need is right there, instantly. Your hard drive is like the storage room in the basement. Every time you need an ingredient from the basement, you walk down, grab it, walk back up. Each trip = one disk I/O. Trips are slow. You want fewer trips.

Term: 4 KB Disk Block (Page)
Simple Definition: The smallest chunk of data a hard drive reads at once — usually 4,096 bytes.
Real-World Example: Imagine a vending machine that only drops items in packs of 10. Even if you need just 1 candy, you get a pack of 10. Similarly, even if you need 1 byte of data, the disk gives you a whole 4 KB block.

Term: Leaf Node
Simple Definition: The bottom-level boxes in a B+ Tree where the actual data (real database rows) lives.
Real-World Example: In a mall directory, the leaf node is the actual shop where you buy things. The signs just pointed you there.

Term: Non-Leaf Node (Internal Node)
Simple Definition: The upper-level boxes in a B+ Tree that give directions but DON'T hold any real data.
Real-World Example: The floor directory in a mall: "Clothes → Floor 2, Electronics → Floor 3." These signs don't sell anything. They just point you the right way.

Term: Indexing
Simple Definition: Creating a shortcut map so the database doesn't scan every row to find what you need.
Real-World Example: The index at the back of a textbook. Instead of reading 500 pages to find "Newton's Laws," you check the index: "Newton's Laws — Page 87." Jump there directly.

Term: Linear Scan
Simple Definition: Checking every single record from start to finish. Slow and painful with big data.
Real-World Example: Looking for your friend in a crowd of 10,000 people by walking up to each person and asking "Are you Rahul?" One. By. One.

Term: Range Query
Simple Definition: Asking the database for all records between two values. Example: "All orders from January to March."
Real-World Example: Telling a shopkeeper: "Show me all T-shirts between ₹500 and ₹1000." The shopkeeper finds the ₹500 section, then grabs everything up to ₹1000.

Term: Point Lookup
Simple Definition: Asking the database for ONE specific record.
Real-World Example: Telling a shopkeeper: "Give me the blue Nike T-shirt, size M, product code NKE-2847." One exact item.

Term: Rebalancing
Simple Definition: When you add or remove data, the B+ Tree automatically reorganizes itself to stay evenly structured.
Real-World Example: Imagine a classroom where one row has 15 students and another has 3. The teacher redistributes students so every row has roughly equal students. That's rebalancing.

Now you speak the language. Let's dive in! 🚀

📖 SECTION 2 — Main Content: Understanding How Databases Store Data

Topic 1: Why Storing Data in a Simple File Fails (The Naive Approach)

Explanation

When you're new to coding, the first idea for storing data sounds perfectly logical: just write everything into a file, line by line. A CSV file, a text file, a JSON file — just append records sequentially.

For 10 records? Works great. For 10 million records? Complete disaster. Here's why:

Searching is painfully slow. There's no shortcut. You scan from line 1 until you find what you need.
Inserting in sorted order means rewriting the file. Adding a record in the middle forces you to shift everything after it.
Deleting leaves gaps that must be closed by rewriting.
Updating is risky. If the new data is bigger than the old data, it won't fit in the same space.

All these operations are O(n) — meaning if your data doubles, the time doubles too. That's a death sentence for performance at scale.

📱 Real-World Example: Your Phone's Contact List (Without Search)

Imagine your phone stored contacts in a simple notepad file — just a list:

Amit - 9876543210
Priya - 9123456789
Rahul - 9988776655
... (50,000 contacts)
Zara - 9111222333

Now you want to call Rahul. Without a search feature, you'd scroll from Amit all the way down, checking each name, until you find Rahul. With 50,000 contacts, this could take minutes.

Now imagine adding a new contact "Mohit" in alphabetical order. You'd have to move every contact from "N" to "Z" one position down to make space. That's rewriting thousands of entries.

This is exactly the problem databases face when they store data in a simple file.

Technical Example

File: users.dat

Offset 0:      {id: 1, name: "Alice", city: "NYC"}
Offset 100:    {id: 2, name: "Bob", city: "LA"}
Offset 200:    {id: 3, name: "Charlie", city: "Chicago"}
...
Offset 9999900: {id: 100000, name: "Zara", city: "Miami"}

Search for id = 50000:
No index exists. Database reads from offset 0, checks each record: id=1? No. id=2? No. id=3? No... all the way to id=50000. That's 50,000 disk reads in the worst case.

Insert id = 1.5 (between Alice and Bob):

Find the insertion point.
Shift every record after offset 100 forward by 100 bytes.
Write the new record.
Essentially rewrite 99,999 records. O(n).

Update name: "Bob" → name: "Alexander":
"Bob" = 3 characters. "Alexander" = 9 characters. The new name doesn't fit in the same space. Every record after Bob must be shifted by 6 bytes. Again, O(n).

📊 Image

Why Does This Matter for Developers?

If you've ever built a small project that reads/writes to a JSON file, you've used this approach. It works fine for a to-do app with 50 tasks. But at a company — an e-commerce platform, a banking app, a food delivery service — you're dealing with millions of records. A linear scan that takes 30 seconds per query will make your app unusable. That's why every production database uses something smarter.

Topic 2: What Is a B+ Tree? (The Core Data Structure Behind Every Database)

Plain English Explanation

A B+ Tree is the data structure that solves all the problems we just discussed. It's a tree-shaped index that organizes your data into levels, so the database can jump straight to the right location in just 3-4 steps — even with billions of records.

Here's the key idea:

A B+ Tree is like a multi-level navigation system. At each level, you eliminate a MASSIVE chunk of irrelevant data. Within 3-4 levels, you've found your exact record.

The tree has three types of nodes:

Root node (the top — your starting point)
Internal nodes (the middle — signposts that guide you)
Leaf nodes (the bottom — where the actual data lives)

This structure gives O(log n) search performance — meaning even with 1 billion records, you find anything in 3-4 steps.

🛒 Real-World Example: Finding a Product in a Supermarket

You walk into a massive supermarket with 100,000 products. You need to find Maggi noodles.

Without a B+ Tree (naive approach): You start at Aisle 1, Shelf 1 and walk through every aisle, checking every product. "Rice? No. Bread? No. Shampoo? No." You'd check thousands of products before finding Maggi. This could take an hour.

With a B+ Tree (smart approach):

Level 1 — Store entrance sign (Root): "Groceries → Left side. Electronics → Right side. Clothing → Upstairs."
- You go left. ✅
Level 2 — Groceries section sign (Internal node): "Snacks & Instant Food → Aisle 7. Dairy → Aisle 12. Beverages → Aisle 15."
- You go to Aisle 7. ✅
Level 3 — Aisle 7 shelf label (Internal node): "Chips → Top shelf. Noodles → Middle shelf. Biscuits → Bottom shelf."
- You look at the middle shelf. ✅
Level 4 — Middle shelf (Leaf node): There's Maggi! Right there, between Yippee and Top Ramen. 🎉

Four steps. That's it. In a store with 100,000 products. That's the power of a B+ Tree.

Technical Example

A simplified B+ Tree storing employee IDs:

                      [50]                    ← Root Node
                     /    \
              [20, 35]    [65, 80]           ← Internal Nodes
             /   |   \    /   |   \
     [10,15,20] [25,30,35] [40,45,50] [55,60,65] [70,75,80] [85,90,95]
                                                                  ↑
                                                           Leaf Nodes
                                                    (actual data lives here)

Searching for Employee ID = 75:

Root [50]: Is 75 > 50? Yes → go right.
Internal [65, 80]: Is 75 ≥ 65 and < 80? Yes → go to middle child.
Leaf [70, 75, 80]: Scan this small node → Found ID 75! ✅

Only 3 steps to find one record among potentially millions. Compare that to scanning every record one by one.

📊 Image

Why Does This Matter for Developers?

Every time you write this SQL:

CREATE INDEX idx_user_email ON users(email);

The database builds a B+ Tree behind the scenes. The keys in the tree are email addresses, and the leaf nodes point to (or contain) the actual rows. When you later search:

SELECT * FROM users WHERE email = 'rahul@gmail.com';

MySQL doesn't scan 10 million rows. It traverses the B+ Tree in 3-4 steps and returns Rahul's record in milliseconds. Now you know what's happening under the hood!

Topic 3: Why B+ Trees Use 4 KB Disk Blocks (Database Page Size Explained)

Plain English Explanation

Here's a fact most beginners don't learn until much later: your hard drive can't read just 1 byte of data. Every time you ask the disk for anything — even a single character — it reads a whole block of data, typically 4 KB (4,096 bytes).

This is because physically moving the disk read-head is expensive and slow. Since you're making the trip anyway, might as well bring back a full chunk of data.

B+ Trees are designed to exploit this. Each node in a B+ Tree is exactly the size of one disk block (4 KB or more). So every time the database reads one node, it uses exactly one disk I/O — no wasted reads, no partial reads.

This is why B+ Trees are so efficient with disk-based storage.

🍕 Real-World Example: Ordering Pizza for a Group

Imagine you're ordering pizza for a party. The pizza place charges a flat ₹50 delivery fee per trip, regardless of how many pizzas you order.

Bad approach: Order 1 pizza, pay ₹50 delivery. Need another? Order again, pay ₹50 again. 10 friends = 10 trips = ₹500 in delivery alone. 😩
Smart approach: Figure out how many pizzas fit in the delivery bag (let's say 5) and order 5 at once. 10 friends = 2 trips = ₹100 delivery. ✅

The delivery trip is like a disk I/O (slow and expensive). The delivery bag capacity is like the 4 KB disk block. B+ Trees stuff each node with as much useful data as possible so every "trip" (disk read) brings back maximum value.

Technical Example

Let's calculate how powerful this is:

Each key in our B+ Tree = 8 bytes (an integer like a user ID).
Each pointer to a child = 8 bytes.
One entry = key + pointer = 16 bytes.
One disk block = 4,096 bytes.
Entries per node = 4,096 ÷ 16 ≈ 256.

So each non-leaf node can have 256 children. Let's see how the tree scales:

Tree Level	Nodes at This Level	Total Records Reachable
Level 1 (Root)	1	256 paths
Level 2	256	65,536 paths
Level 3	65,536	16,777,216 paths
Level 4 (Leaves)	16,777,216	~16.7 million records

Just 4 disk reads can search through 16.7 MILLION records.

And here's the bonus: the root node is almost always cached in RAM (because it's accessed so frequently). So it's really 3 disk reads for 16.7 million records.

With InnoDB (MySQL's default engine), the default page size is 16 KB, which means even higher branching factors and even fewer levels needed.

📊 Image

Why Does This Matter for Developers?

When you see MySQL settings like innodb_page_size = 16384 or PostgreSQL's block_size = 8192, now you understand what they mean. These settings control the node size of the B+ Tree. Larger page sizes = more keys per node = fewer levels = fewer disk reads = faster queries.

This also explains why adding too many indexes can slow down writes. Each index is a separate B+ Tree. Every insert into the table means updating multiple B+ Trees. More trees = more disk writes.

Topic 4: Leaf Nodes in B+ Trees — Where Your Actual Data Lives

Plain English Explanation

In a B+ Tree, there's a strict rule:

ALL actual data rows are stored ONLY in the leaf nodes — the very bottom level of the tree.

The nodes above (root, internal nodes) only contain keys and pointers — navigation information. They're like road signs. The actual destination is always at the bottom.

And here's the most important feature: leaf nodes are linked together like a chain. Each leaf node has a pointer to the next leaf node. This creates a sorted linked list at the bottom of the tree.

This chain is what makes range queries incredibly fast — but we'll get to that soon.

🏬 Real-World Example: A Shopping Mall

Think of a 3-floor shopping mall:

Floor 3 (Root): A big board at the entrance says "Food → Floor 1, Clothes → Floor 2, Electronics → Floor 3." This board doesn't sell anything. It just gives directions.
Floor 2 (Internal): Signs say "Men's Clothing → Left Wing, Women's Clothing → Right Wing." Again, no products here. Just directions.
Floor 1 (Leaf): This is where the actual shops are. You walk in, pick up a shirt, and buy it.

Now imagine all the shops on Floor 1 are connected by a corridor. You can walk from Shop 1 → Shop 2 → Shop 3 → Shop 4 without going back upstairs. That corridor is the linked list between leaf nodes.

Technical Example

Leaf 1              Leaf 2              Leaf 3              Leaf 4
┌──────────────┐   ┌──────────────┐   ┌──────────────┐   ┌──────────────┐
│ ID:1  Amit    │──→│ ID:4  Deepa  │──→│ ID:7  Gaurav │──→│ ID:10 Jaya   │
│ ID:2  Bhavna  │   │ ID:5  Esha   │   │ ID:8  Harsh  │   │ ID:11 Kiran  │
│ ID:3  Chirag  │   │ ID:6  Farhan │   │ ID:9  Isha   │   │ ID:12 Laksh  │
└──────────────┘   └──────────────┘   └──────────────┘   └──────────────┘
       ↑                  ↑                  ↑                   ↑
   Actual rows!      Actual rows!       Actual rows!        Actual rows!

   ──→ = "Next" pointer (linked list connecting all leaves)

Key points:

Every leaf contains real data — names, IDs, everything.
Arrows (→) connect each leaf to the next — sorted, sequential.
Non-leaf nodes above only have keys like [4], [7], [10] — just enough to guide you down to the right leaf.

📊 Image

Why Does This Matter for Developers?

In MySQL InnoDB, the primary key index is a clustered index — meaning the leaf nodes of the primary key's B+ Tree contain the entire row of data. When you run:

SELECT * FROM users WHERE id = 42;

The database traverses the B+ Tree, reaches the leaf node, and finds the complete row (id, name, email, everything) right there. No extra lookup needed.

This is also why choosing a good primary key matters. An auto-incrementing integer ID keeps insertions sequential (always at the end of the leaf chain), which avoids expensive rebalancing. A random UUID as a primary key causes random inserts throughout the tree, leading to more splits and slower writes.

Topic 5: Non-Leaf Nodes — The Signpost System That Guides Every Search

Plain English Explanation

Non-leaf nodes are the upper levels of the B+ Tree — the root and internal nodes. Their ONLY job is to point you in the right direction. They say, "The record you're looking for is somewhere in THAT subtree."

They contain:

Keys — boundary values that divide the data range.
Pointers — addresses pointing to child nodes.

They do NOT contain actual data rows. Think of them as the table of contents of a book — helpful for navigation, but you can't read the actual chapter content from the table of contents.

🗺️ Real-World Example: Google Maps Navigation

Imagine you're using Google Maps to drive from Delhi to a specific restaurant in Mumbai.

Level 1 (Root — Broad direction): "Head South on NH48 toward Maharashtra."
Level 2 (Internal — Getting closer): "Take the Mumbai exit. Enter Western Express Highway."
Level 3 (Internal — Almost there): "Turn left on Linking Road, Bandra."
Level 4 (Leaf — Destination): "You've arrived! The restaurant is on your right." 🎉

Each navigation instruction (turn-by-turn direction) is a non-leaf node. It doesn't have your food — it just tells you where to go. The restaurant (where you actually eat) is the leaf node.

Technical Example

Non-leaf node: [30 | 60]

            /         |         \
           ↓          ↓          ↓
       Child A     Child B     Child C
    (keys < 30)  (30 ≤ keys < 60)  (keys ≥ 60)

Searching for key = 45:

Compare 45 with 30 → 45 ≥ 30, so skip Child A.
Compare 45 with 60 → 45 < 60, so go to Child B.
Child B might be another non-leaf (continue navigating) or a leaf (data found!).

Here's the search logic in simple pseudocode:

def search_b_plus_tree(node, target_key):
    if node.is_leaf():
        # We're at the bottom! Scan this node for the key.
        for record in node.records:
            if record.key == target_key:
                return record  # Found it! 🎉
        return None  # Not here.

    # Non-leaf node: find which child to visit
    for i, key in enumerate(node.keys):
        if target_key < key:
            return search_b_plus_tree(node.children[i], target_key)

    # Target is bigger than all keys → go to the last child
    return search_b_plus_tree(node.children[-1], target_key)

At each non-leaf node, you make one comparison and eliminate a massive portion of the tree. That's what gives B+ Trees their O(log n) speed.

📊 Image

Why Does This Matter for Developers?

When you run EXPLAIN on a SQL query and see the query plan:

EXPLAIN SELECT * FROM orders WHERE order_id = 1234;

If the output shows type: ref or type: const (instead of type: ALL), it means MySQL is using the non-leaf nodes of a B+ Tree index to navigate directly to the right leaf node. type: ALL means it's doing a full table scan — ignoring the tree entirely. Understanding this helps you read query plans and debug slow queries at work.

Topic 6: How B+ Trees Speed Up Find Operations (Database Query Performance)

Plain English Explanation

Let's be very explicit about the performance difference between a naive file scan and a B+ Tree search.

Naive file: O(n) — check every record. 1 million records = up to 1 million reads.
B+ Tree: O(log n) — traverse 3-4 levels. 1 million records = 3-4 reads.

The "log" base here isn't 2 (like in a binary tree). It's typically 256 or higher (the branching factor of the tree). That means each step eliminates not half, but 99.6% of the remaining options.

📱 Real-World Example: Finding a Song on Your Phone

You have 10,000 songs on your phone.

Without search (linear scan): You scroll through your entire music library, song by song, reading each title: "Aadat? No. Believer? No. Closer? No..." until you find "Shape of You." Could take 10 minutes of scrolling.

With search (B+ Tree-like approach): You type "Shape" in the search bar. Instantly, the phone narrows down to songs starting with "Sha..." then "Shape..." and shows you "Shape of You" in under 1 second.

That's the difference between O(n) and O(log n). The search bar uses an index (similar to a B+ Tree) to skip straight to the answer.

Technical Example

Number of Records	Linear Scan (O(n))	B+ Tree Search (O(log₂₅₆ n))
1,000	Up to 1,000 disk reads	1-2 disk reads
100,000	Up to 100,000 disk reads	2-3 disk reads
10,000,000	Up to 10,000,000 disk reads	3 disk reads
1,000,000,000	Up to 1,000,000,000 disk reads	3-4 disk reads

Look at the last row. 1 billion records.

Linear scan: potentially 1 billion disk reads. At 10ms per read, that's 115 days.
B+ Tree: 4 disk reads. At 10ms per read, that's 40 milliseconds.

115 days vs 40 milliseconds. That's not an improvement — it's a miracle.

📊 Image

Why Does This Matter for Developers?

This is why every experienced developer and DBA says: "Add an index on columns you filter by."

Without an index:

-- No index on "email" → full table scan → O(n) → SLOW 🐌
SELECT * FROM users WHERE email = 'rahul@gmail.com';

With an index:

-- Create a B+ Tree index on "email"
CREATE INDEX idx_email ON users(email);

-- Now the same query uses the B+ Tree → O(log n) → FAST ⚡
SELECT * FROM users WHERE email = 'rahul@gmail.com';

The query goes from 30 seconds to 2 milliseconds. Same query, same data, just a B+ Tree index doing its magic.

Topic 7: How B+ Trees Handle Insert, Update, and Delete Efficiently

Plain English Explanation

In a naive file, any modification (insert, update, delete) potentially means rewriting the entire file. That's O(n).

In a B+ Tree, modifications are targeted — you only touch the specific nodes involved:

Insert: Navigate to the correct leaf node (O(log n)), add the record. If the leaf is full, it splits into two. The parent gets a new key. Occasionally, splits can ripple upward (called rebalancing), but this is rare.
Delete: Navigate to the leaf (O(log n)), remove the record. If the leaf becomes too empty, it merges with a neighbor or borrows records from an adjacent leaf.
Update: Navigate to the leaf (O(log n)), change the data in place. If the key itself changes, it's a delete + insert.

The key point: You modify 1-2 nodes out of potentially millions. The rest of the tree stays untouched.

📝 Real-World Example: A Class Seating Chart

Imagine a classroom with 10 rows of benches, 5 students per row. Students are seated alphabetically.

Naive approach (flat file): All 50 students sit in ONE long row on the floor, alphabetically. A new student "Mohit" joins. Everyone from "N" to "Z" has to stand up and shift one seat to the right. That's 20+ students moving. 😩

B+ Tree approach (organized benches): Students sit on benches (leaf nodes), 5 per bench, alphabetically. Mohit joins? Find the right bench (the one with "Kumar, Laksh, Meera, Nisha"). There's space! Mohit sits down between Meera and Nisha. Done. Only 2 students on that bench shuffled slightly. Nobody else in the class moved. ✅

What if that bench is full? The teacher splits the bench into two: 3 students on the old bench, 3 on a new bench. The class map (non-leaf node) is updated to show the new bench. That's it. The rest of the class doesn't notice.

Technical Example

Inserting ID=37 into our B+ Tree:

Before:

              [30 | 60]
             /    |    \
    [10,20,30]  [40,50]  [70,80,90]

Step 1: Navigate. 37 ≥ 30 and 37 < 60 → middle child [40, 50].

Step 2: Insert 37 → leaf becomes [37, 40, 50]. Still within capacity (max 3 entries). Done! ✅

              [30 | 60]
             /    |    \
    [10,20,30]  [37,40,50]  [70,80,90]

What if the leaf was full? Say max capacity = 3, and the leaf was [40, 45, 50]. Inserting 37 creates [37, 40, 45, 50] — too many!

Split the leaf:

Left half: [37, 40]
Right half: [45, 50]
Middle key (45) is pushed up to the parent.

              [30 | 45 | 60]
             /    |    |    \
    [10,20,30]  [37,40] [45,50]  [70,80,90]

Only 2 nodes were modified — the split leaf and its parent. The rest of the tree (potentially millions of nodes) stayed untouched.

📊 Image

Why Does This Matter for Developers?

Understanding insert performance helps you make better design decisions:

Auto-increment primary keys (1, 2, 3, 4...) are great for B+ Trees because new records always go to the END of the leaf chain. No splits needed in the middle.
Random UUIDs as primary keys cause inserts to land all over the tree, triggering frequent splits and rebalancing. This is why UUIDs can be 30-40% slower for write-heavy tables.
Bulk inserts (loading millions of rows at once) often bypass the normal B+ Tree insert path and use a special "bulk load" process that builds the tree bottom-up. That's why LOAD DATA INFILE in MySQL is way faster than millions of individual INSERT statements.

Topic 8: Range Queries — The B+ Tree's Superpower (Why Linked Leaf Nodes Matter)

Plain English Explanation

This is where B+ Trees truly dominate. A range query asks: "Give me everything between X and Y."

Here's how a B+ Tree handles it:

Step 1: Use the tree to navigate to the first record in the range. This takes O(log n) — a few hops down the tree.
Step 2: Now, simply walk the linked list of leaf nodes forward, reading record after record, until you pass the end of the range.
Step 3: Stop.

You never need to go back up the tree. You never need to re-navigate. The horizontal chain of leaf nodes gives you a sorted highway through the data.

This is something a hash index CANNOT do. Hash indexes are great for "find this exact value" (point lookups) but useless for "find everything between A and B" because hashing destroys sorted order.

🎬 Real-World Example: Netflix "Continue Watching"

Imagine Netflix organizes all its shows in alphabetical order on a long digital shelf. You want to watch everything starting from "D" to "G" (Dark, Emily in Paris, Friends, Game of Thrones...).

Without linked leaf nodes: You'd search for "Dark" (go through the navigation tree). Then go BACK to the start and search for the next show. Then go back AGAIN and search for the next one. For 50 shows between D and G, that's 50 full tree traversals. 😩

With linked leaf nodes (B+ Tree): You search for "Dark" ONCE (tree traversal). Then you just slide right along the connected shelf: Dark → Derry Girls → Emily in Paris → Friends → Game of Thrones → stop (we've passed "G"). One search + a simple forward scan. ⚡

Technical Example

B+ Tree with linked leaves:

                    [50]
                   /    \
             [25]        [75]
            /    \      /    \
   [10,20] → [30,40] → [50,60] → [70,80] → [90,100]
     L1        L2         L3        L4         L5

Query: SELECT * FROM products WHERE price BETWEEN 30 AND 70;

Step 1 — Tree navigation to find price = 30:

Root [50]: 30 < 50 → go left.
Node [25]: 30 ≥ 25 → go right child.
Arrive at Leaf L2 [30, 40]. Found the start! ✅

Step 2 — Walk the linked list:

L2: Read price=30 ✅, price=40 ✅
Follow link → L3: Read price=50 ✅, price=60 ✅
Follow link → L4: Read price=70 ✅, price=80 ❌ (80 > 70, stop!)

Result: Records with prices 30, 40, 50, 60, 70.

Total disk I/Os: 3 (tree traversal) + 3 (leaf reads for L2, L3, L4) = 6 disk reads.

Without the linked list, finding these 5 records would require 5 separate tree traversals = 5 × 3 = 15 disk reads. With the linked list: 6 reads. That's 60% fewer disk I/Os.

For large ranges (say, returning 10,000 records), the savings are even more dramatic.

📊 Image

Why Does This Matter for Developers?

Range queries are the most common query type in real applications:

Real Use Case	SQL Query	Range Type
Show last month's orders	`WHERE order_date BETWEEN '2024-11-01' AND '2024-11-30'`	Date range
Products in a price range	`WHERE price BETWEEN 500 AND 2000`	Numeric range
Users who signed up recently	`WHERE created_at >= '2024-12-01'`	Open-ended range
Students with marks 60-80	`WHERE marks BETWEEN 60 AND 80`	Numeric range

Every BETWEEN, >, <, >=, <= in your SQL relies on this linked leaf node structure. If there's no index on the column, the database can't use a B+ Tree, and it falls back to a full table scan. Now you know exactly why indexes matter — and WHY they work!

🔗 SECTION 3 — How It All Connects (The Big Picture)

Let's tie everything together with one complete story.

The Story of FoodDash — A Food Delivery App Database

You're a developer building FoodDash, a food delivery app (like Swiggy or Zomato). Your app has 20 million restaurants across India. Each restaurant has: id, name, cuisine, rating, city, price_for_two.

🏗️ Day 1: The Naive Start

Your junior developer stores everything in a JSON file:

[
  {"id": 1, "name": "Sharma Ji Dhaba", "cuisine": "North Indian", "rating": 4.2, "city": "Delhi", "price_for_two": 400},
  {"id": 2, "name": "Dosa Corner", "cuisine": "South Indian", "rating": 4.5, "city": "Bangalore", "price_for_two": 300},
  ...
  {"id": 20000000, "name": "Pasta Palace", "cuisine": "Italian", "rating": 3.8, "city": "Mumbai", "price_for_two": 1200}
]

A user in Bangalore searches for "South Indian restaurants rated above 4.0." The server reads all 20 million records, one by one, checking each. Takes 45 seconds. By then, the user has already switched to Swiggy. 😤

⚙️ Day 30: Migration to MySQL with B+ Trees

You migrate to MySQL (InnoDB engine). The database automatically creates a B+ Tree on the primary key (id). You also create indexes on rating and city:

CREATE INDEX idx_city ON restaurants(city);
CREATE INDEX idx_rating ON restaurants(rating);

Now there are 3 B+ Trees:

Primary key tree (clustered — leaf nodes contain full rows).
City index tree (leaf nodes contain city values + pointers to full rows).
Rating index tree (leaf nodes contain ratings + pointers to full rows).

Each tree is organized into 16 KB pages (InnoDB's default), with hundreds of keys per node.

🔍 Day 31: Point Lookup (Finding One Restaurant)

A user clicks on restaurant ID #12345678.

SELECT * FROM restaurants WHERE id = 12345678;

The primary key B+ Tree:

Root (cached in RAM — free!): 12345678 < 15000000 → go left.
Internal node (1 disk read): 12345678 is between 12000000 and 13000000 → go to this child.
Leaf node (1 disk read): Found! Return name="Biryani House", cuisine="Hyderabadi", rating=4.7.

2 disk reads. ~20 milliseconds. Not 45 seconds. 🚀

📊 Day 32: Range Query (Price-Based Search)

Marketing wants to know: "How many restaurants have a price_for_two between ₹200 and ₹500?"

SELECT COUNT(*) FROM restaurants WHERE price_for_two BETWEEN 200 AND 500;

You create an index: CREATE INDEX idx_price ON restaurants(price_for_two);

The B+ Tree:

Tree traversal to find the first restaurant at ₹200. (3 node reads)
Walk the linked leaf nodes forward: ₹200, ₹210, ₹215... ₹490, ₹500. (Sequential reads)
Stop at ₹501.

The linked leaf nodes made this possible without re-traversing the tree for each restaurant.

➕ Day 45: Inserting a New Restaurant

A new restaurant "Cloud Kitchen 99" registers. The database:

Traverses the primary key B+ Tree to find the right leaf. (3 reads)
Inserts the record. Leaf is full? Split it. (1-2 writes)
Updates the parent node with the new key. (1 write)
Also inserts into the city, rating, and price indexes. (3 more B+ Tree inserts)

Total: ~15-20 disk I/Os. Not 20 million. Just a handful of targeted operations.

🎯 The Complete Flow:

User opens FoodDash → Searches "Biryani in Hyderabad under ₹600"
        ↓
    SQL Query: WHERE city='Hyderabad' AND cuisine='Biryani' AND price < 600
        ↓
    MySQL picks the best B+ Tree index (city index)
        ↓
    Root node (cached in RAM) → "Hyderabad starts at this subtree"
        ↓
    Internal node (1 disk read) → "This range of Hyderabad restaurants"
        ↓
    Leaf node (1 disk read) → Finds matching restaurants
        ↓
    Walks linked leaf nodes → Grabs all Hyderabad restaurants
        ↓
    Filters by cuisine and price → Returns results
        ↓
    User sees results in 50ms → Orders biryani 🍛 → Happy customer!

📊 Master Diagram

✅ SECTION 4 — Conclusion (What Did We Learn?)

Naive file storage (sequential files) doesn't scale. Searching, inserting, updating, and deleting all require O(n) operations — practically unusable for production databases.
B+ Trees solve this by organizing data into a multi-level tree structure with O(log n) search performance — even a billion records can be found in 3-4 disk reads.
Each B+ Tree node fits exactly into one disk block (4-16 KB). This minimizes disk I/O — the slowest operation in computing.
Non-leaf nodes are navigation signs. They contain only keys and pointers to guide searches. No actual data.
Leaf nodes store all real data. Every actual database row lives at the bottom of the tree.
Leaf nodes are linked together in a sorted chain. This makes range queries blazing fast — find the start, walk forward, stop at the end.
Inserts, updates, and deletes are targeted. Only 1-2 nodes are modified (with occasional rebalancing). The rest of the tree stays untouched.
B+ Trees excel at BOTH point lookups AND range queries. This dual capability is why they dominate over hash indexes for general-purpose database indexing.
Almost every database you'll ever use is powered by B+ Trees — MySQL, PostgreSQL, SQLite, MongoDB (WiredTiger), Oracle, SQL Server — all of them.
Understanding B+ Trees makes you a better developer. You'll write better queries, design better schemas, create smarter indexes, and debug performance issues like a pro.

🎉 You just learned one of the most fundamental concepts in database engineering — something many developers don't understand even after years of experience. You're ahead of the curve. Keep building, keep learning, and keep asking "WHY does this work?" You've got this! 💪

📋 SECTION 5 — Quick Revision Cheat Sheet

Concept	One-Line Summary	Real-Life Analogy
Naive File Storage	Data stored line by line — every operation scans the whole file. O(n).	Scrolling through 10,000 phone contacts one by one to find "Rahul."
B+ Tree	Multi-level tree index — finds any record in 3-4 steps. O(log n).	Supermarket signs: Section → Aisle → Shelf → Product.
4 KB Disk Block	The minimum chunk a hard drive reads at once; B+ Tree nodes match this size.	A pizza delivery bag — always carries a full load per trip.
Non-Leaf Node	Upper nodes with only keys and pointers — no actual data. Just directions.	Google Maps turn-by-turn directions — they guide you but don't feed you.
Leaf Node	Bottom nodes storing ALL actual data rows.	The actual shops in a mall — where you buy things.
Linked Leaf Nodes	Leaf nodes connected in a sorted chain for fast sequential reading.	A corridor connecting all shops on one floor — walk through without going upstairs.
O(n) Linear Scan	Checking every record one by one. Slow and doesn't scale.	Finding your friend by asking every person in a 10,000-person crowd.
O(log n) Search	Each step eliminates 99%+ of the data. Fast even with billions of records.	Guessing a number 1-1 billion in ~30 tries by halving each time.
Range Query	Finding all records between value X and Y using the linked leaf chain.	"Show me all T-shirts between ₹500 and ₹1000" — find the start, grab until the end.
Point Lookup	Finding one exact record by its key.	"Give me order #12345" — one direct lookup.
Rebalancing	Tree auto-adjusts after inserts/deletes to stay evenly organized.	A teacher redistributing students across benches when one gets too crowded.
Storage Engine	Internal machinery deciding how data is physically stored and retrieved.	A grocery store manager who organizes shelves and knows where everything is.

🔍 Frequently Asked Questions (FAQ) About B+ Trees

Q: Why do databases use B+ Trees instead of Binary Search Trees (BST)?
A: A BST has only 2 children per node, so it's very tall (many levels). A B+ Tree has hundreds of children per node, so it's very short (3-4 levels). Fewer levels = fewer disk reads = faster queries.

Q: Why B+ Tree instead of B-Tree?
A: In a regular B-Tree, data can live in ANY node (leaf or internal). In a B+ Tree, ALL data lives in leaf nodes only, and leaf nodes are linked. This makes range queries much faster because you can walk the leaf chain without going up and down the tree.

Q: Does MongoDB also use B+ Trees?
A: Yes! MongoDB's default storage engine (WiredTiger) uses B+ Trees for its indexes.

Q: Should I create an index on every column?
A: No! Each index is a separate B+ Tree that must be updated on every insert/update/delete. Too many indexes slow down writes. Only index columns that you frequently search, filter, sort, or join on.

Q: How do I check if my query is using a B+ Tree index?
A: Use EXPLAIN before your SQL query: EXPLAIN SELECT * FROM users WHERE email = 'test@gmail.com'; Look for type: ref or type: range (using index) vs type: ALL (full scan — no index used).

Happy learning! Now go run EXPLAIN on your SQL queries and see those B+ Trees in action. 🌳⚡

How GitHub Broke Apart Its Massive Database — Without Anyone Noticing

Vivek Upadhyay — Sat, 07 Feb 2026 18:51:54 +0000

1. Context — Why Should You Care?

Imagine you live in a city with one single hospital. Every person — from a child with a cold, to someone needing heart surgery, to someone getting an eye test — goes to the same building, the same reception desk, the same set of doctors.

When the city had 10,000 people, this was fine. But now the city has 10 million people. The waiting room is overflowing. The reception computer is crashing. A patient getting a routine blood test is accidentally blocking the queue for someone who needs emergency surgery.

What do you do?

You don't shut down the hospital and build new ones. People are already inside, being treated. You need to split the hospital into specialized clinics — a heart clinic, an eye clinic, a general clinic — while patients are still being treated, without anyone noticing the transition.

That is exactly what GitHub did with its database.

GitHub — the platform where 100+ million developers store their code — was running on a single, massive MySQL database. As it grew, the database started groaning under the weight. Queries were slow. One team's heavy workload was ruining performance for everyone else. Something had to change.

But GitHub is a 24/7 platform. They couldn't just "turn it off for maintenance." They had to split their giant database into smaller, independent databases — while the platform was live and millions of developers were pushing code.

What You Will Learn in This Blog

What a monolithic database is and why it becomes a problem
What "sharding" means and why companies do it
GitHub's exact two-phase strategy: Virtual Partitioning → Physical Partitioning
How they achieved a cut-over in under 100 milliseconds
Every technical term explained from scratch with analogies
Step-by-step walkthrough with diagrams

What Problem This Knowledge Solves

If you ever work at a company that is growing fast, you will face database scaling problems. Understanding how GitHub solved this gives you a mental framework for one of the hardest problems in backend engineering — scaling databases without downtime.

2. Jargon & Terminology Breakdown

Before we touch any concept, let's make sure every single term is crystal clear. Read this section like a mini-dictionary. Come back to it anytime you get confused later.

Monolithic Database


Definition	One single database that stores everything for your entire application.
Real-Life Analogy	One giant notebook where you write your work notes, grocery lists, personal diary, and meeting minutes — all mixed together.
Where It's Used	Almost every application starts with a monolithic database. It's simple and works fine when you're small.

Sharding


Definition	Splitting one big database into multiple smaller databases, each responsible for a specific portion of the data.
Real-Life Analogy	Instead of one giant notebook, you now have separate notebooks: one for work, one for groceries, one for personal diary. Each notebook is independent.
Where It's Used	Large-scale systems like GitHub, Instagram, Uber, Pinterest — any platform that outgrows a single database.

Schema


Definition	The structure or blueprint of your database — what tables exist, what columns each table has, and how tables relate to each other.
Real-Life Analogy	Think of a schema like the layout plan of a library: "Fiction books go on floor 1, Science on floor 2, History on floor 3." The schema doesn't hold the actual books — it defines where things go.
Where It's Used	Every database has a schema. When developers say "schema domain," they mean a logical group of related tables.

Schema Domain


Definition	A logical grouping of related tables within a database. For example, all tables related to "Repositories" form one domain, all tables related to "Users" form another.
Real-Life Analogy	In a big hospital, "Cardiology" is one domain (heart patients, heart tests, heart doctors), "Ophthalmology" is another domain (eye patients, eye tests, eye doctors). They're in the same building but logically separate.
Where It's Used	GitHub created this concept internally to organize their monolithic database before splitting it.

Cluster


Definition	A group of database servers working together. Usually there's one "primary" server (handles writes) and several "replica" servers (handle reads).
Real-Life Analogy	A team at work: one team lead (primary) makes decisions, and several team members (replicas) execute and share the workload.
Where It's Used	Any production database setup. You rarely run just one server — you run a cluster for reliability and performance.

Primary (Server)


Definition	The main database server that handles all write operations (INSERT, UPDATE, DELETE). It is the "source of truth."
Real-Life Analogy	The original document that everyone copies from. If you want to make a change, you change this document.
Where It's Used	Every database cluster has exactly one primary.

Replica (Server)


Definition	A copy of the primary server. It receives all changes from the primary and handles read operations to reduce load on the primary.
Real-Life Analogy	Photocopies of the original document distributed to different offices so people can read without crowding around the original.
Where It's Used	Read-heavy applications (like GitHub, where millions read code but fewer write code at any given moment).

Replication


Definition	The automatic process of copying data changes from the primary server to replica servers.
Real-Life Analogy	A live Google Doc where one person types (primary) and everyone else sees changes in real time (replicas).
Where It's Used	Every database cluster uses replication to keep replicas in sync with the primary.

Replication Lag


Definition	The delay between when a change happens on the primary and when replicas receive that change.
Real-Life Analogy	When someone speaks in a video call and there's a 2-second delay before you hear it. That delay is "lag."
Where It's Used	Critical during migrations — you need lag to be near zero before you switch traffic to a new server.

ProxySQL


Definition	A middleman software that sits between your application and the database. It decides which database server to send each query to.
Real-Life Analogy	A receptionist at a hospital who listens to your problem and directs you to the right department. You don't need to know which room — the receptionist handles it.
Where It's Used	Large-scale MySQL deployments. It's especially useful during migrations because you can redirect traffic without changing application code.

Cut-Over


Definition	The final moment when you switch traffic from the old database to the new database.
Real-Life Analogy	The moment when a new highway opens and traffic is redirected from the old road to the new road.
Where It's Used	Any database migration. The cut-over is the most critical and risky moment.

Snapshot


Definition	A point-in-time copy of data. Like taking a photograph of your database at a specific moment.
Real-Life Analogy	Photocopying all pages of a book at once. The photocopy represents the book's state at that exact moment.
Where It's Used	Used as a starting point when setting up a new database — you load the snapshot, then apply any changes that happened after the snapshot was taken.

Noisy Neighbor Problem


Definition	When one workload in a shared system uses so many resources that it degrades performance for other workloads in the same system.
Real-Life Analogy	You're in a shared office. One person starts playing loud music and making phone calls. Your concentration is ruined — not because of your work, but because of their behavior.
Where It's Used	Any shared resource: shared databases, shared cloud servers, shared networks.

Query Linter


Definition	A tool that automatically checks your database queries for "rule violations" — like a spell-checker, but for database queries.
Real-Life Analogy	A grammar checker that underlines mistakes in your essay before you submit it.
Where It's Used	Development and testing environments. GitHub used it to catch queries that broke domain boundaries.

Cross-Domain Join


Definition	A database query that pulls data from tables belonging to different schema domains in a single operation.
Real-Life Analogy	A hospital receptionist trying to book one appointment that requires both a heart doctor and an eye doctor in the same room at the same time. It couples two independent departments together.
Where It's Used	Common in monolithic databases. But if you want to split the database, these cross-domain joins must be eliminated first.

Downtime


Definition	A period when the system is unavailable to users.
Real-Life Analogy	A shop putting up a "Closed for Renovation" sign. Customers can't enter.
Where It's Used	Every online platform fears downtime. GitHub's goal was to achieve zero visible downtime during their migration.

3. The Big Picture — High-Level Mental Model

Before diving into the details, let's understand the overall story in simple terms.

The Problem

GitHub's entire platform — repositories, pull requests, issues, users, notifications, everything — was stored in one MySQL database cluster. Think of it as one massive warehouse where every department stores their goods.

As GitHub grew to serve 100+ million developers, two problems became unbearable:

High query volume: Too many people asking for too many things from the same warehouse at the same time.
Noisy neighbor problem: The "Repositories" team's heavy operations slowed down the "Notifications" team, even though their data had nothing to do with each other.

The Solution (at 30,000 feet)

GitHub decided to shard — break the one giant database into many smaller, independent databases, each responsible for one "domain" of data.

But here's the brilliance: they didn't do it all at once. They did it in two careful phases:

Phase 1 — Virtual Partitioning (Logical separation)

"Before we physically move anything, let's first draw boundaries inside the existing database and make sure no one crosses them."

Phase 2 — Physical Partitioning (Actual migration)

"Now that boundaries are clean, let's physically move each domain to its own database cluster — without downtime."

The Analogy

Imagine you have a huge shared apartment with 6 roommates. Everyone's stuff is everywhere — someone's books are on your shelf, your clothes are in their closet.

Phase 1 (Virtual Partitioning): Before anyone moves out, you first sort everything. Each person labels their stuff. You make sure no one is using someone else's belongings. You draw invisible boundaries.

Phase 2 (Physical Partitioning): Now that everything is sorted, each person moves into their own apartment — taking only their labeled stuff, without any mix-ups.

┌─────────────────────────────────────────────────┐
│           GITHUB'S SHARDING STRATEGY            │
│                                                 │
│   ┌──────────────────────────────────────────┐  │
│   │  PHASE 1: VIRTUAL PARTITIONING           │  │
│   │  ─────────────────────────────────        │  │
│   │  • Define schema domains                 │  │
│   │  • Enforce boundaries (no cross-domain   │  │
│   │    queries or transactions)              │  │
│   │  • Use linters + alerts to catch         │  │
│   │    violations                            │  │
│   │  • All still in ONE physical database    │  │
│   └──────────────┬───────────────────────────┘  │
│                  │                               │
│                  ▼                               │
│   ┌──────────────────────────────────────────┐  │
│   │  PHASE 2: PHYSICAL PARTITIONING          │  │
│   │  ─────────────────────────────────        │  │
│   │  • Snapshot domain tables                │  │
│   │  • Set up new cluster                    │  │
│   │  • Replicate data                        │  │
│   │  • Redirect traffic via ProxySQL         │  │
│   │  • Cut-over in < 100ms                   │  │
│   └──────────────────────────────────────────┘  │
│                                                 │
│   Result: Independent databases per domain,     │
│           zero visible downtime                  │
└─────────────────────────────────────────────────┘

**Diagram reference

4. Concept-by-Concept Deep Dive

4.1 — The Starting Point: GitHub's Monolithic MySQL Database

Simple Definition

GitHub stored all its data — for repositories, users, issues, pull requests, notifications, actions, and more — in a single MySQL database cluster.

Why This Existed

Every startup and growing company starts here. A monolithic database is:

Simple to set up
Simple to query (you can JOIN any table with any other table)
Simple to maintain (one place, one backup)

It's the natural starting point. The problem isn't that GitHub chose this — it's that they outgrew it.

The Problems It Caused

Problem 1: Query Volume

Millions of developers, millions of repositories, billions of commits. The single database was receiving a staggering volume of queries — reads and writes, all funneled into the same cluster.

Problem 2: Noisy Neighbor

This is the critical one. Let's say the "Notifications" system runs a heavy batch query to send email digests every morning. While this query runs, it hogs database resources (CPU, memory, I/O). At the same time, a developer is trying to push code to a repository. The push is slow or fails — not because of anything wrong with the "Repositories" tables, but because the "Notifications" tables are hogging resources.

┌─────────────────────────────────────────────────────────┐
│              SINGLE DATABASE CLUSTER                     │
│                                                         │
│  ┌──────────┐  ┌──────────┐  ┌──────────┐  ┌────────┐  │
│  │ Repos    │  │ Users    │  │ Notifs   │  │ Issues │  │
│  │ Tables   │  │ Tables   │  │ Tables   │  │ Tables │  │
│  └────┬─────┘  └────┬─────┘  └────┬─────┘  └───┬────┘  │
│       │              │              │             │      │
│       └──────────────┴──────┬───────┴─────────────┘      │
│                             │                            │
│                    SHARED RESOURCES                       │
│                  (CPU, Memory, Disk I/O)                  │
│                             │                            │
│              ┌──────────────┴──────────────┐             │
│              │  🔴 NOISY NEIGHBOR EFFECT   │             │
│              │  Heavy Notifs query slows   │             │
│              │  down Repos, Users, Issues  │             │
│              └─────────────────────────────┘             │
└─────────────────────────────────────────────────────────┘

Common Mistakes & Misunderstandings

❌ "Monolithic databases are bad."
No. They're fine when you're small or medium-sized. The problem is when you scale to GitHub-level traffic. Don't prematurely shard — it adds enormous complexity.

❌ "Just buy a bigger server."
This is called "vertical scaling" and it has a ceiling. There's only so much CPU and RAM you can add to a single machine. And it doesn't solve the noisy neighbor problem — all domains still share the same resources.

4.2 — Phase 1: Virtual Partitioning (Logical Separation)

Simple Definition

Virtual partitioning means drawing invisible boundaries inside the existing database so that different groups of tables (schema domains) stop interacting with each other — even though they still physically live in the same database.

Why It Exists

You can't just rip tables out of a database and move them elsewhere if your application code is constantly doing queries that JOIN tables from different domains, or running transactions that span multiple domains.

If "Repository" code is doing a JOIN with "Notifications" tables, and you move the Notifications tables to a separate database, that JOIN will break. The application will crash.

So before physically moving anything, you need to guarantee that each domain is self-contained — it doesn't depend on tables from other domains.

How It Works — Step by Step

Step 1: Define Schema Domains

GitHub categorized their tables into logical groups:

┌─────────────────────────────────────────────┐
│          SCHEMA DOMAIN EXAMPLES             │
│                                             │
│  Domain: "Repositories"                     │
│  ├── repositories table                     │
│  ├── commits table                          │
│  ├── branches table                         │
│  └── pull_requests table                    │
│                                             │
│  Domain: "Users"                            │
│  ├── users table                            │
│  ├── profiles table                         │
│  └── sessions table                         │
│                                             │
│  Domain: "Notifications"                    │
│  ├── notifications table                    │
│  ├── email_preferences table                │
│  └── notification_logs table                │
│                                             │
│  Domain: "Issues"                           │
│  ├── issues table                           │
│  ├── comments table                         │
│  └── labels table                           │
└─────────────────────────────────────────────┘

Each domain contains only the tables that are tightly related to each other.

Step 2: Enforce Boundaries — Eliminate Cross-Domain Queries

Once domains were defined, GitHub's goal was: No query or transaction should touch tables from more than one domain.

Example of a violation (cross-domain join):

-- ❌ BAD: This query joins Repositories tables with Notifications tables
SELECT r.name, n.message
FROM repositories r
JOIN notifications n ON r.user_id = n.user_id
WHERE r.id = 12345;

This query reaches across two domains. If we later move Notifications to a separate database, this query will break.

Example of a clean query (single domain):

-- ✅ GOOD: This query only touches Repositories tables
SELECT r.name, b.branch_name
FROM repositories r
JOIN branches b ON r.id = b.repository_id
WHERE r.id = 12345;

Step 3: Use Query Linters in Dev/Test Environments

GitHub built a tool — a query linter — that automatically scanned every database query in the codebase during development and testing.

If a developer wrote a query that crossed domain boundaries, the linter would flag it with an error — just like a spell-checker flags a misspelled word.

┌─────────────────────────────────────────────────┐
│           QUERY LINTER IN ACTION                │
│                                                 │
│  Developer writes query ──► Linter checks it    │
│                                                 │
│  ┌──────────────────────────────┐               │
│  │ Query: SELECT * FROM repos  │               │
│  │   JOIN notifications ON ... │               │
│  └──────────┬───────────────────┘               │
│             │                                   │
│             ▼                                   │
│  ┌──────────────────────────────┐               │
│  │  🔴 LINTER ERROR:           │               │
│  │  "Cross-domain join          │               │
│  │   detected! repos domain     │               │
│  │   cannot join with           │               │
│  │   notifications domain."     │               │
│  └──────────────────────────────┘               │
│                                                 │
│  Developer must refactor the query              │
│  into two separate domain-specific queries.     │
└─────────────────────────────────────────────────┘

Step 4: Production Alerts for Cross-Domain Transactions

Even after linting, some cross-domain transactions might slip through to production (maybe from older code, or edge cases).

GitHub added monitoring in production that raised alerts when a transaction spanned multiple schema domains.

This didn't block the transaction (that would cause downtime), but it notified the team so they could fix it.

Think of it like a security camera: it doesn't physically stop the thief, but it records and alerts so you can respond.

Real-World Analogy

Imagine a company with one giant open-plan office. Marketing, Engineering, Sales, and HR all share the same space, same printer, same coffee machine.

Virtual Partitioning is like:

Defining zones: "Marketing sits in the east wing, Engineering in the west wing."
Making sure Marketing documents don't reference Engineering's internal files.
Installing a system that alerts if someone from Marketing accidentally uses Engineering's private printer.

Nobody has physically moved yet. Everyone is still in the same building. But the boundaries are now clear and enforced.

Common Mistakes & Misunderstandings

❌ "Virtual partitioning alone solves the scaling problem."
No. The data is still on the same physical server. You still have shared resources and the noisy neighbor problem. Virtual partitioning is a prerequisite for physical partitioning, not a substitute.

❌ "We can skip virtual partitioning and directly move tables."
Dangerous. If your code has cross-domain dependencies, moving tables will break your application. Virtual partitioning ensures you're safe to move.

4.3 — Phase 2: Physical Partitioning (The Actual Migration)

Simple Definition

Physical partitioning means actually moving a schema domain's tables from the original database cluster to a brand-new, independent database cluster — while the application is live and serving users.

Why It Exists

Virtual partitioning drew boundaries, but everything is still on the same server. The noisy neighbor problem persists. Physical partitioning gives each domain its own hardware resources — its own CPU, memory, and disk I/O.

After physical partitioning, the "Notifications" domain running a heavy query will only affect its own cluster, not the "Repositories" cluster.

How It Works — Step by Step (The 5-Step Process)

This is the core of GitHub's engineering achievement. Let's go step by step.

For clarity, let's say we're migrating the "Notifications" domain from the old cluster (Cluster A) to a new cluster (Cluster B).

Step 1: Take a Snapshot of the Domain's Tables from Cluster A

┌────────────────────────────────┐
│         CLUSTER A              │
│    (Original Database)         │
│                                │
│  ┌──────────┐ ┌──────────────┐ │
│  │ Repos    │ │ Notifications│ │  ──── Snapshot taken
│  │ Tables   │ │ Tables       │ │       of Notifications
│  └──────────┘ └──────┬───────┘ │       tables only
│                      │         │
└──────────────────────┼─────────┘
                       │
                       ▼
              ┌────────────────┐
              │   SNAPSHOT     │
              │ (Point-in-time │
              │  copy of       │
              │  Notifications │
              │  data)         │
              └────────────────┘

A snapshot is like taking a photograph of the Notifications tables at a specific moment in time. It captures all the rows, all the data — as it exists right now.

Important: While the snapshot is being taken, Cluster A continues serving traffic normally. No downtime.

Step 2: Load the Snapshot into Cluster B

The snapshot is loaded into the new Cluster B — including its primary and replica servers.

              ┌────────────────┐
              │   SNAPSHOT     │
              └───────┬────────┘
                      │
                      ▼
         ┌────────────────────────┐
         │       CLUSTER B        │
         │   (New Database)       │
         │                        │
         │  ┌──────────────────┐  │
         │  │  Notifications   │  │
         │  │  Tables          │  │
         │  │  (from snapshot) │  │
         │  └──────────────────┘  │
         │                        │
         │  Primary + Replicas    │
         └────────────────────────┘

At this point, Cluster B has the Notifications data, but it's frozen at the time the snapshot was taken. Any changes that happened on Cluster A after the snapshot are not yet in Cluster B.

Step 3: Set Up Replication from Cluster A → Cluster B

Now GitHub sets up live replication from Cluster A's primary to Cluster B's primary. This means any new changes to Notifications data on Cluster A are automatically streamed to Cluster B.

┌────────────────────┐         ┌────────────────────┐
│     CLUSTER A      │         │     CLUSTER B      │
│    (Original)      │         │      (New)         │
│                    │         │                    │
│  ┌──────────────┐  │ ─────── │  ┌──────────────┐  │
│  │   Primary    │──┼─Replication──│   Primary    │  │
│  └──────────────┘  │  ─────► │  └──────────────┘  │
│                    │         │                    │
│  All live changes  │         │  Receiving changes │
│  happen here       │         │  in real-time      │
└────────────────────┘         └────────────────────┘

Cluster B is essentially a replica of Cluster A — but only for the Notifications tables. It's catching up on all the changes that happened since the snapshot.

Step 4: Redirect Traffic via ProxySQL (But Still to Cluster A)

Here's the clever part.

GitHub now updates the application to route all Notifications-related queries through Cluster B's ProxySQL. But — and this is key — that ProxySQL is configured to forward all queries back to Cluster A.

Why? Because Cluster B might not be fully caught up yet. You can't serve queries from Cluster B if it's behind Cluster A.

┌──────────┐     ┌──────────────────┐     ┌────────────┐
│          │     │   CLUSTER B's    │     │            │
│  App     │────►│   ProxySQL       │────►│  CLUSTER A │
│  Server  │     │                  │     │  (still    │
│          │     │  (Middleman -    │     │   serving  │
│          │     │   routes to A    │     │   data)    │
│          │     │   for now)       │     │            │
└──────────┘     └──────────────────┘     └────────────┘

This is like changing the receptionist — the new receptionist (Cluster B's ProxySQL) is sitting at the front desk, but for now, she's forwarding all patients to the old hospital. The patients (application) don't notice any difference.

Why do this intermediate step?

Because when it's time for the final cut-over, the application is already talking to Cluster B's ProxySQL. You only need to change ProxySQL's routing from "forward to A" → "serve from B directly." No application code changes needed.

Step 5: The Cut-Over (The Big Moment)

This is the most critical step. Let's break it down into micro-steps:

┌──────────────────────────────────────────────────────────┐
│                   CUT-OVER PROCESS                       │
│                                                          │
│  ① Check replication lag between Cluster A & B           │
│     └── Must be < 1 second                               │
│                                                          │
│  ② Temporarily BLOCK all requests                        │
│     └── ProxySQL holds all incoming queries briefly      │
│     └── This ensures no new data is written to A         │
│                                                          │
│  ③ Wait for Cluster B to FULLY synchronize               │
│     └── Cluster B processes the last remaining           │
│         replicated changes from A                        │
│                                                          │
│  ④ STOP replication from Cluster A                       │
│     └── Cluster B is now independent                     │
│                                                          │
│  ⑤ Update ProxySQL routing                               │
│     └── Route traffic DIRECTLY to Cluster B's primary    │
│                                                          │
│  ⑥ UNBLOCK all requests                                  │
│     └── Traffic now flows to Cluster B                   │
│                                                          │
│  Total time: < 100 milliseconds                          │
└──────────────────────────────────────────────────────────┘

Let me walk through each micro-step:

① Check Replication Lag:
Before initiating the cut-over, GitHub monitors the replication lag. The cut-over only begins when the lag is less than 1 second — meaning Cluster B is almost perfectly in sync with Cluster A.

② Temporarily Block All Requests:
ProxySQL briefly holds all incoming queries. No queries reach either Cluster A or Cluster B. This creates a tiny window where no new writes happen.

Think of it like a traffic cop stopping all cars at an intersection for 2 seconds to let an ambulance pass.

③ Wait for Full Synchronization:
In this blocked state (which lasts milliseconds), Cluster B processes the final remaining replication data from Cluster A. After this, Cluster B has 100% of the Notifications data — exactly matching Cluster A.

④ Stop Replication:
The replication link from A to B is cut. Cluster B is now a standalone, independent cluster. It no longer needs Cluster A.

⑤ Update ProxySQL Routing:
ProxySQL's configuration is updated: instead of forwarding queries to Cluster A, it now sends them directly to Cluster B's primary.

⑥ Unblock Requests:
All the held queries are released. They flow to Cluster B, which is now the authoritative database for Notifications.

The entire process — from blocking requests to unblocking — takes less than 100 milliseconds. That's 0.1 seconds. A human blink takes 300-400 milliseconds. The "downtime" is shorter than a blink.

The Complete Physical Partitioning Flow (Combined Diagram)

┌─────────────────────────────────────────────────────────────────┐
│             PHYSICAL PARTITIONING - COMPLETE FLOW               │
│                                                                 │
│  Step 1: Snapshot                                               │
│  ┌──────────┐                                                   │
│  │Cluster A │──── Take snapshot of ────► ┌──────────┐           │
│  │(Original)│     Notifications tables   │ Snapshot │           │
│  └──────────┘                            └────┬─────┘           │
│                                               │                 │
│  Step 2: Load snapshot into Cluster B         │                 │
│                                               ▼                 │
│                                          ┌──────────┐           │
│                                          │Cluster B │           │
│                                          │  (New)   │           │
│                                          └────┬─────┘           │
│                                               │                 │
│  Step 3: Set up replication A → B             │                 │
│  ┌──────────┐                            ┌────┴─────┐           │
│  │Cluster A │════ Replication Stream ═══►│Cluster B │           │
│  │ Primary  │                            │ Primary  │           │
│  └──────────┘                            └────┬─────┘           │
│                                               │                 │
│  Step 4: Redirect app → Cluster B's ProxySQL  │                 │
│  ┌──────┐    ┌──────────────┐            ┌────┴─────┐           │
│  │ App  │───►│ B's ProxySQL │───────────►│Cluster A │           │
│  └──────┘    └──────────────┘  (still    └──────────┘           │
│                                 routes                          │
│                                 to A)                           │
│                                                                 │
│  Step 5: Cut-over (< 100ms)                                    │
│  ┌──────┐    ┌──────────────┐            ┌──────────┐           │
│  │ App  │───►│ B's ProxySQL │───────────►│Cluster B │  ✅       │
│  └──────┘    └──────────────┘  (now      └──────────┘           │
│                                 routes                          │
│                                 to B!)                          │
│                                                                 │
│              Cluster A no longer handles Notifications.         │
│              Cluster B is fully independent.                    │
└─────────────────────────────────────────────────────────────────┘

Real-World Analogy (Complete)

Let's use the hospital analogy one final time.

The Hospital Analogy:

Snapshot: You photocopy all Cardiology patient records from the main hospital's filing cabinet.
Load into new clinic: You bring the photocopies to the new Heart Clinic across the street and set up their filing cabinet.
Replication: You set up a live fax machine between the main hospital and the Heart Clinic. Any new Cardiology records added to the main hospital are automatically faxed to the Heart Clinic.
Redirect reception: You tell the main hospital's receptionist: "When a Cardiology patient comes in, send them to the Heart Clinic's receptionist." But the Heart Clinic's receptionist, for now, sends them back to the main hospital (because the Heart Clinic isn't fully set up yet).
Cut-over: The Heart Clinic is fully caught up. In one swift moment:
- The receptionist stops accepting patients for 0.1 seconds
- The fax machine confirms all records are synced
- The fax line is disconnected
- The Heart Clinic's receptionist starts directing patients to the Heart Clinic's own doctors
- Patients resume flowing — now to the Heart Clinic directly

Nobody in the waiting room even noticed the switch.

Common Mistakes & Misunderstandings

❌ "The cut-over requires the app to be shut down."
No. ProxySQL acts as a middleman. The app doesn't know or care whether queries go to Cluster A or B. ProxySQL handles the routing transparently.

❌ "100 milliseconds sounds too good to be true."
It's achievable because most of the work is done before the cut-over. The snapshot, loading, replication, and catch-up happen over hours or days. The cut-over itself is just: block → sync the last few transactions → switch routing → unblock.

❌ "What about data loss?"
The blocking step ensures no new writes happen during the switch. The synchronization step ensures Cluster B has 100% of the data. There is no window where data can be lost.

❌ "Can you do this for all domains at once?"
GitHub did it one domain at a time. Migrating everything at once would be too risky. Each domain was virtually partitioned, validated, and then physically migrated independently.

5. Visual Explanation Section — Summary of All Diagrams

Here's a consolidated list of all diagrams you should create/reference for this blog:

#	Diagram Type	What It Shows
1	Two-tier vertical flowchart	Phase 1 (Virtual) → Phase 2 (Physical), overall strategy
2	Shared resource diagram	All domains in one database, noisy neighbor effect
3	Colored regions within one box	Schema domains drawn inside the monolithic database
4	CI/CD flowchart	Query linter catching cross-domain joins in dev/test
5	Two-box replication diagram	Cluster A primary → Cluster B primary, replication arrow
6	Timeline diagram	The cut-over micro-steps, with the <100ms window highlighted
7	5-panel sequential diagram	Complete physical migration flow (most important visual)
8	Before/After architecture	Monolith → multiple independent clusters (final state)

BEFORE                                    AFTER
──────                                    ─────

┌──────────┐                    ┌──────────┐   ┌──────────┐
│   App    │                    │   App    │   │   App    │
└────┬─────┘                    └──┬───┬───┘   └──┬───┬───┘
     │                             │   │          │   │
     ▼                             ▼   │          │   ▼
┌──────────────┐          ┌────────┐   │   ┌──────┴──┐
│  ONE GIANT   │          │ProxySQL│   │   │ProxySQL │
│  DATABASE    │          │  (A)   │   │   │  (B)    │
│              │          └───┬────┘   │   └───┬─────┘
│ ┌────┐┌────┐│               ▼       │       ▼
│ │Repo││User││          ┌────────┐   │  ┌────────┐
│ └────┘└────┘│          │Cluster │   │  │Cluster │
│ ┌────┐┌────┐│          │  for   │   │  │  for   │
│ │Noti││Issu││          │ Repos  │   │  │ Notifs │
│ └────┘└────┘│          └────────┘   │  └────────┘
└──────────────┘                      │
                               ┌──────┴──┐
                               │ProxySQL │
                               │  (C)    │
                               └───┬─────┘
                                   ▼
                              ┌────────┐
                              │Cluster │
                              │  for   │
                              │ Users  │
                              └────────┘

6. Realistic Use Cases

Where This Strategy Is Actually Used

GitHub's approach isn't unique to GitHub. The pattern of virtual partitioning → physical partitioning is used widely in the industry:

Company	Use Case
GitHub	Sharded monolithic MySQL to isolate repos, users, notifications, etc.
Shopify	Sharded their monolithic MySQL database to handle millions of merchants. They built a similar tool called "Ghostferry" for live data migration.
Pinterest	Moved from a monolithic MySQL to sharded clusters as they scaled to billions of pins.
Instagram	Migrated from a monolithic PostgreSQL database to sharded PostgreSQL as they grew beyond 1 billion users.
Any growing startup	When you hit the scaling ceiling of a single database, this is the playbook.

Why Companies Care

Reliability: One domain's failure doesn't bring down the entire platform.
Performance: Each domain gets dedicated resources — no more noisy neighbors.
Independent scaling: The "Repositories" domain needs more powerful hardware? Scale just that cluster, not the entire database.
Team independence: The Notifications team can deploy changes to their database without coordinating with the Repositories team.

7. Connecting All Concepts Together

Let's zoom out and see how everything connects into one coherent system.

┌──────────────────────────────────────────────────────────────┐
│                   THE COMPLETE PICTURE                        │
│                                                              │
│  1. GitHub starts with ONE big MySQL database (Monolith)     │
│                          │                                   │
│                          ▼                                   │
│  2. Problems emerge: Query overload + Noisy neighbors        │
│                          │                                   │
│                          ▼                                   │
│  3. Decision: SHARD the database                             │
│                          │                                   │
│                ┌─────────┴──────────┐                        │
│                ▼                    ▼                         │
│  4. PHASE 1: Virtual         5. PHASE 2: Physical            │
│     Partitioning                Partitioning                 │
│     ┌─────────────────┐    ┌──────────────────────┐          │
│     │• Define domains │    │• Snapshot tables     │          │
│     │• Kill cross-    │    │• Load into new       │          │
│     │  domain queries │    │  cluster             │          │
│     │• Add linters    │    │• Set up replication  │          │
│     │• Add production │    │• Redirect via        │          │
│     │  alerts         │    │  ProxySQL            │          │
│     │                 │    │• Cut-over (<100ms)   │          │
│     │ (Prerequisite   │    │                      │          │
│     │  for Phase 2)   │    │ (Repeat per domain)  │          │
│     └─────────────────┘    └──────────────────────┘          │
│                                      │                       │
│                                      ▼                       │
│  6. Result: Multiple independent database clusters,          │
│     each serving one domain, zero visible downtime,          │
│     no noisy neighbors, independently scalable.              │
└──────────────────────────────────────────────────────────────┘

In Definition:

GitHub had one giant database that was struggling. They couldn't just split it randomly — their code had queries reaching across different parts of the database. So first, they drew invisible boundaries (virtual partitioning) and made sure nothing crossed them. Once a domain was cleanly isolated, they physically moved it to its own dedicated server using a snapshot-replicate-cutover strategy that caused less than 100ms of disruption. They repeated this process for each domain, one at a time, until the monolith was fully broken up.

8. Final Summary — Professor Style

Let me recap what we've learned today. If you remember nothing else, remember these five things:

🔑 Key Takeaways

Monolithic databases are a fine starting point, but they eventually become bottlenecks as you scale — both in performance (query volume) and in isolation (noisy neighbor problem).
Sharding is the solution, but you can't just rip tables apart. You need a disciplined approach.
Virtual Partitioning comes first. Before you physically move data, you must logically isolate each schema domain. Kill cross-domain queries. Kill cross-domain transactions. Use linters and monitoring to enforce this.
Physical Partitioning is the actual migration. It follows a careful 5-step process: Snapshot → Load → Replicate → Redirect → Cut-over. The magic is in ProxySQL (the middleman that makes the switch invisible to the application).
The cut-over takes less than 100ms. This is possible because all the heavy lifting (snapshot, replication, catch-up) happens before the cut-over. The cut-over itself is just: block briefly → sync → switch → unblock.

How This Knowledge Helps You

If you're a backend developer, you now understand one of the most complex database operations companies face. This is senior-level knowledge.
If you're preparing for system design interviews, database sharding and zero-downtime migrations are frequently asked topics. You now have a concrete, real-world example to reference.
If you're a startup developer, you know the roadmap: start with a monolith, and when you outgrow it, follow this phased approach.

9. When NOT to Use This Approach & Trade-Offs

When NOT to Shard

Situation	Why Sharding is Overkill
Your database is under 100GB	A single well-tuned server can handle this easily.
You have fewer than 1,000 queries/second	You're not at the scale where this matters.
Your team is small (< 5 engineers)	Sharding adds massive operational complexity. Managing multiple clusters requires dedicated infrastructure engineers.
You haven't tried simpler solutions first	Read replicas, query optimization, caching (Redis/Memcached), and connection pooling can often delay the need for sharding by years.

Trade-Offs of Sharding

Benefit	Cost
Better performance per domain	More clusters to manage, monitor, and backup
No noisy neighbors	Cross-domain queries become impossible — you must use application-level joins or event-driven architecture
Independent scaling	Operational complexity increases significantly
Team independence	Need robust tooling (ProxySQL, linters, monitoring)
Higher availability	More potential points of failure (more clusters = more servers that can go down)

Beginner Learning Roadmap

If this topic excites you, here's what to learn next:

1. SQL fundamentals (JOINs, transactions, indexes)
        │
        ▼
2. Database replication (primary-replica architecture)
        │
        ▼
3. Read replicas and load balancing
        │
        ▼
4. Database sharding strategies (horizontal vs. vertical)
        │
        ▼
5. Proxy layers (ProxySQL, PgBouncer, Vitess)
        │
        ▼
6. Zero-downtime migration patterns
        │
        ▼
7. Distributed systems fundamentals (CAP theorem, consistency)

Wrapping Up

What GitHub achieved is genuinely impressive engineering. They didn't shut down a platform used by 100+ million developers. They didn't lose a single byte of data. They didn't introduce hours of downtime.

They drew boundaries. They enforced discipline. They moved carefully, one domain at a time. And the result? A database architecture that can scale with them for the next decade.

The next time someone tells you "just shard the database," you'll know it's not that simple — and you'll know exactly what it takes.

Thanks for reading. If this blog helped you understand database sharding, consider bookmarking it and sharing it with someone who's preparing for system design interviews. This is the kind of knowledge that separates good engineers from great ones.

Message Queues in Node.js

Vivek Upadhyay — Wed, 01 May 2024 21:02:45 +0000

Introduction

In today's world of apps, we often need to handle lots of data and do tricky tasks. But one big challenge is making different parts of the app talk to each other without slowing down. That's where Message Queues come in super handy.

What's a Queue?

Okay, think of a queue like a line at the movies. You know, first come, first served. In tech talk, it's called FIFO. Now, in the world of Message Queues, it's like a middleman. It holds messages until they're ready to be used by the right person. People put messages at the back, and others take them from the front. Simple, right?

Important Message Queue Stuff

1. Sender: This is like the person who sends a text. They're called the producer.

2. Message: It's just the stuff you're sending, like a WhatsApp message or an email.

3. Queue: This is where messages hang out until someone reads them.

4. Receiver: Think of them like the person getting your message. They're called the consumer.

5. Broker: This is like the manager of the whole message system.

Cool Ways We Use Queues

Help with Busy Times: Queues are great when too many messages come at once. They keep things organized so nobody gets overwhelmed.
Doing Big Tasks Quietly: They're perfect for jobs that take a long time or need lots of computer power. They quietly work in the background, so you can keep using your app without any interruptions.
Remembering Important Stuff: In apps that react to events, like a change in your account balance, queues store those events until they're needed.
Taking Care of Orders: If you're running an online shop, queues make sure orders get processed in the right order, without any mix-ups.
Retry Mechanism : In the context of an online store, queues ensure that orders are processed correctly and in the correct sequence, minimizing errors and avoiding any confusion.

What's a Dead Letter Queue (DLQ)?

Okay, imagine you're sending a letter, but the address is wrong, or the mailman can't deliver it. That's where the Dead Letter Queue comes in. It's like a lost and found for messages that couldn't get where they were supposed to go. It helps us figure out what went wrong and sometimes gives messages a second chance to get delivered.

How We Do Message Queues with BullMQ

Step 0: Connect With Redis Here I am using Docker
Install Docker: Ensure Docker is installed on your system. You can download and install Docker from the official Docker website: https://www.docker.com/get-started.
Run Redis Container: Start a Redis container using Docker. You can do this by running the following command in your terminal or command prompt:
bash

docker run -d --name redis-container -p 6379:6379 redis
This command will download the Redis image if it's not already available locally and start a Redis container named redis-container on port 6379.

Connect BullMQ to Redis: In your Node.js application, you can connect BullMQ to the Redis container using the Redis connection string. Here's an example of how you can do this:
javascript

const { Queue } = require('bullmq');

// Create a new queue instance connected to Redis
const myQueue = new Queue('myQueue', {
  connection: {
    host: 'localhost',
    port: 6379 // Default Redis port
  }
});

// Now you can use myQueue to send and process messages
In the above code:

We import the Queue class from BullMQ.
We create a new queue instance named myQueue and pass an object
with connection details, specifying the host as localhost and port as 6379, which is the default port for Redis.

Step 1: Send a Message
You start by sending a message to the queue. It's like dropping a letter in a mailbox.

const { Queue } = require('bullmq');

// Create a new queue instance
const myQueue = new Queue('myQueue');

// Send a message to the queue
async function sendMessage(message) {
  try {
    await myQueue.add(message);
    console.log('Message sent successfully:', message);
  } catch (error) {
    console.error('Error sending message:', error);
  }
}

// Call the sendMessage function with your message
sendMessage({ text: 'Hello, world!' });

We import the Queue class from the 'bullmq' package.
We create a new instance of the Queue class named myQueue.
We define an asynchronous function sendMessage that takes a message as input.
Inside the sendMessage function, we use the add method of the queue instance to add the message to the queue.
If the message is sent successfully, we log a success message to the console. Otherwise, we log an error message.

Step 2: Process the Message
Then, you've got a worker who takes the message from the queue and does something with it, like delivering a letter.

const { Worker, Queue } = require('bullmq');

// Create a new queue instance
const myQueue = new Queue('myQueue');

// Define a function to process messages
async function processMessage(job) {
  console.log('Processing message:', job.data);
  // Do something with the message, like delivering it
  // For demonstration purposes, let's just log the message data
  console.log('Message delivered successfully:', job.data);
}

// Create a new worker to process messages from the queue
const worker = new Worker('myQueue', processMessage);

// Start the worker
worker.on('completed', (job) => {
  console.log('Message processing completed:', job.data);
});

worker.on('failed', (job, err) => {
  console.error('Message processing failed:', err);
});

We import both the Worker and Queue classes from the 'bullmq' package.
We define an asynchronous function processMessage that takes a job (message) as input.
Inside the processMessage function, we log the message data to the console to simulate message processing.
We create a new worker instance named worker that listens to the myQueue queue and processes messages using the processMessage function.
We attach event listeners to the worker to handle completion and failure events
.
Wrapping Up

Using Message Queues in Node.js isn't rocket science. It's like having a smart messenger that helps apps talk to each other without any fuss. By understanding how queues work and where they come in handy, we can make our apps run smoother and handle tasks like champs!

Demystifying Proxy Servers: Understanding Forward, Reverse, and Proxy Servers

Vivek Upadhyay — Wed, 03 Apr 2024 09:33:48 +0000

Introduction:
Hey there! Have you ever come across the term "proxy" while browsing the internet? It's like having a trusty companion that enhances your online experience. But what exactly is a proxy, and how does it function? Today, let's delve into this topic in a laid-back manner, just like two pals having a chat. 😊 We'll explore forward, reverse, and proxy servers using relatable examples from our daily lives. So, grab a cup of chai ☕, and let's embark on a journey into the realm of proxy servers!

Forward and reverse proxy Image

Understanding Proxy Servers

What is a Proxy Server?
Imagine you're at a grand wedding feast 🎉, and you're feeling a bit hesitant to head directly to the food counter. Instead, you ask your friend to fetch some food for you. Well, a proxy server operates in a similar fashion on the internet.

Whenever you wish to access a website, your request first goes to the proxy server. It then retrieves the website on your behalf and brings it back to you, ensuring a safe and swift internet browsing experience.

Real-world Examples of Proxy Servers
In settings like schools or offices, there are times when access to certain websites is restricted. This restriction is usually enforced by a proxy server, acting as a guardian to ensure focus on tasks at hand. Additionally, suppose you're keen on watching a movie or playing a game that's not available in India. In that case, you can utilize a proxy server to pretend you're accessing the content from another country, granting you access. These instances demonstrate how proxy servers aid us in navigating the vastness of the internet.

Exploring Forward Proxies

Definition and Role of Forward Proxies
Now, let's zone in on a specific type of proxy server known as a forward proxy. Think of it as a buddy who assists you in communicating with others without revealing your identity. When you intend to visit a website, your request passes through the forward proxy. This proxy conceals your identity and proceeds to make the request on your behalf, safeguarding your privacy in the online realm.

How Forward Proxies Work
Curious about the inner workings of a forward proxy? Well, when you initiate a request to visit a website, it first reaches the forward proxy. This proxy then conducts safety checks to ensure your request is legitimate. If everything checks out, the forward proxy forwards your request to the desired website on your behalf. The website, in turn, perceives the request as originating from the proxy server, thereby shielding your identity and fostering a secure internet browsing experience.

Diving into Reverse Proxies

Define Reverse Proxies and Their Functions
Reverse proxies operate in a distinct manner compared to their forward counterparts. Instead of facilitating communication for users, they serve as intermediaries between websites and users. Essentially, they stand as gatekeepers for websites, managing incoming requests and directing them to the appropriate destination. This function enhances the safety and efficiency of websites by ensuring incoming traffic is managed effectively.

Practical Examples of Reverse Proxies
Imagine a popular e-commerce website hosting a massive sale event. With a surge in traffic, the website could easily become overwhelmed. However, with the assistance of reverse proxies, the load is distributed among various servers, preventing crashes and ensuring a seamless shopping experience for users. Additionally, reverse proxies contribute to security measures by encrypting data, safeguarding it during transmission.

Comparing Forward and Reverse Proxies

Differentiating Between Forward and Reverse Proxies
Forward proxies primarily cater to users, shielding their identities and facilitating safe website access. On the other hand, reverse proxies operate on behalf of websites, ensuring their safety and efficiency for users. Both types of proxies play crucial roles in maintaining the smooth operation of the internet.

Use Cases for Forward vs. Reverse Proxies
When it comes to maintaining anonymity or accessing geo-restricted content, forward proxies are your go-to solution. Conversely, if you're managing a website and aiming to enhance its safety and speed, reverse proxies step in to assist. Both serve as indispensable tools, each contributing to making the internet a safer and more accessible space.

Under the Hood: How Proxies Operate

Mechanisms Behind Proxy Server Functionality
Proxy servers function as intermediaries between users and websites, ensuring safe and efficient data transmission. To illustrate their operation, we can utilize Unified Modeling Language (UML) diagrams showcasing the request flow through a forward proxy and the architecture of a reverse proxy setup. Additionally, mechanisms such as caching, compression, and encryption enhance the performance and security of proxy servers, ensuring a seamless internet experience for users.

Hope you enjoy the journey into the realm of proxy servers! 😄🚀 If you found this article helpful and informative, don't forget to give it a thumbs up 👍 and share it on your social media accounts to spread the knowledge! Let's make the internet a safer and more accessible space together! 🌐

Revolutionize Your React Debugging with Locator.js: A Developer's Guide

Vivek Upadhyay — Sat, 02 Mar 2024 20:59:02 +0000

Hey there, React devs! Are you tired of the same old routine of debugging your React apps? Do you find yourself constantly hopping between the dev console and your code, trying to figure out what's going wrong? Well, fret no more! I've got something super cool to share with you that will totally change the way you debug your React apps. Say hello to Locator.js! 🎉

What's the Deal with Locator.js? 🕵️‍♂️

So, what exactly is Locator.js, you ask? It's a nifty little web extension designed specifically for Visual Studio Code (VS Code) that takes your debugging game to a whole new level. No more squinting at console logs or endlessly scrolling through your code trying to find that pesky bug. With Locator.js, debugging becomes a breeze! 💨

Why Say Goodbye to React Dev Tools? 🤔

Now, you might be wondering, "But isn't React Dev Tools good enough?" Sure, it's a classic tool that we all know and love. But let's face it, sometimes you need a little extra firepower, especially when dealing with those massive codebases. That's where Locator.js comes in handy. 🛠️

Getting Started with Locator.js 🚀

Ready to give Locator.js a spin? Here's how you can get started in just a few simple steps:

Install Locator.js Extension: Head over to the Chrome Web Store and search for "Locator.js" or simply follow this link.
Custom Link Setup: Click on the custom link and paste this URL into it: vscode://file/${projectPath}${filePath}:${line}:${column}.
You're All Set!: Voila! You're now ready to supercharge your debugging workflow with Locator.js.

One Click Away from Magic 🪄

But wait, there's more! Hold down "Alt" and click, and you'll be whisked away to VS Code, right to the specific component and line number where the magic happens. Talk about efficiency! ⚡

💡 Extra Features of Locator.js:

What it Offers? 💫

Click on a Component to Go to Its Code: Easily navigate to the code of any component by simply clicking on it in the browser.
Use as a Browser Extension or Library: Whether you prefer using it as a browser extension or integrating it as a library, Locator.js has you covered.
Speed Up Your Web Development: With Locator.js, speed up your daily workflow and find anything faster than ever before.
Find Any Component Faster: Don't know every corner of your codebase? Locator.js helps you find any component faster than ever before.

How Locator.js Works Behind the Scenes? 🕵️‍♂️

Locator.js works with any editor that supports protocol URL handlers. It utilizes predefined link templates for VSCode, Webstorm, and Atom, with customization options for other editors. It leverages the same API as React Developer Tools to gather information about the component's original position in the codebase.

Purpose of Locator.js? 🎯

Locator.js solves the simple problem of quickly finding components within a React web app's codebase. It speeds up the development process by eliminating the need for manual searching or copying and pasting. Whether you're new to the codebase or just need a faster way to navigate, Locator.js has got you covered!

Why Wait? Try Locator.js Today! 🌟

So, what are you waiting for? Give Locator.js a spin and turbocharge your debugging workflow today! Say goodbye to the old-school methods and embrace the future of React debugging. Happy coding, folks! ✨

FAQs (Frequently Asked Questions) 🤔

1. Is Locator.js free to use?

Yes, Locator.js is completely free to use. Simply install the extension from the Chrome Web Store, and you're good to go!

2. Does Locator.js work with all React applications?

Yes, Locator.js is compatible with all React applications, regardless of size or complexity.

3. Is Locator.js beginner-friendly?

Absolutely! Locator.js is designed to be user-friendly and intuitive, making it suitable for developers of all skill levels.

4. Can I contribute to Locator.js?

Certainly! Locator.js is an open-source project, and contributions are always welcome. Check out the GitHub repository for more information on how you can get involved.

Integrating Google Drive Image/Vedios into Your React App

Vivek Upadhyay — Sun, 25 Feb 2024 10:02:45 +0000

Step 1: Uploading a File/Image to Google Drive

The first step in integrating Google Drive into your React app is uploading a file or image to your Google Drive. To do this, follow these simple instructions:

Log in to your Google Drive account.

Click the "+ New" button and select "File upload" or "Folder upload" to upload your file or image.

Once the upload is complete, your file will be available in your Google Drive.

Step 2: Making a Public URL/Shareable Link

To make the uploaded file or image accessible via a public URL, you'll need to generate a shareable link:

Right-click on the uploaded file in Google Drive.

Select "Get shareable link."

In the sharing settings, set the link sharing to "Anyone with the link" or "Public."

Copy the generated shareable link.

Step 3: Copy the Image ID

The shareable link you copied in Step 2 contains the file's unique ID, which you'll need to extract. Here's how:

Examine the copied link; it will look something like:

"https://drive.google.com/file/d/
1BE-WrnRJGeXzDSHQuGVIO7d6Xw3dz1Wq/view."

The file ID is the long string between /d/ and /view, in this case, 1BEWrnRJGeXzDSHQuGVIO7d6Xw3dz1Wq.

Copy this file ID as you'll use it in the next step.

Step 4: Embed the File/Image in Your React App

Now that you have the file ID, you can embed the file or image into your React application. Here's the syntax to use in this case i use image same step follows for videos also :

<img src="https://drive.google.com/thumbnail?id={Enter Your ID}&sz=w1000" alt=""/>

Replace YOUR_FILE_ID with the file ID you copied in Step 3.

Remove the Curly Brackets also so Final url something look like :-

Step 5: Using the Embedded File in React (2024 Updated )

You can now use the provided syntax to display the file or image in your React application. Simply insert the code wherever you want the file or image to appear in your app:

<img src="https://drive.google.com/thumbnail?id=1BE-WrnRJGeXzDSQuGVIO7d6Xw3dz1Wq&sz=w1000" alt="None"/>

Make sure to replace YOUR_FILE_ID with the actual file ID you obtained in Step 3.

Conclusion

By following these six steps, you can seamlessly integrate Google Drive into your React app. This allows you to upload, share, and display files and images from your Google Drive in your application, enhancing its functionality and user experience. Enjoy the benefits of cloud-based file management within your React app!

My first blog

Vivek Upadhyay — Wed, 20 Sep 2023 19:16:40 +0000

Kuch bhin