Piyush Gupta

Posted on Apr 5

B-Trees, Clustered Indexes, and the OLAP Revolution (Part 2) 📊

#database #sql #bigdata #systemdesign

In Part 1, we looked at LSM Trees—the write-heavy champions found in NoSQL databases. But if you’re using PostgreSQL, MySQL, or Oracle, you’re likely interacting with a different beast: the B-Tree.

Today, we’ll explore why B-Trees still dominate the relational world and how the "Big Data" era forced us to rethink how we store rows entirely.

1. The King of RDBMS: B-Trees
2. Clustered vs. Non-Clustered Indexes
3. OLTP vs. OLAP: The Great Divide
4. Why Column-Oriented Storage Wins at Scale

1. The King of RDBMS: B-Trees

B-Trees are the most widely used indexing structure in history. Unlike the variable-size segments in LSM Trees, B-Trees break the database down into fixed-size pages (usually 4KB to 16KB).

How it Works:

A B-Tree is a balanced tree where each node contains multiple keys and pointers to child pages.

The Root: The entry point for every query.
Leaf Nodes: These contain the actual data or a reference to where the data lives on disk.

Because the tree is always balanced, a B-Tree with a branching factor of 500 and just 4 levels can store 256 billion rows! This makes lookups incredibly consistent at $O(\log n)$.

B-Trees vs. LSM Trees: The Trade-off

Feature	B-Trees	LSM Trees
Best for	Read-heavy workloads	Write-heavy workloads
Fragmentation	High (due to empty page space)	Low (sequential background merges)
Throughput	Lower (multiple disk seeks)	Higher (sequential appends)

2. Clustered vs. Non-Clustered Indexes

Where does the actual row live? This is a common interview question that boils down to index design:

Clustered Index: The index is the data. The leaf nodes of the B-Tree contain the actual row values. You can only have one clustered index per table (usually the Primary Key).
Non-Clustered (Secondary): The index contains a "pointer" (like a Row ID) to the data's location. This allows for multiple indexes but requires an extra "hop" to fetch the full row.

3. OLTP vs. OLAP: The Great Divide

Most web developers spend their time in OLTP (Online Transaction Processing). You handle thousands of small queries: "Update this user’s bio" or "Add this item to the cart."

However, businesses eventually need OLAP (Online Analytical Processing). This involves massive aggregate queries like: "What was the total revenue in Q4 across all Asian markets?"

Running these on your production database will cause it to crawl. Instead, we move data to a Data Warehouse using ETL (Extract, Transform, Load).

4. Why Column-Oriented Storage Wins at Scale

Traditional databases store data Row-by-Row. To calculate an average age, the DB loads the entire row (Name, Email, Bio, Password) just to access one tiny "Age" integer.

Column-Oriented Storage (BigQuery, Snowflake, ClickHouse) stores each column in its own file.

The Code Reality:

Imagine a table with 100 columns and 1 billion rows.

-- Row-oriented: Reads 100 columns from disk per row.
SELECT AVG(age) FROM users;

-- Column-oriented: Reads ONLY the 'age' file. 
-- It ignores the other 99 columns completely.

By only reading the necessary bytes, analytical queries that used to take hours now take seconds. Furthermore, because column data is often repetitive (e.g., many users living in the same "City"), these files compress significantly better than row-based data.

💬 Engineering Challenge

Most modern architectures use Polyglot Persistence—an RDBMS (B-Trees) for user transactions and a Data Warehouse (Columnar) for analytics.

Join the conversation:

Have you ever crashed a production DB by running a heavy "Analytical" query during peak hours?
What’s your "Big Data" tool of choice—Snowflake, BigQuery, or maybe self-hosted ClickHouse?

Drop a comment below with your horror stories or favorite setups! 🛠️✨

DEV Community