When it comes to efficient data storage and retrieval, B-Trees and B+Trees are the backbone of databases, file systems, and indexing engines. In this article, we will explore these tree structures in full detail, including all functionalities, C implementation, and practical usage. No shortcuts, no skipped steps—everything explained from the ground up.
📌 What is a B-Tree?
A B-Tree is a self-balancing search tree in which nodes can have multiple keys and children. It maintains sorted data and allows logarithmic time search, insertion, and deletion.
Key properties of a B-Tree of minimum degree T:
- Every node contains at most
2*T - 1keys. - Every non-leaf node has at most
2*Tchildren. - Every node (except root) has at least
T - 1keys. - All leaves appear at the same level.
- Keys in a node are always sorted in ascending order.
B-Trees are widely used in databases like MySQL, PostgreSQL, and filesystems like NTFS and FAT.
📌 What is a B+Tree?
A B+Tree is an enhanced variant of a B-Tree. The main differences are:
- Internal nodes store only keys, not actual data (pointers to children only).
- Leaf nodes contain all data and are linked as a linked list for sequential access.
- Search is faster for range queries because leaves are linked.
Key properties:
- Maintains sorted order of keys.
- Efficient for both point lookups and range queries.
- Widely used in database indexing and key-value storage systems.
🛠️ B-Tree Implementation in C
Here’s a step-by-step C implementation of a B-Tree with all functionalities.
1️⃣ Node Structure
#include <stdio.h>
#include <stdlib.h>
#define T 2 // Minimum degree
typedef struct BTreeNode {
int keys[2*T - 1]; // Max keys = 2*T - 1
struct BTreeNode *children[2*T]; // Max children = 2*T
int n; // Number of keys
int leaf; // 1 if leaf, 0 otherwise
} BTreeNode;
2️⃣ Creating a Node
BTreeNode* createNode(int leaf) {
BTreeNode* node = (BTreeNode*)malloc(sizeof(BTreeNode));
node->leaf = leaf;
node->n = 0;
for(int i = 0; i < 2*T; i++)
node->children[i] = NULL;
return node;
}
3️⃣ Traversal (In-Order)
void traverse(BTreeNode* root) {
if (root != NULL) {
for (int i = 0; i < root->n; i++) {
if (!root->leaf)
traverse(root->children[i]);
printf("%d ", root->keys[i]);
}
if (!root->leaf)
traverse(root->children[root->n]);
}
}
4️⃣ Searching for a Key
BTreeNode* search(BTreeNode* root, int key) {
int i = 0;
while (i < root->n && key > root->keys[i])
i++;
if (i < root->n && key == root->keys[i])
return root;
if (root->leaf)
return NULL;
return search(root->children[i], key);
}
5️⃣ Insertion
Step 1: Split a full child node
void splitChild(BTreeNode* parent, int i, BTreeNode* fullChild) {
BTreeNode* newChild = createNode(fullChild->leaf);
newChild->n = T - 1;
// Copy last T-1 keys to newChild
for (int j = 0; j < T-1; j++)
newChild->keys[j] = fullChild->keys[j + T];
// Copy child pointers if not leaf
if (!fullChild->leaf)
for (int j = 0; j < T; j++)
newChild->children[j] = fullChild->children[j + T];
fullChild->n = T - 1;
// Shift parent's children
for (int j = parent->n; j >= i+1; j--)
parent->children[j+1] = parent->children[j];
parent->children[i+1] = newChild;
// Shift parent's keys
for (int j = parent->n-1; j >= i; j--)
parent->keys[j+1] = parent->keys[j];
parent->keys[i] = fullChild->keys[T-1];
parent->n++;
}
Step 2: Insert non-full
void insertNonFull(BTreeNode* node, int key) {
int i = node->n - 1;
if (node->leaf) {
while (i >= 0 && node->keys[i] > key) {
node->keys[i+1] = node->keys[i];
i--;
}
node->keys[i+1] = key;
node->n++;
} else {
while (i >= 0 && node->keys[i] > key)
i--;
if (node->children[i+1]->n == 2*T - 1) {
splitChild(node, i+1, node->children[i+1]);
if (key > node->keys[i+1])
i++;
}
insertNonFull(node->children[i+1], key);
}
}
Step 3: Insert wrapper
BTreeNode* insert(BTreeNode* root, int key) {
if (root->n == 2*T - 1) {
BTreeNode* newRoot = createNode(0);
newRoot->children[0] = root;
splitChild(newRoot, 0, root);
insertNonFull(newRoot, key);
return newRoot;
} else {
insertNonFull(root, key);
return root;
}
}
6️⃣ B+Tree Differences
To implement a B+Tree in C:
- Internal nodes store only keys, not actual data.
- Leaf nodes store actual data and point to the next leaf.
- Traversal of leaves allows fast sequential access.
Leaf Node Structure Example:
typedef struct BPlusNode {
int keys[2*T - 1];
int data[2*T - 1]; // store actual values
struct BPlusNode *children[2*T];
struct BPlusNode *next; // link to next leaf
int n;
int leaf;
} BPlusNode;
The insertion, split, and search logic are similar, but all data moves to leaves, and internal nodes serve only as guides.
⚡ Practical Applications
- Database Indexing: MySQL, PostgreSQL, Oracle.
- File Systems: NTFS, FAT, and ext4.
- Key-Value Stores: RocksDB, LevelDB.
- Range Queries: B+Trees are preferred for fast sequential access.
🔑 Key Takeaways
- B-Trees: Balanced multi-way search trees; internal and leaf nodes store data.
- B+Trees: Leaf-linked trees; all data stored in leaves; ideal for sequential access.
- Time Complexity: Search, Insert, Delete = O(log n).
- C Implementation: Stepwise insertion, split, search, and traversal explained.
- Real-World Impact: Foundation of modern database and filesystem performance.
💻 Next Steps
- Implement deletion in B-Trees and B+Trees.
- Add range query support for B+Trees.
- Optimize memory usage for large-scale datasets.
With this, you now have a complete understanding and code reference for B-Trees and B+Trees in C. Whether you are building a database, file system, or an indexing engine, this knowledge is foundational.
Top comments (0)