Sharding is a process of splitting a larger database into smaller pieces called shards.
instead of having all user data in one db.
you could have
shard1 shard2 shard3
Data-A Data-B Data-C
Allows us to
- Handle more data
- process more request
- reduce load on individual machines
- scale horizontally
There are several ways to decide which data goes to which shard.
Sharding techniques
- Range Based
- Hash Based
- Directory Based
- Geographical
- Dynamic
- Hybrid
Range Based Sharding
In range-based sharding , data is divided into shards based on a specific range of values for a given partition key. Each shard would contain specific range of data.
For example you have 10,000 user records.
shard1 shard2 shard3
user-id user-id user-id
1-3000 3001-6000 6001-10000
Ok. the explanation is simple. You would have one thing in your mind. How would my server know which shard I should look for.
const mysql = require('mysql2/promise');
// Connections to different shards
const shard1 = mysql.createPool({
host: 'localhost',
user: 'root',
password: 'password',
database: 'user_shard_1'
});
const shard2 = mysql.createPool({
host: 'localhost',
user: 'root',
password: 'password',
database: 'user_shard_2'
});
const shard3 = mysql.createPool({
host: 'localhost',
user: 'root',
password: 'password',
database: 'user_shard_3'
});
// Shard Router
function getShard(userId) {
if (userId >= 1 && userId <= 3000) return shard1;
if (userId >= 3001 && userId <= 6000) return shard2;
if (userId >= 6001 && userId <= 10000) return shard3;
throw new Error('User ID out of range');
}
// Insert User
async function createUser(userId, name, email) {
const db = getShard(userId);
await db.execute(
'INSERT INTO users (user_id, name, email) VALUES (?, ?, ?)',
[userId, name, email]
);
}
In here getShard function would be helpful for deciding the shard.
Likewise you can manipulate for others.
Hash Based Sharding
Lets say I decide it to do an hash function for my user-id to store the data.
In the following example am using math. But you can decide on which manipulation you'd want.
For example I would insert user-id:1 into shard1, user-id:2 into shard2 and user-id:3 into shard3
function getShard(userId) {
const shardKey = userId % 3;
if (shardKey === 0) return shard1;
if (shardKey === 1) return shard2;
return shard3;
}
// Insert User
async function createUser(userId, name, email) {
const db = getShard(userId);
await db.execute(
'INSERT INTO users (user_id, name, email) VALUES (?, ?, ?)',
[userId, name, email]
);
}
Directory Based Sharding
Consider our file system where you would store a data into specific folder. Based on some scenario I decide it to store it in particular shard. It would have a lookup table to support this kinda architechture.
Imagine you run a software application responsible for sending notifications to various applications. In practical it'd be Firebase Cloud Messenger or Apple Notification Service.
An user enroll their application in that system, it could be Youtube, Instagram, Facebook and etc.
Based on appName. I would decide which shard to return.
Instagram -> shard1
Facebook -> shard2
Youtube -> shard3
But first I would look for the shard in a Lookup.
async function getShard(appName) {
const [rows] = await directoryDb.execute(
'SELECT shard_name FROM app_directory WHERE app_name = ?',
[appName]
);
const shard = rows[0].shard_name;
if (shard === 'shard_1') return shard1;
if (shard === 'shard_2') return shard2;
if (shard === 'shard_3') return shard3;
throw new Error('Invalid shard');
}
// Store notification
async function sendNotification(appName, userId, message) {
const db = await getShard(appName);
await db.execute(
'INSERT INTO notifications (user_id, message) VALUES (?, ?)',
[userId, message]
);
console.log(`Notification stored in ${appName}'s shard`);
}
I think you would have some better idea. Creating pool is common across all the app. Moving forward I'd give only examples.
Geographical Sharding
As the name suggests it would be purely based on geographical locations.
function getShard(country) {
if (country === 'India') return indiaShard;
if (country === 'USA') return usShard;
if (['Germany', 'France', 'Italy'].includes(country)) {
return euShard;
}
throw new Error('Unsupported region');
}
Dynamic Sharding
Dynamic Sharding means the shard is not decided by hardcoded ranges, hashes, or regions. Instead, the application checks a configuration table and decides where to store data at runtime.
This allows you to add new shards without changing application code. It similar to elasticity of a server to scale in real time.
Hybrid
Hybrid sharding is when you combine multiple sharding strategies together to get the benefits of each.
A very common real-world pattern is:
Directory-based (app → shard group) + Hash-based (within shard group)
This gives:
Flexibility (move apps easily)
Even distribution (avoid hotspots inside shards).
async function sendNotification(appName, userId, message) {
const group = await getShardGroup(appName); // directory step
const shard = getShardFromGroup(group, userId); // hash step
await shard.execute(
'INSERT INTO notifications (user_id, message) VALUES (?, ?)',
[userId, message]
);
console.log(`Stored in ${appName} → ${group}`);
}
In this example you can see I've used directory based and hash based.
Top comments (0)