DEV Community

Cover image for Database Sharding Explained With Real Examples: How Apps Scale Beyond a Single Database
Abdullah al Mubin
Abdullah al Mubin

Posted on

Database Sharding Explained With Real Examples: How Apps Scale Beyond a Single Database

Everything is going great.

Your application launches.

You have:

10,000 users
Enter fullscreen mode Exit fullscreen mode

Then:

100,000 users
Enter fullscreen mode Exit fullscreen mode

Then:

1,000,000 users
Enter fullscreen mode Exit fullscreen mode

Life is good.

Until one day your database becomes the bottleneck.

Queries slow down.

CPU usage spikes.

Storage fills up.

And your single database server starts crying for help.

At this point, many engineers discover a concept called:

Database Sharding

A technique used by some of the largest systems on the internet.

Index

  1. The Day One Database Stops Scaling
  2. What Is Database Sharding?
  3. A Real-World Analogy
  4. Why Bigger Servers Eventually Fail
  5. The Basic Idea Behind Sharding
  6. Horizontal vs Vertical Scaling
  7. Common Sharding Strategies
  8. User-Based Sharding Example
  9. How Instagram-Like Systems Use Sharding
  10. The Biggest Challenges of Sharding
  11. Rebalancing and Resharding
  12. When You Should NOT Shard
  13. Real Companies Using Sharding
  14. Final Thought

1. The Day One Database Stops Scaling

Most applications start with:

Application
     │
     ▼
PostgreSQL
Enter fullscreen mode Exit fullscreen mode
  • Simple.
  • Easy.
  • Reliable.

But eventually:

  • data grows
  • traffic grows
  • queries grow
  • users grow

And one machine becomes insufficient.


2. What Is Database Sharding?

Database sharding means:

Splitting data across multiple databases instead of storing everything in one database.

Instead of:

All Users
     │
     ▼
Database A
Enter fullscreen mode Exit fullscreen mode

You get:

Users 1-1M     → Database A
Users 1M-2M    → Database B
Users 2M-3M    → Database C
Enter fullscreen mode Exit fullscreen mode

Now the workload is distributed.


3. A Real-World Analogy

Imagine a library.

At first:

One room
All books
Enter fullscreen mode Exit fullscreen mode

Works fine.

Then the library grows to:

50 million books
Enter fullscreen mode Exit fullscreen mode

Finding books becomes painful.

So the library splits into:

Building A → A-F
Building B → G-M
Building C → N-Z
Enter fullscreen mode Exit fullscreen mode

Each building handles a subset.

That's essentially sharding.


4. Why Bigger Servers Eventually Fail

Many teams first try:

Just buy a bigger server.

This is called vertical scaling.

Example:

8 CPU → 16 CPU → 32 CPU → 64 CPU
Enter fullscreen mode Exit fullscreen mode

Eventually:

  • costs explode
  • hardware limits appear
  • upgrades become difficult

You can't scale infinitely upward.


5. The Basic Idea Behind Sharding

Instead of one huge database:

100 Million Users
        │
        ▼
Single Database
Enter fullscreen mode Exit fullscreen mode

You split the load:

Shard A → 25M Users
Shard B → 25M Users
Shard C → 25M Users
Shard D → 25M Users
Enter fullscreen mode Exit fullscreen mode

Now:

  • less data per database
  • fewer rows to scan
  • better performance
  • more scalability

6. Horizontal vs Vertical Scaling

Vertical Scaling

Bigger Server
Enter fullscreen mode Exit fullscreen mode

Example:

16 GB RAM → 64 GB RAM
Enter fullscreen mode Exit fullscreen mode

Horizontal Scaling

More Servers
Enter fullscreen mode Exit fullscreen mode

Example:

Database A
Database B
Database C
Database D
Enter fullscreen mode Exit fullscreen mode

Sharding is horizontal scaling.


7. Common Sharding Strategies

Several approaches exist.


Strategy 1: Range-Based Sharding

Example:

Users 1-1M     → Shard A
Users 1M-2M    → Shard B
Users 2M-3M    → Shard C
Enter fullscreen mode Exit fullscreen mode

Simple.

But can create uneven traffic.


Strategy 2: Geographic Sharding

Example:

US Users       → US Database
EU Users       → EU Database
Asia Users     → Asia Database
Enter fullscreen mode Exit fullscreen mode

Popular for global systems.


Strategy 3: Hash-Based Sharding

Example:

hash(userId) % 4
Enter fullscreen mode Exit fullscreen mode

Results:

0 → Shard A
1 → Shard B
2 → Shard C
3 → Shard D
Enter fullscreen mode Exit fullscreen mode

Provides better distribution.


8. User-Based Sharding Example

Suppose:

20 Million Users
Enter fullscreen mode Exit fullscreen mode

Sharding rule:

userId % 4
Enter fullscreen mode Exit fullscreen mode

Examples:

User 101 → Shard B
User 202 → Shard C
User 303 → Shard D
User 404 → Shard A
Enter fullscreen mode Exit fullscreen mode

Every request can quickly determine:

Which database owns this user?


9. How Instagram-Like Systems Use Sharding

Imagine:

500 Million Users
Enter fullscreen mode Exit fullscreen mode

Storing everything in one database becomes unrealistic.

Instead:

Users      → Multiple Shards
Posts      → Multiple Shards
Comments   → Multiple Shards
Messages   → Multiple Shards
Enter fullscreen mode Exit fullscreen mode

Each shard owns a subset of data.

This allows the platform to grow far beyond a single machine.


10. The Biggest Challenges of Sharding

Sharding sounds amazing.

But it creates new problems.


Cross-Shard Queries

Suppose:

User A → Shard A
User B → Shard C
Enter fullscreen mode Exit fullscreen mode

Now you need data from both.

The application must query multiple databases.


Joins Become Difficult

Traditional SQL joins work best within one database.

Across shards:

JOINs become expensive
Enter fullscreen mode Exit fullscreen mode

Many systems avoid them entirely.


Operational Complexity

Now instead of managing:

1 Database
Enter fullscreen mode Exit fullscreen mode

you manage:

10 Databases
Enter fullscreen mode Exit fullscreen mode

or

100 Databases
Enter fullscreen mode Exit fullscreen mode

11. Rebalancing and Resharding

What happens when:

Shard A = 90% full
Shard B = 20% full
Enter fullscreen mode Exit fullscreen mode

You need to move data.

This process is called:

Resharding

And it can be one of the hardest parts of operating large systems.


12. When You Should NOT Shard

Many developers discover sharding and immediately want it.

Don't.

Avoid sharding if:

  • database is still small
  • indexing solves performance issues
  • read replicas solve scaling
  • traffic is moderate

Sharding introduces significant complexity.


13. Real Companies Using Sharding

Large-scale systems often rely on sharding:

  • Instagram
  • Uber
  • Netflix
  • Pinterest
  • Discord

At massive scale, a single database rarely remains enough.


14. Final Thought

Database sharding is one of the most powerful scaling techniques in software engineering.

It allows systems to grow from:

Thousands of users
Enter fullscreen mode Exit fullscreen mode

to:

Millions or even billions of users
Enter fullscreen mode Exit fullscreen mode

But it comes with trade-offs:

✅ Better scalability
✅ Better distribution of load
✅ More storage capacity

❌ More complexity
❌ Harder queries
❌ Challenging maintenance

That's why experienced engineers usually follow this rule:

Exhaust simpler solutions first.

Use indexing.

Use caching.

Use read replicas.

And only when a single database truly becomes the bottleneck...

Reach for sharding.

Because once you shard, there's usually no going back.

Top comments (0)