JosephAkayesi

Posted on Jan 19

Scale from Zero to Millions of Users

#systemdesign

Scale from Zero to Millions of Users

By Joseph Akayesi

Originally published on Medium: https://medium.com/@josephakayesi/scale-from-zero-to-millions-of-users-3e91daa771d9

Designing systems to support millions of users can be challenging. It requires refinement and continuous improvement. In this article, we’ll walk through how to design a system that scales from zero users to millions of users.

Single Server Setup

At the beginning, a single-server setup is enough to serve a small number of users. Every component required to service requests exists on this single server, including:

Web server
Database
Cache
Other supporting services

Request Flow in a Single Server Setup

A user makes a request to your website using a URL.
DNS resolution occurs and translates the URL to an IP address.
The user sends a request to the web server using that IP address.
The web server processes the request and responds with an HTML page.

This setup works well initially but quickly becomes insufficient as traffic grows.

Separating the Database Layer

A single-server setup cannot handle increasing traffic efficiently. As the user base grows, the web application layer must be separated from the database layer so both can scale independently.

Which Databases Should You Use?

There are generally two main types of databases:

Relational Databases

Relational databases maintain strict referential integrity. They enforce relationships between tables and organize data into rows and columns.

Popular relational databases include:

MySQL
PostgreSQL
OracleDB

Non-Relational (NoSQL) Databases

Non-relational databases provide looser referential integrity and are designed to store unstructured or semi-structured data, often as documents or key-value pairs.

Popular non-relational databases include:

MongoDB
DynamoDB
CouchDB

Vertical Scaling vs Horizontal Scaling

As your user base grows, your system must handle increased traffic. There are two primary scaling strategies:

Vertical Scaling

Vertical scaling increases the capacity of a single server by adding more resources such as:

CPU
RAM
Storage

Vertical scaling is simple and works well for low to moderate traffic, but it has a hard upper limit.

Horizontal Scaling

Horizontal scaling adds more servers to form a cluster. This approach allows the system to scale beyond the limits of a single machine.

Load Balancer

When many users access your system concurrently, servers can become overloaded. A load balancer solves this problem.

A load balancer sits between clients and servers and distributes incoming requests to the next available server.

Types of Load Balancers

Application Load Balancer (ALB)
Network Load Balancer (NLB)

With a load balancer, the web tier becomes highly available, as requests can be routed across multiple servers.

What About the Data Tier?

While the web tier may now be redundant, the data tier often still consists of a single database, making it a single point of failure.

This problem is addressed using database replication.

Database Replication

Database replication distributes copies of data across multiple machines.

Advantages of Database Replication

Better performance
High availability
Improved reliability

Typical Request Flow with Replication

A user gets the load balancer IP from DNS.
The user connects to the load balancer.
The request is routed to one of the web servers.
Read operations go to replica (slave) databases.
Write, update, and delete operations go to the master database.

Cache

A typical web application consists of a web tier and a data tier. To improve performance and reduce latency, we introduce a cache.

A cache stores frequently accessed or expensive-to-compute data in memory for fast retrieval.

Popular caching systems include:

Redis
Memcached
Valkey

Cache Tier

The cache tier sits between the web servers and the database.

Frequently accessed data is stored in the cache
Subsequent requests are served directly from the cache
Database load is significantly reduced

Common Caching Strategies

Read-through
Read-around
Write-back

Caches typically store data as key-value pairs and support expiration times.

Cache Considerations

Use cache only for temporary data
Maintain consistency between cache and database
Use expiration policies to avoid stale data
Avoid single points of failure by using cache clusters
Choose appropriate eviction policies (LRU, LFU, MRU)

Content Delivery Network (CDN)

A Content Delivery Network (CDN) is a globally distributed network of servers that serve static content, such as:

HTML files
Images and videos
JavaScript and CSS files

CDNs store copies of static content closer to users, reducing latency and improving performance.

CDN Considerations

CDNs cost money due to data replication
Cache expiration must be carefully configured
Always provide a fallback in case the CDN fails

Stateless Web Tier

Stateful Architecture

In a stateful architecture, user session data is stored on the server. This forces users to connect to the same server throughout their session, which limits scalability.

Stateless Architecture

In a stateless architecture:

User session data is stored externally (database or cache)
Any server in the cluster can handle any request
Scalability and fault tolerance improve significantly

Data Centers

Large-scale systems often operate across multiple data centers to enable automatic failover and global availability.

Message Queues

Message queues enable asynchronous processing and further scalability.

They support event-driven architectures, where:

Producers publish events
Consumers process events asynchronously

This decouples services and improves reliability, especially for long-running or heavy operations.

Logs, Metrics, and Automation

As systems grow, observability becomes critical.

Logs help debug issues
Metrics provide insight into performance and capacity
Automation ensures reliability and consistency

These are essential in large-scale systems with many moving parts.

Database Scaling

As data volume grows, the database layer must also scale.

Vertical Database Scaling

Add more CPU, memory, and storage
Simple but limited by hardware constraints

Horizontal Database Scaling

Add more database instances
Distribute data across instances
Enables near-infinite scalability

Sharding

Horizontal database scaling is commonly achieved through sharding.

Sharding breaks data into smaller partitions distributed across multiple database instances.

Requests are routed to the correct shard using techniques such as consistent hashing.

Drawbacks of Sharding

While powerful, sharding introduces complexity:

Resharding data is difficult
Celebrity (hot key) problem
Joins and normalization become harder

Careful planning is required before adopting sharding.

Conclusion

Scaling from zero to millions of users is an incremental journey. By evolving your architecture step by step and applying the right techniques at the right time, you can build systems that are scalable, resilient, and performant.

DEV Community

Scale from Zero to Millions of Users

Scale from Zero to Millions of Users

Single Server Setup

Request Flow in a Single Server Setup

Separating the Database Layer

Which Databases Should You Use?

Relational Databases

Non-Relational (NoSQL) Databases

Vertical Scaling vs Horizontal Scaling

Vertical Scaling

Horizontal Scaling

Load Balancer

Types of Load Balancers

What About the Data Tier?

Database Replication

Advantages of Database Replication

Typical Request Flow with Replication

Cache

Cache Tier

Common Caching Strategies

Cache Considerations

Content Delivery Network (CDN)

CDN Considerations

Stateless Web Tier

Stateful Architecture

Stateless Architecture

Data Centers

Message Queues

Logs, Metrics, and Automation

Database Scaling

Vertical Database Scaling

Horizontal Database Scaling

Sharding

Drawbacks of Sharding

Conclusion

Top comments (0)