DEV Community

JosephAkayesi
JosephAkayesi

Posted on

Scale from Zero to Millions of Users

Scale from Zero to Millions of Users

By Joseph Akayesi

Originally published on Medium: https://medium.com/@josephakayesi/scale-from-zero-to-millions-of-users-3e91daa771d9

Designing systems to support millions of users can be challenging. It requires refinement and continuous improvement. In this article, we’ll walk through how to design a system that scales from zero users to millions of users.


Single Server Setup

At the beginning, a single-server setup is enough to serve a small number of users. Every component required to service requests exists on this single server, including:

  • Web server
  • Database
  • Cache
  • Other supporting services

Request Flow in a Single Server Setup

  1. A user makes a request to your website using a URL.
  2. DNS resolution occurs and translates the URL to an IP address.
  3. The user sends a request to the web server using that IP address.
  4. The web server processes the request and responds with an HTML page.

This setup works well initially but quickly becomes insufficient as traffic grows.


Separating the Database Layer

A single-server setup cannot handle increasing traffic efficiently. As the user base grows, the web application layer must be separated from the database layer so both can scale independently.


Which Databases Should You Use?

There are generally two main types of databases:

Relational Databases

Relational databases maintain strict referential integrity. They enforce relationships between tables and organize data into rows and columns.

Popular relational databases include:

  • MySQL
  • PostgreSQL
  • OracleDB

Non-Relational (NoSQL) Databases

Non-relational databases provide looser referential integrity and are designed to store unstructured or semi-structured data, often as documents or key-value pairs.

Popular non-relational databases include:

  • MongoDB
  • DynamoDB
  • CouchDB

Vertical Scaling vs Horizontal Scaling

As your user base grows, your system must handle increased traffic. There are two primary scaling strategies:

Vertical Scaling

Vertical scaling increases the capacity of a single server by adding more resources such as:

  • CPU
  • RAM
  • Storage

Vertical scaling is simple and works well for low to moderate traffic, but it has a hard upper limit.

Horizontal Scaling

Horizontal scaling adds more servers to form a cluster. This approach allows the system to scale beyond the limits of a single machine.


Load Balancer

When many users access your system concurrently, servers can become overloaded. A load balancer solves this problem.

A load balancer sits between clients and servers and distributes incoming requests to the next available server.

Types of Load Balancers

  • Application Load Balancer (ALB)
  • Network Load Balancer (NLB)

With a load balancer, the web tier becomes highly available, as requests can be routed across multiple servers.


What About the Data Tier?

While the web tier may now be redundant, the data tier often still consists of a single database, making it a single point of failure.

This problem is addressed using database replication.


Database Replication

Database replication distributes copies of data across multiple machines.

Advantages of Database Replication

  • Better performance
  • High availability
  • Improved reliability

Typical Request Flow with Replication

  • A user gets the load balancer IP from DNS.
  • The user connects to the load balancer.
  • The request is routed to one of the web servers.
  • Read operations go to replica (slave) databases.
  • Write, update, and delete operations go to the master database.

Cache

A typical web application consists of a web tier and a data tier. To improve performance and reduce latency, we introduce a cache.

A cache stores frequently accessed or expensive-to-compute data in memory for fast retrieval.

Popular caching systems include:

  • Redis
  • Memcached
  • Valkey

Cache Tier

The cache tier sits between the web servers and the database.

  • Frequently accessed data is stored in the cache
  • Subsequent requests are served directly from the cache
  • Database load is significantly reduced

Common Caching Strategies

  • Read-through
  • Read-around
  • Write-back

Caches typically store data as key-value pairs and support expiration times.


Cache Considerations

  • Use cache only for temporary data
  • Maintain consistency between cache and database
  • Use expiration policies to avoid stale data
  • Avoid single points of failure by using cache clusters
  • Choose appropriate eviction policies (LRU, LFU, MRU)

Content Delivery Network (CDN)

A Content Delivery Network (CDN) is a globally distributed network of servers that serve static content, such as:

  • HTML files
  • Images and videos
  • JavaScript and CSS files

CDNs store copies of static content closer to users, reducing latency and improving performance.


CDN Considerations

  • CDNs cost money due to data replication
  • Cache expiration must be carefully configured
  • Always provide a fallback in case the CDN fails

Stateless Web Tier

Stateful Architecture

In a stateful architecture, user session data is stored on the server. This forces users to connect to the same server throughout their session, which limits scalability.

Stateless Architecture

In a stateless architecture:

  • User session data is stored externally (database or cache)
  • Any server in the cluster can handle any request
  • Scalability and fault tolerance improve significantly

Data Centers

Large-scale systems often operate across multiple data centers to enable automatic failover and global availability.


Message Queues

Message queues enable asynchronous processing and further scalability.

They support event-driven architectures, where:

  • Producers publish events
  • Consumers process events asynchronously

This decouples services and improves reliability, especially for long-running or heavy operations.


Logs, Metrics, and Automation

As systems grow, observability becomes critical.

  • Logs help debug issues
  • Metrics provide insight into performance and capacity
  • Automation ensures reliability and consistency

These are essential in large-scale systems with many moving parts.


Database Scaling

As data volume grows, the database layer must also scale.

Vertical Database Scaling

  • Add more CPU, memory, and storage
  • Simple but limited by hardware constraints

Horizontal Database Scaling

  • Add more database instances
  • Distribute data across instances
  • Enables near-infinite scalability

Sharding

Horizontal database scaling is commonly achieved through sharding.

Sharding breaks data into smaller partitions distributed across multiple database instances.

Requests are routed to the correct shard using techniques such as consistent hashing.


Drawbacks of Sharding

While powerful, sharding introduces complexity:

  • Resharding data is difficult
  • Celebrity (hot key) problem
  • Joins and normalization become harder

Careful planning is required before adopting sharding.


Conclusion

Scaling from zero to millions of users is an incremental journey. By evolving your architecture step by step and applying the right techniques at the right time, you can build systems that are scalable, resilient, and performant.

Top comments (0)