Applications can accelerate from zero to millions of users overnight. If this occurs, it’s critical that your applications have the ability to quickly dial up and down resources to meet the demand. In other words, if you are seeing this pattern in your applications, it's time to scale your database according to traffic. Of course, you can handle the flood of requests most of the time by adding more hardware resources, but this will quickly increase costs.
The need to scale a database is inevitable when an application grows, but how should you do it? In this guide, we look at how you can build a scalable database for your application.
Database scalability is the ability of a database to expand or contract its compute resources according to the changing demands of an application. When your database can't scale quickly enough, it can also lead to serious application issues. For example, in social games use cases, the impact of inadequate infrastructure scale can be easily noticed by gamers through latency. Take the case of EA's The Simpson's: Tapped Out. After an incredibly successful launch, it was trending at #2 on the Apple App Store. Then, four days later, it was removed from the App Store because the database couldn't handle the growing number of users.
Let's look at two ways databases are normally scaled: vertical and horizontal scaling.
Vertical scaling (scaling up) increases the performance of individual database servers in the cluster. This can be achieved by adding beefier computing resources, such as CPU and memory, to the server. Relational and non-relational databases can be scaled vertically, but vertical scaling is ideal for relational databases.
The main advantage of vertical scaling is that you do not have to change your application code. Instead, you upgrade your server resource capacity to meet scaling requirements. A machine's vertical scaling is limited by its hardware resources. Beyond this limit, you typically need to take the machine offline to install any new hardware. In addition, hardware can be expensive, and migrations could be difficult.
In horizontal database scaling (scaling out), more machines are added to spread or partition data. Each server is only responsible for a subset of the data (or data shard) that it stores. So whenever an app requests data, the request needs to be redirected to the server that hosts the corresponding data shard.
- No downtime - During scaling, you won't have to turn off the old machine because you are adding a new machine.
- Increased performance - With more servers, you can spread out your data and queries across more servers. This design can better serve large volumes of data because each server can index and serve its portion of data in parallel with other servers, allowing for better performance.
- Complexity of maintenance and operation - Multiple servers require more effort to manage than a single one. For the load to be distributed evenly, you will probably need to put in a load balancer. Additionally, like many other distributed systems, additional software like Apache Zookeeper is needed to handle resiliency at scale and synchronization across servers. Managing and running Zookeeper can take up valuable IT and engineering resources if you don't have the right expertise.
- Hard to choose the right sharding key - Without fully understanding your data patterns, picking the right sharding key can be challenging. If a wrong sharding scheme is chosen, query requests can become unevenly distributed across the servers, easily overburdening some data partitions.
- Joining data can become complex - Partitioning data across multiple servers means that joins have been done across servers. In the absence of native database support, join logic must be implemented by the application code, which can be inefficient from a performance standpoint for large datasets. Non-relational databases are better suited to horizontal scaling since they store data in self-contained objects like key-value pairs and JSON documents. In contrast, horizontal scaling can be difficult to implement for a relational database. For instance, partitioning data across multiple relational database servers requires joining the data across these servers, which can be quite complex.
As part of managing a database, it is crucial to provide users with a better experience and ensure smooth business operations. Therefore, a serverless approach to managing data is becoming increasingly popular. As an application programming interface (API) tied to the cloud, serverless databases can scale automatically behind the scenes as needed. Since the service provider manages the database service, you don't have to worry about provisioning, maintaining, and scaling it. Serverless databases provide many competitive advantages, which is why they are being used in a growing number of software development projects today. Are you looking for a serverless database that can scale to meet the needs of your application?
Fauna is a flexible, developer-friendly, transactional database delivered as a secure and scalable cloud API with native GraphQL. It offers traditional transactional ACID guarantees along with the performance and flexibility of NoSQL databases. Fauna is a serverless cloud database that dynamically adjusts capacity so that you never run out of storage and pay for only what you use.