Daniel Musembi

Posted on Apr 30, 2023 • Edited on May 8, 2023

Implementing a distributed cache for high-performance web applications

This article takes an in-depth look into Distributed Cache. It answers all of the commonly asked questions about it, including What is a distributed cache? What is the point of it? What distinguishes it structurally from conventional caching? What are the applications for Distributed Caches? What are the most often utilized distributed caches in the industry?

Before diving into distributed caching. It will be beneficial to have a fast understanding of caching. After all, why do we utilize cache in our web applications?

Introduction

Distributed caching is a computer technology storing data in numerous cache servers, allowing for quicker data access and retrieval. A distributed cache system spreads data across numerous cache servers situated in various geographic areas rather than depending on a single cache server. This reduces latency and bottlenecks that might arise when data is accessed from a single cache server.

Distributed caching works by storing frequently visited data in cache servers, allowing future requests for that data to be provided fast from the cache rather than needing a round trip to the origin server. This may assist to enhance web application speed by lowering response time and load on the origin server.

In a distributed caching system, cache servers are linked together to create a network that may interact and exchange data. When a cache server gets a request for data that is not in its cache, it may route the request to other cache servers in the network until the desired data is located and provided to the client. This ensures that data is always accessible, even if a cache server fails or goes down.

Overall, distributed caching is a useful strategy for increasing web application speed and scalability, and it is extensively utilized in high-traffic websites and apps.

1. What exactly is caching? What is it, and why do we use it in our web applications?

Caching implies storing frequently requested data in RAM rather than on a hard disk. Data access from RAM is always quicker than data access from the hard disk.

In web applications, caching fulfils the following functions.

For starters, it significantly minimizes application latency. Simply said, since it has all of the commonly used data stored in RAM, it does not need to communicate with the hard disk when the user demands the data. This improves application response times.

Second, it prevents users' requests for information from reaching the database. The problem of a sluggish database is therefore avoided. The overall number of database queries is reduced, which improves the application's performance.

Third, Caching is typically quite useful in decreasing the application's operating expenses.

The database, content delivery network (CDN), domain name system (DNS), and any other component of a web project may all benefit from caching.

2. How Does Caching Help In Bringing the Application Compute Costs Down?

In-memory storage is inexpensive compared to persistent storage on a hard drive. Costly read/write operations in databases. Those who are familiar with DBaaS Database as a Service will understand what I mean.

Sometimes there are too many writes to the program, such as in MMOGs with a large player base.

Data with a low priority may be stored in the cache, and data persistence to the database can be handled using scheduled batch processes. There is a substantial saving in database write costs because of this.

So, this is how I put my own Google Cloud-hosted game into production. The NoSQL service on Google Cloud I utilized was Datastore. In terms of budget, this strategy was a big benefit.

Now that we have defined cache and caching, we can go further. Now, we'll talk about distributed caching.

3. What is meant by the term "distributed caching"? What exactly is the point of it?

A distributed cache is a kind of cache in which the data are dispersed among several nodes within a cluster as well as multiple clusters housed inside multiple data centers situated in different parts of the world.

Distributed caching is widely used nowadays because it can grow dynamically and provides continuous service.

Large-scale online services rely heavily on scalability, high availability, and fault tolerance. Companies can't risk having their services interrupted. Consider the healthcare system, the financial market, and the armed forces. There is nowhere for them to fall farther. They are dispersed throughout a network of nodes with a good deal of redundancy.

Cloud computing prefers distributed cache and systems. Scalability and availability alone.

Memcache caches Google Cloud data. Internet giants utilize Redis for caching, NoSQL data stores, and more.

4. How Does It Differ From Local or Conventional Caching?

Everything from computing power to storage space can be added to a system on the fly because of how it was built and how it is distributed. The distributed cache is no different in this regard.

The conventional cache can only be hosted on a limited number of instances. On-the-fly scalability is a major challenge. In contrast to the distributed cache architecture, it is not as resilient to outages and errors.

In the article that you just read, I mentioned how cloud computing makes use of distributed systems to grow and keep its services accessible. The name "cloud" is simply a marketing phrase; technically speaking, it refers to a widely dispersed system of services such as computing, storage, and so on that is operating in the background.

5. When Would Distributed Caches Be Useful?

Database Caching

The Cache layer that sits in front of a database stores data that is often accessed in memory to reduce the amount of delay and unneeded stress that is placed on the database. When the cache is implemented, there is no longer a bottleneck in the database.

Keeping Track of User Sessions

Sessions for users are saved in the cache so that the user state is not lost if any of the instances become unavailable.

If an instance fails, another one will start up, retrieve the user's state from the session cache, and carry on where the previous one left off.

Communication between Different Modules and Shared Storage

In-memory distributed caching is used not only for data storage but also for the transfer of messages between the several microservices that work in conjunction with one another.

It stores the data that is shared throughout all of the services and can be accessed in the same way. It performs the function of a skeleton for the communication of micro-services. In certain use scenarios, distributed caching is often used in the capacity of a NoSQL datastore.

Processing and analysis of data streams in memory using in-memory computing

instead of collecting data intermittently and processing it analytically. Processing data and doing analytics on it as it streams in real time is what in-memory data stream processing is all about.

Anomaly identification, fraud monitoring, online gaming statistics in real-time, real-time suggestions, payments processing, etc. all benefit from this.

6. The Design and Architecture of a Distributed Cache and Its Functions

A dedicated, in-depth article on designing a distributed cache is warranted. I'll give you a high-level understanding of how a distributed cache works.

A distributed cache, when stripped down to its parts, is a Distributed Hash Table that maps object values to keys across several nodes.

As long as the cache service is available, the distributed Hash table enables it to dynamically grow in response to changing demands. Initially, P2P networks relied on distributed hash tables.

The Least Recently Used (LRU) policy is used for data eviction in caches. A Linked List is used optimally to handle the data references. The most frequently used data is always at the forefront of the queue, indicated by the location of the pointer. To create a Queue, a Linked List is used.

Information that is seldom accessed gets relegated to the back of the queue and finally deleted.
When there will be plenty of additions, deletions, and changes to the list, a linked list is the best choice of data structure. When compared to an ArrayList, it offers superior performance.

Several ideas, such as eventual consistency, strong consistency, distributed locks, commit logs, and so on, are involved in the topic of concurrency. That's much too much information for this write-up. In the next piece, I will detail my ground-up approach to designing a distributed cache and handle any questions you may have.

7. What Are the Various Varieties of Distributed Caching Strategies, and How Do They Work?

Caching methods come in many forms, each optimized for a particular set of circumstances. With That Cache Put Aside, Cache-to-disk, cache-to-memory, and cache-to-disk-again

Let's investigate what these cache types are and why we need them.

Cache Aside

The most typical caching method is a cache that works in tandem with a database to lessen the latter's workload. Lazy loading occurs in the cache for the data.

When a user requests specific information, this occurs. When a file is needed, the cache is checked first. It is simply returned from it if it is already existent. When that's the case, the user gets given the most up-to-date information from the database rather than the cached version.

This technique is most effective when dealing with read-intensive tasks. Information that often does not undergo significant changes, like user profiles on a portal. Details like his name, account number, and more.

In this method, information is stored in a database without any intermediate steps. This might cause inconsistencies between the cache and the database.

Cache information is protected by a Time to Live (TTL) setting. The information is removed from the cache when the time limit has passed.

An application can emulate the functionality of read-through caching by implementing the cache-aside strategy. This strategy loads data into the cache on demand. The figure illustrates using the Cache-Aside pattern to store data in the cache.

Read Through

It reads data from the cache and lazy-loads the database.

In Fig. 1 — If the cache doesn't already have the data associated with a given key (operation-1), it will go and get it from the datastore (operation-2) and store it in the cache (operation-3), before finally returning the value to the caller (operation-4).

If the cache has the requested data for the requested key (operation-1), it will return that data to the requestor (operation-2).

Write Through

This method is ideal for write-heavy workloads because it uses a cache to store data before saving it to the datastore (as in operations 1 and 2). This method helps to maintain data consistency across applications.

Write Back

In this method, we enter data directly into the cache (operations 1 and 2), rather than the datastore (operation 3), so that we may process it more quickly. Instead of directly inserting or updating data into the cache, we put it in a queue and duplicate it to the datastore when it's ready.

Data loss is risky if the cache fails since the most recent data is not immediately reflected in the database.

Write Around

Data in this pattern bypasses the cache entirely (operation-1) and goes straight into the data storage. When data is read from the database, it is cached (together with the results of operations 2 and 3).

Applications that often retrieve newly written data out of the datastore are not good candidates.

Refresh-Ahead

This design reduces latency since cached data is updated just before it is utilized (operations 1 and 2), rather than after it has expired. In later phases, the same mechanism as operation-3 is employed during the fetch phase.

Caching Eviction (clean-up) Techniques/Algorithms

Caching eviction, also known as cache cleanup, is the process of removing stale or unused data from the cache to make room for new data. Here are some popular caching eviction techniques and algorithms:

One of the most used cache eviction techniques is Least Recently Used (LRU). The items in the cache that have been utilized the least are deleted. This algorithm kicks out the least-utilized things on the premise that they are less likely to be used again shortly.
First-In, First-Out (FIFO) is a straightforward strategy for clearing caches in which the oldest entries are removed first. First in, first out: the algorithm eliminates things with the assumption that they will not be needed again.
The Least Recently Used (LFU) algorithm purges the cache of data that is seldom used. The model presupposes that seldom-used resources won't be used again.
Evicting cached objects are chosen at random using the random replacement technique. This method does not make any predictions about which resources will be more or less useful in the future.
In size-based eviction, cached things are removed from storage based on their physical dimensions. This may be helpful when the cache's memory is limited and space must be maximized for the objects it contains.

Use-Cases: Some of the common use-cases of distributed caching

When high speed, scalability, and availability are needed, distributed caching is a viable solution. Some typical applications of distributed caching are listed below.

The efficiency and scalability of online applications may be enhanced via the use of distributed caching by storing copies of frequently visited information such as user profiles, session data, and database queries.
Distributed caching may be used to store data like prices and stock levels for products in e-commerce systems. This has the potential to ease pressure on the database and speed up the app.
Data that is requested often, such as user behaviour and activity data, may be cached in real-time analytics systems with the help of distributed caching. This has the potential to boost application performance by facilitating faster data processing.
Distributed caching may be used to store frequently used information in mobile apps, such as user profiles, location data, and photos. The application's performance may be enhanced, and network traffic lightened, as a result.
Images, movies, and other static assets that are viewed often may be cached in advance using distributed caching in content delivery systems. This has the potential to lessen the strain placed on the application's servers, allowing it to run more smoothly.
Market data and price information for high-frequency trading may be cached with the help of distributed caching. Trading choices may be made more quickly and with more precision.

What Are the Most Common Distributed Caches in Use Today?

There are several well-known distributed caching solutions available right now, and they all have their advantages and disadvantages. Here are a few of the most typical:

Memcached is a popular distributed caching technology for websites since it is open-source and free. It is optimized for caching data items of varying sizes and is meant to scale quickly.
Another well-known open-source distributed caching solution that is both flexible and fast is Redis. It may be used for storing and retrieving information in a cache, sending and receiving messages, and processing data in real-time, among other things.
Hazelcast: Hazelcast is a simple and scalable distributed caching and computation platform. It may be used for caching, messaging, and distributed computing, and it supports several data structures.
Apache Ignite is high-performance, scalable, and distributed caching and computation platform. It may be used for caching, messaging, and distributed computing, and it is compatible with many different types of data structures.

Couchbase: Couchbase is a caching-enabled distributed NoSQL database. In addition to supporting several data formats including JSON documents, key-value stores, and full-text search indexes, it is also noted for its outstanding speed and scalability.

An Architecture for Distributed Caching

Take a look at how InterSystems IRIS stores and retrieves data to get a feel for its distributed caching architecture:

InterSystems IRIS utilizes a local OS file format known as a database to keep track of information. Multiple databases may (and often do) coexist in the same InterSystems IRIS instance.

When using data stored in one or more databases, InterSystems IRIS programs have access to it via a namespace. Multiple namespaces may (and often do) coexist in a single InterSystems IRIS instance.

To improve the efficiency of queries that are executed often, InterSystems IRIS instances keep a database cache, a local shared memory buffer used to store data obtained from the databases.

Using these components in the following ways, the construction of a distributed cache cluster is theoretically straightforward.

When a second InterSystems IRIS instance is added as a remote server and one or more of its databases are also added as remote databases, the first instance becomes an application server. As a result, the second instance is functioning as a data server for the first.

The application server's local namespaces are mapped to the data server's remote databases in the same manner as local database names are mapped. When an application queries a namespace on the application server, it does not know if the database it is accessing is local or remote.

Just as it would if it were working with local databases alone, the application server keeps its database cache. ECP synchronizes the application server caches with the data server and efficiently exchanges data, locks, and executable code amongst several instances of InterSystems IRIS.

The following describes how a cluster of application servers and a data server implement a distributed cache:

Data is still being stored, updated, and served from the data server. To prevent users from receiving or retaining outdated data, the data server also manages locks throughout the cluster and keeps the caches of the application servers in sync and coherent.
The whole set of cached data is dispersed throughout multiple separate caches because each query against the data is done in a namespace on one of the many application servers, and each of these servers utilizes its unique database cache to store the results it gets. The application server will connect to the appropriate data server without any intervention from the user.
Each application server keeps tabs on its connections to the data servers, and if one is lost, it tries to reestablish it.
A load balancer may distribute requests from users in a round-robin fashion among several application servers, but the most efficient method makes use of distributed caching by sending users with similar requests to the same application server.
For a healthcare application, this may mean splitting off the queries performed by clinicians and front-desk employees onto separate application servers. Users of each application that the cluster supports may be routed to a dedicated application server. In the next examples, we will see how a cluster with this distribution of user connections compares to a single instance of InterSystems IRIS.

On a single instance of InterSystems IRIS, local databases are associated with local namespaces.

*A distributed cache cluster that uses remote databases hosted on a data server and maps them to namespaces hosted on application servers.
*

The data server in a clustered distributed cache is accountable for the following tasks:

Keeping information on its internal databases.
Maintaining consistency between the databases and the application servers' database caches prevents the application servers from receiving outdated information.
Controlling how locks are distributed within a system.
Controlling what happens if a connection to an application server goes down for a certain length of time

The following tasks fall within the purview of each application server in a distributed cache cluster:

Whenever an application needs access to data that is kept on a particular data server, the program must first establish a connection to that server.
Keeping previously obtained data from the network in a cache.
Keeping track of every connection to the data server and intervening if one is lost for a certain period

Clustered Cache Storage System Deployment

One or more application servers in an InterSystems IRIS distributed cache cluster receives data from a data server instance and then pass it on to the application. The data platform InterSystems IRIS provides several options for automated deployment of distributed cache clusters..

The following are the steps for a manual cluster deployment using the Management Portal

Any version of InterSystems IRIS may be used in a distributed cache cluster, so long as no application servers are newer than the data server. For critical restrictions and prerequisites concerning version compatibility, see.

All InterSystems IRIS instances in a distributed cache cluster must utilize the same locale, even if the data server and application server hosts use different operating systems and/or endianness. Where can I get help with setting up different languages and regions? see in the Manual for Managing Computer Systems.

In a typical distributed cache cluster setup, each host runs a single instance of InterSystems IRIS, and each instance serves either as a data server or an application server. This is the only feasible setup when using one of the automatic deployment approaches covered in the next section. The instructions for logging in to the Management Portal also presume this setup.

Methods for the Automated Deployment of Clusters

In addition to the manual technique described in this section, the InterSystems IRIS Data platform offers many options for the automatic deployment of fully functional distributed cache clusters.

Use InterSystems Cloud Manager (ICM) to set up a cluster of distributed cache nodes.

Distributed cache clusters with a mirrored or unmirrored data server and an arbitrary number of application servers, along with web servers, a load balancer, and an arbiter, can be automatically deployed with InterSystems Cloud Manager (ICM). Provisioning cloud or virtual infrastructure and deploying the desired InterSystems IRIS architecture on that infrastructure, along with other services, is made easy with ICM thanks to the combination of plain text declarative configuration files, a simple command line interface, and InterSystems IRIS deployment in containers. An InterSystems IRIS installation kit may also be used to set up ICM.

The full set of ICM docs, including information on setting up a distributed cache cluster in a cloud environment or on preexisting hardware, can be found here see here.

Use the InterSystems Kubernetes Operator (IKO) to set up a distributed cache cluster.

kubernetes is a free and open-source tool for managing and orchestrating containerized applications and services. You specify the containerized services to be deployed and the rules that control them, and Kubernetes will transparently provide those resources in the most effective manner possible, fixing or restoring the deployment if it deviates from spec, and scaling automatically or on demand. Any Kubernetes platform may host a mirrored InterSystems IRIS sharded cluster, distributed cache cluster, or solo instance thanks to the IrisCluster custom resource added by the InterSystems Kubernetes Operator (IKO).

If you want to deploy InterSystems IRIS on Kubernetes, you can do so without the IKO, but doing so will be much more difficult and time-consuming because Kubernetes will lack the InterSystems IRIS-specific cluster management capabilities necessary to perform common tasks like adding nodes to a cluster.

To learn more about implementing the IKO, go here:

Assumptions:

On average, the web app gets 10,000 queries per minute.
On average, 5 database queries are needed to fulfill a single request.
On average, database queries take 100 milliseconds.
Up to 500 queries per second are supported by the database before the response time becomes unacceptably slow.
10,000 requests per second are within the cache server's capabilities.
Average cache hit percentage: 80%

Calculation:

10,000 requests per minute, times 5 inquiries each request, or 50,000 queries per minute.
Divide 50,000 by 60 seconds, and you get 833 database requests per second.
The database can only handle up to 500 inquiries per second, thus we need to cut down by 333 requests per second.
With an 80% cache hit rate, the cache is able to respond to 80% of requests. The last 20% will be added to the information repository.
The maximum number of requests that may be handled from cache per second is 8,000 (10,000 divided by 80%).
833 queries/second x 20% = 167 queries/second that need to be sent to the database
The number of database requests per second may be reduced from 8,000 to 7,500 by using a distributed cache.

Conclusion:

Using a distributed cache may increase web application speed by reducing database traffic by 7,500 queries per second. The predicted workload decrease on the database and the cache server capacities may both be calculated using this method.



{% raw %}``
```

DEV Community