There are only two hard things in Computer Science: cache invalidation and naming things.
The above statement by Phil Karlton has acquired some sort of legendary status in software development circles. And not without reason.
Caching has the potential to be the most important piece in your app development puzzle. It solves a bunch of important problems in modern app development such as :
- performance
- scalability
- cost
- availability
But despite all the benefits, you must not decide to use it blindly.
Any approach is beneficial only when it provides some value over the current situation.
Before deciding to implement caching, ask yourself the below questions.
1 - Is the operation I’m trying to cache slow?
Caching the result of an operation is beneficial only if the operation is really slow or resource-intensive.
For example, let’s say you are trying to use an external Weather API to retrieve some information. The Weather API may be slow or it may be expensive to use due to usage limits (in other words, resource-intensive).
In this case, you would do well to implement a cache to store the results of the Weather API query. When a user makes a query, you can first check if the data is available in the cache and call the slow and costly Weather API only when needed.
As a rule of thumb, always check whether you are trying to access a slow external API or a database. If yes, only then consider the use of caching. Otherwise, you would end up caching for no benefit and additional complexity.
2 - Is the cache actually faster?
Don’t cache just for the sake of caching!
The cache must be able to store and retrieve faster than the original source. Alternatively, it should consume fewer resources.
However, sometimes it is not immediately clear if caching will be advantageous. To make a decision, try to set up a test environment where you can simulate a high volume of traffic. Run tests with and without the cache and compare the results. If the performance improves due to caching, then and only then go for a caching-based solution.
Ultimately, there should be some quantitative advantage of using cache.
3 - Is the data I’m trying to cache dynamic?
Suppose your cache stores the results of a search query. When a user makes a new search query, you retrieve the results and also store them in the cache. On subsequent requests for the same query by other users, you return the cached results.
While caching such data, ask yourself how long until the cached result becomes stale.
If the cached data becomes stale very frequently and you have to invalidate it, you might not get sufficient advantage from caching. Items that don’t change from request to request are better candidates for caching.
4 - Is the data frequently accessed?
Let’s consider our earlier example of an e-commerce platform.
If you think there is a popular product in your catalog, its product page may receive a ton of requests. If the details are fetched from the database every single time, your application may perform poorly due to overloading. This is an ideal scenario to explore the use of cache.
As a rule, always ask yourself how frequently a piece of data is needed.
The more times it is needed, the more effective the use of cache will be.
To answer this question, you need to have a really good understanding of the statistical distribution of data access from your data source.
Caching is more likely to be effective for your use case when your data has a normal bell-curve distribution instead of flat distribution in terms of access.
5 - Does the original operation have side effects?
If you want to cache the result of an operation, it must not have side effects. For example, the operation should not store data, make changes to other systems or control some software or hardware item.
If the result of such operations is cached, you will end up breaking your application when the requests are served from the cache and the side effects are ignored.
That’s it
The decision to use a cache is not an emotional matter.
Though caching has a very important role in application development, you need to run your use case through the lens of hard questions.
If you find satisfactory answers that point to the use of a cache, you will be able to reap the benefits in the long run. Otherwise, caching systems can become more of a liability.
Do you want to stay updated and relevant to the latest trends in technology?
Do you also want to get actionable advice about web development, cloud computing, distributed systems, platform engineering and software architecture?
If yes, subscribe to the ProgressiveCoder newsletter for FREE.
Top comments (0)