DEV Community

Cover image for Caching strategies to speed up your API
Brian Neville-O'Neill
Brian Neville-O'Neill

Posted on • Originally published at on

Caching strategies to speed up your API

Written by Paramanantham Harrison✏️


Caching is a way to serve content faster. Caching happens at different levels in a web application:

  • Edge caching or CDN
  • Database caching
  • Server caching (API caching)
  • Browser caching

CDN is used to cache static assets in geographically distributed servers. It sends the assets faster to the end user from the cache.

Database caching is the caching natively used by every database. Every database has smart algorithms to optimize reads and writes. This caching depends primarily on the structure of the database. You can optimize your database caching by tweaking the schema.

Indexing in a database is one way of optimizing database reads. The database also has a working set of data in-memory to handle frequent requests to the same data.

Server caching is the custom caching of data in a server application. Usually this caching heavily depends on the business need. Highly optional for small applications that don’t have enough concurrent users.

Browsers cache the static assets based on the cache expiry headers. Also, browsers cache the GET requests smartly to avoid unnecessary data calls.

In this article, we will see different caching strategies that happen in the API (i.e., server level caching).

LogRocket Free Trial Banner

Why do you need caching?

When you are creating an API, you want to keep it simple. But, once the concurrent requests increase, you’ll face a few issues.

  • Databases will take more time to respond
  • CPU spikes occur during peak traffic
  • Server response times become inconsistent depending on the number of concurrent requests

In most cases, horizontal scaling will work fine to resolve these issues. Horizontal scaling adds more resources and machine power to handle traffic. However, you will eventually reach a point where your database can’t handle the requests irrespective of the traffic.

On an API level, there are many optimizations you can do to solve such issues. Some of the solutions include paginating data fetches from the database, caching read data that are the same for many users or visitors, and database sharding.

You use caching when you have a highly concurrent need to read the same data, or for any application that has heavy read and write. You can also use caching for frequently accessed information.

For example, caching is useful for COVID APIs.

Lot of users around the world are reading about this topic, but its write frequency is not very high. Therefore, you can simply cache the response in memory and serve it very fast. Once there is new data, write it to the database, invalidate the cache, and update it asynchronously.

Caching is also useful for user meta information. When you’ve logged in to any site, the user data will be required on every page.

You don’t need to request it again and again — rather, you can keep that information in a cache to serve faster. When the information gets updated, update the cache as well. Server session is one type of user meta information.

It can also be helpful to cache live scorecards. Online scorecards are live data accessed by millions of users when any sports match is live.

You don’t need to fetch data from the database all the time to serve live scorecard information — you can simply cache the data for a few seconds to serve millions of the same requests and then recache with live data again in a set time interval.

Using this method, your database will be hit by only one user every second as opposed to millions of users directly.

In most cases, the cache lives in-memory. This is the reason it’s faster to fetch data from the cache rather than the database. Even databases cache their working set in-memory.

But those are not custom caches defined by the business requirement. Those caching mechanisms are based on database internal operations.

Let’s learn some of the caching strategies with real-world use cases.

Cache lifetime and expiry

The lifecycle of caches play a major role. Invalidating a cache is one of the most difficult computer science problems.

We’re going to talk about cache with time to live (TTL) and cache without TTL.

Cache with TTL

A cache with TTL is the most commonly used cache. When your data is frequently updated and you want to expire your cache in regular intervals, you can use cache with a time limit. The cache will get deleted automatically once the time interval has passed.

Server sessions and live sports scores are examples of caches with TTL.

Cache without TTL

A cache without TTL is used for caching needs that don’t need to be updated frequently.

Course content in course websites and heavy static content sites like multi-author blogs often use caches without TTL.

Real world example for cache without TTL are,

Content will be published and updated infrequently, so it’s easier to cache it. Cache invalidation is easy too.

There are several strategies in caching. Some of them include cache aside (or lazy loading), read through cache, and write through cache.

Cache aside (lazy loading)

Cache aside keeps the cache updated through the application asynchronously.

First, the application checks to see whether data exists in the cache. If it exists, you can read from cache. If it doesn’t, you read from the data source.

Then, it will write to the cache and return the data as the response. The subsequent requests will be served through the cache.

A chart showing how cache aside works.

Cache aside is very easy to implement but very difficult to manage. Cache invalidation is difficult. Whenever the data in the source is updated, we need to check the cache and invalidate it. It will be difficult and expensive if you use multiple sources to update the data in the store.

Pseudocode: Cache aside for COVID API:

app.get('/summary', async (req, res) => {
  // Check the cache
  if (cache.has('summary')) {
    const data = await cache.get('summary');
    return res.json({ data });
  // If no cache, then get data from datastore
  const data = await getSummary();
  // Set the cache for future request
  await cache.set('summary', data);
  // Send the response
  return res.json({ data });
Enter fullscreen mode Exit fullscreen mode

Read through cache

For read through cache, the data will be read through the cache every time.

First, check whether data exists in the cache. If it does, read from the cache and send the response. If it doesn’t, the cache will be updated from the datasource. Cache will then send the response back to the client.

A chart displaying how read through write through works.

Read through cache has the same problem as cache aside — if the datastore gets updated through many sources, the cache will be obsolete.

Cache aside and read through cache are mostly used on heavy read sites. The cache invalidation issue can be solved using write through cache.

Pseudocode: Read through cache for COVID API:

app.get('/summary', async (req, res) => {
  // If no cache exist
  if (!cache.has('summary')) {
    await getSummaryFromStoreAndSetCache();
  const data = await cache.get('summary'); // Always exist at this point
  // Send the response
  return res.json({ data });
Enter fullscreen mode Exit fullscreen mode

Read through cache is similar to cache aside, the only difference being that it always sends the result from the cache.

Write through cache

The data will first be written to the cache and then the cache will update the datastore.

First, write to the cache and then to the main database.

Write through cache only solves the write issue. It needs to be combined with read through cache to achieve proper results.

When both are combined, the data always gets read from cache and written to the database through the cache, so there won’t be any obsolete data in the cache.

However, this will make the resources expensive very easily if you want to keep everything in the cache. There are cloud databases that support read through and write through caches natively without writing custom cache layers.

Pseudocode: Write through cache for COVID API:

// Sample creation endpoint'/summary', async (req, res) => {
  const { payload } = req.body;
  // Create cache first
  await cache.set('summary', payload);
  // Send the response
  return res.json({ success: 'Content updated successfully' });

// Sample updation endpoint
app.put('/summary', async (req, res) => {
  const { payload } = req.body;
  // Get previous cache and update payload
  const previousCache = await cache.get('summary') || [];
  const data = [...previousCache, payload];
  // Update or overwite cache first
  await cache.set('summary', data);
  // Send the response
  return res.json({ success: 'Content modified successfully' });
Enter fullscreen mode Exit fullscreen mode

Write behind cache

Cache will send the response before writing to the datastore. It writes to the datastore asynchronously depending on the load.

It is just a special type of write through cache. It doesn’t wait for the datastore to acknowledge whether data is stored.

This technique is used in every NoSQL and SQL database internally. First, the data will be written in memory. The database acknowledges the response and writes it to the disk asynchronously. This way, the database can smartly handle heavy writes.

A chart of a write behind pattern.

Again, this strategy alone won’t be sufficient. It should be coupled with read through cache to solve cache invalidation issues. Most relational database have write behind and read through cache supported natively.

Refresh ahead cache

Refresh ahead cache is used to refresh the data before it expires. It happens asynchronously so the end user won’t see any issues.

This kind of cache is used heavily on real-time websites, such as live sports scoring sites and stock market financial dashboards. You can always read from cache. The cache will refresh with live data before it expires.

A chart showing the refresh ahead pattern.

How to choose a caching key

There are two problems everyone faces in caching: when to invalidate the cache and how to name the key for the cache.

For simple and straightforward caches, you can name it easily with a static string key. If you have to cache paginated data, you can use a key that contains the page number and limits information.

Example: tweets:${pageNumber}:${limit}

You can choose your own delimiter. This works if you already know the filtering information. If you have many more filters, then it becomes hard to handle the dynamic key.

Choosing a key is a hard problem once your cache is very deep and has many levels of variables. It is always a good idea to keep the cache data and the key simple enough to retrieve and load easily and faster to the end user.

Case study: How do you cache Twitter data?

Twitter is a complex and large-scale distributed app. It’s not easy to consider every use case.

Let’s consider a simple Twitter app with hundreds of users.

How do you serve the users as fast as you can through a reliable caching mechanism?

There are many levels of cache you can build.

One option is to cache with TTL for user-based views. The expiry duration will be short enough under a minute. This way, you can show the same paginated results to the minute to the user for consequent refreshes.

You can also cache infrequent hashtags. Keep the cache in a paginated set and update the last set based on new tweets for those hashtags.

Caching trending hashtag-based tweets is costly since a lot of real-time users are tweeting about it.

You can do the same approach as caching with TTL for the first few pages.

Caching a single tweet and its related threads is easy. An even simpler caching strategy like cache aside will work easily. Since the order of threads aren’t very important, you can invalidate and recache when new threads start coming in.


Caching is a vast topic. We scratched the surface with a few strategies.

Most applications don’t require custom caching solutions since databases and servers themselves handle requests smartly through their in-built caching mechanism.

Once the scale reaches certain levels, caching is the first solution to reach for. It improves the end user experience and helps to avoid unnecessary resource costs.

Hope you learned few strategies to implement on your web app!

Plug: LogRocket, a DVR for web apps

LogRocket Dashboard Free Trial Banner
LogRocket is a frontend logging tool that lets you replay problems as if they happened in your own browser. Instead of guessing why errors happen, or asking users for screenshots and log dumps, LogRocket lets you replay the session to quickly understand what went wrong. It works perfectly with any app, regardless of framework, and has plugins to log additional context from Redux, Vuex, and @ngrx/store.
In addition to logging Redux actions and state, LogRocket records console logs, JavaScript errors, stacktraces, network requests/responses with headers + bodies, browser metadata, and custom logs. It also instruments the DOM to record the HTML and CSS on the page, recreating pixel-perfect videos of even the most complex single-page apps.
Try it for free.

The post Caching strategies to speed up your API appeared first on LogRocket Blog.

Top comments (0)