loading...

Caching with [Web Storage, Fetch, Redis, Nginx]

victormagarlamov profile image Victor Magarlamov ・5 min read

One of the main goals of caching is to eliminate the need to send requests in many cases. If the request just gives us a resource without any side effects (like many get requests), nothing prevents us from using the previous response.

Proper caching settings can improve performance of your project, make your project faster for user. On the other hand, neglecting caching can bury your project. Speed matters. Nobody likes to wait for the data to finally load and the site will finally come to life.

Caching with Web Storage API

Caching data in localStorage allows us to skip repeated requests to the server. In fact, I'm not a fan of this technique. This method has some disadvantages. For example, data stored in localStorage has no expiration time. But this is one of the easiest ways to cache. And this is a good solution if you do not have access to the server.

Let’s create a simple Redux middleware that will be to cache some data in a React app.

import { FETCH_ARTICLES_SUCCESS } from '../actions';

const isCached = actionType => {
  return [ FETCH_ARTICLES_SUCCESS ].includes(actionType);
};

const setToCache = action => {
  window.localStorage.setItem(
    action.key,
    JSON.stringify(action.data)
  );
};

export const getFromCache = key => {
  const data = window.localStorage.getItem(key);

  if (!data) {
    return null;
  }

  return JSON.parse(data);
};

export const conservator = store => next => action => {
  if (isCached(action.type)) {
    setToCache(action);
  }

  next(action);
};

Now we need to connect our conservative...

import { conservator } from './middleware/conservator';

const middleware = [thunk, conservator];

const store = createStore(
  rootReducer,
  initialState,
  compose(applyMiddleware(...middleware))
);

And add changes to the Article actions.

import { getFromCache } from '../middleware/conservator';
import { FETCH_ARTICLES_SUCCESS } from './';

const CACHE_KEY = articles;

const fetchArticlesSuccess = articles => ({
  type: FETCH_ARTICLES_SUCCESS,
  key: CACHE_KEY,
  articles,
});

export const fetchArticles = () => {
  return (dispatch) => {
    const cachedData = getFromCache(CACHE_KEY);

    if (cachedData) {
      dispatch(fetchArticlesSuccess(cachedData));
    } else {
      ArticleApi.index().then(res => {
        dispatch(fetchArticlesSuccess(res));
      });
    }
  };
};

The idea behind this solution is pretty simple. When we first get a response from the server, the data will be cached in localStorage on the way to the Redux store. Before sending a request to the server, we check the data in localStorage by key. If there is, we return data from the cache. If there isn't, we send the the request to the server.

Сaching with Fetch API

By default, fetch uses standard HTTP caching, which is controlled using HTTP headers. We can set these headers with the cache parameter in the query request options. For example:

fetch(url, { cache: no-cache });

You can see the full list of available values ​​in the fetch specification. I will not describe all of them, I will focus only on some interesting points.

The Cache-Control header allows us to specify how and for how long the request should be cached. When this header is Cache-Control: private it means that the resource should be cached in the browser only. This cache is used when the user clicks the back button in the browser.

An alternative to this value is Cache-Control: public. It allows you to cache the resource for many users by any intermediate cache.

We can also set cache expiration time with Cache-Control: max-age. For example Cache-Control: max-age=3600 makes you cache valid for one hour.

Another very important header is E-Tag.
E-Tag (entity tag) - is the ID of the version of the resource. You can think of this identifier as a checksum (or “fingerprint”). Let’s see how it works.

The server sends a response with E-Tag header, which contains some value, say, “ver1”.
When the browser again requests this resource, the request will contain in the header If-None-Match: ver1. This header makes the request conditional. If there is no resource that corresponds to version “ver1”, the response will contain new data and has a status of 200. If there is, it will be a very short response with status of 304 (Not Modified).

There is a method in RubyOnRails, that allows us to set the E-Tag manually.

def show
  @article = Article.find(params[:id])

  if stale?(etag: @article, last_modified: @article.updated_at)
    render json: @article
  end
end

Caching with Redis

Pay attention to one point in the previous example. To understand what status we should respond with, we need to get the resource from the database first. When there are many requests at the same time, this can be a problem. At best, users will have to wait a bit.

But we can reduce the cost of reading from database with an in-memory data structure store. I prefer to use Redis as such a store. Let’s modify the previous example.

@article = Rails.cache.fetch({cache_key_with_version}, expires_in: 12.hours) do
   Article.find(params[:id])
end

cache_key_with_version generates a string based on the model's class name, id, and updated_at attributes. As you can see, this cache will be valid for 12 hours from the moment of the first request. All this time the resource will be received without reading from the database.

I often see that this method is used to cache the results of several database queries in one object. For example, we can cache summary information about a user in a profile object. It will contain not only basic information about the user, but also the number of friends, the number of posts, balance, etc., which will be requested from several tables.

In my opinion this practice is bad. Especially if some data that will be included in a single object can be updated frequently. You will be forced to reduce the resource caching time, and you can significantly increase the resource cache formation time.

I prefer to normalize my cache according to the first normal form. Each cache entry is a separate entity. This gives me the ability to manage the cache more flexibly.

In a word, mindless caching can have a completely different effect than you expected.

Caching with Nginx

And finally, I'll tell you how to configure caching in Nginx. With a heavy load on the site, this can give an incredible result, you can reduce the load many times even when resource is cached for a short time.

Here is an example of Nginx config.

proxy_cache_path /var/lib/nginx/cache levels=1:2 keys_zone=my_cache:10m max_size=10g inactive=60m use_temp_path=off;

server {
…

location /api/v1/articles {
  # activate the my_cache caching zone            
  proxy_cache my_cache;
  # cache requests with these codes for 5 minutes
  proxy_cache_valid 200 301 302 304 5m;
  # set a caching key
  proxy_cache_key $scheme$proxy_host$uri$is_args$args;
  # ignore backend headers
  proxy_ignore_headers "Cache-Control" "Expires";
  # protection against identical cookies in a cached response
  proxy_hide_header "Set-Cookie";
}

In this case, we will receive an answer without request to web servers and databases. Instantly 🧚‍♀️

Discussion

markdown guide