Optimizing Web Performance is a pending issue that we must continually work on. There are multiple solutions to improve our response times.
What is an Inverse Proxy cache?
In computing, a reverse proxy is a type of server located between the client and the server or web servers that retrieves the resources of these. These resources are returned to the client as if returned by the web server itself. Basically it is an intermediary that can act in different roles.
The possible applications that we can give to this proxy is to use it as a firewall to an internal network, as a load balancer, as a tool to apply an AB Testing or the reason for this article, as a static cache server.
Image credit: Privacy Canada (https://privacycanada.net)
Basically this server used as a cache, consults the web server if it does not have the requested resource, a MISS occurs and this resource is stored in memory. The next request to the same resource is served by the proxy itself in a very efficient way, producing a HIT and avoiding the call to the web server during a given TTL. In this way we can mount a static server centralized and from which we serve all our static content.
There are different technologies that allow us to fulfill this scenario. One of the best known tools is Varnish, but Nginx apart from web server can also be configured to act as a reverse proxy cache. Reading several articles of comparisons and benchmarks the difference is minimal, so today we are going to focus on the Nginx-Nginx scenario.
However, we could represent the same scenario with different combinations: Varnish-Apache, Varnish-Nginx or Nginx-Apache, being the first of the couple the cache server and the second the web server.
Why Nginx?
Like Varnish, Nginx is able to act as a web cache but not all system administrators know this aspect. Nginx can serve static content directly in a very efficient way and can act as a front-facing cache, just like Varnish would. While Varnish its only task is to act as a cache with advanced options, Nginx is versatile and offers several alternatives.
The truth is that I would not know which of the two options to decline, but the decision to use only nginx, is not to introduce new technologies to learn and maintain in our infrastructure, and it seems that the configuration of nginx is much simpler than the of Varnish.
Mounting the infrastructure
Install Nginx
To install nginx on our server we simply execute the following command:
sudo apt-get install nginx
By default nginx will be installed on your system listening on port 80, serving a static welcome html. Let's see how to configure nginx.
Configure the web server
First of all we must take into account that the web server that will read the resources of our filesystem should no longer listen on port 80, because the proxy will be placed instead.
We are going to create a folder where our resources will be hosted:
mkdir /var/www/assets
After this we are going to configure our nginx so that it listens on port 81, that defines as root our directory of resources and that includes a time of expiration of the resource of 12 hours. This can be configured through regular expressions that allows configuration. For this we generate the following file / etc / nginx / sites-available / static-server
server {
listen 81;
server_name {{tu ip o subdominio}};
access_log /var/log/nginx/static-server.log;
location ~* \.(?:jpg|jpeg|gif|png|ico|cur|gz|svg|svgz|mp4|ogg|ogv|webm|htc)$ {
expires 12h;
root /var/www/assets;
add_header Cache-Control "public";
}
}
And finally we'll move it to / etc / nginx / sites-enabled / static-server. If we now make a request to a resource through port 81 we should be able to receive it.
Configure the proxy cache
To configure nginx to act as a cache we will use the proxy_cache_path directive. This directive indicates a directory of your machine where all the resources that are being cached will be stored.
This directory should contain the www-data group and 700 permissions so that the proxy can write correctly. In addition, this directive indicates a keys_zone identifier that defines the name, the maximum size of this cache with max_size, and the levels of indirection of the folder hierarchy in the cache.
We will also rely on another proxy_cache_key directive, which is the key to store the cached resources. Basically nginx makes a hash of this structure, in this way to be able to choose the level of cache, for example if it uses the same parameter or not in the url.
Once we have these two directives clear, we will indicate that the proxy listens on port 80, and that any request that comes inspect the cache area that we have defined and otherwise check against our web server listening on port 81 with the proxy_pass directive
In addition Nginx includes a header called X-Proxy-Cache that tells the client if the resource was returned by the cache (HIT) or had to consult the web server (MISS). This information is especially interesting to debug that everything is working correctly.
Now I am leaving the complete configuration of the caching server:
#/etc/nginx/sites-enabled/caching-server
proxy_cache_path /tmp/nginx levels=1:2 keys_zone=assets_zone:10m inactive=60m;
proxy_cache_key "$scheme$request_method$host$request_uri";
server {
listen 80;
server_name {{tu ip o subdominio}};
access_log /var/log/nginx/caching-server.log;
location /static/ {
proxy_cache assets_zone;
add_header X-Proxy-Cache $upstream_cache_status;
include proxy_params;
proxy_pass http://localhost:81/;
}
}
I hope this article has been interesting and useful and I hope you tell us your experiences on how you have optimized your website and the problems you have faced and solutions you have implemented.
Related links:
Top comments (7)
Hi,
Thanks for the article. I have a question. If I cache all my assets(images, font, css).
How does the nginx cache work? If I hard refresh the page does it get the data from the folder or from cache?
You are welcome @nimatullah and thanks for reading my articles.
Nginx's cache is powerful! It can cache proxied HTTP requests, however it can also cache the results of FastCGI, uWSGI proxied requests and even the results of load balanced requests (requests sent "upstream"). This means we can cache the results of requests to our dynamic applications.
By the way, I will suggest you read this article: nginx.com/blog/nginx-caching-guide/ or do some research using google.
Hi Shameen, very interesting article! I have a doubt about using Nginx with dynamic content. Could I store on cache some JSON responses from my Rails server? Many thanks
Yes you can. But probably you need to do some sockets configuration to cache JSON response on Rails server. You can read this guide for clear understanding: medium.com/@leshchuk/http-cache-on...
Thanks for the article @Shammeem Reza. Loves me some nginx. So many features, so little time.
You are welcome @david_j_eddy and happy to know that you like it.
Thanks for the article! But is there any way to tell nginx-proxy which content to cache? URL-masks, regex, wildcards or something - WITHOUT touching the origin server?