In the previous article about basic CDN components we described what components you need to build a CDN, and today we will focus on the software configuration of the servers and the reverse proxy itself, which will cache the content to ensure that the data is always as close as possible to the end visitors.
The primary goal of this article is not to give you specific values for each setting (although we will recommend some), but to tell you what to look for and what to watch out for. In fact, we also tune and optimize the specific values ourselves over time according to the traffic and the collected monitoring indications. It is therefore essential to understand the individual settings and adjust them with respect to your HW and expected traffic.
At SiteOne we have the vast majority of servers running on Linux — specifically Gentoo and Debian distributions. In the case of CDN, however, all our servers are running on Debian, so any detailed tips will include Debian paths/settings.
In the area of OS and kernel, we recommend focusing on the following parameters, which will significantly affect how much traffic each server can handle without rejecting TCP connections or hitting other limits:
- Configure /etc/security/limits.conf — set significantly higher soft and hard limits especially for nproc and nofile for the nginx process (tens to hundreds of thousands).
- Ideally, configure the kernel via sysctl.conf and focus on the parameters you see in the recommended configuration below. It’s a good idea to study each parameter, understand how it affects your operation, and set it accordingly.
- If you have kernel 4.9+ you can enable the TCP BBR algorithm to reduce RTT and increase the speed of content delivery. Parameters: net.ipv4.tcp_congestion_control=bbr, net.core.default_qdisc=fq (more info in the article at Cloudflare).
- Check the RX-DRP value with netstat -i, and if the value is already in the millions after a couple of days and still increasing, increase the RX/TX buffers on the netstat. To find the current setting and max value, use ethtool -g YOUR-IFACE and set the new value with ethtool -G, so for example ethtool -G ens192 rx 2048 tx 2048. To make the setting survive a reboot, call the command in post-up scripts in /etc/network/interfaces or /etc/rc.local. If you are modifying the network interface that connects you to the server, be careful, because the change will reboot the interface.
- Txqueuelen on network cards is recommended to be raised from the default 1000, depending on your connectivity and network card.
- Set the IO scheduler on each disk/array depending on what storage you are using — /sys/block/*/queue/scheduler. If you are using SSD or NVME, we recommend none.
- Iptables or router — it is recommended to set some hard limits on the number of simultaneous connections from one IP address and the number of connections per certain time. In case of a DoS attack, you can filter out a large part of the traffic effectively already at the network level. However, you should also set limits with respect to possible visitors behind NAT (multiple legitimate visitors behind one IP address is a typical situation e.g. with mobile operators or smaller local ISPs).
When setting individual parameters, consider what the typical traffic of a visitor who retrieves content from the CDN looks like. HTTP/2 is essential, as it usually only takes one TCP connection for a visitor to download all the content on the page. You can afford shorter TCP connection timeouts, keepalives, smaller buffers. The metrics you collect, such as: the number of TCP connections in each state, will tell you a lot in real traffic. If you want to handle tens of thousands of visitors in seconds or minutes, forget about the default values of various timeouts in minutes and test values in units to tens of seconds.
The values of each setting should be taken only as our recommendation, which has been proven to work well for a server with 4–8 GB RAM, 4–8 vCPUs and Intel X540-AT2 or Intel I350 network cards. Some directives have values an order of magnitude higher or lower than the distributions default. These are usually modifications to increase the ability to handle heavy traffic efficiently and minimize the impact of a DoS or DDoS attack. It is also important to note that the configuration is for a server with IPv6 support disabled. If your situation allows it, use IPv6 too.
fs.aio-max-nr = 524288 fs.file-max = 611160 kernel.msgmax = 131072 kernel.msgmnb = 131072 kernel. panic = 15 kernel.pid_max = 65536 kernel.printk = 4 4 1 7 net.core.default_qdisc = fq net.core.netdev_max_backlog = 262144 net.core.optmem_max = 16777216 net.core.rmem_max = 16777216 net.core.somaxconn = 65535 net.core.wmem_max = 16777216 net.ipv4.conf.all.accept_redirects = 0 net.ipv4.conf.all.log_martians = 1 net.ipv4.conf.all.rp_filter = 1 net.ipv4.conf.all.secure_redirects = 0 net.ipv4.conf.all.send_redirects = 0 net.ipv4.conf.default.accept_redirects = 0 net.ipv4.conf.default.accept_source_route = 0 net.ipv4.conf.default.rp_filter = 1 net.ipv4.conf.default.secure_redirects = 0 net.ipv4.conf.default.send_redirects = 0 net.ipv4.ip_forward = 0 net.ipv4.ip_local_port_range = 1024 65535 net.ipv4.tcp_congestion_control = bbr net.ipv4.tcp_fin_timeout = 10 net.ipv4.tcp_keepalive_intvl = 10 net.ipv4.tcp_keepalive_probes = 5 net.ipv4.tcp_keepalive_time = 60 net.ipv4.tcp_low_latency = 1 net.ipv4.tcp_max_orphans = 10000 net.ipv4.tcp_max_syn_backlog = 65000 net.ipv4.tcp_max_tw_buckets = 1440000 net.ipv4.tcp_moderate_rcvbuf = 1 net.ipv4.tcp_no_metrics_save = 1 net.ipv4.tcp_notsent_lowat = 16384 net.ipv4.tcp_rfc1337 = 1 net.ipv4.tcp_rmem = 4096 87380 16777216 net.ipv4.tcp_sack = 0 net.ipv4.tcp_slow_start_after_idle = 0 net.ipv4.tcp_synack_retries = 2 net.ipv4.tcp_syncookies = 1 net.ipv4.tcp_syn_retries = 2 net.ipv4.tcp_timestamps = 0 net.ipv4.tcp_tw_reuse = 1 net.ipv4.tcp_window_scaling = 0 net.ipv4.tcp_wmem = 4096 65536 16777216 net.ipv6.conf.all.disable_ipv6 = 1 net.ipv6.conf.default.disable_ipv6 = 1 net.ipv6.conf.lo.disable_ipv6 = 1 vm.dirty_background_ratio = 2 vm.dirty_ratio = 60 vm.max_map_count = 262144 vm.overcommit_memory = 1 vm.swappiness = 1
On all PoP servers, you need a critical CDN component — a reverse proxy with robust caching support. Most popular are Varnish, Squid, Nginx, Traefik, H2O and with limited functionality e.g. HAProxy. Tengine is also worth considering, built on Nginx and adding a lot of interesting functionality.
In the context of a CDN, the functionality of the reverse proxy is quite clear — based on the URL and request headers, find the content in the cache and if it is not there, or has expired, download it from the Origin server and store it in the cache so that the next visitor’s request is processed faster, from the cache on the PoP.
We finally chose Nginx web server because we have been using it successfully on most of our servers for many years. We have all the configurations and different vhost variants as well as optimal functional, performance and security settings in Ansible. As for the specific version, we recommend the latest 1.19.x, which already includes the improved HTTP/2 implementation, along with OpenSSL 1.1.1 due to TLSv1.3.
Compared to our normal default values for application servers, we have significantly reduced various buffers, timeouts, and thresholds for CDNs, as well as for the kernel. Our CDN is optimized for static content and for handling only GET/HEAD/OPTIONS requests. Since we don’t have to support POST or uploads anymore, we could tighten the parameters significantly, both on the client side and on the backend (requests to source origin servers).
The following text assumes that you already have at least basic experience with Nginx — that’s why there are no specific configuration snippets, but rather various recommendations beyond basic usage that you won’t usually find in Nginx tutorials and have a significant impact on CDN operation.
Cache is a key functionality of a CDN, so we recommend:
- Check out the High-Performance Caching guide. For proxy cache, carefully study and understand all proxy_cache*_ directives and their parameters. Start with proxy_cache_path and the levels, key_zone, _inactive_or max_size attributes. For remote secondary PoPs, you can have inactive for weeks or months, for example — the cache manager will also keep content that hasn’t been accessed for longer, thus increasing the accelerating effect of CDN and cache hit-ratio even for PoPs from which the content of specific URLs is not downloaded as often.
- Optimally set the proxy_cache_valid directive, which affects how long the HTTP codes are cached. If you decide to cache error codes, e.g. 400 Bad Request, then only cache them for a very short period of time to minimize the effects of possible “cache poisoning”.
- If you don’t want an original to consider its “cache control” through response headers when caching, you can use proxy_ignore_headers and ignore typically Cache-Control, _Expires_ or Vary headers.
- Also pay attention to the proxy_cache_use_stale, which affects how the cache behaves if the origin is unavailable. We decided that if by chance the original is down and the cache has expired, we will return the original content to the visitor anyway. This will encourage high availability. Also set up updating to load the visitor’s content immediately from the cache after expiration (without waiting for the original), but update the content immediately from the original in the background for future visitors. This eliminates the effect of occasional slowdowns, where once in a while a visitor “gets carried away” by the need to update the expired content of a given URL in the CDN.
- Decide what to set in the proxy_cache_key. For example, do you want to include a possible query string in the cache key, which is often used to “version” files and suppress the cache of the original version of the file?
- Activate proxy_cache_lock to keep the cache filling/keeping optimal even with high parallelization and decide how to set proxy_cache_min_uses .
In addition, consider the following tips and settings that affect Nginx performance:
- If your platform allows it, set up use epool. If you have kernel 4.5+, it will use EPOLLEXCLUSIVE.
- For listen directivity of the main node of your CDN (cdn.company.com) use reuseport, so that requests to individual Nginx workers are distributed by the kernel, it is many times more efficient. For the listen directive, study also the backlog and fastopen parameters. You can also activate deferred, so that the request reaches Nginx only when the client actually receives the first data, which can better address some types of DDoS attacks.
- Activate http2 on the listen directive and always keep a secure set of ssl_ciphers (with respect to the browser versions you want to support).
- If you can afford to do so given the browsers supported, only support TLSv1.2 and TLSv1.3.
- The CDN server processor will be mostly loaded by gzip/brotli compression and SSL/TLS communication. Set ssl_session_cache to minimize SSL/TLS handshakes. We recommend shared so that the cache is shared between all workers. For example, a cache size of 50 MB, which will fit about 200,000 sessions in the cache. To minimize the number of SSL/TLS handshakes, you can increase the ssl_session_timeout. If you don’t want to use SSL cache on the server, enable ssl_session_tickets to keep the session cache active at least in the browser.
- For SSL settings, activate 0-RTT on TLSv1.3 (ssl_early_data on) to substantially reduce latency, but understand and consider Replay attack.
- If you want to achieve minimal TTBF (at the expense of higher load when transferring large files), study and set reasonably low ssl_buffer_size and http2_chunk_size. Alternatively, deploy the Cloudflare patch to Nginx, which supports dynamic settings — just google the ssl_dyn_rec_size_lo directive.
- Also focus on understanding and setting up KeepAlive both on the client side and in the upstreams — this will help streamline communication with the origin servers. KeepAlive HTTP/2 is governed by the http2_idle_timeout directive (default: 3min), also look at http2_recv_timeout. Keeping connections open unnecessarily long significantly reduces the number of visitors you are then able to serve. It also affects how large a DDoS attack you are then able to withstand. It’s good to have an understanding of how connection-tracking works (both on Linux and possibly on routers when the server is behind NAT), how it relates to the limit_conn setting, and how it behaves as a whole if you have hundreds of thousands of clients accessing your servers or are under a DDoS attack on L7.
- If you need to detect a change in the IP address of the original and you don’t have a paid Nginx Plus with the resolve attribute on the upstream server, you can just use
proxy_pass: https://www.myorigin.com;instead of defining an upstream. In this mode, proxy_pass monitors the TTL in the domain DNS and updates the IP address(es) if necessary.
- Also study the lingering_close, lingering_time, and lingering_timeout directives, which determine how quickly inactive connections should be closed. For better resistance to attacks, it makes sense to reduce the default times. For HTTP/2 connections, however, lingering_* directives have only been applied since Nginx 1.19.1.
- Increase ULIMIT in /etc/default/nginx and also set a higher LimitNOFILE in /etc/systemd/system/nginx.service.d/nginx.conf.
- The sendfile, tcp_nopush and tcp_nodelay also help to handle files and requests quickly. To prevent clients with fast connections downloading large files from using up the entire worker process, set sendfile_max_chunk sensibly as well.
- If you are handling very large files and are seeing slowdowns in other requests, consider using aio. Be sure to set the directio directive appropriately, which defines the max size of the file that will still be sent via sendfile and larger ones via aio. We find 4MB to be the optimal value, so all JS/CSS/fonts and most images are handled through the sendfile and usually from the FS cache, so no IO does this either.
- Also look at the directives around open_file_cache. With optimal settings and enough RAM you will have almost zero IOPS, even if you are clearing hundreds of Mbps.
- To handle high numbers of concurrent visitors and protect yourself from attacks, reduce client_max_body_size, client_header_timeout, client_body_timeout, and send_timeout as a matter of principle.
- For access log settings, study the buffer and flush parameters to minimize the IOPS associated with writing logs. Beware that this will also cause the logs to not be written 100% chronologically. Access logs should ideally be stored on a different disk than the cached data.
- For upstreams, you can play with load balancing (if the original can be accessed via multiple IP addresses) and backup weighting attributes. In the current version, the useful max_conns attribute, which was for a long time only in the paid version, is now freely available.
- If you also want to have some form of auto-retry logic (for case of short unavailability of the origin), you can solve it for example by using multiple upstream-servers to the same original, but in between them put a vhost with short Lua code that will provide sleep between retry requests.
- Use a custom resolver setup and consider using the local dnsmasq as the primary resolver.
- Learn how the Cache Manager works in Nginx, which starts working especially when the cache gets full.
- Not everything can be mentioned here, but other attributes have an impact on proxy and cache behavior, which we recommend to study and set as well: proxy_buffering, proxy_buffer_size, proxy_buffers, proxy_read_timeout, output_buffers, reset_timedout_connection.
- If you will be using dynamic modules with Nginx (in our case for brotli compression and WAF), with every Nginx upgrade you have to recompile all modules against the new Nginx version. If you don’t do this, Nginx won’t boot after the upgrade due to signature conflicts with *.so modules. It is therefore better to automate the whole process of upgrading Nginx, because you will end up with a broken Nginx when you upgrade e.g. apt. Part of this automation should include using the option to do Nginx upgrade on-the-fly where Nginx continues to run the old instance (from memory) and at the same time runs (or at least tries to) the new instance from the current binary and modules. This will ensure that you don’t lose a single request during the upgrade, even if the new Nginx doesn’t run after the upgrade for some reason. This whole process is in most distributions in init scripts under the upgrade action, i.e. service nginx upgrade. To prevent unwanted Nginx upgrades when upgrading packages globally, use apt-mark hold/unhold nginx.
Depending on what content and behavior of the originals you want to support, you will need to study and possibly debug the behavior of the CDN cache with respect to the Cache-Control header or, perhaps quite fundamentally, the Vary header. For example, if the origin says in the response Vary: User-Agent, the cache key should include the user-agent of the client, otherwise it can easily happen that you return cached HTML for the mobile version to someone on the desktop. But that depends on what scenarios and content types you want/do not want to support. Supporting these scenarios often means a lot of work, and it also reduces the efficiency of the cache. Usually you won’t be able to get by with native Nginx directives and will have to handle some scenarios with Lua scripts.
Finally, I’ll mention that in the case of Nginx you also have a paid version Nginx Plus which offers various useful functionalities, a live dashboard and extra modules. Important is for example the resolve directive of the upstream server, which in conjunction with the resolver directive can detect a change in the IP address of the origin. However, the cost per instance is in the thousands of dollars per year, so its use would only make sense for a large commercial solution. If you don’t have thousands of dollars and would still like to have a realtime view of Nginx traffic, we recommend buying the $49 Luameter (demo). It works well, but if you’ll be handling hundreds of requests per second and a lot of unique URLs, expect increased load and RAM requirements. We have it disabled by default and only activate it when debugging.
Below we have prepared a sample average basic configuration of Nginx, which in this model example does not do a reverse proxy in front of the whole domain, but provides a CDN endpoint
https://cdn.company.com/myorigin.com/*.(css|js|jpg|jpeg|png|gif|ico) that retrieves content from the origin
https://www.myorigin.com/*. Averaged because we further modify some directives due to the HW of individual PoP servers, and it also doesn’t include some additional security mechanisms that we don’t want to expose. On the servers this configuration is of course split into separate configuration files, which in our case we generate via Ansible.
The settings are especially different at the definition level for individual locations/origins, because you may want differently composed cache-keys, cache validity, limits, ignore cookies, have/not WebP or AVIF support, referer validation, active CORS-related settings, or maybe use a slice module, where you have to cache the 206 code and the cache key must also contain $slice_range. Similarly, for some origins you may want to ignore Cache-Control headers entirely and cache everything at a fixed time, or other per-origin specialties.
The configuration also contains various per-origin directories or files — these must of course be set up by your automation, which you are using to introduce the new origin into your CDN. So really just take this as a guide on how to grab and set up the various functionalities.
We did a random test of two commercial CDNs that have servers in Prague and neither provider is obviously using this great functionality/option. The commercial CDNs have to compress content using brotli or gzip on every request, which drastically drains their CPU and increases the response time several times, but the visitor pays for it.
So how to solve the compression? In Nginx, you can enable static compression for both gzip and brotli (gzip_static on; brotli_static on; ). This, if understood and implemented correctly, can reduce the CPU load quite substantially and at the same time speed up the visitor’s loading time.
The way it works is that when static compression is active and the browser requests e.g. /js/file.js, Nginx looks at the disk to see if there is already a pre-compressed file /js/file.js.gz or /js/file.js. br. If such a file exists, it will send it straight away (without bothering the CPU with compression). The type of compression the browser supports is sent in the Accept-Encoding header (br takes precedence over gzip if the browser supports it).
Nginx does not create .br or .gz files for you. Nor does it try to download these files from the originals. Frontend builds often create these *.br or *.gz files for their JS/CSS as part of the build, but they are simply not used here. You have to provide this yourself with your CDN. We’ve made a background process that continuously parses access logs and extracts “200 OK” requests for text files that don’t have their *.br or *. gz yet.
Because this is a background process, you can afford to choose the highest, most efficient, but therefore slowest compression level for compression. You’ll put a bit of strain on the CPU for once, but the reward will be an additional 5–15% lower transfer rate. In addition, the decompression speed in browsers is minimally affected (you can find benchmarks for this). Don’t forget to figure out how you will clean up the already expired *.br or *.gz after they expire. Also, how and if at all you will handle the situation when the query string contains e.g. ?v=1.0.5 to force the download of a new version of the file.
However you implement static compression, ensure that your files behave atomically during compression. In other words, store the final *.br or *.gz file next to it first, and only when the file is finally done, rename it to the destination location where Nginx expects it. You won’t have someone download a non-valid (only partial) file if a visitor hits the moment you compress.
Since we usually cache content in the browser for months, such a visitor would have downloaded e.g. broken JS/CSS until the cache is cleared, which is very annoying. We all know how unprofessional it is when developers tell a client to clear their browser cache.
Hint: If you don’t have a background process that will handle static compression for you, you should leave static compression disabled. This is because you will unnecessarily increase your IOPS when Nginx will look for *.gz or *.br variants.
If you want to reduce image bitrates by 30% to 90% (depending on how much the source images are already optimized), you can arrange for smart image conversion to modern WebP or AVIF format.
Be careful about the AVIF format though — while it is fully supported and well-functioning in Google Chrome, support in Firefox is still experimental and there it still exhibits various bugs described in this ticket, which will manifest themselves e.g. in not displaying some images. However, this experimental support is disabled by default, so Firefox does not send the image/avif for the Accept request header.
For inspiration, this is how we implemented WebP/AVIF support:
- The background process analyzes the access logs and searches for the most frequently retrieved images with a defined minimum data size.
- Using converters cwebp a cavif convert the source image, e.g. /images/source.jpg, to /images/source.jpg.webp (atomically, as in static compression).
- In Nginx we have logic that when image/avif or image/webp occurs in the Accept header of the request, it tries to send the requested file with the extension . avif or . webp, if it exists on the disk. The solution can be based on a combination of maps and try_files or composing the contents of a variable and IFs.
If we have a real need for this, we may eventually centralise the process. That is, this process will not be done by each server separately, but will be managed by some central system that can select suitable images for optimization from the central logs, keeping statistics of real data savings by transfers, etc. This brings a certain degree of flexibility and the possibility to perform some operations in bulk. However, on the other hand, we like that the decentralization of these processes and the maximum autonomy of the individual PoPs minimizes the risk that some bug will reach the whole CDN. Another advantage is that each PoP optimizes its most loaded content according to the visitors there.
It’s important to note that if you deploy a CDN and suddenly HTML images are loaded from another domain (unless you happen to use the CDN as a proxy for the entire site/domain), search engines will not index them as belonging to your domain, but to the CDN domain. Of course, you don’t want that.
The solution is to provide canonicalization in Nginx using the HTTP Link header, which tells the search engine where the actual source (origin) is. This way it will not index the image under the CDN domain, but under the source domain specified in the Link header. For optimal image indexing, we recommend that you also generate sitemap for images.
Example: the URL
https://cdn.company.com/myorigin.com/image.jpg should return the HTTP header:
Link: https://www.myorigin.com/image.jpg; rel="canonical"
The primary and preferred way of using our CDN is very simple and is also evident from the sample Nginx configuration.
If we want to deploy a CDN for content e.g. on
www.myorigin.com the web developers just need to ensure that instead of
/js/script.js, for example, this file is addressed as
The base URL is our GeoCDN domain, followed by the domain of the original (without the “www”) and ending with the path to the file on the original.
The CDN administrators control which origin domains our CDN supports through Ansible. In Ansible, administrators can also set some specific behavior for each origin. In addition, for each origin it is possible to specify what type of content is supported, restrict URL shapes, define custom WAF rules, etc.
Tip: if you want to deploy a CDN to your site without requiring a single intervention in the application code and you are using Nginx, you can very easily help yourself with the native Nginx sub module. This allows you to easily replace the paths to selected files so that they are addressed from the CDN (typically in HTML or CSS).
The example shows that it requires href/src as the first attribute of the HTML tag. Unfortunately, regular expressions are not supported by sub_filter. If this is not sufficient for you, you can solve this substitution in the application code. You’re probably using a templating system that usually forces you to use some form of base-path variable, so this should be a piece of cake.
Note 1: for content substitution to work, you must also set proxy_set_header Accept-Encoding “”; , so that the original text content is uncompressed and strings can be substituted.
Note 2: since the CDN is not deployed as a reverse proxy for the entire origin domain, the content loads faster in the browser. This is because the browser allows for more parallelization (HTML and assets are loaded from different IP addresses), so the resulting page build and render time is shorter. In reverse proxy mode, HTTP/2 multiplexing and prioritization helps a lot before full origin, but when the browser can load content from multiple different IP addresses, it is still a bit more efficient.
With the help of the previous article on CDN components and this article, you should be able to get your CDN up and running with all the basic functionality.
I hope that this article has helped you and that someone may have found some ideas or settings that will help them to improve their web or application server.
If anyone has additional tips when looking at the proposed settings, or if they see any threats in our configuration, we would be happy to share them in the discussion. We’ve been tweaking the settings ourselves for years, reflecting the different needs and attacks we’ve had on our projects, so it’s an ongoing and never-ending process. Additionally, simulating real traffic to verify the effect of some settings is very difficult, so every lived experience is welcomed and we will be grateful for sharing.
In the next and last article of the How to build a CDN series, we will focus on various operational aspects of CDN operation — how to protect the originals, how to defend against DoS/DDoS attacks and how to have the whole CDN operation under control.
Thanks for reading, and if you like the article, I will be happy if you share it or leave a comment. If you are a crypto fan and find this article helpful, you can send “Thank you” to one of these addresses. Thank you and I wish you good health and success in 2022. :-)