DEV Community

Zachary Izdepski
Zachary Izdepski

Posted on

Content Delivery Networks and Caching

So you've just launched your shiny new app and are ready for hoards of curious potential customers to start bombarding you with requests for your new services. You keep a close eye on the traffic and are pleased that things seem to be working fine with no crashes or bugs reported. But within a few hours of going live you begin to notice a disturbing trend - people are leaving your site shortly after arrival. What's worse, they're leaving without delving into all the features you painstakingly came up with! So, you hop on your own app and try to see what your quickly dwindling users are seeing and are mortified when you have to wait 3 seconds (an eternity in the web development world) for a server request to complete. You have latency issues and your users are moving on as a result.

Those of us who grew up in the early days of the web know how much faster web applications have become over the years. We've never had it so good, and you can't help but notice, or even be a little frustrated by, websites and apps that don't respond in the blink of an eye. We can actually measure this impatience we have become accustomed to with something called Bounce Rate, or simply the percentage of users who exit, or "bounce" from a site or app before moving on to the next page or using features. The average bounce rate for fast applications (loading times of less than 400milliseconds) sits at around 7%. For loading times of 3 seconds or more the bounce rate jumps up to around 22%, which is quite significant. So what can you do to speed things up? These days we have fiber optic cables that have boosted the speeds and bandwidth that many ISPs offer, but even fiber optics have limitations. Even though fiber optics utilize speedy light signals, latency issues can still arise over long distances due to the fact that the light in the cables does not follow the fastest path from end to end. This is because the light bounces along the glass wire in a zigzag fashion. Add to that the frequent need for several call/response cycles to your server or database and all of a sudden the milliseconds it takes for your users in Sacramento to reach a server in Amsterdam are really adding up. This is one of the many issues that using a Content Delivery Network (CDN) can solve.

Image description

CDNs use a geographically distributed network of servers and databases to hold copies of data in order to keep it available to the user closer to where they are. They achieve this through the use of Reverse Proxy servers and data Caching. As the name suggests, a Reverse Proxy server is essentially the opposite of a proxy server in that the user, or client, is agnostic of what server is in use rather than the server being agnostic of the user. This allows for the Load Balancing of server traffic since the network decides what servers should fulfill a given request based on data traffic and availability. This "deciding" of where to grab the requested data is called Received Package Steering (RPS), where requests for particular bits of data are tracked and "steered" to the appropriate server to optimize efficiency and reduce latency.

Image description

Another integral aspect of a CDN is how data is cached in the servers. CDNs all have some kind of Cache Policy to determine what data gets cached and for how long. The most common policy is the Least Recently Used (LRU) policy where data is evicted from the server after a predetermined amount of time since it was last requested. This policy saves the application owner on the cost of the server because it does not need to be as large as a database. This is an important consideration because CDN servers typically use Solid State Drives (SSDs), which are more expensive than the hard drives used in a typical database. With this in mind, it is important to get a server that is not so large as to be too expensive, but also not so small that evicted data is frequently re-cached. This means unnecessary calls to the database that defeats the purpose of a CDN in the first place. Another advantage of using a CDN is a sort of build-in side effect of the network structure itself; if traffic is load balanced, DDOS attacks are likely to be rendered ineffective.

In sum, CDNs offer speed, data availability, security and scalability, all of which are essential for developers who are bringing their innovations to a competitive market. However, not all data are equal, and not all data are cacheable. CDNs are mostly meant to cache non-essential, transient types of data such as user profiles, videos, images and other static data types. More critical data, such as medical records, banking information and passwords, cannot be cached in a CDN and must be stored in a more permanent database.

Top comments (1)

Collapse
 
dasanasak profile image
Dosa Nasyak

I use G-core's CDN and recommend it to everyone. Cloudfront is for some very large companies that do not care about the budget at all.

CDN from G-corelabs 5 TB/month
Price - €100, online chat, support available 24/7, L3, L4 Protection+WAF

CDN from AWS Cloudfront 5 TB/month
Price - €350 and more, delayed notifications in CloudWatch.