Introduction - Content Delivery Network
- CDNs are important because they improve performance, scalability, availability, and reliability of services. Understanding what they are can help you not only work with them but also apply their ideas to solve similar problems in your own reverse proxies or infrastructure.
-
Short history:
- Before CDNs, websites depended on single origin servers, causing high latency and outages under traffic spikes. In the late 1990s, Akamai introduced distributed caching and routing users to nearby servers, selling it as a premium acceleration service for large enterprises. With the rise of video platforms like Youtube and Netflix, CDNs became essential to handle massive global traffic and streaming. Over time, pricing dropped and delivery became commoditized, so today CDNs compete mainly through services on top, such as WAF, DDoS protection, TLS termination, bot management, API protection, analytics, and edge compute, not just content delivery.
Forward and Reverse Proxy
- Since a CDN is a reverse proxy, we need to first understand what a reverse proxy is.
Forward Proxy
- Normally called just “Proxy”, it is a server placed in front of client machines that intercepts their internet requests and communicates with external servers on their behalf, acting as a middleman.
- Image reference - Cloudflare Learning
What is a Forward Proxy used for?
- To protect the user's identity online: Only the IP of the proxy will be easily known, but the real IP of the user may be harder to identify. (e.g. avoid censorship from a tyrant government)
- To bypass firewall restrictions: For example, when a college firewall blocks specific websites, a user can still access the proxy, the proxy accesses the website A, and forwards responses to the user.
- To block access to certain content: Also used for the opposite as seen before, for example, a school network configured to connect to the web through a forward proxy can refuse to forward responses from specific sites.
Reverse Proxy
- Instead of being a man in the middle between client and server, a reverse proxy acts as a man in the middle between a client and one or more servers. The client requests one server, but it first reaches a reverse proxy that forwards the request to one or more servers (to the real origin server).
- Image reference - Cloudflare Learning
What is a Reverse Proxy used for?
- Load balancing: For some cases we can't handle all the requests with only one origin server, and for that we use multiple origin servers in which we need to balance traffic across them. For that we use a reverse proxy as a load balancer.
- Protection from attacks: With a reverse proxy the origin server doesn't need to reveal its IP, making attacks harder to make. And also implementing good security techniques in this reverse proxy.
- Caching: There are cases where you have multiple services around the globe, for example, a Brazil Server and an England Server, but you have Reverse Proxies that forward traffic to the closest server. However, to reach the origin server every time you lose a lot of time, and for that you can cache responses in the reverse proxy closest to the user.
- SSL encryption: Instead of doing SSL or TLS encryption and decryption in all your origin servers, you can do it with a single reverse proxy, freeing resources from your real origin servers.
TLDR: Forward Proxy vs Reverse Proxy
- The difference is subtle and important. It is not only about where the proxy sits, but also about its purpose.
-
Forward Proxy: Sits in front of the client and prevents the origin server from communicating directly with the user. Client → Forward Proxy → Internet → Origin Server
- In front of client; Forward user requests to the internet
-
Reverse Proxy: Sits in front of the origin server and prevents the user from communicating directly with the origin server. Client → Internet → Reverse Proxy → Origin Server
- In front of our servers; Manage incoming traffic from the internet to our servers.
How to implement a Reverse Proxy?
- You can use existent services such as NGINX, APACHE, and more.
- You can build your own (really hard and probably you will not achieve same results).
- You can use a CDN, but sometimes you will use both: a CDN for edge caching and a reverse proxy near the origin servers for caching, routing, security and load balancing.
- This is useful in scenarios where you want to take advantage of both strategies.
FINALLY: What is a CDN in practice?
- A reverse proxy… Eh, actually, multiple reverse proxies spread across the globe.
- Maybe you were expecting something bigger since CDNs solve huge problems, but that's it. The power of a CDN resides, obviously, in the software, but mainly in the infrastructure the company has: with redundant energy, multiple data center sites, and reverse proxies with superpowers such as caching, routing, load balancing, and edge networks.
- “ A content delivery network (CDN) is a geographically distributed group of servers that caches content close to end users. A CDN allows for the quick transfer of assets needed for loading Internet content, including HTML pages, JavaScript files, stylesheets, images, and videos.” - Cloudflare Learning
How does a CDN work?
- As mentioned, the main goal of a CDN is to deliver content faster, cheaper, and more reliably.
- For this to work, a CDN is a network of servers linked together and placed in data centers, often near Internet exchange points (IXPs) - places where internet providers connect to exchange traffic between networks.
- In summary: A CDN is formed by multiple servers placed in strategic locations closer to users, enabling high-speed data delivery, and adding more optimizations on top of it to provide better security, reliability, redundancy, and more.
When to use a CDN?
-
To reduce website load time: A CDN distributes content closer to users, caches the content following configured rules, and also applies other optimizations such as compression of transferred data.
- “Edge cache” derives from this → cache closer to users
- To reduce bandwidth costs: Before using a CDN, your origin servers would handle all requests, but now the CDN caches it, so you spend less on bandwidth and hosting.
- Availability, reliability, and redundancy: A CDN can also load balance traffic across several servers and apply a failover strategy to ensure hitting an available origin server, avoiding downtime.
- To improve security: DDoS mitigation through a Web Application Firewall (WAF) and other optimizations.
Multi-CDN
- Multi-CDN is an approach where you use multiple CDN providers to improve reliability, performance, and global coverage. Instead of just failover, modern setups dynamically route traffic to the best-performing CDN in real time. There are trade-offs, since providers differ in APIs, caching behavior, and reporting, making it more complex to manage. Still, Multi-CDN is standard for large-scale platforms where availability and performance are critical.
- Only use Multi-CDN if you identify that a single CDN isn’t enough for reliability and availability: outages, geo constraints (one CDN doesn’t have strong presence close to your users), etc. Analyze all aspects before using Multi-CDN since it’s not easy to manage.
References and Suggested Reading
- https://www.cloudflare.com/learning/cdn/glossary/reverse-proxy/
- https://www.cloudflare.com/learning/cdn/what-is-a-cdn/
- https://www.cloudflare.com/learning/cdn/glossary/internet-exchange-point-ixp/
- https://www.cdnhandbook.com/cdn/history/
- https://www.cdnhandbook.com/multicdn/history/
- https://aws.amazon.com/blogs/networking-and-content-delivery/using-multiple-content-delivery-networks-for-video-streaming-part-1/


Top comments (0)