From Metrics to Mitigation: Protecting my Application with NGINX Rate Limiting

In my previous article, I integrated Spring Boot Actuator with Prometheus and Grafana to gain visibility into application behavior. The goal was simple: expose metrics, scrape them with Prometheus, and visualize them in Grafana.

What I didn't expect was that the metrics would immediately influence architectural decisions.

After enabling metrics and exploring Prometheus, I started looking at request patterns using PromQL queries.

One metric in particular stood out:

sum by(uri)( http_server_requests_seconds_count )

This showed the total number of requests received by each endpoint.

The results were interesting(As an Admin not as someone who built this lab and send traffic on-purpose).

I thought (hypothetically), what if the Authentication and API endpoints were receiving significantly more traffic than expected? What if someone is sending more traffic than expected, what if someone is intentionally flooding the endpoints? (It could be my wife for not buying her gold last weekend 😂)

A login endpoint could be of the first targets during a brute-force attack, credential stuffing attempt, or denial-of-service event(Only a low esteemed hacker would be interested in my app 😂. But in production environments this is a real issue)

So the problem now is all legitimate and illegitimate requests are reaching my app and I'm wasting time processing the request, because the flow is:

Client -> Application -> MariaDB

So, I thought of introducing a new member to the family, tada - it's Nginx(Actually, reverse proxy). So now, the flow is :

client -> nginx -> App -> MariaDB

This architecture gave me several advantages:

A single entry point for all my services:
- Spring App
- Prometheus
- Grafana
Centralized routing. Meaning, all of the above will reach my reverse proxy which will forward the traffic to the correct endpoint.
Reverse proxy capabilities.
Rate-limiting (In Gen-z lingo, This is our main-character)

This is how i configured nginx for "Rate-limiting".

In simple words, this config:

allows 5 requests per minute.
Burst of 2 additional requests. (Because people make mistakes and having tolerations only helps)
Shouts a 429 when limits exceed ( Like a typical parent )

Let's test this out. Here I'm using a for-loop - A system admin's better half.

This started shouting at the 4th request:

But we configured nginx to tolerate 5 requests per minute right? Here's the math:

5 requests per minute = 1 request in 12 seconds.
Here 3 requests came in probably 3 seconds which means it allowed 1 request and already spent the 2 burst tokens it had it had since it happened within 3 seconds . Now, it did not have more burst tokens to assign it could not tolerate the 4th request and it blocked the request. It will block until a new token arrives every 12 seconds.

3 cheers to Nginx's Leaky Bucket Algorithm!! And 2 beers for me for breaking this down!!

The most important part is NGINX allowed the initial requests and began rejecting excess traffic before it ever reached our application.

With this setup the biggest learning is not the configuration itself but what we can improve on by enabling metrics, those dashboards have a meaning. In this case it forced an architectural change even if it's just adding one small component.

It's also important to know what endpoints to not limit. In my case, I only wanted to limit the /auth/register endpoint as only authenticated requests will reach my /students endpoint. If I limit my /students endpoint it will be like shooting myself in the foot. Also, I didn't want to limit my actuator endpoints as prometheus scrapes it every 15s.

You can configure dashboards in Grafana by connecting it to your prometheus instance to make you life easier.

DEV Community

From Metrics to Mitigation: Protecting my Application with NGINX Rate Limiting

Top comments (0)