DEV Community

Milan Mandal
Milan Mandal

Posted on

Building a Rate Limiter in Java & Spring Boot for Microservices

Modern distributed systems and APIs often face a common challenge: handling too many requests from clients.

Without protection, a sudden surge in traffic can overload servers, cause downtime, or degrade performance.

This is where Rate Limiting becomes essential.

In this article, I will explain how I built a lightweight and extensible Rate Limiter using Java and Spring Boot that supports multiple rate-limiting strategies for microservices and APIs.

🔗 Project Repository

Source Code:

[https://github.com/milanmandal-1/Rate-Limiter]

What is Rate Limiting?

Rate limiting is a technique used to control the number of requests a client can send to a server within a specific time period.

It helps to:

Prevent API abuse

Protect backend services

Improve system stability

Ensure fair resource usage

Many large platforms like Google, Amazon, and Netflix rely heavily on rate limiting to maintain reliability.

Technologies Used

This project uses modern backend technologies:

Java

Spring Boot

Microservices Architecture

API Gateway

Service Registry

Config Server

The system is designed to integrate easily into distributed microservice environments.

🧠 Rate Limiting Algorithms Implemented

This project supports multiple rate limiting strategies.

1️⃣ Token Bucket Algorithm

The Token Bucket algorithm allows a burst of traffic while maintaining an overall rate limit.

How it works:

Tokens are added to a bucket at a fixed rate

Each request consumes a token

If the bucket is empty, the request is rejected

Benefits:

✔ Allows traffic bursts
✔ Smooth request flow
✔ Widely used in APIs

2️⃣ Fixed Window Algorithm

The Fixed Window strategy counts requests within a fixed time window.

Example:

Limit = 100 requests

Time window = 1 minute

If a client sends more than 100 requests within that minute, further requests are rejected.

Advantages:

✔ Simple to implement
✔ Efficient for small systems

3️⃣ Sliding Window Algorithm

The Sliding Window algorithm improves accuracy compared to fixed windows.

Instead of resetting counters abruptly, it calculates requests based on a moving time window.

Benefits:

✔ More precise rate limiting
✔ Prevents traffic spikes at window boundaries

🏗 Microservices Architecture

This project demonstrates rate limiting inside a Spring Boot microservices ecosystem.

Main components include:

API Gateway

Service Registry

Config Server

Hotel Service

Rating Service

User Service

The rate limiter can be integrated at the API Gateway level, ensuring all incoming requests are validated before reaching downstream services.

📂 Project Structure
Rate-Limiter

├── ApiGateway
├── ConfigServer
├── ServiceRegistry
├── HotelService
├── RatingService
├── UserService

├── README.md
└── LICENSE

This structure demonstrates a production-style microservices setup.

Example Rate Limit Scenario

Example configuration:

Limit: 10 requests
Time Window: 1 minute

If a client sends:

Request 1 → Allowed
Request 2 → Allowed
...
Request 10 → Allowed
Request 11 → Blocked

This ensures backend services remain stable under heavy load.

** Why Rate Limiting is Critical for APIs**

Rate limiting protects APIs from:

🚫 DDoS attacks
🚫 API abuse
🚫 Resource exhaustion

It also helps maintain fair usage among multiple clients.

** Real-World Use Cases**

Rate limiters are widely used in:

Public APIs

Payment systems

Authentication services

Cloud platforms

SaaS platforms

Almost every major API provider uses some form of rate limiting.

Future Improvements

Some potential enhancements for this project include:

Redis-based distributed rate limiting

Kubernetes deployment

Dynamic configuration updates

Advanced monitoring with Grafana

Integration with CI/CD pipelines

Top comments (0)