When building APIs, chat services, or real-time systems, one of the biggest challenges is preventing clients from overwhelming your server with too many requests. Without protection, a flood of traffic can slow down performance or even crash the system.
This is where rate limiting comes in. Among the many techniques available, the Token Bucket Algorithm is widely used because it is simple, efficient, and allows for bursts of traffic without losing overall control.
What is Rate Limiting?
Rate limiting is the process of controlling how many requests a client can send to a server in a given period of time.
Example:
- A client may be allowed 10 requests per second.
- If they exceed that, their additional requests are rejected until the next second begins.
Rate limiting ensures:
- Fair resource usage across users
- Protection against abuse, brute-force attacks, or spam
- Improved server stability and reliability
The Token Bucket Algorithm
The Token Bucket Algorithm works like this:
- Each client is assigned a bucket.
- The bucket has a capacity (for example, 10 tokens).
- Tokens are refilled at a fixed rate (for example, 1 token per second).
- Each request consumes one token.
- If the bucket is empty, the request is rejected.
This approach allows short bursts of requests when tokens are available, while still enforcing a long-term average request rate.
Token Bucket Flow
The steps are as follows:
Check if the client is new
If yes, create a bucket for them with full capacity.Refill tokens
Based on the time elapsed since the last refill.Check if tokens are available
- If yes, consume one token and accept the request.
- If no, reject the request.
This balance ensures that clients can make quick bursts of requests but cannot exceed the average allowed rate.
Implementing Token Bucket in Node.js
Let’s build a simple HTTP server with Token Bucket Rate Limiting.
Step 1: Setup
const http = require('http');
// Configuration
const bucketCapacity = 10; // maximum tokens per user
const refillRate = 1; // tokens per second
const ipBuckets = new Map(); // store buckets for each IP
We define the bucket size, refill rate, and a map to store each user’s token bucket.
Step 2: Refill Function
function refillTokens(bucket) {
const now = Date.now();
const elapsed = (now - bucket.lastRefillTime) / 1000; // seconds
const refill = Math.floor(elapsed * refillRate);
if (refill > 0) {
bucket.tokens = Math.min(bucketCapacity, bucket.tokens + refill);
bucket.lastRefillTime = now;
}
}
This function calculates how many tokens should be added based on the elapsed time since the last refill, and updates the bucket without exceeding its capacity.
Step 3: Rate Limiting Middleware
function rateLimitMiddleware(req, res) {
const ip = req.socket.remoteAddress;
// If user is new, create a bucket
if (!ipBuckets.has(ip)) {
ipBuckets.set(ip, { tokens: bucketCapacity, lastRefillTime: Date.now() });
}
const bucket = ipBuckets.get(ip);
refillTokens(bucket);
if (bucket.tokens > 0) {
bucket.tokens -= 1; // consume one token
res.writeHead(200, { 'Content-Type': 'text/plain' });
res.end('Request accepted\n');
} else {
res.writeHead(429, { 'Content-Type': 'text/plain' });
res.end('Too Many Requests\n');
}
}
This function manages the bucket for each IP, consumes tokens when available, and rejects requests if tokens are empty.
Step 4: Start the Server
const server = http.createServer(rateLimitMiddleware);
server.listen(3000, () => {
console.log('Server running at http://localhost:3000/');
});
This starts the server on port 3000, applying the rate limiting logic to every request.
Full Code
const http = require('http');
// Configuration
const bucketCapacity = 10; // maximum tokens per user
const refillRate = 1; // tokens per second
const ipBuckets = new Map(); // store buckets for each IP
// Refill function
function refillTokens(bucket) {
const now = Date.now();
const elapsed = (now - bucket.lastRefillTime) / 1000; // seconds
const refill = Math.floor(elapsed * refillRate);
if (refill > 0) {
bucket.tokens = Math.min(bucketCapacity, bucket.tokens + refill);
bucket.lastRefillTime = now;
}
}
// Middleware
function rateLimitMiddleware(req, res) {
const ip = req.socket.remoteAddress;
// If user is new, create a bucket
if (!ipBuckets.has(ip)) {
ipBuckets.set(ip, { tokens: bucketCapacity, lastRefillTime: Date.now() });
}
const bucket = ipBuckets.get(ip);
refillTokens(bucket);
if (bucket.tokens > 0) {
bucket.tokens -= 1; // consume one token
res.writeHead(200, { 'Content-Type': 'text/plain' });
res.end('Request accepted\n');
} else {
res.writeHead(429, { 'Content-Type': 'text/plain' });
res.end('Too Many Requests\n');
}
}
// Start server
const server = http.createServer(rateLimitMiddleware);
server.listen(3000, () => {
console.log('Server running at http://localhost:3000/');
});
Use Cases
- API Gateways: prevent abuse by limiting requests per client
- Chat Applications: stop spamming by controlling message frequency
- Authentication Systems: slow down brute-force login attempts
- IoT Devices: manage bursts of data from sensors and devices
Benefits of Token Bucket
- Allows bursts of requests up to the bucket capacity
- Maintains a steady long-term request rate
- Simple and efficient implementation
- Predictable refill behavior
Conclusion
The Token Bucket Algorithm is a practical and effective way to implement rate limiting. It combines flexibility and control by allowing temporary bursts while maintaining a predictable average request rate.
If you are building APIs, chat systems, or real-time applications, Token Bucket rate limiting can help you protect your server, ensure fairness, and improve system reliability.
Top comments (0)