DEV Community

Cover image for API Rate Limiting Cheat Sheet
Colin McDermott
Colin McDermott

Posted on

9 2 1

API Rate Limiting Cheat Sheet

Jump to a section:

Gateway-level rate limiting

  • Gateway-level rate limiting is a popular approach to rate limiting that allows developers to set rate limits at the gateway level.
  • Gateway-level rate limiting is typically implemented in API gateways such as Kong, Google's Apigee, or Amazon API Gateway.
  • Gateway-level rate limiting can provide simple and effective rate limiting, but may not offer as much fine-grained control as other approaches.

Token bucket algorithm

Token bucket algorithm
image source

  • The token bucket algorithm is a popular rate limiting algorithm that involves allocating tokens to API requests.
  • The tokens are refilled at a set rate, and when an API request is made, it must consume a token.
  • If there are no tokens available, the request is rejected.
  • The token bucket algorithm is commonly used in many rate limiting libraries and tools, such as rate-limiter, redis-rate-limiter, and the Google Cloud Endpoints.

More: Token Bucket vs Bursty Rate Limiter by @animir

Leaky bucket algorithm

Leaky bucket algorithm
image source

  • The leaky bucket algorithm is similar to the token bucket algorithm, but instead of allocating tokens, API requests are added to a "bucket" at a set rate.
  • If the bucket overflows, the requests are rejected.
  • The leaky bucket algorithm can be useful for smoothing out request bursts, and for ensuring that requests are processed at a consistent rate.

Sliding window algorithm

Sliding window algorithm
image source

  • The sliding window algorithm is a rate limiting approach that involves tracking the number of requests made in a sliding window of time.
  • If the number of requests exceeds a set limit, further requests are rejected.
  • The sliding window algorithm is commonly used in many rate limiting libraries and tools, such as Django Ratelimit, Express Rate Limit, and the Kubernetes Rate Limiting.

More: Rate limiting using the Sliding Window algorithm by @satrobit

Distributed rate limiting

  • For high-traffic APIs, it may be necessary to implement rate limiting across multiple servers.
  • Distributed rate limiting algorithms such as Redis-based rate limiting or Consistent Hashing-based rate limiting can be used to implement rate limiting across multiple servers.
  • Distributed rate limiting can help to ensure that rate limiting is consistent across multiple servers, and can help to reduce the impact of traffic spikes.

In this example, we'll create a simple Next.js application with a rate-limited API endpoint using Redis and Upstash. Upstash is a serverless Redis database provider that allows you to interact with Redis easily and cost-effectively.

First, let's create a new Next.js project:

npx create-next-app redis-rate-limit-example
cd redis-rate-limit-example
Enter fullscreen mode Exit fullscreen mode

Install the required dependencies:

npm install upstash-redis@0.4.4 ioredis@4.27.6 express-rate-limit@5.3.0
Enter fullscreen mode Exit fullscreen mode

Create a .env.local file in the project root to store your Upstash Redis credentials:

UPSTASH_REDIS_URL=your_upstash_redis_url_here
Enter fullscreen mode Exit fullscreen mode

Replace your_upstash_redis_url_here with your actual Upstash Redis URL.

Create a new API route in pages/api/limited.js:

import { connectRedis } from '../../lib/redis';
import rateLimit from 'express-rate-limit';
import { createError } from 'micro';

const redisClient = connectRedis();

const rateLimiter = rateLimit({
  store: new RedisStore({
    client: redisClient,
  }),
  windowMs: 60 * 1000, // 1 minute
  max: 5, // limit each IP to 5 requests per minute
  handler: (req, res) => {
    res.status(429).json({ message: 'Too many requests, please try again later.' });
  },
});

export default async function handler(req, res) {
  try {
    await rateLimiter(req, res);
  } catch (error) {
    if (error instanceof createError.HttpError) {
      return res.status(error.statusCode).json({ message: error.message });
    }
    res.status(500).json({ message: 'Internal server error' });
  }

  res.status(200).json({ message: 'Success! Your request was not rate-limited.' });
}

export const config = {
  api: {
    bodyParser: false,
  },
};
Enter fullscreen mode Exit fullscreen mode

Create a lib/redis.js file to handle Redis connections:

import Redis from 'ioredis';

let cachedRedis = null;

export function connectRedis() {
  if (cachedRedis) {
    return cachedRedis;
  }

  const redis = new Redis(process.env.UPSTASH_REDIS_URL);
  cachedRedis = redis;
  return redis;
}
Enter fullscreen mode Exit fullscreen mode

Create a new RedisStore class in lib/redis-store.js:

import { connectRedis } from './redis';

export class RedisStore {
  constructor({ client } = {}) {
    this.redis = client || connectRedis();
  }

  async get(key) {
    const data = await this.redis.get(key);
    return JSON.parse(data);
  }

  async set(key, value, ttl) {
    await this.redis.set(key, JSON.stringify(value), 'EX', ttl);
  }

  async resetKey(key) {
    await this.redis.del(key);
  }
}
Enter fullscreen mode Exit fullscreen mode

Now you can test your rate-limited API endpoint by starting the development server:

npm run dev
Enter fullscreen mode Exit fullscreen mode

Visit http://localhost:3000/api/limited in your browser or use a tool like Postman or curl to make requests. You should see the Success! Your request was not rate-limited. message. If you make more than 5 requests within a minute, you'll receive the rate limit message:

Too many requests, please try again later.

User-based rate limiting

  • Some APIs may require rate limiting at the user level, rather than the IP address or client ID level.
  • User-based rate limiting involves tracking the number of requests made by a particular user account, and limiting requests if the user exceeds a set limit.
  • User-based rate limiting is commonly used in many API frameworks, such as Django Rest Framework, and can be implemented using session-based or token-based authentication.

API key rate limiting

  • For APIs that require authentication with an API key, rate limiting can be implemented at the API key level.
  • API key rate limiting involves tracking the number of requests made with a particular API key, and limiting requests if the key exceeds a set limit.
  • API key rate limiting is commonly used in many API frameworks, such as Flask-Limiter, and can be implemented using API key-based authentication.

Custom rate limiting

  • Finally, it's worth noting that there are many other rate limiting approaches that can be customized to suit the needs of a particular API.
  • Some examples include adaptive rate limiting, which adjusts the rate limit based on the current traffic load, and request complexity-based rate limiting, which takes into account the complexity of individual requests when enforcing rate limits.
  • Custom rate limiting approaches can be useful for optimizing the rate limiting strategy for a specific API use case.

For my latest project Pub Index API I am making use of an API gateway for rate-limiting.

More: RESTful API Design Cheatsheet

Sentry blog image

How I fixed 20 seconds of lag for every user in just 20 minutes.

Our AI agent was running 10-20 seconds slower than it should, impacting both our own developers and our early adopters. See how I used Sentry Profiling to fix it in record time.

Read more

Top comments (3)

Collapse
 
flimtix profile image
Flimtix

Super interesting and helpful post! 👍
Could you make a continuation where you show the pros and cons?

Collapse
 
colinmcdermott profile image
Colin McDermott

Honestly I think the best advice is simply: use an API gateway and configure it as needed. All the stuff about leaky buckets is good theory but really you are probably just going to set a limit eg x requests per y period.

Collapse
 
vineetjadav73 profile image
vineetjadav • Edited

Thanks for extraordinary insights on API, for more information do check out Cloud Computing And DevOps Courses.

Billboard image

Try REST API Generation for Snowflake

DevOps for Private APIs. Automate the building, securing, and documenting of internal/private REST APIs with built-in enterprise security on bare-metal, VMs, or containers.

  • Auto-generated live APIs mapped from Snowflake database schema
  • Interactive Swagger API documentation
  • Scripting engine to customize your API
  • Built-in role-based access control

Learn more

👋 Kindness is contagious

Please leave a ❤️ or a friendly comment on this post if you found it helpful!

Okay