Frontend rate limiting can save you $10,000

#javascript #webdev #programming #coding

Fastest Requests Are the Ones You Never Make

While backend rate limiting is absolutely crucial, it comes at a cost.

For backend rate limiting, we usually store a user's IP and request timestamps in memory or a Redis database. If your app has millions of users, you end up storing a lot of data in memory or a database — and that can get expensive fast.

But what if we block requests before they ever leave the browser/client? By handling rate limiting on the client side, we can save a significant number of unnecessary request and read/write operations on the backend. Depending on how many users your app has, this can translate to anywhere between hundreds to thousands of dollars saved per year.

1: Basic frontend rate limiter example:

// utils/rate-limiter.ts

const MAX_REQUESTS = 50;           // Maximum number of requests allowed
const TIME_WINDOW = 60 * 1000;    // Time window in milliseconds (1 minute)

// Store timestamps of recent requests
let requestTimestamps: number[] = [];

// Check if a new request is within the allowed limit
function canMakeRequest(): boolean {
    const currentTime = Date.now();

    // Remove timestamps that have fallen outside the time window
    requestTimestamps = requestTimestamps.filter(
        timestamp => currentTime - timestamp < TIME_WINDOW
    );

    return requestTimestamps.length < MAX_REQUESTS;
}

// Drop-in fetch wrapper with rate limiting
function $fetch(url: string, options?: RequestInit): Promise<Response> {
    if (!canMakeRequest()) {
        return Promise.reject(new Error('Rate limit exceeded. Please try again later.'));
    }

    requestTimestamps.push(Date.now());
    return fetch(url, options);
}

export default $fetch;

2. Better rate limiter with cooldown time and localstorage implementation so user can't just reset limit by refreshing the page

// Front end rate limiter
// Max number of requests allowed per time window
const MAX_REQUESTS = 50; // Maximum number of requests allowed
const TIME_WINDOW = 60 * 1000; // Time window in milliseconds (e.g., 1 minute)
const COOLDOWN_TIME = 30 * 1000; // Time to wait after hitting the rate limit before requests are allowed again

// localStorage keys for persisting state across page refreshes
const STORAGE_KEY_TIMESTAMPS = 'rate_limiter_timestamps';
const STORAGE_KEY_COOLDOWN = 'rate_limiter_cooldown_until';

// --- localStorage helpers ---

// Load request timestamps from localStorage, falling back to an empty array
function loadTimestamps(): number[] {
    try {
        const stored = localStorage.getItem(STORAGE_KEY_TIMESTAMPS);
        return stored ? JSON.parse(stored) : [];
    } catch {
        return [];
    }
}

// Persist the current timestamps array to localStorage
function saveTimestamps(timestamps: number[]): void {
    try {
        localStorage.setItem(STORAGE_KEY_TIMESTAMPS, JSON.stringify(timestamps));
    } catch {
        // Silently fail if localStorage is unavailable (e.g. private browsing restrictions)
    }
}

// Load the cooldown expiry timestamp (ms since epoch), or 0 if none is set
function loadCooldownUntil(): number {
    try {
        return Number(localStorage.getItem(STORAGE_KEY_COOLDOWN) ?? 0);
    } catch {
        return 0;
    }
}

// Persist the cooldown expiry timestamp to localStorage
function saveCooldownUntil(until: number): void {
    try {
        localStorage.setItem(STORAGE_KEY_COOLDOWN, String(until));
    } catch {
        // Silently fail if localStorage is unavailable
    }
}

// --- Core rate-limit logic ---

// Function to check if a request can be made
function canMakeRequest(): boolean {
    const currentTime = Date.now();

    // If an active cooldown exists, block all requests until it expires
    const cooldownUntil = loadCooldownUntil();
    if (cooldownUntil && currentTime < cooldownUntil) {
        return false;
    }

    // Load persisted timestamps so a page refresh doesn't reset the counter
    let requestTimestamps = loadTimestamps();

    // Remove timestamps that are outside the time window
    requestTimestamps = requestTimestamps.filter(
        timestamp => currentTime - timestamp < TIME_WINDOW
    );

    // Persist the cleaned-up timestamps
    saveTimestamps(requestTimestamps);

    // Check if the number of requests in the time window is less than the maximum allowed
    return requestTimestamps.length < MAX_REQUESTS;
}

// Function to make a request
function $fetch(url: string, options?: RequestInit): Promise<Response> {
    if (!canMakeRequest()) {
        // Inform the caller how long (in seconds) they need to wait
        const remaining = Math.ceil((loadCooldownUntil() - Date.now()) / 1000);
        return Promise.reject(
            new Error(`Rate limit exceeded. Please try again in ${remaining}s.`)
        );
    }

    const currentTime = Date.now();

    // Load, update, and persist timestamps before making the request
    const requestTimestamps = loadTimestamps();
    requestTimestamps.push(currentTime);
    saveTimestamps(requestTimestamps);

    // If this request tips us over the limit, start a cooldown period.
    // The cooldown begins NOW and lasts COOLDOWN_TIME ms, after which
    // canMakeRequest() will unblock and the sliding window will have cleared.
    if (requestTimestamps.length >= MAX_REQUESTS) {
        saveCooldownUntil(currentTime + COOLDOWN_TIME);
    }

    // Make the actual fetch request
    return fetch(url, options);
}

export { $fetch };

Whenever you need to make a request to the backend, just use $fetch instead of the native fetch:

// example usage
import $fetch from 'utils/rate-limiter';

try {
    const data = await $fetch('/api/get-data', {
        method: 'POST',
        body: JSON.stringify({ key: 'value' })
    });
} catch (error) {
    console.error(error.message);
}

⚠️ Warning: This will prevent the vast majority of users from flooding your backend, but a determined minority and AI bots may attempt to bypass client-side checks entirely. Backend rate limiting must still be in place as your true line of defense. Think of frontend rate limiting as an optimization layer, not a security layer.

When to Use Frontend Rate Limiting

Your app has a high volume of users and you want to reduce unnecessary backend load
The endpoint being called is expensive (e.g., hits a third-party API, runs a heavy DB query)
You want to improve UX by giving users instant feedback instead of waiting for a backend 429 response
You want to reduce cloud costs on services that charge per request or per DB operation

When Not to Use Frontend Rate Limiting

As a replacement for backend rate limiting — it is never a substitute
For protecting sensitive or security-critical endpoints (authentication, payments, etc.)
When your users are primarily API consumers or developers who interact directly with your backend
When the client environment is untrusted (e.g., mobile apps that can be reverse-engineered or patched)
When accurate rate limiting is a hard requirement — client-side state can be reset simply by refreshing the page