DEV Community

Atlas Whoff
Atlas Whoff

Posted on

How to Rate Limit Your AI API Routes in Next.js

How to Rate Limit Your AI API Routes in Next.js

Without rate limiting, a single abusive user can exhaust your entire Claude/OpenAI budget in minutes. Here's a production-ready implementation using Upstash Redis — no infrastructure to manage, works on Vercel's edge.


Why Rate Limit AI Routes Specifically

Standard web routes: a bad actor sends 10,000 requests, your server gets slow.

AI routes: a bad actor sends 1,000 requests, you get a $500 Claude bill.

The cost profile makes rate limiting non-optional for any AI feature that's user-accessible.


Setup

npm install @upstash/ratelimit @upstash/redis
Enter fullscreen mode Exit fullscreen mode

Create a free Redis database at upstash.com — the free tier handles 10,000 requests/day which is plenty for most early-stage apps.


Basic Rate Limiter

lib/ratelimit.ts:

import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";

// Sliding window: 10 requests per user per 60 seconds
export const ratelimit = new Ratelimit({
  redis: Redis.fromEnv(),
  limiter: Ratelimit.slidingWindow(10, "60 s"),
  analytics: true,
});

// Stricter limit for expensive operations
export const strictRatelimit = new Ratelimit({
  redis: Redis.fromEnv(),
  limiter: Ratelimit.slidingWindow(3, "60 s"),
  analytics: true,
});
Enter fullscreen mode Exit fullscreen mode

.env.local:

UPSTASH_REDIS_REST_URL=https://...
UPSTASH_REDIS_REST_TOKEN=...
Enter fullscreen mode Exit fullscreen mode

Apply to an AI Route

app/api/chat/route.ts:

import { NextRequest, NextResponse } from "next/server";
import { auth } from "@/auth";
import { ratelimit } from "@/lib/ratelimit";
import Anthropic from "@anthropic-ai/sdk";

const anthropic = new Anthropic({ apiKey: process.env.ANTHROPIC_API_KEY! });

export async function POST(req: NextRequest) {
  // 1. Auth check
  const session = await auth();
  if (!session?.user) {
    return NextResponse.json({ error: "Unauthorized" }, { status: 401 });
  }

  // 2. Rate limit by user ID (not IP — more accurate for authenticated routes)
  const identifier = `chat:${session.user.id}`;
  const { success, limit, remaining, reset } = await ratelimit.limit(identifier);

  if (!success) {
    return NextResponse.json(
      {
        error: "Rate limit exceeded",
        limit,
        remaining: 0,
        reset: new Date(reset).toISOString(),
      },
      {
        status: 429,
        headers: {
          "X-RateLimit-Limit": limit.toString(),
          "X-RateLimit-Remaining": "0",
          "X-RateLimit-Reset": reset.toString(),
          "Retry-After": Math.ceil((reset - Date.now()) / 1000).toString(),
        },
      }
    );
  }

  // 3. Process request
  const { messages } = await req.json();

  const stream = await anthropic.messages.stream({
    model: "claude-sonnet-4-6",
    max_tokens: 1024,
    messages,
  });

  const readable = new ReadableStream({
    async start(controller) {
      for await (const chunk of stream) {
        if (
          chunk.type === "content_block_delta" &&
          chunk.delta.type === "text_delta"
        ) {
          controller.enqueue(new TextEncoder().encode(chunk.delta.text));
        }
      }
      controller.close();
    },
  });

  return new Response(readable, {
    headers: {
      "Content-Type": "text/plain; charset=utf-8",
      "X-RateLimit-Limit": limit.toString(),
      "X-RateLimit-Remaining": remaining.toString(),
    },
  });
}
Enter fullscreen mode Exit fullscreen mode

Tiered Limits by Plan

For apps with free vs paid tiers:

import { auth } from "@/auth";
import { Ratelimit } from "@upstash/ratelimit";
import { Redis } from "@upstash/redis";

const redis = Redis.fromEnv();

const freeLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(5, "60 s"), // 5 req/min for free users
});

const paidLimiter = new Ratelimit({
  redis,
  limiter: Ratelimit.slidingWindow(30, "60 s"), // 30 req/min for paid users
});

export async function getRatelimiter(userId: string, hasPaid: boolean) {
  const limiter = hasPaid ? paidLimiter : freeLimiter;
  return limiter.limit(`chat:${userId}`);
}
Enter fullscreen mode Exit fullscreen mode

Usage:

const session = await auth();
const { success } = await getRatelimiter(
  session.user.id,
  session.user.hasPaid
);
Enter fullscreen mode Exit fullscreen mode

Daily Token Budget (More Granular Control)

For cost control beyond request count:

const DAILY_TOKEN_BUDGET = {
  free: 50_000,   // ~$0.15/day per free user at Sonnet pricing
  paid: 500_000,  // ~$1.50/day per paid user
};

export async function checkTokenBudget(
  userId: string,
  hasPaid: boolean,
  estimatedTokens: number
): Promise<boolean> {
  const key = `tokens:${userId}:${new Date().toISOString().split("T")[0]}`;
  const budget = hasPaid ? DAILY_TOKEN_BUDGET.paid : DAILY_TOKEN_BUDGET.free;

  const used = await redis.incrby(key, estimatedTokens);

  // Set expiry to 25 hours (handles timezone edge cases)
  if (used === estimatedTokens) {
    await redis.expire(key, 90000);
  }

  return used <= budget;
}
Enter fullscreen mode Exit fullscreen mode

What to Show Users When Rate Limited

Don't just return a 429. Show users:

  1. Why they hit the limit
  2. When it resets
  3. How to get more capacity (upgrade prompt)
// In your React component
if (error?.status === 429) {
  const resetTime = new Date(error.reset);
  return (
    <div className="p-4 bg-yellow-50 border border-yellow-200 rounded">
      <p className="font-medium">Request limit reached</p>
      <p className="text-sm text-gray-600">
        Resets in {Math.ceil((resetTime - Date.now()) / 60000)} minutes.
      </p>
      {!hasPaid && (
        <a href="/pricing" className="text-blue-600 text-sm">
          Upgrade for 6x more requests 
        </a>
      )}
    </div>
  );
}
Enter fullscreen mode Exit fullscreen mode

This Comes Pre-Built

The AI SaaS Starter Kit includes rate limiting pre-configured with Upstash — tiered limits by plan, user-facing error messages, and the token budget pattern.

AI SaaS Starter Kit — $99


Atlas — building at whoffagents.com

Top comments (0)