DEV Community

Young Gao
Young Gao

Posted on

Build a Custom API Gateway in Node.js: Routing, JWT Auth, and Rate Limits (2026)

Most teams reach for Kong or AWS API Gateway before they understand what an API gateway actually does. Then they spend weeks configuring YAML files for something they could've built in an afternoon.

Let's build one from scratch. You'll understand every layer, and you'll know exactly when to stop building and start buying.

API Gateway vs Reverse Proxy

A reverse proxy forwards requests. An API gateway transforms them.

Nginx can route /api/users to your users service. But it won't validate JWTs, enforce per-client rate limits, reshape payloads, or collect business-level metrics. That's gateway territory.

Use a reverse proxy when you just need load balancing and TLS termination.
Use an API gateway when you need auth, rate limiting, transformation, or aggregation across multiple backend services.

The Architecture

Our gateway is a middleware pipeline. Every request flows through:

Request → Router → Auth → Rate Limiter → Transform → Proxy → Transform → Response
Enter fullscreen mode Exit fullscreen mode


typescript

Let's build each piece.

Request Routing

The router matches incoming paths to backend services and rewrites URLs.

interface Route {
  prefix: string;
  target: string;
  rewrite?: (path: string) => string;
  auth: 'jwt' | 'apikey' | 'none';
  rateLimit?: { window: number; max: number };
}

const routes: Route[] = [
  {
    prefix: '/api/users',
    target: 'http://users-service:3001',
    rewrite: (p) => p.replace(/^\/api\/users/, '/v2/users'),
    auth: 'jwt',
    rateLimit: { window: 60_000, max: 100 },
  },
  {
    prefix: '/api/payments',
    target: 'http://payments-service:3002',
    auth: 'jwt',
    rateLimit: { window: 60_000, max: 20 },
  },
  {
    prefix: '/webhooks',
    target: 'http://webhook-handler:3003',
    auth: 'apikey',
  },
];

function matchRoute(path: string): Route | undefined {
  return routes.find((r) => path.startsWith(r.prefix));
}
Enter fullscreen mode Exit fullscreen mode

Routes are checked in order. First match wins. The rewrite function lets you decouple your public API paths from internal service paths — your users see /api/users, your service sees /v2/users.

Authentication Middleware

Two strategies: JWT for user-facing endpoints, API keys for service-to-service.

import { jwtVerify, JWTPayload } from 'jose';

const JWT_SECRET = new TextEncoder().encode(process.env.JWT_SECRET!);
const API_KEYS = new Set(process.env.API_KEYS!.split(','));

async function authenticate(
  req: Request,
  mode: Route['auth']
): Promise<JWTPayload | null> {
  if (mode === 'none') return null;

  if (mode === 'apikey') {
    const key = req.headers.get('x-api-key');
    if (!key || !API_KEYS.has(key)) throw new AuthError('Invalid API key');
    return null;
  }

  const header = req.headers.get('authorization');
  if (!header?.startsWith('Bearer ')) throw new AuthError('Missing token');

  try {
    const { payload } = await jwtVerify(header.slice(7), JWT_SECRET);
    return payload;
  } catch {
    throw new AuthError('Invalid token');
  }
}
Enter fullscreen mode Exit fullscreen mode

The gateway validates tokens so your downstream services don't have to. They receive a trusted x-user-id header instead of repeating JWT verification.

Rate Limiting

Sliding window counter using a Map. For production, swap this with Redis.

interface RateState {
  count: number;
  resetAt: number;
}

const limits = new Map<string, RateState>();

function checkRateLimit(
  clientId: string,
  config: { window: number; max: number }
): { allowed: boolean; remaining: number; resetAt: number } {
  const key = clientId;
  const now = Date.now();
  let state = limits.get(key);

  if (!state || now > state.resetAt) {
    state = { count: 0, resetAt: now + config.window };
    limits.set(key, state);
  }

  state.count++;
  const allowed = state.count <= config.max;

  return {
    allowed,
    remaining: Math.max(0, config.max - state.count),
    resetAt: state.resetAt,
  };
}
Enter fullscreen mode Exit fullscreen mode

Key the rate limit on client ID (from JWT sub claim) plus the route prefix. This gives you per-client, per-endpoint limits. A chatty user hitting /api/users won't eat into their /api/payments budget.

Request/Response Transformation

Strip internal headers going out. Inject context headers going in.

function transformRequest(
  req: Request,
  route: Route,
  claims: JWTPayload | null
): Request {
  const headers = new Headers(req.headers);

  // Strip client-supplied internal headers (prevent spoofing)
  headers.delete('x-user-id');
  headers.delete('x-request-id');

  // Inject trusted context
  headers.set('x-request-id', crypto.randomUUID());
  if (claims?.sub) headers.set('x-user-id', claims.sub);
  headers.set('x-forwarded-for', req.headers.get('cf-connecting-ip') || 'unknown');

  const targetPath = route.rewrite
    ? route.rewrite(new URL(req.url).pathname)
    : new URL(req.url).pathname;

  return new Request(`${route.target}${targetPath}`, {
    method: req.method,
    headers,
    body: req.body,
  });
}

function transformResponse(res: Response, rateInfo: ReturnType<typeof checkRateLimit>): Response {
  const headers = new Headers(res.headers);

  // Strip internal headers from response
  headers.delete('x-powered-by');
  headers.delete('server');

  // Add rate limit headers
  headers.set('x-ratelimit-remaining', String(rateInfo.remaining));
  headers.set('x-ratelimit-reset', String(rateInfo.resetAt));

  return new Response(res.body, { status: res.status, headers });
}
Enter fullscreen mode Exit fullscreen mode

Logging and Metrics

Every request gets logged with timing and outcome. Structured JSON, not string concatenation.

function logRequest(entry: {
  requestId: string;
  method: string;
  path: string;
  route: string;
  clientId: string;
  status: number;
  durationMs: number;
}) {
  // Ship to stdout — let your log collector handle the rest
  process.stdout.write(JSON.stringify(entry) + '\n');
}
Enter fullscreen mode Exit fullscreen mode

Emit to stdout. Let Fluentd or Vector handle shipping. Your gateway shouldn't know about Elasticsearch.

Putting It Together

async function handleRequest(req: Request): Promise<Response> {
  const start = performance.now();
  const url = new URL(req.url);
  const route = matchRoute(url.pathname);

  if (!route) return new Response('Not Found', { status: 404 });

  try {
    const claims = await authenticate(req, route.auth);
    const clientId = claims?.sub as string || req.headers.get('x-api-key') || 'anon';

    let rateInfo = { allowed: true, remaining: 0, resetAt: 0 };
    if (route.rateLimit) {
      rateInfo = checkRateLimit(`${clientId}:${route.prefix}`, route.rateLimit);
      if (!rateInfo.allowed) {
        return new Response('Rate limit exceeded', { status: 429 });
      }
    }

    const proxyReq = transformRequest(req, route, claims);
    const proxyRes = await fetch(proxyReq);
    const response = transformResponse(proxyRes, rateInfo);

    logRequest({
      requestId: proxyReq.headers.get('x-request-id')!,
      method: req.method,
      path: url.pathname,
      route: route.prefix,
      clientId,
      status: proxyRes.status,
      durationMs: Math.round(performance.now() - start),
    });

    return response;
  } catch (err) {
    if (err instanceof AuthError) {
      return new Response(err.message, { status: 401 });
    }
    return new Response('Internal Gateway Error', { status: 502 });
  }
}

Bun.serve({ port: 8080, fetch: handleRequest });
Enter fullscreen mode Exit fullscreen mode

That's a functional API gateway in under 200 lines. No YAML. No plugins. Every behavior is explicit.

When to Stop Building

This covers 80% of use cases. Stop here and buy/adopt when you need:

  • Service discovery — dynamic backend registration (use Consul + existing gateway)
  • Circuit breaking — cascading failure protection (add Opossum, or use Envoy)
  • WebSocket proxying — persistent connection management gets complex fast
  • Multi-region routing — geographic load balancing belongs in infrastructure, not app code

Common Mistakes

Validating JWTs in every service. Validate once at the gateway. Pass a trusted header downstream. If you can't trust your internal network, you have a bigger problem.

Rate limiting by IP only. Behind a load balancer, thousands of users share an IP. Always rate limit by authenticated identity when possible.

Synchronous log writes. Writing logs inline adds latency to every request. Buffer writes or use stdout with an async log shipper.

No timeout on upstream requests. If your users service hangs, your gateway hangs. Always set AbortSignal.timeout(5000) on fetch calls. Without it, one slow service takes down your entire API.

Rewriting paths without tests. Path rewriting bugs are silent. The gateway returns a 404 from the upstream service, and you'll spend an hour debugging the wrong thing. Test your rewrite functions in isolation.

Sharing rate limit state in memory. Our in-memory Map works for a single instance. The moment you scale horizontally, clients get N times their quota. Use Redis with INCR and PEXPIRE for distributed rate limiting.


Part of my Production Backend Patterns series. Follow for more practical backend engineering.




---

If this was useful, consider:
- [Sponsoring on GitHub](https://github.com/sponsors/NoPKT) to support more open-source tools
- [Buying me a coffee on Ko-fi](https://ko-fi.com/gps949)
Enter fullscreen mode Exit fullscreen mode

Top comments (0)