DEV Community

Bharath Kumar
Bharath Kumar

Posted on

I Built a Rate Limiter SDK from Scratch — Here's Every Decision I Made and Why

I'm a final-year CS student who contributes to open source — Formbricks, Trigger.dev. While doing that I kept running into the same class of problems: rate limiting, retry logic, SDK reliability.
So I built a rate limiter SDK from scratch. Not to follow a tutorial. To actually understand every decision.
This post is about those decisions — why Redis over PostgreSQL, why sliding window over fixed window, why fail-open over fail-closed, and a few others. Each one taught me something that no tutorial ever explained.
Live demo: https://rate-limiter-sdk.vercel.app
GitHub: https://github.com/bharathkumar39293/Rate-Limiter-SDK

What I built
A rate limiter that any Node.js developer can drop into their app with one npm install:
typescriptimport { RateLimiterClient } from 'rate-limiter-sdk'

const limiter = new RateLimiterClient({
apiKey: 'your-api-key',
serverUrl: 'https://your-server.com'
})

const result = await limiter.check({ userId: 'user_123', limit: 100, window: 60 })

if (!result.allowed) {
return res.status(429).json({ retryAfter: result.retryAfter })
}
One line. Everything handled. That's the goal of an SDK — hide the complexity so the developer never has to think about it.
The stack: TypeScript, Node.js, Express, Redis, PostgreSQL, Docker. Let me walk through the decisions.

Decision 1: Redis over PostgreSQL for the rate limiting logic
This was the first question I had to answer. I already know PostgreSQL. Why bring in Redis at all?
The answer is simple once you think about it.
Rate limiting happens on every single request — before anything else runs. At scale that's thousands of times per second. PostgreSQL lives on disk. Every query is a disk read. That's fine for storing user data. It's not fine for something that needs to respond in under a millisecond.
Redis lives in RAM. No disk. The difference is roughly 100 nanoseconds (Redis) vs 10 milliseconds (PostgreSQL). That's 100,000x faster.
So the rule became clear: Redis for real-time decisions. PostgreSQL for permanent history. Different jobs, different tools.

Decision 2: Sliding window over fixed window
This is the one I get asked about most. Both algorithms count requests over a time window — but they behave very differently under pressure.
Fixed window divides time into rigid buckets: 0-60s, 60-120s, and so on. Limit is 100 requests per bucket. Sounds fine.
The problem: a user can send 100 requests at second 59 and another 100 at second 61. That's 200 requests in 2 seconds — double the limit — and both batches pass the check. The bucket boundary is a hole.
Sliding window doesn't use buckets. The window always looks back exactly N seconds from right now. If you sent 100 requests in the last 60 seconds, you're blocked. Doesn't matter when the clock ticks over.
The implementation uses a Redis sorted set. Each request is stored as an entry with its timestamp as the score. To check the limit:
typescript// Remove entries older than the window
await redis.zremrangebyscore(key, 0, now - windowMs)

// Count what's left — these are all within the window
const count = await redis.zcard(key)

// Make the decision
if (count >= limit) return { allowed: false, retryAfter: ... }

// Allow — add this request
await redis.zadd(key, now, requestId)
Four lines of logic. The sliding window moves automatically because we always remove old entries before counting.
Stripe uses sliding window. Cloudflare uses sliding window. There's a reason.

Decision 3: Fail-open over fail-closed
This was the most important design decision in the SDK client — and the one that took the longest to think through.
When the rate limiter server is unreachable (network down, timeout, crash), the SDK has two options:

Fail closed → block all requests. Safe, strict.
Fail open → allow all requests. Risky, but resilient.

I chose fail-open. Here's why.
My rate limiter is a secondary service. It exists to protect the developer's app — not to be the app itself. If my server goes down and I fail closed, I just blocked every user of every app that's using my SDK. The developer's product is now broken because of my infrastructure problem.
That's a worse outcome than allowing a few extra requests temporarily.
typescript} catch (error: any) {
// Server unreachable — fail open
if (!error.response) {
console.warn('[RateLimiter] Server unreachable — failing open')
return { allowed: true, remaining: -1 }
}
return error.response.data
}
The remaining: -1 is a deliberate signal. Negative remaining means "we allowed this but couldn't actually check." Developers who want to monitor fail-open events can watch for it.
The principle: never let your secondary service take down someone's primary app.

Decision 4: Fire-and-forget for PostgreSQL logging
Every request — allowed or rejected — gets logged to PostgreSQL for analytics. But I don't await the log call.
typescriptconst result = await checkRateLimit(apiKey, userId, limit, window)

// No await — fire and forget
logRequest({ apiKey, userId, allowed: result.allowed, remaining: result.remaining })

// Response goes out immediately
return res.status(result.allowed ? 200 : 429).json(result)
Why? Because the client doesn't care about logging. The decision is already made. If I await the PostgreSQL write, I'm adding ~5ms of latency to every single request — for something the client gets zero value from.
Fire-and-forget: start the operation, send the response immediately, let the log finish in the background.
The tradeoff: if the server crashes in that 5ms window, the log is lost. That's acceptable for analytics data.
The rule: never make clients wait for things they don't care about.

Decision 5: In-memory cache for API key validation
Every request needs to validate the API key against PostgreSQL. But if I hit the database on every single request, I'm adding a DB round-trip to every rate limit check — defeating the purpose of using Redis for speed.
The solution is an in-memory Set:
typescriptconst validKeys = new Set()

export async function authMiddleware(req, res, next) {
const apiKey = req.headers['x-api-key']

// Fast path — already verified
if (validKeys.has(apiKey)) return next()

// Slow path — first time seeing this key
const result = await db.query('SELECT id FROM api_keys WHERE key = $1', [apiKey])
if (result.rows.length === 0) return res.status(401).json({ error: 'Invalid API key' })

// Cache it for next time
validKeys.add(apiKey)
next()
}
First request from a key: hits PostgreSQL (~5ms). Every subsequent request: hits the Set (~0.001ms). At scale that's thousands of database queries saved per second.
The Set resets on server restart — which is fine. The DB is the source of truth. This is just a speed layer.

Decision 6: Plain React over Next.js for the dashboard
This one is simple but I get asked about it.
The dashboard is an internal analytics tool. It shows request counts, blocked percentages, per-user breakdowns. Nobody is Googling for it. There are no public pages to index.
Next.js is great for server-side rendering and SEO. Neither of those things matter for an internal dashboard that only authenticated users see.
Adding Next.js for this use case is overengineering. Plain React, talking to the Express API, is exactly the right tool.
The principle: use the simplest tool that solves the problem correctly.

Decision 7: 2-second timeout on every SDK call
The SDK calls my server on every limiter.check() call. If my server is slow — maybe it's under load, maybe it's in the middle of a deploy — the SDK should not hang the developer's app indefinitely.
typescriptconst response = await axios.post(serverUrl, options, {
headers: { 'x-api-key': this.apiKey },
timeout: 2000 // give up after 2 seconds
})
Two seconds is the threshold. After that, the request times out, the catch block runs, and we fail-open. The developer's app never hangs waiting for my server.

What I learned
Building this taught me something I didn't expect: the interesting part of backend engineering is almost never the happy path.
Anyone can write the code that works when everything is fine. The decisions that matter are:

What happens when Redis goes down?
What happens when the DB is slow?
What happens when two requests arrive at the same millisecond?
How do you make it fast without making it fragile?

These are the questions that show up in production. Building this project — and contributing to Formbricks and Trigger.dev — forced me to think about all of them.
That's why I built it. Not to add a line to a resume. To actually understand the problems.

Links

Live demo: https://rate-limiter-sdk.vercel.app
GitHub: https://github.com/bharathkumar39293/Rate-Limiter-SDK
My other project (webhook delivery engine): https://web-hook-drop-t4k6.vercel.app

If you're building something similar or have questions about any of these decisions — drop a comment. Happy to dig into it.

Top comments (0)