Most idempotency implementations in NestJS apps look the same: hash the key, check Redis, return cached response or proceed. 40 lines, done.
I'm building a booking platform with NestJS - concurrent mutations against shared resources, money involved. While writing the idempotency layer I kept finding edge cases that the simple version doesn't handle. There are at least five, and most of them are exploitable.
The version everyone ships
@Injectable()
export class IdempotencyInterceptor implements NestInterceptor {
constructor(private readonly redis: Redis) {}
async intercept(context: ExecutionContext, next: CallHandler) {
const req = context.switchToHttp().getRequest();
const key = req.headers['idempotency-key'];
if (!key) return next.handle();
const cached = await this.redis.get(`idem:${key}`);
if (cached) return of(JSON.parse(cached));
const response = await lastValueFrom(next.handle());
await this.redis.set(`idem:${key}`, JSON.stringify(response), 'EX', 86400);
return of(response);
}
}
It has at least five holes.
Keys aren't scoped
The raw idempotency key goes straight into Redis. So:
- User A sends
Idempotency-Key: abc123toPOST /orders - User B sends
Idempotency-Key: abc123toPOST /payments - User B gets User A's cached order response
Unlikely? Sure. But "unlikely" in security means "will happen at scale, probably during an incident when you're already stressed."
Keys need to be scoped to the actor and the endpoint:
function buildRedisKey(actorId: string, method: string, path: string, idempotencyKey: string): string {
const keyHash = createHash('sha256').update(idempotencyKey).digest('hex');
return `idempotency:${actorId}:${method}:${path}:${keyHash}`;
}
The idempotency key gets hashed before going into the Redis key. Not for secrecy - it prevents injection of Redis key separators (:) in user-supplied values and normalizes key length.
Payload switching
User sends Idempotency-Key: order-1 with { item: "book", quantity: 1 }. Gets a 201. Sends the same key with { item: "laptop", quantity: 100 }. The naive version replays the cached 201. The user sees "order confirmed" for a laptop that was never actually ordered.
The scarier version: an attacker probes with a throwaway payload, waits for the cache to expire, then replays the key with an expensive one.
Hash the body, reject mismatches.
function hashBody(body: unknown): string {
const stable = JSON.stringify(body, Object.keys(body ?? {}).sort());
return createHash('sha256').update(stable).digest('hex');
}
// On cache hit:
const cached = JSON.parse(await redis.get(redisKey));
if (cached.bodyHash !== hashBody(req.body)) {
// Extend TTL - don't let them wait it out
await redis.expire(redisKey, 86400);
throw new ConflictException({
code: 'IDEMPOTENCY_CONFLICT',
message: 'This idempotency key was already used with a different request body.',
});
}
The TTL extension on conflict matters. Without it, the attacker just waits for the entry to expire and tries again.
The concurrent first-use race
Two identical requests arrive within the same millisecond. Both check Redis. Both miss. Both execute the mutation. Double booking.
Your database constraints might save you (unique indexes, optimistic locking), but "might" isn't a word I want near payment processing.
const lockKey = `${redisKey}:lock`;
const acquired = await redis.set(lockKey, '1', 'EX', 5, 'NX');
if (!acquired) {
await sleep(100);
const cached = await redis.get(redisKey);
if (cached) return replay(cached);
// First request still running or crashed. Let this one through -
// the DB-level constraints are the final safety net.
}
SET NX is atomic. Only one request wins. The loser waits 100ms, checks if the winner cached a result, and either replays it or proceeds (because the winner might have crashed, and the 5-second lock TTL ensures we don't deadlock).
This is defense in depth: the idempotency layer catches the common case, and the database catches the rest. Neither alone is enough.
Not all errors should be cached
The naive version caches every response for 24 hours. Including 503s, 429s, and 408s.
So your server has a brief Redis hiccup, returns a 503 to one request, and now every retry of that idempotency key for the next 24 hours replays "Service Unavailable." The user literally cannot complete their action.
Different status codes need different treatment:
function getTtlForStatus(statusCode: number): number | null {
if (statusCode >= 200 && statusCode < 300) return 14400; // 4h - it worked, cache it
if (statusCode >= 400 && statusCode < 500) {
if (statusCode === 409) return 2; // Brief dampener, not a wall
if ([429, 408, 423].includes(statusCode)) return null; // Retryable, don't cache
return 14400; // 400, 422 etc - retrying won't fix bad input
}
if (statusCode >= 500) {
if (statusCode === 503) return null; // "Try again" means let them try again
return 10; // Brief thundering-herd protection
}
return null;
}
The 409 with a 2-second TTL is worth explaining: when two requests race and the loser gets a DB conflict, you want a tiny dampener to absorb the immediate retry storm, but nothing longer. The client needs to be able to recover.
Success preservation
This one is subtle.
Request A and Request B arrive simultaneously, same key, same body. A gets the lock, runs the mutation, gets 201, starts writing to Redis. B's lock attempt timed out, it ran anyway, got a 409 from the database (unique constraint), and writes its result to Redis.
If B's write lands after A's, the cache now holds a 409 for a mutation that succeeded. Every future replay tells the client their operation failed. They retry with a new idempotency key, and now you have a duplicate.
The rule is simple: never let an error overwrite a success.
async function cacheResponse(redisKey: string, statusCode: number, body: unknown, bodyHash: string) {
const ttl = getTtlForStatus(statusCode);
if (ttl === null) return;
const isSuccess = statusCode >= 200 && statusCode < 300;
if (!isSuccess) {
const existing = await redis.get(redisKey);
if (existing) {
const parsed = JSON.parse(existing);
if (parsed.statusCode >= 200 && parsed.statusCode < 300) {
return; // A 201 is already cached. Don't touch it.
}
}
}
await redis.set(redisKey, JSON.stringify({ statusCode, body, bodyHash }), 'EX', ttl);
}
Cardinality attacks
Someone writes a script that sends a million requests with unique idempotency keys. The mutations all fail validation, but each one creates a Redis cache entry that sits there for 4 hours. Your Redis instance fills up.
Per-actor quota, bucketed by time window:
async function checkQuota(actorId: string): Promise<boolean> {
const bucket = Math.floor(Date.now() / 60000);
const quotaKey = `idem-quota:${actorId}:${bucket}`;
const count = await redis.incr(quotaKey);
if (count === 1) {
await redis.expire(quotaKey, 120); // 2x window for clock skew
}
return count <= 30;
}
INCR + EXPIRE instead of SETEX because INCR atomically creates the key if it doesn't exist and increments it. SETEX would reset the counter on every call. The 120-second TTL (double the 60-second window) ensures the key outlives its bucket.
30 unique keys per minute per user is generous for legitimate use and devastating for an attacker trying to fill your Redis.
A few more things worth getting right
Skip unauthenticated requests. Without an actor ID you can't scope keys. An "anonymous" bucket keyed by IP sounds reasonable until you remember that anyone behind the same NAT shares an IP - one user's cached response replays for another. Just skip idempotency for unauthenticated endpoints.
Hash multipart uploads carefully. A 50MB file upload with the same idempotency key is almost certainly a retry. But hashing 50MB on every request is expensive. Hash only the non-file fields.
Rate limiter integration. If a request is a cache hit (idempotent replay), don't count it against rate limits. The user isn't making a new request. They're recovering from a dropped connection. Penalizing retries is punishing correct client behavior.
Redis going down. If your idempotency layer throws 500s when Redis is unavailable, a Redis blip takes down the entire API. Skip idempotency and let requests through instead. You lose duplicate protection for a few seconds. You don't lose your entire service.
The full implementation is around 500 lines. Most of it is boring, defensive code - which is probably why so many production systems are still running the 40-line version.
Top comments (0)