DEV Community

Cover image for Rate Limiting Wasn't Enough — So I Built an API Gateway with Behavioral Abuse Detection

Rate Limiting Wasn't Enough — So I Built an API Gateway with Behavioral Abuse Detection

Macaulay Praise on April 09, 2026

Real rate limiting, Bloom filters, credential stuffing detection, and the bugs that almost broke everything. Live demo included. GitHub: macaula...
Collapse
 
arkforge-ceo profile image
ArkForge

The entropy-based scraping detection is a solid approach, but Poisson jitter is a known countermeasure - modern credential stuffing frameworks already add randomized delays specifically to defeat low-entropy detection. A more resilient signal: correlate entropy with the 4xx ratio for the same client_id. Your two-dimensional auth failure tracking already collects that data. A legitimate API poller will have low inter-request entropy but near-zero auth failures; a credential stuffer with Poisson jitter will still surface as anomalous once you cross-reference against its failure rate. That combination is significantly harder to defeat without slowing the attack to the point of impracticality.

Collapse
 
circuit profile image
Rahul S

Worth noting there's an even nastier evasion than Poisson jitter — replaying captured human session timings. If an attacker records real user inter-request gaps and uses those as their delay distribution, entropy-based detection is basically blind to it. At that point you're down to correlating across signals the attacker can't easily spoof: does the TLS fingerprint match the claimed user-agent, is the source IP geographically consistent across the session, does the request path diversity look like an actual user navigating vs. a script hitting the same endpoint. Single-signal detection is always going to be an arms race, the real leverage is making them defeat multiple orthogonal checks simultaneously.

Collapse
 
wolfraider profile image
Macaulay Praise

That's the honest ceiling of pure behavioral heuristics — once an attacker is replaying real session distributions, you've lost the timing signal entirely. The multi-signal point is exactly right though; the value isn't any single check but making them defeat orthogonal ones simultaneously. TLS fingerprinting and path diversity are harder to spoof consistently at scale in a way that also passes auth failure thresholds and Bloom checks at the same time. Each layer you add raises the attacker's cost. Appreciate you extending ArkForge's point — this thread has mapped out a real extension roadmap.

Collapse
 
wolfraider profile image
Macaulay Praise

Really sharp catch — Poisson jitter is a known blind spot for pure entropy detection and I didn't address it. The cross-correlation idea is clean though, and the data's already there: auth failure counts per client_id exist in Redis from the 2D tracking, so it'd be an additional scoring step rather than new infrastructure. A jittered stuffer still fails logins — combining that failure rate against the entropy score makes it significantly harder to defeat both signals at once without slowing the attack to uselessness. Flagging this as a documented extension, appreciate the depth here.

Collapse
 
bridgexapi profile image
BridgeXAPI

Really interesting 😁😁 I’ve seen rate limiting fall apart pretty fast once you deal with more distributed or “low and slow” abuse. Ran into something similar with OTP flows, retries can look totally legit but still mess up delivery at scale. Behavior based detection makes a lot of sense here. What kind of signals worked best for you?

Collapse
 
wolfraider profile image
Macaulay Praise

The timing entropy signal handles "low and slow" pretty well — bots that throttle themselves still get caught if their inter-request gaps are too regular (std dev below threshold). The two-dimensional auth failure tracking ended up being the most reliable signal overall though — tracking by IP and by username independently avoids the corporate NAT false positive trap.
Your OTP case is a good edge — retries that look legit individually but create delivery pressure at scale would probably need a third axis tracking failure rate per action type. Hadn't thought about it that way before.

Collapse
 
bridgexapi profile image
BridgeXAPI

ah that’s actually really interesting, especially the timing entropy part. makes sense that even “slow” bots still look too consistent over time. the IP + username split is smart too, that NAT issue is always annoying 😅

yeah with OTP it gets weird because retries aren’t always failures, but they still put pressure on delivery. tracking per action type sounds like a solid way to catch that without breaking legit flows

Thread Thread
 
wolfraider profile image
Macaulay Praise

Yeah exactly — retries that aren't failures are the tricky part, it's pressure without a clear signal. Tracking delivery rate or time-to-success per action type might be the cleaner handle there rather than failure counts alone. Might revisit that as an extension!

Collapse
 
itskondrat profile image
Mykola Kondratiuk

good for learning internals - bloom filters + behavioral detection is useful to understand deeply. but in production: you probably shouldn’t own this surface. Cloudflare/AWS WAF already handle this. building it custom means maintaining a security-critical component indefinitely.

Collapse
 
wolfraider profile image
Macaulay Praise

Totally fair — and I'd agree in most production contexts. The goal here was never to compete with WAF products but to understand what they're actually doing under the hood. Cloudflare/AWS WAF are the right answer operationally; being able to reason about why — sliding windows, probabilistic filtering, behavioral signals — is what makes you useful when those tools need tuning, custom rules, or you're working somewhere that can't just throw a WAF in front.
The maintenance burden point is real though, it's a genuine cost most teams shouldn't take on.

Collapse
 
itskondrat profile image
Mykola Kondratiuk

yeah fair point - honestly the real payoff is when you're debugging why cloudflare's rules are misfiring in prod. knowing the internals means you're not just clicking through the WAF console hoping something helps

Thread Thread
 
wolfraider profile image
Macaulay Praise

Exactly that — knowing what's underneath means you're debugging with a model, not just guessing at knobs. Appreciate you engaging on it.

Thread Thread
 
itskondrat profile image
Mykola Kondratiuk

Right — that intuition compounds fast. When the decision tree is actually in your head, you stop chasing symptoms and start eliminating causes. The "just adjust the knobs" approach works fine until you hit a production edge case at 2am — then the difference between a 20-minute fix and a 3-hour postmortem is exactly whether you know why the gateway decided what it did.