Mubarak Alhazan for AWS Community Builders

Posted on May 29 • Originally published at poly4.hashnode.dev

Why Rate Limiting Alone Won't Stop OTP Abuse — A Real Incident Breakdown

#waf #captcha

The Attack That Looked Like Normal Traffic

On a perfectly normal Tuesday morning, while the team was getting into their usual flow, we quietly noticed that our SendGrid email delivery was behaving strangely. Emails were bouncing more than usual. Open rates were dropping, and then the one that made everything serious was that legitimate users were reporting that our OTP emails weren't showing up in their inboxes. They were landing in spam.

We dug into our SendGrid dashboard, expecting to find something obvious: a misconfigured domain, a broken DKIM record, maybe a sudden spike in bounces from a bad email list. What we found instead was that thousands of OTP emails had gone out to email addresses we didn't recognise. Addresses nobody on our platform had ever signed up with.

Someone was abusing our OTP endpoints.

The otp endpoints in question are public by design. They exist to let new users sign up via email OTP without needing an existing account. No authentication required. That openness, which is intentional and necessary for the signup flow, is exactly what made them a target.

The attacker had written a script that continuously hit these endpoints with a rotating pool of random email addresses, triggering our system to send OTP emails on their behalf. Our infrastructure was being used as a free email delivery machine, and it was burning our SendGrid reputation in the process.

Most of those emails landed on test carrier gateways and were dropped or bounced. But some hit real inboxes. One recipient received over 15 unsolicited OTP emails from us within two days before finally marking us as spam. Another got two within a single day and did the same. When real people mark you as spam, inbox providers like Gmail and Yahoo take notice, and that's exactly what started pushing our legitimate emails into spam folders.

Why The Obvious Defences Failed

The first instinct when you see an endpoint getting hammered is to reach for rate limiting. We already had it in place (1 burst per minute and 3 requests per hour per IP). Reasonable numbers that would stop any single bad actor cold.

Except this wasn't a single bad actor.

When each of 200+ proxy IPs sends exactly one request and then steps aside, rate limiting becomes completely blind to the attack. From the rate limiter's perspective, it's just seeing 200 different users each making one perfectly normal request. No threshold crossed. No alarm triggered. Just 200 OTP emails going out the door, one after another, all looking entirely legitimate.

That's the thing about residential proxy networks that makes them so effective against traditional defences: they don't look like bots. These aren't datacenter IPs that show up on blocklists. They're real consumer devices on real ISP connections. When you look at those IPs in your logs, they look like your actual users.

We also tried blocking known test-carrier gateways, the kind of infrastructure attackers typically use to absorb bulk emails. That helped reduce some of the noise, but it didn't stop the attack. The attacker's email pool was wide enough that plenty of addresses on real domains were still getting through, and those were the ones reaching actual inboxes and generating spam reports.

Blocking the individual IPs wasn't a real option either. By the time you identify and block one, the proxy network has already rotated to the next. You'd need to block the entire residential IP ranges of major ISPs, which would mean blocking your actual users in the process.

The root problem with all of these approaches is that they're built around a different threat model. They assume abuse looks like volume from a single source. This attack was the opposite: low-volume and distributed across hundreds of sources. No single data point in our logs was alarming. Only the aggregate told the story, and by the time the aggregate was obvious, the damage was already done. The question now was how to stop it, and stop it fast.

We needed something that could answer a simpler question: Is there a real human on the other end of this request? Rate limiting, IP blocking, and gateway filtering can't answer that question. Only one thing can reliably answer that.

Stopping the Bleeding

Before we could build a proper fix, we needed to stop the bleeding first, because every spam report that landed while we were still investigating made our reputation harder to recover. So we made an uncomfortable but necessary call: fully block both OTP signup endpoints immediately, knowing it would lock out legitimate users trying to sign up via OTP alongside the attacker.

The blast radius was manageable. Other signup methods were still available, and anyone hitting the blocked endpoints got a generic message directing them to support. Not ideal, but acceptable for the short window we needed.

The block bought us the breathing room to focus on the real solution that could tell the difference between a real user and an attack script.

The Permanent Fix: AWS WAF CAPTCHA

The core question we needed to answer was "Is there a real human on the other end of this request?" Our rate limiting, IP blocking, and gateway filtering couldn't answer that. A well-established industry solution that can answer is CAPTCHA

A script can rotate IPs, generate random emails, and fire requests all day, but it can't solve a CAPTCHA challenge. That was exactly what we needed.

We implemented the fix at the AWS WAF level, sitting in front of the load balancer. Requests get challenged before they ever reach the backend, which means no OTP logic runs, no email gets triggered, and SendGrid never sees the request at all.

We set up two rules working together:

The first was a rate-based block rule: any single IP exceeding a set threshold within a 10-minute window gets blocked outright. This handles the unsophisticated case where someone is just hammering the endpoint from one place.

The second, and more important one, was an always-CAPTCHA rule: every single request to the OTP endpoints gets challenged with a CAPTCHA, regardless of where it's coming from or how many requests that IP has made. This rule actually kills this specific attack. It doesn't matter that the attacker is rotating through 200+ IPs, sending one request each. None of them can solve a CAPTCHA.

The two rules complement each other cleanly to solve the problem.

One thing worth thinking about when implementing CAPTCHA on a high-traffic endpoint is user experience. You don't want legitimate users solving a CAPTCHA every time they request an OTP. We handled this with a 300-second immunity window: once a user solves the CAPTCHA, they're not challenged again for the next 5 minutes. This immunity time is tracked using WAF Token cookies.

After deploying the changes to production, automated requests stopped reaching the backend. We could also see from the WAF dashboard that the automated requests were failing CAPTCHA challenges.

There was one wrinkle, though — the AWS WAF CAPTCHA challenge itself. It uses the classic image-based approach: select all the traffic lights, pick the bicycles, and identify the buses. It works, but it's not a great experience for users who are just trying to sign up. That friction pushed us to evaluate alternatives, and we ended up switching to Cloudflare Turnstile.

Why We Switched to Cloudflare Turnstile

AWS WAF CAPTCHA solved the security problem, but the user experience it delivered wasn't something we were comfortable shipping long-term. Image-based challenges add real friction to what should be a simple signup flow. First impressions matter, and asking a new user to solve a puzzle before they can even get their OTP is not the experience we wanted.

So we switched to Cloudflare Turnstile.

Turnstile takes a fundamentally different approach. Instead of asking users to prove they're human by identifying objects in blurry images, it runs a set of browser-based signals in the background: things like how the page was loaded, JavaScript execution patterns, and other non-invasive checks. In most cases, the user sees nothing at all. They click the signup button, the check happens invisibly, and the OTP request goes through. Only when Turnstile is genuinely uncertain does it surface a checkbox challenge, and even then, it's far less disruptive than a grid of traffic light images.

The implementation sits in the same place as before, in front of the OTP endpoints, intercepting requests before they reach the OTP logic. The main difference is that instead of relying on the WAF to serve the challenge, the Turnstile widget lives on the frontend and generates a token that gets verified server-side before any OTP logic runs.

If you're evaluating CAPTCHA options for a similar setup, the decision mostly comes down to this: AWS WAF CAPTCHA is convenient if you're already in the AWS ecosystem and want everything managed in one place, but Turnstile is the better choice if user experience is a priority, and for a signup flow, it almost always should be.

What We'd Do Differently From Day One

The honest answer is that this incident was preventable. CAPTCHA solutions were available before the incident, but we didn't think enough about how these endpoints could be abused, only about how they were meant to be used.

That's the mindset shift worth taking away from this.

Any public endpoint that triggers an external action needs abuse protection from day one. OTP emails, password reset emails, SMS codes, notification triggers. If an unauthenticated request can cause your infrastructure to do something on behalf of an attacker, that endpoint is a target. For us, the consequence was a degraded sender reputation. But depending on your setup, the same attack pattern could quietly rack up significant bills. AWS SES, Twilio, and similar pay-per-use services will charge you for every single message an attacker tricks your system into sending. The vulnerability is the same regardless of the provider.
Rate limiting is necessary but not sufficient. It should be the floor, not the ceiling. If rate limiting is your only line of defence on a public endpoint that sends emails, you're one residential proxy network away from this same situation. Layer it with CAPTCHA, and choose a CAPTCHA solution that balances security with user experience
Monitor your email reputation proactively. We only noticed the problem when users started complaining. By then, the damage was already accumulating. Setting up daily delivery reports and reputation alerts from your email provider takes little time and gives you early warning before a bad situation becomes a crisis.
Think about blast radius when you design public flows. Because we had alternative signup methods available, blocking the affected endpoints bought us time without completely breaking the product. If those endpoints had been the only way into the platform, the trade-off would have been much harder. Designing with that kind of redundancy gives you options when something goes wrong.

Finally, this incident prompted us to audit every other public endpoint that triggers an email. That audit should have happened at build time. It's now on the roadmap as a recurring security review, not a one-time fix.

Closing Thoughts

Incidents like this one don't make the highlight. There's no dramatic zero-day, no sophisticated exploit, no novel attack vector. It's a script, a proxy network, and an endpoint that was never designed with abuse in mind. And that's what makes it so easy to overlook.

The technical fix here wasn't complex: a couple of WAF rules and a CAPTCHA integration. What required experience was recognising why the obvious defences were failing, making the uncomfortable call to block legitimate users to stop the bleeding, and thinking through the layered approach that would hold up beyond this specific attack.

Security exploits are only getting more sophisticated and more frequent. As you build, make it a habit to ask not just how the feature is meant to be used, but how it can be abused.

If you're building anything with public-facing endpoints that trigger emails or external services, treat abuse protection as a first-class requirement, not an afterthought. Your sender reputation, your cloud bill, and your users will thank you for it.

Thank You for Reading

You can follow me on LinkedIn and subscribe to my YouTube Channel, where I share more valuable content. Also, let me know your thoughts in the comments section.