DEV Community

Cover image for How 77% of a Magento Store's Traffic Turned Out to Be Bots — and the nginx Fix That Stopped It
Kishan Savaliya
Kishan Savaliya

Posted on • Originally published at hire-best-magento-hyva-developer.hashnode.dev

How 77% of a Magento Store's Traffic Turned Out to Be Bots — and the nginx Fix That Stopped It

A store owner pinged me with a worrying screenshot: Google Analytics showed a sudden spike of "active users," almost all from a single country. Their
first guess was a viral moment. It wasn't. It was a bot flood — and once I pulled the server logs, the numbers were brutal: 77% of all incoming
requests were automated traffic
hammering the site.

Here's exactly how I diagnosed it and shut it down, with the config you can reuse.

## Step 1: Read the actual logs, not the dashboard

Analytics dashboards lie to you about bots, because many bots execute JavaScript and show up as real "users." The truth is in the web server logs. I
pulled the last 15 minutes of nginx access logs and aggregated by client IP and user-agent.

The pattern was instantly obvious:

  • ~77% of requests came from one cloud-hosting IP range (a data-center, not real humans), rotating fake Chrome user-agents — including Chrome version numbers that don't exist yet. That's a dead giveaway.
  • The rest were crawlers stuck in infinite pagination crawl-traps like /blog/tag/x/page/11410 and ?p=6095 — URLs that should never have been generated.

## Step 2: Understand why a firewall won't help (the part people get wrong)

The owner's instinct was "just block the IP with a firewall." But the site sits behind Cloudflare, and that changes everything:

When you're behind a CDN/proxy, your origin server only sees the CDN's IP addresses at the network layer. The real visitor IP lives in the
X-Forwarded-For HTTP header — which iptables (a layer-3/4 firewall) cannot read.

So iptables -s <botIP> -j DROP blocks nothing. The block has to happen where the real IP is visible: at the CDN, or in nginx, which can parse
the header. nginx is also the perfect place because a 403 there is served instantly — before the heavy application (Magento/PHP) ever boots.

## Step 3: The nginx rules

Two map blocks (HTTP context) read the real client IP from X-Forwarded-For and flag bad traffic:

  # Real client IP behind a CDN is the FIRST token of X-Forwarded-For
  map $http_x_forwarded_for $bad_ip {
      default                       0;
  }

  # Non-compliant / spoofed crawlers
  map $http_user_agent $bad_ua {
      default                       0;
      "~*(Bytespider|PetalBot|MJ12bot|DotBot)"  1;
  }   
Enter fullscreen mode Exit fullscreen mode

Then, inside the server block, drop them before PHP and kill the pagination crawl-trap:

  if ($bad_ip) { return 403; }
  if ($bad_ua) { return 403; }

  # No listing has thousands of pages — anything past page 9 is junk.
  location ~ "/page/[1-9][0-9]+/?$" { return 410; }
Enter fullscreen mode Exit fullscreen mode

A few things worth calling out:

  • 410 Gone beats 404 for crawl-traps — it tells well-behaved crawlers (Googlebot, GPTBot) to permanently drop the URL.
  • Don't block the good bots. I left Googlebot and ClaudeBot untouched (the site's robots.txt allows them) and only blocked the spoofed/abusive ones.
  • Validate before reloading. Always nginx -t first, then nginx -s reload (zero downtime). One unquoted {2,} regex in a location will take your whole site down.

## Step 4: The result

After the reload, I sampled live traffic across a 10-minute window:

  • ~95% of requests were now served as cheap 403/410 responses by nginx, never reaching Magento.
  • PHP/database load from the flood dropped to zero — the server breathed again.
  • Real users, Googlebot, and legitimate crawlers were completely unaffected.
  • Within minutes the attacker's volume began to taper (bots back off once they keep hitting walls).

The durable follow-up is to push the same block up to Cloudflare's WAF (a rule on the hosting ASN + Bot Fight Mode), so the junk is dropped at the
edge and never even reaches the origin as a 403.

## The takeaway

If your store's analytics suddenly balloon with traffic from one country or one network and your bounce rate looks weird, check your server logs
before you celebrate.
A surprising amount of "traffic" is bots that inflate your numbers, burn your server resources, and — on Magento — can flood
your customer table with fake registrations. The fix is usually cheap and fast if you block at the right layer.


I'm Kishan Savaliya, an Adobe-Certified Magento & Hyvä developer. I help store owners with exactly this kind of thing — performance, security, and
clean code. If your store feels slow or you're seeing strange traffic, you can find me and what I do at
kishansavaliya.com.

Top comments (0)