Solved: All that fake traffic from China – Why? What’s the endgame?

#devops #programming #tutorial #cloud

🚀 Executive Summary

TL;DR: Automated scanning from Chinese IPs often generates log noise and consumes resources by probing for common web vulnerabilities. Solutions range from quick WAF geo-blocking to precise, behavior-based IP banning with Fail2Ban, or highly restrictive allow-listing for internal services.

🎯 Key Takeaways

Large-scale automated scanning, often from botnets, targets common web vulnerabilities like credential stuffing, unpatched software (e.g., Jenkins, WordPress), and information disclosure (e.g., .env files) across the internet.
WAF geo-blocking offers a rapid, edge-level defense to stop high-volume traffic from specific countries, best suited for applications with a clearly defined, limited geographic audience.
Fail2Ban provides a precise, behavior-based intrusion prevention method by monitoring server logs for malicious patterns (e.g., repeated 404s on /phpmyadmin) and automatically creating temporary firewall rules to ban offending IPs.

Tired of seeing your server logs spammed with nonsensical requests from Chinese IPs? We’ll break down the ‘why’ behind these automated scans and provide three real-world, no-nonsense solutions to clean up your logs and secure your perimeter for good.

That ‘Fake’ Traffic From China? Here’s Why It Exists and How to Stop It.

I remember it clear as day. 3:17 AM. The piercing shriek of a PagerDuty alert yanks me out of a dead sleep. The alert? “Disk space critical on prod-web-02.” My mind immediately goes to the worst places: a runaway process, a botched deployment spewing log files, maybe even a breach. I scramble for my laptop, VPN in, and start tailing the Nginx access logs. What I find isn’t a critical application error. It’s an endless, mind-numbing stream of 404s. GET requests for /wp-admin, /phpmyadmin, /.env, /solr/admin/info/system… all from a massive block of IPs originating from China. We don’t even use PHP or WordPress. This wasn’t an attack; it was just… noise. Deafening, disk-filling noise. And it had just cost me an hour of sleep for nothing.

First Off, Why Is This Happening? The Endgame Is Simple.

If you’re seeing this, take a deep breath. It’s almost certainly not personal. You haven’t been “targeted” by a sophisticated APT group. What you’re experiencing is the digital equivalent of someone walking down a street and rattling every single doorknob on every house.

This is large-scale, automated scanning. Botnets are constantly sweeping the entire internet, looking for low-hanging fruit. They’re not looking for you; they’re looking for a specific vulnerability, and they’re checking millions of hosts to find it. The endgame is usually one of a few things:

Credential Stuffing: Trying default or leaked passwords on common login pages.
Vulnerability Exploitation: Looking for unpatched versions of Jenkins, WordPress, Apache Struts, etc.
Information Disclosure: Hunting for exposed .git directories or publicly accessible .env files.
Botnet Recruitment: Finding a vulnerable machine they can infect and add to their army for DDoS attacks or more scanning.

It’s a numbers game. They scan the world, and if your server happens to have a rusty lock, they’ll find it. Our job is to make sure our locks are solid and, more importantly, to stop them from even getting to our front door.

The Fixes: From a Band-Aid to Fort Knox

Okay, enough theory. Let’s talk about solutions. I’ve used all three of these in different situations, and each has its place. We’ll go from the quickest fix to the most permanent.

Solution 1: The Quick Fix (WAF Geo-Blocking)

This is the fastest way to stop the bleeding, especially if that noise is triggering alerts. The idea is simple: if you don’t do business in a specific country, just block all traffic from it at the edge.

If you’re using a modern cloud provider or a CDN like Cloudflare, you have a Web Application Firewall (WAF). In AWS WAF, Azure WAF, or Cloudflare, you can create a rule in about five minutes that says “If the source IP is from Country CN, then Block.”

When to use it: Your application serves a specific geographic region (e.g., North America only) and you have no legitimate customers in the countries you want to block.

The Catch: This is a blunt instrument. If you suddenly get a legitimate customer from a blocked country, their access will be denied, and you’ll be scrambling to figure out why. It also doesn’t stop a determined attacker using a VPN.

Pro Tip: Don’t just block, block and respond with a 403 Forbidden. This is faster and uses fewer resources than letting the request time out. Your logs will be cleaner, and the scanners will move on more quickly.

Solution 2: The Permanent Fix (Fail2Ban & Intrusion Prevention)

This is my preferred method. Instead of blocking a whole country, you block specific malicious behaviors. This is where a tool like Fail2Ban becomes your best friend. It actively monitors your log files and, when it sees an IP address exhibiting bad behavior (like trying to access /phpmyadmin 10 times in a minute), it automatically adds a temporary firewall rule to block that IP.

It’s more precise. It catches bad actors regardless of their origin, and it only affects those who are actively trying to break in.

Here’s a sample of what a basic Nginx 404-scan filter might look like in /etc/fail2ban/filter.d/nginx-badbots.conf:

[Definition]
failregex = ^<HOST> -.*- .*GET .*(/phpmyadmin|/wp-login|/pma|/.env|/wordpress|/wp-admin).* HTTP/1.*" 404

And you enable it in your /etc/fail2ban/jail.local:

[nginx-badbots]
enabled = true
port = http,https
filter = nginx-badbots
logpath = /var/log/nginx/access.log
maxretry = 2
bantime = 86400

In this example, if an IP hits one of those forbidden paths twice, they’re banned for a full day (86400 seconds). It’s beautiful, and it’s automated.

Solution 3: The ‘Nuclear’ Option (Default Deny & Allow-listing)

Sometimes, you’re managing a system that should never be accessible from the public internet. Think a corporate VPN endpoint, a server’s SSH port, or an internal admin tool that only your team should access.

In this case, you flip the logic. You don’t block bad IPs; you block everything by default and only explicitly allow known, safe IP ranges. This is an allow-list, not a block-list.

You can do this at multiple levels:

Security Groups / Network ACLs: In your cloud console, set the inbound rule for SSH (Port 22) to only allow traffic from your office IP range. Everything else is implicitly denied.
Web Server Config: For an internal tool, you can configure Nginx or Apache to only serve content to a specific IP list and return a 403 Forbidden to everyone else.

Warning: This is the most secure option, but also the most restrictive. Only use this for services with a well-defined and static user base. Applying this to a public-facing e-commerce site would be a resume-generating event, and not in a good way.

Comparison at a Glance

Method	Pros	Cons	Best For
WAF Geo-Blocking	Fast, easy, stops massive volume.	Blunt, can block legitimate users.	Apps with a clear, limited geographic audience.
Fail2Ban	Precise, behavior-based, targets actual attackers.	Requires server-level setup, can be CPU intensive.	Almost any public-facing server.
Allow-listing	Most secure, closes attack surface almost completely.	Highly restrictive, breaks access for anyone not on the list.	Internal tools, management ports (SSH/RDP), VPNs.

At the end of the day, this constant scanning is just part of the background radiation of the internet. You’ll never stop it completely, but by using these layered strategies, you can make your systems a much harder target. The bots will fail, they’ll move on, and you can get back to sleeping through the night.