I don't really like bad bots, and by that I mean crawlers that don't care about robots.txt. The reason is simple: I don't want my data fed into obscure systems, and also just by principle, if we give you rules, follow them.
Credit where it's due: the idea came from Caolan's website.
The idea is simple: make the bad bots click a link they aren't supposed to, then ban them. To do that, I added a robots.txt at the root of my site, explicitly disallowing robots from a specific page (I went with /roboty/, because why not):
User-agent: *
Disallow: /roboty/
Then I slipped a link to that page somewhere on the root page.
Since I don't want curious humans getting instantly banned, the page itself just explains what's going on and links to article.php, the actual dangerous script. I named it like that to bypass possible keyword blacklists like ban or ban-ip. ¯\_(ツ)_/¯
Talking about the script, here it is:
<?php
$cf_api_token = '...';
$zone_id = '...';
$note = 'Auto banned by dtech/roboty at ' . date("H:i d/m/y");
$ip = $_SERVER['REMOTE_ADDR'];
$payload = json_encode([
'mode' => 'block',
'configuration' => [
'target' => 'ip',
'value' => $ip,
],
'notes' => $note,
]);
$ch = curl_init("https://api.cloudflare.com/client/v4/zones/{$zone_id}/firewall/access_rules/rules");
curl_setopt_array($ch, [
CURLOPT_RETURNTRANSFER => true,
CURLOPT_POST => true,
CURLOPT_POSTFIELDS => $payload,
CURLOPT_IPRESOLVE => CURL_IPRESOLVE_V4,
CURLOPT_HTTPHEADER => [
"Authorization: Bearer {$cf_api_token}",
"Content-Type: application/json",
],
]);
$response = json_decode(curl_exec($ch), true);
curl_close($ch);
header("Location: /?blehhhhh"); // redirect to '/', should be blocked
echo "Bye ;)";
Right now it only bans the bot's IP on douxx.tech (proxied through Cloudflare), but I plan to eventually implement it into an internal API to block across every domain I own, and maybe throw in some iptables rules too.
So yeah, I'll keep it running for a bit and see how many IPs we get.
For the record, the first one to be banned is an IP from Tencent datacenters 🤡
Top comments (0)