DEV Community

Cover image for How I Stopped Fake “OpenAI” & “Googlebot” Crawlers from Flooding My Site
YEB
YEB

Posted on

How I Stopped Fake “OpenAI” & “Googlebot” Crawlers from Flooding My Site

If your site is even a little bit successful, you’re probably being hammered by traffic from “OpenAI,” “Googlebot,” or “Bingbot” — or so your logs claim. Spoiler: most of it is fake. Here’s how I learned that the hard way, what I did about it, and the exact steps to fix it.

🧨 The Problem: The Bot Flood No One Talks About

Fake vs Real Website Hits

I run a site that gets about 20,000 legit visits per day. But my server logs?

500,000+ “bot” requests every 24 hours — mostly claiming to be “OpenAI” or “Googlebot.”

Analytics showed almost none of that traffic.

But my server load was spiking, and then my AdSense dashboard dropped this warning:

Crawler – Unknown Error

Restricted Ad Serving

Adsense Crawler Errors

Your site is blocking our ad validation bots, so we’re cutting your revenue.


✅ Step 1: Verifying Bots with BotDetect API

User-Agent headers are worthless for bot detection — anyone can spoof them.

User Agent that claim it is legit crawler

So I built an integration with BotDetect API:

  • Every request gets checked against the API
  • You send the IP (and optionally the UA)
  • It tells you if it's a real verified bot or just another fake VPS

Yeb API Response

Example (PHP pseudo-code):

$response = Http::post('https://api.yeb.to/v1/bot/detect/detect', [
    'ip' => $request->ip(),
    'ua' => $request->header('User-Agent'),
    'api_key' => '...'
]);

Enter fullscreen mode Exit fullscreen mode

Result:

99%+ of “OpenAI” and “Googlebot” traffic was fake.

Verified bot requests were a tiny fraction of what the logs claimed.


🛑 Step 2: Don’t Block Yourself — Enter GeoIP ASN Lookup

Even after blocking the fakes, AdSense was still unhappy.

Why?

Because Google's Media Partners and some AdSense bots don’t always use the standard Googlebot IPs. If you block them? Revenue tanks.

So I added a second layer: GeoIP ASN Lookup.

  • For every suspicious IP, check its ASN (Autonomous System Number)
  • If it's from Google, let it through
  • If not, block with extreme prejudice

Example (PHP pseudo-code):

$asnData = Http::get('https://api.yeb.to/v1/geoip/asn', [
    'ip' => $ip, 'api_key' => '...'
]);


Enter fullscreen mode Exit fullscreen mode

📊 Results: Cleaner Traffic, Stable Revenue, No Guesswork

✅ Fake bots? Blocked.
✅ Real bots & ad crawlers? Let through.
✅ AdSense Policy Center? Clean.
✅ Revenue? Back to normal.


🧠 Lessons Learned

  • Never trust User-Agent headers for bot detection.
  • Don’t go nuclear — over-blocking kills revenue.
  • Double-check with ASN lookups before you block.
  • Automate everything.

🔧 Tools Used


Have your own war stories with fake bots or AdSense errors? Drop them in the comments. Let’s fix the internet together.


#api #openai #webdev #tutorial #bots

Top comments (0)