DEV Community

Francesco Napoletano
Francesco Napoletano

Posted on • Edited on

5 1

A Facebook crawler was making 7M requests per day to my stupid website

I own a little website I use for some SEO experiments. Of course there's some content and a facebook sharing button for every post. The website is so little it runs on a "single controller" PHP app + a 400kb SQLite db, but can generate thousands of different pages.

Everything is hosted (together with a bunch of other websites) on a cheap DigitalOcean machine + free cloudflare plan for some caching. One of those websites has some alerting and it started to alert me about being down.

After some investigations I've found out the problem... the Facebook Crawler

That crawler was making more than 7M requests per day (with a peak of 300req/second) to that website.

Their doc was not helping on how to block the bot.

  • og:ttl -> ignored
  • robots.txt -> ignored
  • HTTP 429 -> ignored

I had to block the user-agent using cloudflare rules.

If there's someone working on that crawler here on dev.to, please stop ignoring basic Internet netiquette about crawlers.

Next time you could hit someone on AWS. And then they'll probably ask you to pay the bill ;)

Image of Timescale

Timescale – the developer's data platform for modern apps, built on PostgreSQL

Timescale Cloud is PostgreSQL optimized for speed, scale, and performance. Over 3 million IoT, AI, crypto, and dev tool apps are powered by Timescale. Try it free today! No credit card required.

Try free

Top comments (3)

Collapse
 
adrobyazkosoftheme profile image
adrobyazko-softheme

Try follow in robots.txt

User-agent: Facebot/1.0
Crawl-delay: 1

User-agent: Facebot/1.1
Crawl-delay: 10

Collapse
 
sharadcodes profile image
Sharad Raj (He/Him)

This is some serious issue

Collapse
 
godge_c196dfa41b7cc7f644b profile image
Godge

Services like iprangesapi.com can help to block requests from crawlers like these - might be worth a look!

Image of Docusign

🛠️ Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more