DEV Community

Cover image for How to Block AI Web Scrapers on Your WordPress Site with robots.txt (2025 Update)
Mouaz
Mouaz

Posted on

How to Block AI Web Scrapers on Your WordPress Site with robots.txt (2025 Update)

AI web scrapers are increasingly crawling WordPress sites to gather content for training large language models and powering AI search results. As a site owner, you can use your robots.txt file to help protect your original content from being scraped and used without your permission.

This guide explains:

  • What the robots.txt file does
  • Why blocking AI bots matters
  • How to add up-to-date blocking rules

A current table of the main AI bots and the exact syntax you need

What Is robots.txt?

The robots.txt file is a simple text file in your website’s root directory (e.g., https://yourdomain.com/robots.txt). It tells web crawlers which parts of your site they can or cannot access. Most reputable AI and search engine bots will respect these rules.

Why Block AI Bots?

  • Protect your original content from being used to train AI models without your consent
  • Maintain control over your website’s data
  • Limit how your content appears in AI-powered search and chatbots

How to Edit robots.txt in WordPress

  1. Access your website’s root directory (via FTP, file manager, or a WordPress SEO plugin).
  2. Open or create the robots.txt file.
  3. Add the blocking rules from the table below.
  4. Save and upload the file.

Table: AI Bots and robots.txt Syntax (2025)

AI Bot / Model Syntax to Block (add to robots.txt)
   
GPTBot (OpenAI) User-agent: GPTBot
Disallow: /
   
Google-Extended User-agent: Google-Extended
Disallow: /
   
ClaudeBot (Anthropic) User-agent: ClaudeBot
Disallow: /
   
PerplexityBot User-agent: PerplexityBot
Disallow: /
   
CCBot (Common Crawl) User-agent: CCBot
Disallow: /
   
Bytespider (ByteDance) User-agent: Bytespider
Disallow: /
   
Amazonbot User-agent: Amazonbot
Disallow: /
   
Applebot User-agent: Applebot
Disallow: /
   

Copy and paste the lines for each bot you want to block.

Example robots.txt

User-agent: GPTBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

Enter fullscreen mode Exit fullscreen mode

Important Notes

  • robots.txt is not foolproof. Only bots that respect the protocol will comply. Some scrapers may ignore it.

  • Update regularly. New bots appear all the time—keep your file current.

  • Don’t block search engines (like Googlebot or Bingbot) unless you want your site removed from search results.

Top comments (0)