How to Block AI Web Scrapers on Your WordPress Site with robots.txt (2025 Update)

#seo #wordpress #webscraping #privacy

AI web scrapers are increasingly crawling WordPress sites to gather content for training large language models and powering AI search results. As a site owner, you can use your robots.txt file to help protect your original content from being scraped and used without your permission.

This guide explains:

What the robots.txt file does
Why blocking AI bots matters
How to add up-to-date blocking rules

A current table of the main AI bots and the exact syntax you need

What Is robots.txt?

The robots.txt file is a simple text file in your website’s root directory (e.g., https://yourdomain.com/robots.txt). It tells web crawlers which parts of your site they can or cannot access. Most reputable AI and search engine bots will respect these rules.

Why Block AI Bots?

Protect your original content from being used to train AI models without your consent
Maintain control over your website’s data
Limit how your content appears in AI-powered search and chatbots

How to Edit robots.txt in WordPress

Access your website’s root directory (via FTP, file manager, or a WordPress SEO plugin).
Open or create the robots.txt file.
Add the blocking rules from the table below.
Save and upload the file.

Table: AI Bots and robots.txt Syntax (2025)

AI Bot / Model	Syntax to Block (add to robots.txt)

GPTBot (OpenAI)	User-agent: GPTBot Disallow: /

Google-Extended	User-agent: Google-Extended Disallow: /

ClaudeBot (Anthropic)	User-agent: ClaudeBot Disallow: /

PerplexityBot	User-agent: PerplexityBot Disallow: /

CCBot (Common Crawl)	User-agent: CCBot Disallow: /

Bytespider (ByteDance)	User-agent: Bytespider Disallow: /

Amazonbot	User-agent: Amazonbot Disallow: /

Applebot	User-agent: Applebot Disallow: /

Copy and paste the lines for each bot you want to block.

Example robots.txt

User-agent: GPTBot
Disallow: /

User-agent: Google-Extended
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: PerplexityBot
Disallow: /

Important Notes

robots.txt is not foolproof. Only bots that respect the protocol will comply. Some scrapers may ignore it.
Update regularly. New bots appear all the time—keep your file current.
Don’t block search engines (like Googlebot or Bingbot) unless you want your site removed from search results.

DEV Community

How to Block AI Web Scrapers on Your WordPress Site with robots.txt (2025 Update)

What Is robots.txt?

Why Block AI Bots?

How to Edit robots.txt in WordPress

Top comments (0)