DEV Community

Алексей Спинов
Алексей Спинов

Posted on

How to Check if a Website Blocks AI Bots (GPTBot, ClaudeBot, CCBot)

With the rise of AI crawling, many websites now block AI bots in their robots.txt. Here's how to check.

The Quick Way

Fetch any site's robots.txt and look for these user-agents:

User-agent: GPTBot
Disallow: /

User-agent: ChatGPT-User
Disallow: /

User-agent: CCBot
Disallow: /

User-agent: anthropic-ai
Disallow: /

User-agent: ClaudeBot
Disallow: /
Enter fullscreen mode Exit fullscreen mode

If you see Disallow: / for any of these — that bot is blocked.

Examples

  • NYTimes — blocks GPTBot, CCBot ✅
  • OpenAI.com — does NOT block AI bots (interesting!) ❌
  • Google.com — selective blocking

Automated Check

I built a Robots.txt Analyzer that automatically detects AI bot blocking:

{
  "domain": "nytimes.com",
  "aiBotsBlocked": true,
  "blockedBots": {
    "GPTBot": true,
    "ClaudeBot": false,
    "CCBot": true,
    "Googlebot": false
  }
}
Enter fullscreen mode Exit fullscreen mode

Why This Matters

  • For AI companies: know which data sources you can legally train on
  • For publishers: verify your robots.txt actually blocks unwanted crawlers
  • For researchers: study the AI crawling landscape

Free tool on Apify Store — search knotless_cadence robots-txt.

Top comments (0)