Your sitemap is configured. Your Core Web Vitals score is green. Your product catalog is perfectly structured. And yet when a user asks ChatGPT for products you sell, your store doesn't appear.
Most of the time, the reason is a single 60-year-old text file: robots.txt.
Specifically — a robots.txt written for Google in 2019 and never updated for the seven AI crawlers that now determine your visibility in ChatGPT, Gemini, Claude, and Perplexity.
🔗 This article was originally published on https://angeo.dev/magento-2-robots-txt-chatgpt-gemini-ai-bots/:
Why robots.txt Is AEO Signal #1
The angeo/module-aeo-audit checks robots.txt first and marks it Critical because it is a gate. Every other AEO signal — llms.txt, Product schema, AI product feed — is irrelevant if the AI crawler cannot enter your store.
OpenAI states this without ambiguity:
"Sites that are opted out of OAI-SearchBot will not be shown in ChatGPT search answers."
Not "may not appear." Will not appear. If OAI-SearchBot is blocked — by an explicit Disallow or caught in a wildcard rule — your store is excluded from ChatGPT search answers regardless of everything else.
The 7 AI Bots You Need to Allow
| Bot name | Platform | What it does |
|---|---|---|
OAI-SearchBot |
ChatGPT | Powers ChatGPT real-time search answers and product recommendations. The most critical bot to allow. |
GPTBot |
ChatGPT | Search quality and AI training — also used for product discovery in Shopping results |
ChatGPT-User |
ChatGPT | Browse mode — used when ChatGPT retrieves live pages on behalf of a user |
ClaudeBot |
Claude | Anthropic's general crawler for Claude's knowledge |
Claude-SearchBot |
Claude | Powers claude.ai real-time web search |
PerplexityBot |
Perplexity | Perplexity AI product and content recommendations |
Google-Extended |
Gemini | Powers Gemini's product knowledge and Google AI Overviews |
You need all seven — not because each drives significant traffic today, but because indexing is cumulative. A store that starts allowing PerplexityBot now builds 12 months of crawl history while competitors are still figuring out the bot name.
The Default Magento robots.txt Problem
Magento's default robots.txt starts with a wildcard:
User-agent: *
Disallow: /index.php/
Disallow: /*?
Disallow: /checkout/
Disallow: /app/
...
This wildcard establishes a baseline that all bots inherit. If your deployment script, hosting provider, or a staging migration adds Disallow: / anywhere — AI bots are caught in it, silently, with no error anywhere.
Check this right now. Open yourstore.com/robots.txt and look for Disallow: / on its own line. If it exists without an explicit Allow: / override for each AI bot above it — those bots are blocked. This affects the majority of Magento stores checked by the AEO audit.
The Complete Fix — Exact robots.txt for Magento 2
In Magento admin go to Stores → Configuration → General → Design → Search Engine Robots → Edit Custom instruction of robots.txt. Paste this at the very top, before any other User-agent block:
# ── AI Search Bots ── must be BEFORE the wildcard User-agent: * block
User-agent: OAI-SearchBot
Allow: /
User-agent: GPTBot
Allow: /
User-agent: ChatGPT-User
Allow: /
User-agent: ClaudeBot
Allow: /
User-agent: Claude-SearchBot
Allow: /
User-agent: PerplexityBot
Allow: /
User-agent: Google-Extended
Allow: /
# ── Sitemaps — reference all structured files
Sitemap: https://yourstore.com/sitemap.xml
Sitemap: https://yourstore.com/llms.txt
# ── Default Magento rules below this line ──
User-agent: *
Disallow: /index.php/
Disallow: /*?
Disallow: /checkout/
...
⚠️ Order matters. robots.txt is read top-to-bottom, first-match wins per crawler. Placing explicit Allow: / before the wildcard User-agent: * block ensures AI bots are not caught by any Disallow rule below.
Three Mistakes That Block AI Bots
Mistake 1 — "Disallow all" left on from staging
Many Magento stores use Disallow: / on staging environments to prevent Google indexing. This directive is often accidentally left in production after deployment or migration.
Common sources:
- Magento admin: Stores → Configuration → General → Design → Search Engine Robots
-
Direct file edit: Manually edited
pub/robots.txtthat overrides Magento's generated version - Deployment scripts: CI/CD pipelines that copy staging robots.txt to production
-
Hosting provider defaults: Some managed Magento hosts (Hypernode, Nexcess) apply default restrictions
Quick check: Run
curl -s https://yourstore.com/robots.txt | grep -n "Disallow: /"— if you see it without correspondingAllow: /rules for AI bots above it, you're blocking them.
Mistake 2 — Wildcard block placed before AI bot rules
# ❌ WRONG — wildcard fires first, AI bot rules below it are ignored
User-agent: *
Disallow: /
User-agent: OAI-SearchBot
Allow: / # ← ignored, bot already blocked by wildcard above
# ✅ CORRECT — AI rules appear before the wildcard
User-agent: OAI-SearchBot
Allow: /
User-agent: *
Disallow: /checkout/
robots.txt uses first-match semantics. Once a bot matches a User-agent block, it stops reading. If User-agent: * appears first with Disallow: /, all bots — including AI crawlers — inherit that block and never see their specific Allow rules below.
Mistake 3 — robots.txt served from cache or static file, bypassing Magento
Some hosting configurations serve robots.txt directly from the filesystem or cache layer, completely bypassing Magento's admin configuration:
-
Nginx/Varnish: Static file rule serves
pub/robots.txtbefore Magento handles the request - CDN caching: Cloudflare, Fastly, or other CDNs cache robots.txt with long TTLs
-
Manual file overrides: A physical
pub/robots.txtfile that takes precedence over Magento's generated version How to verify:
# Check response headers to see if Magento served the file
curl -I https://yourstore.com/robots.txt
# Look for X-Magento-Cache-Control or similar Magento headers
# If absent, the file is served statically
Fix: If Magento admin changes don't appear, check for pub/robots.txt and remove it, or update your web server config to route robots.txt through Magento. For Nginx, ensure this location block exists:
location = /robots.txt {
try_files $uri $uri/ /index.php$is_args$args;
}
Verify the Fix
# Manual — check each bot is listed with Allow: /
curl -s https://yourstore.com/robots.txt | grep -A2 "OAI-SearchBot\|GPTBot\|ClaudeBot"
# Via AEO audit module — checks all 7 bots automatically
composer require angeo/module-aeo-audit
bin/magento setup:upgrade
bin/magento angeo:aeo:audit
# ✓ PASS robots.txt — AI Bot Access
# OAI-SearchBot ✓ GPTBot ✓ ChatGPT-User ✓
# ClaudeBot ✓ Claude-SearchBot ✓
# PerplexityBot ✓ Google-Extended ✓
The audit module not only checks if bots are allowed, but also validates the order of rules and warns if wildcards appear before AI bot directives.
After robots.txt — What's Next?
Fixing robots.txt is the gate. Once AI bots can enter, they need structured data to understand what you sell. The next critical signals:
- llms.txt — A machine-readable index of your catalog that AI assistants can parse in seconds
- Product Schema — JSON-LD structured data that eliminates ambiguity in product attributes
3. AI-optimized product descriptions — Natural language content that answers "why" not just "what"
How It Fits the Angeo Suite
angeo/module-aeo-audit — Diagnose: checks all 7 bots as signal #1, scores full AEO across 8 signals
angeo/module-llms-txt — Signal #2: generate llms.txt once bots can enter your store
Fixing robots.txt is the fastest AEO win: 5 minutes, no Composer installs, immediate impact. Once the gate is open, every other AEO signal starts to count.
FAQ
Q: Does blocking GPTBot affect ChatGPT recommendations?
Yes. OpenAI states explicitly: "Sites that are opted out of OAI-SearchBot will not be shown in ChatGPT search answers." If either GPTBot or OAI-SearchBot is blocked in robots.txt — by an explicit Disallow or by a wildcard rule — your store will not appear in ChatGPT product recommendations, regardless of how well-optimised your content is.
Q: What is OAI-SearchBot and why does it matter?
OAI-SearchBot is OpenAI's crawler that powers ChatGPT real-time search results and product recommendations. It is different from GPTBot, which is used for training data. Both must be allowed in robots.txt. OAI-SearchBot determines whether your store appears in ChatGPT search answers; GPTBot affects long-term AI knowledge and product discovery.
Q: How do I check which AI bots are blocked in Magento 2?
Run: curl https://yourstore.com/robots.txt | grep -i 'searchbot\|gptbot\|claude\|perplexity'. If nothing appears, bots may be blocked by a wildcard Disallow rule. The fastest check is angeo/module-aeo-audit: composer require angeo/module-aeo-audit && bin/magento angeo:aeo:audit — it checks all 7 AI bots and reports exact status for each one.
Check all 7 AI bots and 7 other AEO signals in one command — free, MIT licensed:
- 🌐 Free Web Self-Assessment
Originally published at angeo.dev
Top comments (0)