DEV Community

Global Chat
Global Chat

Posted on • Originally published at global-chat.io

Building Honeypots for AI Bots: What Works and What Doesn't

We've been running experiments to understand what attracts AI crawlers to websites. Think of it as building a "honeypot" — a site designed to be maximally attractive to web crawlers like GPTBot, ClaudeBot, and Meta-ExternalAgent.

Here's what we learned after weeks of testing different approaches.

Experiment 1: Schema.org Structured Data

We added comprehensive JSON-LD markup to every page — WebSite, Organization, FAQPage, and Article schemas.

Result: Googlebot crawl frequency increased from every 3 days to daily within a week. GPTBot showed no measurable change.

Takeaway: Googlebot prioritizes structured data signals. AI training crawlers seem to care more about content volume and quality than markup.

Experiment 2: llms.txt

The llms.txt standard tells AI models what your site is about and what content is available. We added a comprehensive llms.txt file describing our site structure and content.

Result: Too early to measure definitive impact. The standard is relatively new, and it's unclear how many crawlers actively check for it yet. We're continuing to monitor.

Experiment 3: Content Volume vs Quality

We tested two approaches side by side:

  • 50 thin glossary pages (~200 words each)
  • 4 deep comparison articles (~1,500 words each)

The deep articles attracted 3x more repeat bot visits. Bots that found the comparison articles crawled deeper into the site (3-4 pages per session vs 1-2 for glossary pages).

Quality wins over quantity for AI crawlers.

Experiment 4: External Signals

This was the most surprising finding. The strongest trigger for AI crawler visits wasn't anything on-site — it was external backlinks.

Posting links on platforms like Dev.to and social media drove bot visits within hours. Not just from users clicking, but from bots that monitor these platforms for new URLs to crawl.

External signals appear to be the single strongest trigger for AI crawler visits.

What Doesn't Work

  • Hidden links (honeypot traps): Get indexed but don't attract more bots
  • Keyword stuffing: Ignored by AI crawlers — they're not search engines
  • Auto-generated thin content: Gets crawled once and never revisited

The bots are smarter than we expected about content quality.

Our Recommendations

If you want AI bots to crawl your site:

  1. Create genuinely useful content — quality over quantity
  2. Add Schema.org markup — especially for Googlebot
  3. Maintain an up-to-date sitemap — make discovery easy
  4. Add llms.txt — future-proofing for AI crawlers
  5. Get external links from platforms bots monitor

The bottom line: quality content plus external signals matter far more than on-site tricks.


We built Global Chat specifically to test AI bot capabilities and track crawler behavior. Check out our full technical deep-dive on bot detection for more details on the infrastructure behind these experiments.

Top comments (0)