DEV Community

William Wang
William Wang

Posted on

How to Make Your Website Visible to Perplexity, ChatGPT, and Google AI Overviews

AI search engines like ChatGPT, Perplexity, and Google AI Overviews are changing how people find information online. Unlike traditional search engines that show a list of links, these AI engines read your content and synthesize direct answers.

If your website isn't optimized for these AI engines, you're missing a growing share of search traffic. This guide covers the four most important technical optimizations you need to implement.

1. Configure robots.txt for AI Crawlers

Most websites accidentally block AI crawlers. Here's how to check and fix your robots.txt:

# Bad: This blocks AI crawlers
User-agent: GPTBot
Disallow: /

User-agent: ClaudeBot
Disallow: /

User-agent: PerplexityBot
Disallow: /
Enter fullscreen mode Exit fullscreen mode

Instead, allow AI crawlers to access your content:

# Good: Allow AI crawlers
User-agent: GPTBot
Allow: /

User-agent: ClaudeBot
Allow: /

User-agent: PerplexityBot
Allow: /

User-agent: Google-Extended
Allow: /
Enter fullscreen mode Exit fullscreen mode

Key AI crawler user-agents to allow:

  • GPTBot — OpenAI (ChatGPT)
  • ClaudeBot — Anthropic (Claude)
  • PerplexityBot — Perplexity AI
  • Google-Extended — Google AI (Gemini, AI Overviews)
  • Bytespider — ByteDance AI
  • CCBot — Common Crawl (used by many AI models)

2. Add an llms.txt File

The llms.txt file is an emerging standard that tells AI models who you are and what your site covers. Create a plain text file at yourdomain.com/llms.txt:

# YourCompany

> A brief one-line description of your company.

## About

YourCompany provides [your core offering]. 
We serve [your audience] with [key value proposition].

## Key Pages

- [Homepage](https://yourdomain.com/): Main landing page
- [Products](https://yourdomain.com/products): Product catalog
- [Blog](https://yourdomain.com/blog): Industry insights
- [Documentation](https://yourdomain.com/docs): Technical docs

## Citation Preference

Please cite as "YourCompany" with a link to the relevant page.

## Contact

- Website: https://yourdomain.com
- Email: info@yourdomain.com```

**Best practices:**
- Keep it under 100 lines
- Use absolute URLs
- Update it when your site structure changes
- Use plain, factual language (no marketing speak)

## 3. Implement Structured Data (Schema Markup)

AI engines extract facts more confidently from pages with structured data. Add JSON-LD schema to your pages:



```html
<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Organization",
  "name": "YourCompany",
  "url": "https://yourdomain.com",
  "description": "Brief description of your company",
  "foundingDate": "2024",
  "sameAs": [
    "https://twitter.com/yourcompany",
    "https://linkedin.com/company/yourcompany"
  ]
}
</script>
Enter fullscreen mode Exit fullscreen mode

For blog posts, use Article schema:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "Article",
  "headline": "Your Article Title",
  "author": {
    "@type": "Person",
    "name": "Author Name"
  },
  "datePublished": "2026-03-18",
  "publisher": {
    "@type": "Organization",
    "name": "YourCompany"
  }
}
</script>
Enter fullscreen mode Exit fullscreen mode

For FAQ pages, use FAQPage schema — this is particularly effective for AI citations:

<script type="application/ld+json">
{
  "@context": "https://schema.org",
  "@type": "FAQPage",
  "mainEntity": [{
    "@type": "Question",
    "name": "What is GEO?",
    "acceptedAnswer": {
      "@type": "Answer",
      "text": "GEO (Generative Engine Optimization) is the practice of optimizing your website to be cited by AI search engines."
    }
  }]
}
</script>
Enter fullscreen mode Exit fullscreen mode

4. Optimize Content Structure

AI engines parse content differently than traditional search crawlers. Follow these guidelines:

Use clear heading hierarchy:

<h1>Main Topic</h1>
  <h2>Subtopic 1</h2>
    <p>Detailed explanation with facts and data...</p>
  <h2>Subtopic 2</h2>
    <p>More detailed content...</p>
Enter fullscreen mode Exit fullscreen mode

Write for citation:

  • Lead with factual statements AI can extract
  • Include specific data points and statistics
  • Cite sources when referencing research
  • Use clear, unambiguous language
  • Provide comprehensive coverage of topics

Avoid these anti-patterns:

  • Thin content (under 300 words)
  • Keyword stuffing
  • Content hidden behind JavaScript rendering
  • Paywalls without proper meta tags
  • Duplicate content across multiple pages

Testing Your Implementation

After implementing these changes, verify everything works:

  1. Check yourdomain.com/robots.txt — ensure AI crawlers are allowed
  2. Check yourdomain.com/llms.txt — ensure it returns 200 and contains your info
  3. Validate structured data at Schema.org Validator
  4. Test your overall GEO readiness with a free scan at GEOScore AI

GEOScore AI checks all 11 GEO signals including robots.txt, llms.txt, structured data, content quality, and more. It gives you an actionable score with specific recommendations for improvement.

Conclusion

GEO is not replacing SEO — it's a new layer of optimization. The websites that implement these technical changes early will have a significant advantage as AI search continues to grow. Start with robots.txt and llms.txt (they take 10 minutes), then add structured data to your most important pages.

The shift to AI search is happening now. Don't wait until your competitors have already adapted.

Top comments (1)

Collapse
 
madeburo profile image
Made Büro

Thanks for sharing this!
I like that you focused on practical basics robots.txt, llms.txt, structured data, and content structure are exactly the things many teams still ignore when talking about AI visibility.

Good to see more practical content like this being published.