Jan-Willem Bobbink

Posted on Jan 13

17 common SEO mistakes LLMs and vibecoders make

The rise of AI-assisted development has democratized coding like never before. Anyone can spin up a SaaS, build a landing page, or create a web app by prompting their way to a working product. But here's the uncomfortable truth: most of these projects are SEO disasters waiting to happen.

LLMs don't inherently understand SEO. They generate code that works, not code that ranks. And vibecoders (developers who ship by feel rather than fundamentals) often lack the technical SEO knowledge to catch these issues before they tank their organic traffic.

After analyzing countless AI-generated codebases and vibe-coded projects, here are the 17 most common SEO mistakes I see repeatedly.

1. Client-side rendering without SSR or SSG

This is the big one. LLMs default to whatever framework is most popular in their training data, which often means React SPAs with client-side rendering.

// What the LLM generates
function BlogPost({ slug }) {
  const [post, setPost] = useState(null);

  useEffect(() => {
    fetch(`/api/posts/${slug}`).then(res => setPost(res.json()));
  }, [slug]);

  return <article>{post?.content}</article>;
}

Googlebot will see an empty <article> tag. Your content doesn't exist until JavaScript executes, and while Google can render JavaScript, it's slow, unreliable, and puts you at a significant disadvantage.

The fix: Use Next.js with getStaticProps or getServerSideProps, Nuxt with SSR, or Astro for content-heavy sites. If you must use a SPA, implement pre-rendering or dynamic rendering for crawlers.

2. Hash-based or query parameter routing

LLMs often generate routing patterns that are technically functional but SEO-hostile:

// Terrible for SEO
yoursite.com/#/blog/my-post
yoursite.com/?page=blog&id=123

// What you actually need
yoursite.com/blog/my-post

Hash fragments (#) are completely ignored by search engines. Query parameters create duplicate content issues and look spammy to users.

The fix: Always use clean, semantic URL paths. Configure your framework's router for history-based navigation, not hash-based.

3. Auto-generated slugs without human review

When LLMs generate content management systems, they typically create slugs from titles automatically:

const slug = title.toLowerCase().replace(/\s+/g, '-');
// "10 Best Ways to Optimize Your Website!!!" becomes
// "10-best-ways-to-optimize-your-website!!!"

This produces slugs with special characters, excessive length, and no keyword optimization. Worse, if you change a title, many systems regenerate the slug, breaking existing links without redirects.

The fix: Generate slugs as suggestions, but require human approval. Strip special characters, limit length to 60 characters, and implement automatic 301 redirects when slugs change.

4. Missing or duplicate meta tags

Ask an LLM to build you a blog and you'll often get pages with:

No meta description at all
The same title tag on every page
Titles that exceed 60 characters and get truncated
Meta descriptions that are either missing or auto-truncated content

<!-- What you get -->
<title>My Blog</title>

<!-- What you need -->
<title>How to fix Core Web Vitals issues in 2024 | Your Brand</title>
<meta name="description" content="Learn the 7 most effective techniques to improve LCP, CLS, and INP scores. Includes code examples and before/after case studies.">

The fix: Build meta tag management into your content model from day one. Every page needs a unique, optimized title (50-60 chars) and description (150-160 chars).

5. No canonical URLs

Duplicate content is the silent killer of SEO. LLMs rarely implement canonical tags, which means:

yoursite.com/blog/post
yoursite.com/blog/post/
yoursite.com/blog/post?utm_source=twitter
www.yoursite.com/blog/post

All compete against each other, diluting your ranking signals.

<link rel="canonical" href="https://yoursite.com/blog/post">

The fix: Implement canonical tags on every page. Pick one URL format (with or without trailing slash) and stick to it. Configure your server to redirect all variations to the canonical version.

6. Completely ignoring internal linking

LLMs generate isolated pages. They don't understand your content architecture or how pages should relate to each other. You end up with blog posts that link nowhere, category pages that don't link to their children, and pillar content that doesn't establish topical authority.

The fix: Design your internal linking architecture deliberately. Every piece of content should link to 3-5 related pieces. Important pages should receive more internal links. Use descriptive anchor text, not "click here."

7. Invalid or incorrect schema markup

When LLMs attempt structured data, they often produce schema that is:

Syntactically invalid JSON-LD
Using deprecated schema types
Missing required properties
Semantically incorrect (marking a blog post as a Product)

// LLM-generated mess
{
  "@type": "Article",
  "author": "John Doe"  // Wrong: should be a Person object
  "datePublished": "January 5, 2024"  // Wrong: needs ISO 8601 format
}

The fix: Validate all schema markup with Google's Rich Results Test. Use the correct types: BlogPosting for blog posts, Article for news, Product for products. Include all required and recommended properties.

8. Hallucinated facts and statistics

This is an LLM-specific problem that creates both credibility and potential legal issues. LLMs confidently generate statistics, quotes, and "studies" that don't exist:

"According to a 2023 Stanford study, 73% of websites with proper schema markup see a 45% increase in click-through rates."

That study doesn't exist. That statistic was invented. And when your content is full of hallucinated facts, it destroys E-E-A-T signals and can get you penalized for misinformation.

The fix: Fact-check every statistic, quote, and claim in AI-generated content. Link to primary sources. Remove anything you can't verify.

9. No robots.txt or sitemap.xml

LLMs build features, not infrastructure. They won't remind you that search engines need a roadmap to your site.

// robots.txt you need
User-agent: *
Disallow: /admin/
Disallow: /api/
Sitemap: https://yoursite.com/sitemap.xml

Without a sitemap, Google has to discover pages through crawling alone, which may never find your deeper pages. Without robots.txt, you might be letting bots crawl your API endpoints and admin panels.

The fix: Generate a dynamic sitemap.xml that updates when content changes. Include lastmod dates. Create a robots.txt that guides crawlers to what matters.

10. Images without alt text or optimization

AI-generated code typically handles images like this:

<img src={post.image} />

No alt text. No width/height (causing CLS). No lazy loading. Massive unoptimized files. No next-gen formats.

The fix: Every image needs descriptive alt text for accessibility and image search. Specify dimensions to prevent layout shift. Use WebP/AVIF formats. Implement lazy loading for below-the-fold images.

11. Broken heading hierarchy

LLMs choose heading levels based on visual size, not document structure:

<h3>Main Page Title</h3>  <!-- Should be h1 -->
<h1>A Section Header</h1> <!-- Should be h2 -->
<h4>Subsection</h4>       <!-- Should be h3 -->

Or worse, multiple H1 tags on a single page because the developer wanted multiple "big text" elements.

The fix: Every page gets exactly one H1. Headings follow logical order: H1 → H2 → H3. Never skip levels. Use CSS for styling, not heading tags.

12. Ignoring Core Web Vitals

Vibecoders ship features. Core Web Vitals are an afterthought, if they're thought of at all. Common issues:

LCP (Largest Contentful Paint): Hero images that take 8 seconds to load because nobody optimized them
CLS (Cumulative Layout Shift): Ads, images, and fonts that shift content as they load
INP (Interaction to Next Paint): JavaScript bundles so large that clicks take 500ms to register

The fix: Test with PageSpeed Insights before shipping. Lazy load below-the-fold content. Optimize your critical rendering path. Reserve space for dynamic content.

13. JavaScript-dependent critical content

Beyond the CSR problem, LLMs often put critical content behind JavaScript interactions:

// Your important content is hidden until user clicks
<Accordion title="Product Features">
  <p>All your keyword-rich content here</p>
</Accordion>

Content inside collapsed accordions, tabs, or "read more" sections may be deprioritized or ignored by search engines.

The fix: Important content should be visible in the initial HTML. If you must use interactive elements, ensure the content is in the DOM on page load, just visually hidden.

14. No mobile optimization

LLMs are trained on desktop-centric code. Mobile is an afterthought:

Fixed widths instead of responsive layouts
Tiny tap targets
Horizontal scrolling on mobile
Text too small to read without zooming

Google uses mobile-first indexing. If your mobile experience is broken, your rankings suffer.

The fix: Design mobile-first. Test on real devices. Use responsive images. Ensure tap targets are at least 48x48px.

15. Missing or wrong hreflang tags

When LLMs build multilingual sites, they either ignore hreflang entirely or implement it incorrectly:

<!-- Common mistakes -->
<link rel="alternate" hreflang="english" href="..." />  <!-- Wrong: should be "en" -->
<link rel="alternate" hreflang="en" href="..." />       <!-- Missing: x-default -->
<!-- Also missing: the return links on the other language versions -->

The fix: Use proper language codes (en, en-US, de-DE). Always include x-default. Ensure every page in the hreflang set references all other pages, including itself.

16. Pagination done wrong

LLMs generate infinite scroll because it's trendy:

// Infinite scroll that search engines can't follow
<InfiniteScroll loadMore={fetchNextPage}>
  {posts.map(post => <PostCard {...post} />)}
</InfiniteScroll>

Or they create paginated content without proper linking:

<!-- What's missing -->
<link rel="next" href="/blog?page=2">
<link rel="prev" href="/blog">

The fix: Provide crawlable pagination with static links. Consider a "load more" button that appends to existing content rather than replacing it. Ensure all pages are accessible via links, not just JavaScript.

17. Zero consideration for page speed

The default LLM stack is bloated:

Import the entire Lodash library for one function
Include three animation libraries
Bundle fonts you're not using
No code splitting
No tree shaking
Synchronous third-party scripts blocking render

// What LLMs generate
import _ from 'lodash';
const sorted = _.sortBy(items, 'date');

// What you need
import sortBy from 'lodash/sortBy';
// Or just: items.sort((a, b) => a.date - b.date);

The fix: Audit your bundle size regularly. Use dynamic imports for heavy components. Lazy load third-party scripts. Question every dependency.

The root cause

These mistakes share a common origin: LLMs optimize for "does it work?" not "will it rank?"

SEO isn't a feature you bolt on later. It's architectural. By the time you realize your React SPA isn't indexing properly, you're looking at a significant rewrite, not a quick fix.

The vibecoders shipping MVPs without SEO fundamentals are building on sand. They'll get traffic from Product Hunt and Hacker News, wonder why organic never materializes, and blame "SEO takes time" rather than examining their technical foundation.

The solution

If you're using AI to build web projects:

Specify SEO requirements upfront. Tell the LLM you need SSR, semantic URLs, and proper meta tags before it generates code.
Use SEO-first frameworks. Next.js, Nuxt, Astro, and SvelteKit have good defaults. Vanilla React SPAs don't.
Audit before launch. Run Lighthouse, check your rendered HTML, validate your schema, test your mobile experience.
Monitor continuously. Set up Google Search Console. Track your Core Web Vitals. Watch for indexing issues.

The bar for "working websites" is low. The bar for "websites that ranks" or "websites that show up in LLMs" is much higher. Know the difference.

Top comments (1)

Sophia Devy • Jan 13

Excellent breakdown. This perfectly highlights the gap between “it works” and “it ranks.” AI can generate functional code fast, but without SEO aware architecture (SSR, clean URLs, proper metadata, CWV), you’re just shipping invisible products. Strong fundamentals still win.