DEV Community

Cover image for I Built an SEO Tool — Google Ignored It Until I Fixed This
TEKI BHAVANI SHANKAR
TEKI BHAVANI SHANKAR

Posted on

I Built an SEO Tool — Google Ignored It Until I Fixed This

CrawlIQ Team · 12 min read · Technical SEO + JavaScript

You deployed your React app. Everything looks perfect in the browser. Then you check Google Search Console three months later and half your pages aren't indexed. You Google "Googlebot JavaScript" and get 40 Medium articles all saying the same wrong thing: "Googlebot can now render JavaScript." That statement is technically true but practically useless. Here's what actually happens.

The two-wave system nobody explains properly
Googlebot doesn't crawl and render simultaneously. It runs in two completely separate waves, and the gap between them can be anywhere from a few hours to several days.

Wave 1: Googlebot fetches your URL, gets the raw HTML response, indexes whatever is in that HTML string. For a React/Vue/Angular SPA, that's usually a shell — a

and a bunch of script tags. No content. Googlebot indexes nothing meaningful.

Wave 2: At some point later, the WRS (Web Rendering Service) — Google's headless Chromium layer — executes your JavaScript, renders the DOM, and re-indexes the result. This is the part that can be delayed, rate-limited, or skipped entirely on low-priority pages.

Key insight

Google's rendering queue is a shared resource across the entire web. Your JS-heavy app competes for rendering slots with billions of other pages. Low crawl budget → low rendering priority → JS content potentially never indexed.

What Googlebot's HTTP headers actually look like
One thing that immediately reveals how Googlebot behaves: look at what it sends vs. what a regular browser sends. Run this in your server logs filter:

nginx access log filter

# Filter Googlebot hits from nginx logs
grep "Googlebot" /var/log/nginx/access.log | \
  awk '{print $1, $7, $9}' | \
  head -30

Enter fullscreen mode Exit fullscreen mode

The five JavaScript patterns that silently kill indexing

  1. Content behind setTimeout/Promise chains The WRS has a finite rendering timeout. If your content loads inside a Promise that resolves after a network call, and that network call takes more than the render budget, the content never makes it into the indexed DOM.

javascript — what kills indexing

// ❌ INVISIBLE TO WAVE 2 if API is slow
useEffect(() => {
  fetch('/api/content')
    .then(res => res.json())
    .then(data => setContent(data)) // renders after WRS timeout?
}, []);

// ✅ Server-side render the initial content
// Next.js getServerSideProps / getStaticProps
export async function getStaticProps() {
  const content = await fetchContent();
  return { props: { content } };
}
Enter fullscreen mode Exit fullscreen mode
  1. Infinite scroll without static pagination
    Classic trap. Your product listing page loads 20 items, then fetches more on scroll. Googlebot doesn't scroll. It gets 20 items indexed. The other 980 products: invisible. The fix is paginated URLs with rel="next" that Googlebot can follow as discrete crawlable links.

  2. Client-side routing without SSR/SSG
    React Router, Vue Router — when you navigate between routes in a SPA, there's no actual HTTP request. The URL changes via History API, the DOM updates. Googlebot fetches each URL as a fresh HTTP request. If your server only returns index.html for every route (which most SPAs do), every page looks identical in Wave 1. Googlebot may deduplicate them as the same content.

  3. Lazy-loaded images with no fallback
    If your hero image and product images are all loading="lazy" with JavaScript-triggered Intersection Observer fallbacks, they won't load in the WRS unless they're within the initial viewport. Below-fold content often gets missed in Wave 2 rendering.

  4. Conditional rendering based on browser APIs
    Code like if (typeof window !== 'undefined') gates content behind a browser check. The WRS is a browser, so this sometimes works — but only if the condition resolves before the render timeout. When it doesn't, you get partial pages in the index.

How to actually audit this — not just guess
The standard advice is "use Google's URL Inspection tool." That's Wave 2 on demand — it tells you what Google's renderer sees, which is useful. But it doesn't tell you: how many of your pages actually go through Wave 2, how long the delay is, or what the Wave 1 view looks like.

For a proper audit, you need a crawler that can show you both the raw HTML response AND the rendered DOM, then diff them. That's exactly what we built into CrawlIQ — it crawls your site with both raw-fetch and render passes, then surfaces exactly which pages have a significant delta between the two states. If your Wave 1 HTML and rendered DOM are different on 40% of your pages, you have an indexing problem.

CrawlIQ's audit engine flags it and explains the specific fix.

python — naive raw vs rendered diff check

import httpx
from playwright.async_api import async_playwright
from bs4 import BeautifulSoup

async def check_render_delta(url: str) -> dict:
    # Wave 1: raw HTML
    raw = httpx.get(url, headers={'User-Agent': 'Googlebot/2.1'})
    raw_text_len = len(BeautifulSoup(raw.text, 'html.parser').get_text())

    # Wave 2: rendered DOM
    async with async_playwright() as pw:
        browser = await pw.chromium.launch()
        page = await browser.new_page()
        await page.goto(url, wait_until='networkidle')
        rendered_text_len = len(await page.inner_text('body'))
        await browser.close()

    delta = rendered_text_len - raw_text_len
    return {
        "url": url,
        "raw_chars": raw_text_len,
        "rendered_chars": rendered_text_len,
        "delta": delta,
        "js_dependent": delta > 500  # flag if render adds >500 chars
    }
Enter fullscreen mode Exit fullscreen mode

The decision tree: should you SSR, SSG, or ISR?
There's no single answer. It depends on how often the content changes and how critical indexing is for each page type:

decision logic
Content changes rarely (blog, docs, marketing) →
SSG (Static Site Generation) ← best for SEO, fastest

Content is user-specific (dashboard, profile) →
CSR (Client-Side only) ← noindex these pages

Content changes frequently but needs indexing →
SSR (Server-Side Rendering) ← costlier, but Wave 1 sees content

Content changes occasionally (product listings) →
ISR (Incremental Static Regeneration) ← best of both worlds

If you want to audit your site's JavaScript indexing gap — see exactly what Google's Wave 1 vs Wave 2 sees on every page — check out CrawlIQ.

Top comments (0)