Mitu Das

Posted on May 18

I Built a Free JavaScript SEO Audit Tool After Google Ignored My React App for 3 Months

#seo #react #javascript #developer

I spent three months wondering why my React app was getting zero organic traffic. Google Search Console showed pages being "discovered" but not indexed. The Lighthouse score looked fine. Everything seemed okay.

Then I fetched my site through Googlebot's eyes using fetch as Google and saw a blank white page.

The culprit? Client-side rendering with no fallback, missing meta tags injected after hydration, and a canonical URL pointing to a development URL I forgot to update. None of these showed up in my usual workflow. In 2026, JavaScript SEO is still a trap most developers fall into at least once. Here's how to catch these issues before Google does.

Why JavaScript Apps Break SEO in Ways Static HTML Doesn't

The fundamental problem is timing. Googlebot is smart, but it crawls JavaScript apps differently than a browser would:

It fetches the initial HTML response
It queues the page for JavaScript rendering (which can take days)
It renders, then re-indexes

If your meta tags, canonical URLs, or structured data are injected after JavaScript runs, you're at the mercy of Google's render queue. For new pages, this can mean weeks of invisibility.

Static HTML sites don't have this problem. A <title> tag in a .html file is immediately readable. A <title> set via document.title = "..." in a React useEffect is not at least not reliably on first crawl.

Here's a quick check you can run in Node.js to see what Googlebot actually sees:

// seo-check.js  fetch your page without JavaScript, like a basic crawler would
const https = require('https');

function crawlPage(url) {
  return new Promise((resolve, reject) => {
    const options = {
      headers: {
        'User-Agent': 'Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)'
      }
    };

    https.get(url, options, (res) => {
      let data = '';
      res.on('data', chunk => data += chunk);
      res.on('end', () => {
        const hasTitle = /<title>(.+?)<\/title>/i.test(data);
        const hasDescription = /name="description"/i.test(data);
        const hasCanonical = /rel="canonical"/i.test(data);
        const hasH1 = /<h1/i.test(data);

        console.log('SEO Snapshot (pre-JS render):');
        console.log('  Title tag present:', hasTitle);
        console.log('  Meta description:', hasDescription);
        console.log('  Canonical URL:', hasCanonical);
        console.log('  H1 tag:', hasH1);

        resolve({ hasTitle, hasDescription, hasCanonical, hasH1 });
      });
    }).on('error', reject);
  });
}

crawlPage('https://yoursite.com').then(result => {
  const score = Object.values(result).filter(Boolean).length;
  console.log(`\nBasic SEO score: ${score}/4`);
});

Run it. If you get Title tag present: false on a React app, you have a problem.

Auditing Structured Data and Open Graph Tags Programmatically

Missing meta tags are just the beginning. Structured data (JSON-LD) errors are invisible to the naked eye but actively hurt rich results eligibility. Open Graph tags determine how your links look when shared on Slack, X, or LinkedIn.

You can audit both with cheerio a server-side HTML parser that doesn't execute JavaScript (which is exactly the point):

npm install cheerio node-fetch

// audit-metadata.js
import fetch from 'node-fetch';
import * as cheerio from 'cheerio';

async function auditPage(url) {
  const res = await fetch(url);
  const html = await res.text();
  const $ = cheerio.load(html);

  // Check Open Graph
  const og = {
    title: $('meta[property="og:title"]').attr('content'),
    description: $('meta[property="og:description"]').attr('content'),
    image: $('meta[property="og:image"]').attr('content'),
    url: $('meta[property="og:url"]').attr('content'),
  };

  // Check JSON-LD structured data
  const jsonLdBlocks = [];
  $('script[type="application/ld+json"]').each((_, el) => {
    try {
      jsonLdBlocks.push(JSON.parse($(el).html()));
    } catch (e) {
      jsonLdBlocks.push({ error: 'Invalid JSON', raw: $(el).html().slice(0, 100) });
    }
  });

  // Check canonical
  const canonical = $('link[rel="canonical"]').attr('href');

  console.log('\n=== Open Graph ===');
  Object.entries(og).forEach(([key, val]) => {
    console.log(`  og:${key}: ${val || '❌ MISSING'}`);
  });

  console.log('\n=== Canonical ===');
  console.log(' ', canonical || '❌ MISSING');

  console.log('\n=== JSON-LD Blocks Found ===');
  console.log(` ${jsonLdBlocks.length} block(s)`);
  jsonLdBlocks.forEach((block, i) => {
    if (block.error) {
      console.log(`  Block ${i + 1}: ❌ Parse error  ${block.error}`);
    } else {
      console.log(`  Block ${i + 1}: ✅ Type = ${block['@type'] || 'unknown'}`);
    }
  });
}

auditPage('https://yoursite.com');

This gives you a static HTML snapshot of exactly what a non-JS crawler sees. If your OG tags show ❌ MISSING but you know you set them in your React component, they're being injected after render fix them server-side or use a proper SSR/SSG setup.

Automating Multi-Page Audits with a Crawl Script

Checking one page is useful. Checking your entire sitemap is necessary. Here's a minimal crawler that pulls URLs from your sitemap and runs the audit across all of them:

// crawl-sitemap.js
import fetch from 'node-fetch';
import * as cheerio from 'cheerio';

async function getSitemapUrls(sitemapUrl) {
  const res = await fetch(sitemapUrl);
  const xml = await res.text();
  const $ = cheerio.load(xml, { xmlMode: true });
  return $('loc').map((_, el) => $(el).text()).get();
}

async function quickAudit(url) {
  try {
    const res = await fetch(url, { timeout: 8000 });
    const html = await res.text();
    const $ = cheerio.load(html);
    return {
      url,
      title: $('title').text() || null,
      canonical: $('link[rel="canonical"]').attr('href') || null,
      h1Count: $('h1').length,
      hasDescription: !!$('meta[name="description"]').attr('content'),
    };
  } catch (e) {
    return { url, error: e.message };
  }
}

async function auditSitemap(sitemapUrl) {
  console.log(`Fetching sitemap: ${sitemapUrl}\n`);
  const urls = await getSitemapUrls(sitemapUrl);
  console.log(`Found ${urls.length} URLs. Auditing...\n`);

  const results = [];
  for (const url of urls) {
    const result = await quickAudit(url);
    results.push(result);

    const issues = [];
    if (!result.title) issues.push('no title');
    if (!result.canonical) issues.push('no canonical');
    if (result.h1Count !== 1) issues.push(`${result.h1Count} H1 tags`);
    if (!result.hasDescription) issues.push('no meta description');

    const status = issues.length === 0 ? '✅' : '⚠️ ';
    console.log(`${status} ${url}`);
    if (issues.length > 0) console.log(`   Issues: ${issues.join(', ')}`);
  }

  const clean = results.filter(r => !r.error && r.title && r.canonical).length;
  console.log(`\nSummary: ${clean}/${urls.length} pages passed basic audit`);
}

auditSitemap('https://yoursite.com/sitemap.xml');

Run this before every major deploy. It catches the obvious stuff missing titles, duplicate H1s, canonical mismatches in under a minute.

Going Deeper: When You Need More Than a Script

These scripts cover the fundamentals well, but they don't catch everything: Core Web Vitals, crawl depth, internal link structure, redirect chains, or hreflang mismatches. For a more complete picture without spinning up Screaming Frog or paying for an enterprise tool, I came across power-seo a free, open-source npm package built for JavaScript-heavy sites.

What caught my attention: it runs a headless browser audit (so it sees your post-render HTML, not just the raw response), checks Core Web Vitals, and flags JavaScript-specific issues like render-blocking scripts and hydration mismatches. It's not magic, but it filled the gap between "I wrote a crawler script" and "I need a $400/month SEO platform."

You install it globally and point it at a URL:

npm install -g power-seo
power-seo audit https://yoursite.com --output report.json

The JSON output is structured enough to pipe into CI so you can fail a build if, say, a critical page loses its title tag after a component refactor.

For a full walkthrough of how it handles JavaScript rendering specifically, the team wrote it up here: Free JavaScript SEO Audit Tool Every Developer Needs in 2026.

What I Learned (After Losing 3 Months of Potential Traffic)

Never assume your meta tags are in the HTML. Use curl -A "Googlebot" or a raw fetch without JS execution to verify what crawlers actually see on first hit.
One H1 per page, always. It's the most common mistake in component-driven UIs multiple components each thinking they "own" the heading.
Canonical URLs in SPAs are a footgun. If you're using client-side routing, double-check that your canonical tag updates correctly on navigation. A static canonical pointing to / on every route is an indexing disaster.
Build SEO checks into your CI pipeline. A five-minute audit script catching a missing title before deploy is infinitely better than discovering it two weeks later in Search Console.

What's your biggest JavaScript SEO headache?

I'm genuinely curious is it the render timing issue, or something more obscure like hreflang in a Next.js app, or structured data that validates in the Rich Results Test but never shows up in SERPs?

Drop it in the comments. I've probably hit it too, and if I haven't, someone else in this thread has. These problems are a lot less frustrating when you're not debugging them alone at 11pm.

Top comments (2)

Bhavin Sheth • May 19

Been there 😅 One wrong canonical + client-side meta tags cost me weeks of indexing issues on a React project too. The “fetch as Googlebot” check is still one of the fastest ways to catch problems most devs completely miss.

Ken Morgan • Jul 11

Wow! I need things like this to catch the gaps in my own projects. Thanks so much! On my way to the full walkthrough. I want to incorporate this into my pre-deployment audit toolbox.