You push a "safe" refactor. A meta tag moves to a component. Someone renames a route. A redirect gets dropped. Two weeks later, traffic drops 40%.
This happens constantly — not because developers don't care about SEO, but because there's no SEO gate in the deployment pipeline. We lint for code style, run unit tests, check types — but nothing catches when <title> disappears from a page template or a canonical URL starts pointing at localhost.
This article fixes that. We'll build an automated SEO test suite that runs on every PR and blocks deploys when it breaks things.
What Kind of SEO Breaks in Deploys?
Before writing tests, it helps to know what actually regresses. Here are the most common culprits:
-
Missing
<title>or<meta description>— someone extracts a layout component and forgets to keep meta tags -
Broken canonicals —
canonicalpointing to staging URLs, localhost, or the wrong page -
Noindex leaking to production —
<meta name="robots" content="noindex">left in from dev/staging config - Sitemap URL changes — routes renamed but sitemap not updated
- Broken redirects — 301s removed or changed to 302s during refactors
- Missing structured data — JSON-LD dropped when templates are restructured
-
robots.txtaccidentally blocking everything — happened to Cloudflare's own site once
These aren't exotic edge cases. They're Tuesday afternoon bugs.
The Stack We'll Use
- Lighthouse CI — automated Core Web Vitals + SEO score audits
- A custom Node.js SEO assertion script — catches the specific things Lighthouse misses
- GitHub Actions — runs everything on every PR
-
axe-core(optional) — accessibility checks that overlap with SEO signals
Step 1: Custom SEO Assertions Script
Lighthouse gives you a score, but it won't tell you "this specific page is missing a canonical tag." Write your own assertions.
Install the dependencies:
npm install --save-dev node-fetch cheerio
Create scripts/seo-check.js:
import fetch from 'node-fetch';
import * as cheerio from 'cheerio';
const BASE_URL = process.env.CHECK_URL || 'http://localhost:3000';
// Pages to test — add all critical routes
const PAGES = [
{ path: '/', name: 'Homepage' },
{ path: '/blog', name: 'Blog index' },
{ path: '/about', name: 'About' },
// Add your most critical pages here
];
const MAX_TITLE_LENGTH = 60;
const MAX_DESCRIPTION_LENGTH = 160;
const MIN_DESCRIPTION_LENGTH = 50;
async function checkPage(path, name) {
const url = `${BASE_URL}${path}`;
const errors = [];
const warnings = [];
let html;
try {
const res = await fetch(url, { redirect: 'manual' });
// Check for accidental redirects
if (res.status >= 300 && res.status < 400) {
errors.push(`Unexpected redirect (${res.status}) to ${res.headers.get('location')}`);
return { url, name, errors, warnings };
}
if (res.status !== 200) {
errors.push(`Non-200 status: ${res.status}`);
return { url, name, errors, warnings };
}
html = await res.text();
} catch (err) {
errors.push(`Fetch failed: ${err.message}`);
return { url, name, errors, warnings };
}
const $ = cheerio.load(html);
// --- Title ---
const title = $('title').text().trim();
if (!title) {
errors.push('Missing <title> tag');
} else if (title.length > MAX_TITLE_LENGTH) {
warnings.push(`Title too long (${title.length} chars, max ${MAX_TITLE_LENGTH}): "${title}"`);
}
// --- Meta description ---
const description = $('meta[name="description"]').attr('content')?.trim();
if (!description) {
errors.push('Missing <meta name="description">');
} else if (description.length > MAX_DESCRIPTION_LENGTH) {
warnings.push(`Description too long (${description.length} chars)`);
} else if (description.length < MIN_DESCRIPTION_LENGTH) {
warnings.push(`Description too short (${description.length} chars)`);
}
// --- Canonical ---
const canonical = $('link[rel="canonical"]').attr('href');
if (!canonical) {
errors.push('Missing <link rel="canonical">');
} else {
// Check for localhost/staging URLs leaking to prod
if (/localhost|127\.0\.0\.1|staging\.|.dev\b/i.test(canonical)) {
errors.push(`Canonical points to non-production URL: ${canonical}`);
}
const expectedCanonical = `${process.env.PROD_URL || ''}${path}`;
if (process.env.PROD_URL && canonical !== expectedCanonical) {
warnings.push(`Canonical mismatch. Expected: ${expectedCanonical}, Got: ${canonical}`);
}
}
// --- Robots meta ---
const robots = $('meta[name="robots"]').attr('content')?.toLowerCase();
if (robots && (robots.includes('noindex') || robots.includes('nofollow'))) {
errors.push(`Page has robots meta: "${robots}" — is this intentional?`);
}
// --- Open Graph ---
const ogTitle = $('meta[property="og:title"]').attr('content');
const ogDescription = $('meta[property="og:description"]').attr('content');
const ogImage = $('meta[property="og:image"]').attr('content');
if (!ogTitle) warnings.push('Missing og:title');
if (!ogDescription) warnings.push('Missing og:description');
if (!ogImage) warnings.push('Missing og:image');
// --- H1 ---
const h1Count = $('h1').length;
if (h1Count === 0) {
errors.push('Missing <h1> tag');
} else if (h1Count > 1) {
warnings.push(`Multiple <h1> tags found (${h1Count})`);
}
// --- JSON-LD structured data ---
const jsonLd = $('script[type="application/ld+json"]');
if (jsonLd.length === 0) {
warnings.push('No JSON-LD structured data found');
} else {
jsonLd.each((_, el) => {
try {
JSON.parse($(el).html());
} catch {
errors.push('Invalid JSON in JSON-LD block — will be ignored by Google');
}
});
}
// --- Images without alt ---
const imagesWithoutAlt = [];
$('img').each((_, el) => {
const alt = $(el).attr('alt');
const src = $(el).attr('src') || $(el).attr('data-src') || '[unknown src]';
if (alt === undefined || alt === null) {
imagesWithoutAlt.push(src);
}
});
if (imagesWithoutAlt.length > 0) {
warnings.push(`${imagesWithoutAlt.length} image(s) missing alt attribute`);
}
return { url, name, errors, warnings };
}
async function run() {
console.log(`\n🔍 Running SEO checks against: ${BASE_URL}\n`);
console.log('='.repeat(60));
const results = await Promise.all(
PAGES.map(({ path, name }) => checkPage(path, name))
);
let totalErrors = 0;
let totalWarnings = 0;
for (const result of results) {
const hasErrors = result.errors.length > 0;
const hasWarnings = result.warnings.length > 0;
const status = hasErrors ? '❌' : hasWarnings ? '⚠️ ' : '✅';
console.log(`\n${status} ${result.name} (${result.url})`);
for (const err of result.errors) {
console.log(` ✗ [ERROR] ${err}`);
totalErrors++;
}
for (const warn of result.warnings) {
console.log(` ⚠ [WARN] ${warn}`);
totalWarnings++;
}
if (!hasErrors && !hasWarnings) {
console.log(' All checks passed');
}
}
console.log('\n' + '='.repeat(60));
console.log(`\nSummary: ${results.length} pages checked`);
console.log(` Errors: ${totalErrors}`);
console.log(` Warnings: ${totalWarnings}`);
if (totalErrors > 0) {
console.log('\n💥 SEO check FAILED — fix errors before deploying\n');
process.exit(1);
} else {
console.log('\n✅ SEO check PASSED\n');
}
}
run();
Add this to package.json:
{
"scripts": {
"seo:check": "node scripts/seo-check.js",
"seo:check:prod": "CHECK_URL=https://yoursite.com node scripts/seo-check.js"
}
}
Step 2: Add Lighthouse CI
Lighthouse CI automates Core Web Vitals and gives you a full audit score diff on every PR — including a dedicated SEO category.
Install:
npm install --save-dev @lhci/cli
Create lighthouserc.js in your project root:
export default {
ci: {
collect: {
startServerCommand: 'npm run start',
startServerReadyPattern: 'ready on',
url: [
'http://localhost:3000/',
'http://localhost:3000/blog',
'http://localhost:3000/about',
],
numberOfRuns: 3,
},
assert: {
assertions: {
'categories:seo': ['error', { minScore: 0.9 }],
'categories:performance': ['warn', { minScore: 0.8 }],
'meta-description': 'error',
'document-title': 'error',
'html-has-lang': 'error',
'canonical': 'error',
'robots-txt': 'warn',
'largest-contentful-paint': ['warn', { maxNumericValue: 2500 }],
'total-blocking-time': ['warn', { maxNumericValue: 300 }],
'cumulative-layout-shift': ['warn', { maxNumericValue: 0.1 }],
},
},
upload: {
target: 'temporary-public-storage',
},
},
};
Step 3: GitHub Actions Workflow
Create .github/workflows/seo.yml:
name: SEO Tests
on:
pull_request:
branches: [main, develop]
push:
branches: [main]
jobs:
seo-assertions:
name: SEO Assertions
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Build app
run: npm run build
- name: Start app
run: npm run start &
env:
NODE_ENV: production
PORT: 3000
- name: Wait for app to be ready
run: npx wait-on http://localhost:3000 --timeout 60000
- name: Run SEO assertion checks
run: npm run seo:check
env:
CHECK_URL: http://localhost:3000
PROD_URL: https://yoursite.com
lighthouse:
name: Lighthouse CI
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup Node.js
uses: actions/setup-node@v4
with:
node-version: '20'
cache: 'npm'
- name: Install dependencies
run: npm ci
- name: Build app
run: npm run build
- name: Run Lighthouse CI
run: npx lhci autorun
env:
LHCI_GITHUB_APP_TOKEN: ${{ secrets.LHCI_GITHUB_APP_TOKEN }}
Install the Lighthouse CI GitHub App to get inline PR comments with score diffs — it looks like this:
Category Before After Delta
SEO 95 72 -23 ❌
Performance 88 91 +3 ✅
Step 4: Test Your Sitemap and Robots.txt
// scripts/seo-infrastructure.js
import fetch from 'node-fetch';
const BASE_URL = process.env.CHECK_URL || 'http://localhost:3000';
async function checkRobotsTxt() {
const res = await fetch(`${BASE_URL}/robots.txt`);
const text = await res.text();
if (res.status !== 200) {
return { error: `robots.txt returned ${res.status}` };
}
if (/Disallow:\s*\/\s*$/.test(text) && !/Allow:/.test(text)) {
return { error: 'robots.txt is blocking ALL crawlers (Disallow: /) — check this immediately!' };
}
if (/User-agent:\s*Googlebot[\s\S]*?Disallow:\s*\//i.test(text)) {
return { warning: 'Googlebot appears to be blocked in robots.txt' };
}
return { ok: true };
}
async function checkSitemap() {
const robotsRes = await fetch(`${BASE_URL}/robots.txt`);
const robotsText = await robotsRes.text();
const sitemapMatch = robotsText.match(/Sitemap:\s*(.+)/i);
const sitemapUrl = sitemapMatch ? sitemapMatch[1].trim() : `${BASE_URL}/sitemap.xml`;
const res = await fetch(sitemapUrl);
if (res.status !== 200) {
return { error: `Sitemap not found at ${sitemapUrl} (${res.status})` };
}
const text = await res.text();
if (!text.includes('<urlset') && !text.includes('<sitemapindex')) {
return { error: 'Sitemap response does not look like valid XML' };
}
const urlCount = (text.match(/<url>/g) || []).length;
if (urlCount === 0) {
return { warning: 'Sitemap appears to have 0 URLs' };
}
return { ok: true, urlCount };
}
async function run() {
console.log('\n🗺️ Checking SEO infrastructure...\n');
const [robots, sitemap] = await Promise.all([checkRobotsTxt(), checkSitemap()]);
let failed = false;
if (robots.error) { console.log(`❌ robots.txt: ${robots.error}`); failed = true; }
else if (robots.warning) { console.log(`⚠️ robots.txt: ${robots.warning}`); }
else { console.log('✅ robots.txt looks good'); }
if (sitemap.error) { console.log(`❌ sitemap: ${sitemap.error}`); failed = true; }
else if (sitemap.warning) { console.log(`⚠️ sitemap: ${sitemap.warning}`); }
else { console.log(`✅ Sitemap valid (${sitemap.urlCount} URLs)`); }
if (failed) process.exit(1);
}
run();
Step 5: Redirect Testing (The Silent Killer)
A 301 becoming a 302 costs you link equity. A redirect chain added by accident costs you crawl budget.
// scripts/seo-redirects.js
import fetch from 'node-fetch';
const BASE_URL = process.env.CHECK_URL || 'http://localhost:3000';
// Format: [from, to, expectedStatus]
const REDIRECT_MAP = [
['/old-blog', '/blog', 301],
['/home', '/', 301],
['/products/old-slug', '/products/new-slug', 301],
];
async function checkRedirect([from, to, expectedStatus]) {
const res = await fetch(`${BASE_URL}${from}`, { redirect: 'manual' });
const actualStatus = res.status;
const location = res.headers.get('location');
if (actualStatus !== expectedStatus) {
return { from, error: `Expected ${expectedStatus}, got ${actualStatus}` };
}
const normalizedLocation = location?.replace(BASE_URL, '') || '';
if (normalizedLocation !== to) {
return { from, error: `Expected redirect to ${to}, got ${normalizedLocation}` };
}
return { from, ok: true };
}
async function run() {
console.log('\n🔀 Checking redirects...\n');
const results = await Promise.all(REDIRECT_MAP.map(checkRedirect));
let failed = false;
for (const result of results) {
if (result.error) {
console.log(`❌ ${result.from}: ${result.error}`);
failed = true;
} else {
console.log(`✅ ${result.from} → OK`);
}
}
if (failed) process.exit(1);
}
run();
Putting It All Together
{
"scripts": {
"seo:check": "node scripts/seo-check.js",
"seo:infra": "node scripts/seo-infrastructure.js",
"seo:redirects": "node scripts/seo-redirects.js",
"seo:all": "npm run seo:infra && npm run seo:redirects && npm run seo:check",
"seo:lighthouse": "lhci autorun"
}
}
Your CI workflow calls npm run seo:all before any deploy step.
What This Catches (That Your Current Pipeline Doesn't)
| Regression | Caught by |
|---|---|
| Missing title/description | Custom assertions |
| Noindex leaking to prod | Custom assertions |
| Canonical pointing to localhost | Custom assertions |
| robots.txt blocking all crawlers | Infrastructure check |
| Broken sitemap | Infrastructure check |
| 301 changed to 302 | Redirect tests |
| Missing redirect after route rename | Redirect tests |
| SEO score drop from JS change | Lighthouse CI |
| LCP regression from new image | Lighthouse CI |
| Missing JSON-LD | Custom assertions |
Next Steps
Once this is running, consider:
- Adding your most critical pages to the assertion script — especially any page with structured data
- Setting up a LHCI server (it's open source) for a full historical score dashboard
-
Running
seo:check:prodafter every production deploy as a smoke test - Slack/Discord notifications when SEO tests fail in CI
The goal isn't to replace an SEO audit tool — it's to stop shipping SEO regressions by accident. Once this pipeline is in place, the "my refactor broke our rankings" conversation stops happening.
What SEO regressions have you been bitten by in production? Drop them in the comments — some of the best edge cases I know came from war stories.
Top comments (0)