Originally published at thatdevpro.com. This framework reference is part of the 14-tier Engine Optimization stack from ThatDevPro, an SDVOSB-certified veteran-owned web + AI engineering studio. You are reading the dev.to mirror; the source-of-truth canonical version with embedded validation tools lives at the link above.
The Operational Deep Dive for Multi-Language and Multi-Region Sites, Covering Implementation, URL Structure, x-default, Pagination, Canonical Precedence, Validation, and the Audit Posture That Survives a Migration
A comprehensive installation and audit reference for hreflang, the most error-prone implementation in international SEO. While framework-international.md covers internationalization at the strategic and editorial level (which markets, how deeply to localize, content adaptation), this document is the operational technical reference: how the tags are formed, where they live, how they interact with canonical and pagination, how they fail, and how to validate that they did not. Dual-purpose: installation manual and audit document.
Cross-stack implementation note: the code samples in this framework are written in plain HTML for clarity. For React, Vue, Svelte, Next.js, Nuxt, SvelteKit, Astro, Hugo, 11ty, Remix, WordPress, Shopify, and Webflow equivalents of every pattern below, see
framework-cross-stack-implementation.md. For pure client-rendered SPAs (no SSR/SSG) seeframework-react.md. For Next.js specific patterns (App Router,generateMetadata, dynamic alternates) seeframework-nextjs.md.
1. Document Purpose and How to Use This Document
1.1 What This Document Is
The canonical operational reference for hreflang, the HTML attribute (or HTTP header, or XML sitemap annotation) that tells Google and other consumer search engines which page version to serve to which language and country segment of the user base. Hreflang is the single most error-prone implementation surface in technical SEO. John Mueller has publicly described it as "one of the most complex aspects of SEO, if not the most complex one" (Google Search Central, multiple Office Hours, 2020-2025). Independent research consistently finds 60 to 75 percent of international sites have at least one hreflang implementation error severe enough to invalidate part of the cluster. Ahrefs analyzed 374,756 domains and found 67 percent had implementation issues; 31.02 percent had conflicting hreflang directives; 16.04 percent had no self-referencing tags; 8.91 percent used unknown language codes (Ahrefs Study, "Hreflang Implementation Study", 2024 sample, n=374,756 domains).
This framework specifies what correct hreflang looks like, how to deploy it across three implementation methods, how to resolve the canonical-versus-hreflang conflict deterministically, how to handle paginated series, when to use x-default and when to skip it, the validation tooling stack, and the most common ten anti-patterns with their fixes.
1.2 What This Document Is Not
Not a strategic framework for whether to internationalize, which markets to target, or how to localize content. That is framework-international.md. Not a CMS-agnostic schema reference, that is framework-schema.md. Not a substitute for the content-first architectural doctrine in framework-contentfirst.md, which establishes that the hreflang block must render in the first byte of the server response, not be injected by client-side JavaScript.
1.3 Three Operating Modes
Mode A, Install: build hreflang infrastructure on a new or rebuild engagement. Read Sections 2 through 12 in order.
Mode B, Audit: evaluate an existing implementation. Skip to Section 13.
Mode C, Hybrid: audit first, install for failing items. Most engagements run as Mode C because hreflang is rarely correct on legacy sites.
1.4 How Claude Code CLI Should Consume This Document
- Section 2: client variables (markets, languages, current URL structure, current hreflang status).
- Section 3: confirm operator understands what hreflang is and is not.
- Section 4: URL structure decision (ccTLD vs subdirectory vs subdomain). Decision is captured before tag generation.
- Section 5: pick implementation method (HTML head vs HTTP header vs XML sitemap).
- Section 6: decide whether x-default is required.
- Section 7: if site has paginated series, apply pagination interaction rules.
- Section 8: confirm canonical-hreflang relationship for every cluster.
- Section 9: decide regional sub-variant policy (en-US vs en-GB vs en-CA threshold).
- Section 10: run validation before deploy and after.
- Section 11: cross-check against the ten anti-patterns.
- Section 13: audit rubric.
1.5 Required Tools and Validators
-
Google Search Console (
search.google.com/search-console), coverage reports, URL inspection. Note: the International Targeting report was deprecated in Search Console on September 22, 2022 (Google Search Central blog andsupport.google.com/webmasters/answer/12474899). Country targeting and the hreflang error tab no longer exist. Hreflang itself is fully supported; only the report interface was removed. This is a frequent source of confusion in 2026 audits. - Screaming Frog SEO Spider, desktop crawler with thirteen dedicated hreflang filters including missing self-reference, missing return links, incorrect language codes, non-200 hreflang URLs, hreflang to non-canonical, and inconsistent language confidence (Screaming Frog, "How To Audit Hreflang", 2025 documentation).
- Sitebulb, scheduled full-site international audits with hreflang cluster visualization that chases cross-domain alternates (Sitebulb, "International Hints" documentation, 2025).
-
Aleyda Solis hreflang Tags Generator (
aleydasolis.com/en/seo-resources-tools/hreflang-tags-generator/), free generator supporting up to 50 URL variants per session, produces HTML head, HTTP header, and XML sitemap output. The most widely used free tool in international SEO (cross-referenced by Google Search Central documentation, 2025). -
hreflang.org Testing Tool (
app.hreflang.org), live tag tester for an individual URL with return-tag verification. -
Merkle hreflang Tag Tester (
technicalseo.com/tools/hreflang/), alternative tester that also validates HTTP header implementation. -
curl+xmllint+grep, for command-line validation against deployment. Bash scripts in Section 10.7. - Google Rich Results Test, does not validate hreflang directly but verifies the head block is reachable and parseable.
1.6 Scope and Boundaries
Covers: hreflang format (language and region codes), the three implementation methods, x-default rules, pagination interaction, canonical precedence, regional sub-variant policy, validation methodology, common anti-patterns, monitoring after deprecation of the GSC International Targeting report, and the audit rubric. Touches but does not exhaust: URL structure strategy (framework-international.md), canonical signal stack (framework-technicalseo.md), schema markup (framework-schema.md), Next.js metadata API patterns (framework-nextjs.md), and CMS specifics (framework-wordpress.md, framework-shopify.md).
2. Client Variables Intake
# HREFLANG CLIENT VARIABLES
# Markets
target_countries: [] # ["US","GB","CA","AU","DE","FR","ES","MX","BR","JP"]
target_languages: [] # ["en","de","fr","es","pt","ja"]
language_country_matrix: [] # ["en-US","en-GB","en-CA","en-AU","de-DE","de-AT","de-CH",
# "fr-FR","fr-CA","es-ES","es-MX","pt-BR","pt-PT","ja-JP"]
total_locales: 0
primary_market: "" # the locale that gets traffic if no other matches
fallback_market: "" # used as x-default candidate
# Content readiness
locales_with_full_translation: []
locales_with_partial_translation: []
locales_with_machine_translation_only: []
locales_with_localized_content: [] # currency, units, examples adapted
content_identity_us_vs_gb: "" # "identical" | "spelling_only" | "fully_localized"
content_identity_es_es_vs_es_mx: "" # same scale
# Current URL structure
url_structure: "" # "cctld" | "subdirectory" | "subdomain" | "gtld_with_param" | "mixed"
domain_apex: "" # primary domain
ccltd_inventory: [] # ["example.com","example.co.uk","example.de","example.fr"]
subdirectory_pattern: "" # "/en-us/", "/us/en/", "/de/", "/fr-fr/"
subdomain_pattern: "" # "us.example.com", "de.example.com"
url_case_policy: "lowercase"
trailing_slash_policy: "" # with-slash | without-slash, must match technicalseo.md
# Current hreflang status
hreflang_present: false
hreflang_method: "" # "html_head" | "http_header" | "xml_sitemap" | "mixed" | "none"
hreflang_locale_count: 0
hreflang_self_referencing: false
hreflang_bidirectional_verified: false
x_default_present: false
x_default_target: "" # which URL is the x-default
canonical_self_referencing: false
canonical_points_to_alternate: false # the broken pattern; should be false
known_hreflang_errors: []
# Pagination
paginated_series_present: false
pagination_types: [] # ["blog_archive","category","ecommerce_catalog","search_results"]
pagination_uses_rel_next_prev: false
pagination_self_canonical: false
pagination_has_hreflang: false
# Non-HTML resources
pdfs_internationalized: false
pdfs_use_http_header_hreflang: false
# Monitoring
gsc_property_per_locale: false
gsc_property_inventory: []
last_hreflang_validation_date: ""
hreflang_validation_tool: "" # "screaming_frog" | "sitebulb" | "manual" | "none"
hreflang_change_management_process: "" # how new locales get added
# Migration context
recent_url_migration: false
recent_locale_addition: false
recent_locale_removal: false
A field left blank during intake is an audit item. The hreflang_self_referencing, hreflang_bidirectional_verified, and canonical_points_to_alternate flags are the three primary signals for whether a cluster is functional or broken.
3. What Hreflang Is
3.1 The Working Definition
Hreflang is an HTML attribute (rel="alternate" hreflang="...") that declares, for a given URL, the alternate URLs that serve the same content for different language and region targets. It is consumed by Google web search, Yandex, and Naver. Bing publicly stated in 2016 that Bingbot ignores hreflang and uses the <html lang="..."> attribute and content-language HTTP header instead (Bing Webmaster Blog, "How Bing handles hreflang", 2016, still cited in 2025 Bing Webmaster documentation). Baidu does not consume hreflang. ChatGPT, Claude, Perplexity, and other AI search engines do not consume hreflang at this writing (May 2026), they consume the first-byte HTML and the visible content.
The narrow technical claim hreflang makes: "I, this URL, am one version of a content unit. Here are the other versions, and here is the language and optional region each targets." Hreflang does not declare canonicality. It does not declare which version Google should rank. It distributes a single ranking signal across a cluster so that the locale-matched version is the one shown in the SERP for a user in that locale.
3.2 What Hreflang Is Not
Not a canonical signal. Hreflang does not tell Google which version is the master. Canonical does that. Hreflang is orthogonal: it identifies a peer group, where every peer is self-canonical, and the cluster as a whole shares ranking signals while the locale-matched URL is shown in the SERP.
Not a ranking signal. Hreflang does not improve rankings. It improves the probability that the correct locale URL is shown when the cluster ranks. The cluster ranks based on the content, links, and authority of the URLs in the cluster.
Not a directive. Google has stated repeatedly through 2025 that hreflang is a "hint, not a directive" (John Mueller, Search Off the Record, March 2024; Google Search Central docs revision history). Google may serve a different URL than the hreflang-matched one if other signals are stronger, including content similarity, canonical signals, and user location.
Not the <html lang="..."> attribute. The lang attribute declares the language of the current document for accessibility and rendering (font fallbacks, spell-check, screen reader pronunciation). It does not declare alternate language versions. The two attributes are independent, but both should be present and consistent. Bing relies on <html lang> (and content-language HTTP header) instead of hreflang.
Not server-side content negotiation. Sites that serve different content for the same URL based on Accept-Language HTTP header are doing content negotiation, which is a separate technique. Content negotiation cannot be combined with hreflang because hreflang requires distinct URLs per locale. Mixing them is a common anti-pattern (Section 11).
3.3 The Relationship to Other Signals
signal_relationships:
hreflang_vs_canonical:
interaction: "peers, not conflict if implemented correctly"
rule: "every URL in a cluster self-canonicals; hreflang declares the peer set"
conflict_mode: "if canonical points to an alternate, hreflang is silently discarded by Google"
hreflang_vs_html_lang:
interaction: "both required; serve different consumers"
rule: "html lang declares this document's language; hreflang declares alternates"
consumer_split: "Google uses hreflang; Bing uses html lang and Content-Language"
hreflang_vs_content_language_header:
interaction: "both can coexist; Content-Language is hint for non-hreflang crawlers"
rule: "Content-Language: en-US (or just en) in HTTP response"
practical_use: "Bing, accessibility tools, browser language detection"
hreflang_vs_country_in_gsc:
interaction: "GSC country targeting deprecated September 22, 2022"
rule: "for subdirectory and subdomain structures, GSC country targeting is no longer settable"
consequence: "rely on ccTLD signal, hreflang, and on-page content cues only"
hreflang_vs_geoip_redirect:
interaction: "incompatible at the redirect layer"
rule: "do not auto-redirect by IP; offer a banner with a link; let user choose"
why: "Googlebot crawls from one IP region; auto-redirect blocks crawl coverage"
3.4 The Citation Surface for Hreflang in 2026
Google web search is the dominant consumer. Yandex consumes hreflang in much the same way. Naver consumes hreflang for Korean targeting. Bing ignores hreflang and uses html lang plus Content-Language. AI search engines (ChatGPT, Claude, Perplexity, Gemini, OpenAI Search) do not consume hreflang at all; they fetch the URL the user-facing rendering or referral surface points to, and they read the visible content. The 2026 implication: a site investing in hreflang is investing in Google SERP locale routing, not in AI citation routing. AI citation routing is decided by the visible content, the URL structure, and the linking topology of internal navigation, not by hreflang.
4. URL Structure Decision Tree
Before any hreflang tag is written, the URL structure question is settled. Hreflang annotates a URL structure; it does not create one. The four primary options and the decision rule for each.
4.1 The Four Options
Option A, ccTLDs (Country Code Top Level Domains).
example.com -> US or global
example.co.uk -> UK
example.de -> Germany
example.fr -> France
example.com.mx -> Mexico
example.com.br -> Brazil
Each country gets its own root domain. The country target is implied by the TLD itself: .de targets Germany, .fr targets France, with no further signal needed. Hreflang is still recommended because the same content can target multiple countries (.com for the US plus the UK plus AU), and because language can differ from country (German content on .ch for Swiss German).
Pros:
- Strongest country geo-signal Google recognizes (Google Search Central, "Managing Multi-Regional and Multilingual Sites", revision 2024).
- Country target is automatic; no GSC setting required (and GSC country targeting is gone anyway since 2022).
- Local users perceive a local brand presence.
- Easier server-side localization at the infrastructure layer (different hosting per country if needed).
- Each country can run its own marketing, ad spend, and PR independently.
Cons:
- Each ccTLD accumulates authority separately. A link to
.dedoes not help.frrank. - Higher cost: domain acquisition (some country domains are expensive or restricted), hosting per region, SSL per region, monitoring per region.
- More operational overhead (multiple GSC properties, Analytics properties, sitemaps, robots.txt files).
- Migration to or from this structure is more disruptive than other patterns.
Best for: large multinationals with budget and dedicated per-country operations, brands serving regulated industries where per-country legal entity separation is required, established global brands with separate country marketing teams.
Option B, Subdirectories on a gTLD.
example.com/ -> US default
example.com/en-gb/ -> UK English
example.com/de/ -> Germany
example.com/fr/ -> France
example.com/es-mx/ -> Mexico Spanish
A single gTLD (.com, .org, .net, .io, etc.) with locale segments in the URL path. Authority accumulates to one domain. Hreflang is mandatory because the TLD gives no country signal.
Pros:
- All link authority accumulates to one root domain.
- Simpler operational footprint: one DNS, one SSL, one hosting environment, one GSC property as a baseline (sub-properties optional).
- Cost-effective.
- Migrations are simpler; new locales are added by deploying a new subdirectory.
- Google has indicated through 2024-2025 docs that subdirectories work effectively for most cases.
Cons:
- Weaker country geo-signal than ccTLDs; relies entirely on hreflang plus content cues.
- All locales share infrastructure; a server outage takes down every locale.
- Some markets perceive a
.comas "American site," which can suppress conversion in privacy-sensitive markets (Germany, France, Japan).
Best for: the default modern recommendation. SaaS, B2B, mid-market consumer brands, agencies, content sites. Most sites should choose subdirectories unless a specific reason requires ccTLDs.
Option C, Subdomains on a gTLD.
www.example.com -> global
us.example.com -> US
uk.example.com -> UK
de.example.com -> Germany
fr.example.com -> France
Locale segments as subdomains. Authority distribution between subdomains is debated (Google has at various points said subdomains may be treated as separate sites for ranking purposes; current 2024-2025 guidance is "treated as part of the same site for crawling and indexing").
Pros:
- Per-subdomain hosting and infrastructure possible.
- Per-subdomain GSC property is conventional.
- Per-region cookie scope is cleaner (cookies do not bleed across subdomains by default).
- Suits organizations where per-region IT teams operate independently.
Cons:
- Authority signal aggregation is less reliable than subdirectories. Internal evidence and external research consistently find subdirectories outperform subdomains for international SEO when the rest is equal (Moz, "Subdomains vs Subfolders", recurring annual analysis 2018-2024; Ahrefs, similar conclusions).
- More complex than subdirectories with no clear gain for most cases.
- Setup, monitoring, and migration all more complex.
Best for: organizations with strong per-region IT teams that need infrastructure separation, sites where per-region brand identity is intentionally distinct enough to merit subdomain separation, sites where ccTLDs are out of reach but country separation is strategic.
Option D, gTLD with URL Parameter.
example.com/?lang=en
example.com/?country=us&lang=en
Locale carried as a query parameter rather than in the URL path.
Pros:
- None that justify the pattern in 2026.
Cons:
- Parameter-based URLs are harder for Google to crawl and consolidate (parameter handling in GSC was removed in 2022 alongside International Targeting).
- Sharing a URL strips parameters in many user-flow patterns.
- Internal linking is brittle.
- Hreflang must still be implemented; the parameter does not replace it.
Best for: legacy sites that cannot migrate. New builds should not use this pattern.
4.2 The Decision Tree
Does the business need per-country legal, billing, or operational separation?
YES -> Use ccTLDs. Move to Section 5.
NO -> Continue.
Does the business have separate per-country marketing teams with separate budgets?
YES -> Consider ccTLDs if budget allows; subdirectories if not.
NO -> Continue.
Does the business need per-region infrastructure separation (data residency, latency)?
YES -> Consider subdomains or ccTLDs.
NO -> Continue.
Default -> Use subdirectories with locale segment (/en-us/, /de/, etc.) and rely on hreflang.
4.3 The Locale Segment Pattern
For subdirectory implementations, the locale segment format choice:
/en-us/ -> language-region (recommended for multi-region same-language)
/us/en/ -> region-language (less common; two-level)
/en/ -> language only (single-region or language-agnostic)
/us/ -> region only (cannot be used; hreflang requires language)
The recommended pattern is language-region as a single segment (/en-us/, /de-de/, /fr-ca/). It maps cleanly to the hreflang value. It is unambiguous to humans reading the URL. It allows easy aggregation in analytics. Hyphens, not underscores or slashes between language and region.
Lowercase throughout: /en-us/ not /en-US/. The hreflang attribute value is case-insensitive per Google docs, but lowercase in URLs is the trailing-slash-and-case-policy default from framework-technicalseo.md. The hreflang attribute value itself is conventionally written as en-US (uppercase region) in HTML even though the URL is lowercase. Both are valid.
4.4 The "Will We Add More Locales?" Future-Proofing
Plan the URL structure to accommodate locales that do not exist yet. A site that launches with /en/ and /es/ and later wants to add /en-gb/ faces an awkward transition because /en/ was implicitly US English. Better: launch with /en-us/ and /es-es/ even if there is only one of each, leaving room to add /en-gb/, /es-mx/ cleanly later.
If the site launches with language-only segments (/en/, /es/, /de/), use them only when sub-regions are explicitly out of scope and unlikely to be added. The migration cost from language-only to language-region segments is high (URL change for every page in every existing language, redirects, hreflang rewrites, sitemap rewrites).
4.5 The "Mixed Structure" Anti-Pattern
Sites that organically grew across regions sometimes end up with a mixed structure: .de as ccTLD for Germany, /fr/ as subdirectory for France, uk.example.com as subdomain for UK. Mixed structures are not invalid, but they are operationally harder, harder to audit, and create more places for hreflang to break. The migration cost to unify them is real, and the SEO benefit of unification is usually marginal compared to the cost. For an established mixed structure, the typical recommendation is: do not unify, but invest disproportionately in hreflang validation and per-property GSC monitoring.
5. Hreflang Implementation Methods
Three methods exist. Pick one (or for non-HTML resources, two). Mixing methods is permitted but adds maintenance surface.
5.1 Method 1: HTML Head Link Tags
The most common and most directly visible method. Place <link rel="alternate" hreflang="..."> tags in the <head> of every page that is part of an hreflang cluster.
<head>
<!-- self -->
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/products/widget/">
<!-- peers -->
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/products/widget/">
<link rel="alternate" hreflang="en-CA" href="https://example.com/en-ca/products/widget/">
<link rel="alternate" hreflang="es-ES" href="https://example.com/es-es/products/widget/">
<link rel="alternate" hreflang="es-MX" href="https://example.com/es-mx/products/widget/">
<link rel="alternate" hreflang="de-DE" href="https://example.com/de-de/products/widget/">
<link rel="alternate" hreflang="fr-FR" href="https://example.com/fr-fr/products/widget/">
<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/products/widget/">
<!-- canonical to self -->
<link rel="canonical" href="https://example.com/en-us/products/widget/">
</head>
Required properties for HTML head implementation:
- Tags must appear in
<head>, not in<body>. Tags in<body>are silently ignored by Google. - Tags must render in the first-byte HTML response, not be injected by client-side JavaScript.
framework-contentfirst.mdcovers the architectural reason: AI crawlers and many SEO crawlers do not execute JavaScript, and Google's own JavaScript rendering can drop the tag in cases. - URLs must be absolute, including
https://, the domain, and the full path. - URLs must be the canonical version of each peer; the URL given for
en-GBmust be the URL that returns HTTP 200 with<link rel="canonical" href="...this same URL">in its head. - The current page's self-reference must be included.
- Bidirectional confirmation: if
en-USlistsen-GB, thenen-GBmust listen-US. Same for every pair in the cluster.
Pros:
- Most directly visible: a developer or SEO can view-source and see the tags.
- Easiest to audit with Screaming Frog.
- No sitemap dependency.
Cons:
- Every page must render the full cluster of alternates. For a site with 20 locales and 10,000 pages, every page has 21 link tags (20 locales plus x-default). Page weight grows linearly with locale count.
- Updates to the cluster (add a locale, remove a locale, change a URL) require redeploying every page.
- High visual head-noise for developers maintaining templates.
Best for: sites under five locales, sites where editorial control over per-page head is straightforward, sites without sitemap discipline.
5.2 Method 2: XML Sitemap
The XHTML link extension in XML sitemaps lets every URL declare its alternates inside the sitemap entry. The HTML head can omit hreflang entirely if the sitemap version is complete.
<?xml version="1.0" encoding="UTF-8"?>
<urlset xmlns="http://www.sitemaps.org/schemas/sitemap/0.9"
xmlns:xhtml="http://www.w3.org/1999/xhtml">
<url>
<loc>https://example.com/en-us/products/widget/</loc>
<xhtml:link rel="alternate" hreflang="en-US"
href="https://example.com/en-us/products/widget/"/>
<xhtml:link rel="alternate" hreflang="en-GB"
href="https://example.com/en-gb/products/widget/"/>
<xhtml:link rel="alternate" hreflang="en-CA"
href="https://example.com/en-ca/products/widget/"/>
<xhtml:link rel="alternate" hreflang="es-ES"
href="https://example.com/es-es/products/widget/"/>
<xhtml:link rel="alternate" hreflang="es-MX"
href="https://example.com/es-mx/products/widget/"/>
<xhtml:link rel="alternate" hreflang="de-DE"
href="https://example.com/de-de/products/widget/"/>
<xhtml:link rel="alternate" hreflang="fr-FR"
href="https://example.com/fr-fr/products/widget/"/>
<xhtml:link rel="alternate" hreflang="x-default"
href="https://example.com/en-us/products/widget/"/>
</url>
<url>
<loc>https://example.com/en-gb/products/widget/</loc>
<xhtml:link rel="alternate" hreflang="en-US"
href="https://example.com/en-us/products/widget/"/>
<xhtml:link rel="alternate" hreflang="en-GB"
href="https://example.com/en-gb/products/widget/"/>
<!-- ... same cluster repeated ... -->
</url>
</urlset>
Required properties for sitemap implementation:
- Every URL in the cluster must appear as a
<loc>in the sitemap with its own<xhtml:link>block listing all alternates including itself. - Hreflang inside
<xhtml:link>does not count toward the 50,000 URL or 10 MB sitemap limit. Only the primary<loc>counts (Google Search Central, sitemap protocol documentation, 2024). - Use sitemap index files if more than 50,000 canonical URLs.
- The sitemap must be submitted to GSC for each property and referenced from
robots.txt.
Pros:
- The entire cluster is declared in one place per URL. Adding or removing a locale is one sitemap change, not a redeploy of every page.
- Page templates are clean; no per-page hreflang block to maintain.
- Scales well to large sites (1M+ URLs).
- Allows automated cluster generation from a single source of truth (typically a CMS database of URLs and locales).
Cons:
- Less directly visible; SEOs viewing source see only canonical, not hreflang.
- Requires sitemap discipline: every URL in the cluster must be in the sitemap, and the sitemap must be current.
- Some crawlers do not consume hreflang from sitemaps as reliably as from HTML head.
- If the sitemap is broken or out of date, the cluster silently degrades.
Best for: sites with five or more locales, large sites (10,000+ pages), sites with mature build pipelines that can generate sitemaps from a CMS or database, headless sites where the build step controls sitemaps cleanly.
5.3 Method 3: HTTP Link Header
For non-HTML resources (PDFs, images, video files), the HTML head method is not applicable. The HTTP Link header carries the hreflang declaration in the response headers.
HTTP/1.1 200 OK
Content-Type: application/pdf
Link: <https://example.com/en-us/whitepaper.pdf>; rel="alternate"; hreflang="en-US",
<https://example.com/en-gb/whitepaper.pdf>; rel="alternate"; hreflang="en-GB",
<https://example.com/de-de/whitepaper.pdf>; rel="alternate"; hreflang="de-DE",
<https://example.com/fr-fr/whitepaper.pdf>; rel="alternate"; hreflang="fr-FR",
<https://example.com/en-us/whitepaper.pdf>; rel="alternate"; hreflang="x-default"
nginx configuration to deliver the header for a PDF cluster:
# /var/www/sites/example/nginx-hreflang-pdf.conf
location = /en-us/whitepaper.pdf {
add_header Link '<https://example.com/en-us/whitepaper.pdf>; rel="alternate"; hreflang="en-US", <https://example.com/en-gb/whitepaper.pdf>; rel="alternate"; hreflang="en-GB", <https://example.com/de-de/whitepaper.pdf>; rel="alternate"; hreflang="de-DE", <https://example.com/fr-fr/whitepaper.pdf>; rel="alternate"; hreflang="fr-FR", <https://example.com/en-us/whitepaper.pdf>; rel="alternate"; hreflang="x-default"';
}
The header on every URL in the cluster must list every URL in the cluster including itself.
Bash one-liner to verify a PDF's Link header is correct on Bubbles:
#!/bin/bash
# /var/www/sites/example/check-pdf-hreflang.sh
URL="https://example.com/en-us/whitepaper.pdf"
echo "Checking Link header for $URL"
curl -sI "$URL" | grep -i "^link:" | tr ',' '\n' | sed 's/^[[:space:]]*//'
Pros:
- Only method for non-HTML resources.
- Works for any file type the server delivers.
- Survives content-type variations.
Cons:
- Header configuration per file is verbose and brittle.
- Cannot be set from inside the document (the document is a binary).
- Must be configured at the server layer (nginx, Apache, or upstream proxy).
Best for: PDFs that exist in multiple locales (whitepapers, product specs, manuals), image assets that are localized (only when the image itself is translated, which is rare), localized video files.
5.4 Mixing Methods
Mixing is permitted by Google. The most common useful mix:
- HTML head for HTML pages.
- HTTP Link header for non-HTML resources (PDFs).
- XML sitemap as a backstop redundancy (optional but useful for very large sites).
What does not work: declaring different alternates in the HTML head and the XML sitemap for the same URL. The two sources must agree, or Google may consolidate inconsistently. The rule: pick one source of truth (head or sitemap) per resource type, and treat the others as redundant declarations of the same data, not as competing declarations.
5.5 Implementation Method Decision
implementation_method_decision:
locales_under_5_and_pages_under_1000:
method: "HTML head only"
rationale: "simple, directly visible, audit-friendly"
locales_5_to_20_or_pages_1000_to_50000:
method: "XML sitemap primary; HTML head optional"
rationale: "cluster maintenance scales"
locales_over_20_or_pages_over_50000:
method: "XML sitemap mandatory; HTML head redundant if templates allow"
rationale: "head-only is too brittle at scale"
pdfs_and_non_html_present:
method: "HTTP Link header for those resources, in addition to chosen HTML method"
rationale: "only option for non-HTML"
legacy_site_already_using_one_method:
method: "stay with current method; do not migrate without audit"
rationale: "migration during ranking-sensitive periods risks dropping cluster signals"
6. The x-default Tag
6.1 What x-default Does
x-default is a special hreflang value that declares a fallback URL for users whose language and region do not match any of the specific locale tags in the cluster. It does not target a language. It does not target a country. It is the "if none of the above, serve this" pointer.
<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/products/widget/">
Google introduced x-default in April 2013 (Google Search Central Blog, "Introducing x-default hreflang for international landing pages", April 10, 2013). Its purpose: a Polish user reaching a site that has only English, Spanish, and German versions has no language match. Without x-default, Google guesses. With x-default, the site picks.
6.2 When to Use x-default
Use it when:
- The cluster contains more than one locale. A single-locale site has no x-default.
- The cluster has a clear "default for unmatched" choice. Typically the global English version, or the original language version, or the version with the most complete content.
- The site has a country or language selector page at the root (
example.com/with country selector); the root is often the x-default. - The same-language same-content version covers multiple regions: a generic English page for users not in US, GB, AU, IE, IN.
Skip it when:
- The site has only one locale (no cluster exists).
- Every supported language is already covered with a wildcard or default in the cluster, and the site explicitly does not want to fall back to anything for unmatched users (rare).
- The "fallback" would actually be misleading content (a German user landing on a Korean fallback by accident is worse than no fallback).
6.3 What x-default Should Point To
Three valid patterns:
Pattern A: Country selector page.
<link rel="alternate" hreflang="x-default" href="https://example.com/">
The root URL is a country selector with no localized content of its own. Users see flags or language names and pick. Google indexes this page for unmatched users.
Pattern B: Default locale URL.
<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/products/widget/">
The US English version (or whichever locale is the business's default) is the fallback. Users who do not match any other locale get this version. Same URL as one of the existing hreflang values. This is the most common pattern.
Pattern C: Generic language version.
<link rel="alternate" hreflang="en" href="https://example.com/en/products/widget/">
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/products/widget/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/products/widget/">
<link rel="alternate" hreflang="x-default" href="https://example.com/en/products/widget/">
A generic English version exists separate from any country variant. The generic English is x-default. The country-specific variants exist for users in those countries. The generic English catches everyone else.
6.4 Common x-default Misuses
Misuse 1: x-default to a 404 or redirected URL.
<!-- WRONG: x-default points to a 301 -->
<link rel="alternate" hreflang="x-default" href="https://example.com/global/">
<!-- ...but /global/ redirects to /en-us/ -->
Fix: point x-default directly to the destination URL. The same rule that applies to any hreflang URL applies to x-default: 200 status, canonical, indexable.
Misuse 2: x-default to a different language than any of the cluster URLs.
<!-- WRONG: cluster is English and Spanish, x-default is French -->
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/">
<link rel="alternate" hreflang="es-ES" href="https://example.com/es-es/">
<link rel="alternate" hreflang="x-default" href="https://example.com/fr-fr/">
x-default's URL should usually appear elsewhere in the cluster as a specific locale. Pointing to a French URL that is not otherwise in the cluster is technically allowed but confusing and rarely intentional.
Misuse 3: missing x-default on a cluster that needs one.
A site with five locales and no x-default falls back to Google's algorithmic choice for unmatched users. Usually the closest language match. Usually fine. Sometimes wrong. x-default is cheap insurance.
Misuse 4: x-default on a single-locale site.
<!-- WRONG: site only has one locale -->
<link rel="alternate" hreflang="en-US" href="https://example.com/">
<link rel="alternate" hreflang="x-default" href="https://example.com/">
A single-locale site does not have a cluster. No hreflang is needed at all, and x-default has nothing to fall back from. Remove both tags.
Misuse 5: x-default that disagrees with the canonical pattern of the target URL.
If x-default points to /en-us/widget/ and that URL's canonical points to /widget/ (a different URL), the x-default declaration is invalid. The target of x-default must be a self-canonical URL.
6.5 The x-default Decision Test
Does the site serve more than one locale?
NO -> Skip hreflang and x-default entirely.
YES -> Continue.
Is there a sensible fallback URL for users not matching any specific locale?
NO -> Skip x-default. Let Google pick algorithmically.
YES -> Use x-default pointing to that URL. Continue.
Is the x-default URL self-canonical, HTTP 200, and indexable?
NO -> Fix the URL first, then add x-default.
YES -> Implement. Validate. Move on.
7. Hreflang on Paginated Series
7.1 The Pagination + Hreflang Interaction
Paginated series (blog archives, category listings, ecommerce catalogs, search results) introduce a second axis of URL multiplication on top of the locale axis. A blog archive with five pages of posts across four locales is 5 * 4 = 20 URLs that are all related but not interchangeable.
The cardinal rule: page N in locale A points to page N in locale B. Not page 1, not the next page, not the canonical version of the series. Page 2 of /en-us/blog/ is the hreflang peer of page 2 of /de-de/blog/. Mismatched page numbers across locales is a common cluster-breaking error.
<!-- On https://example.com/en-us/blog/page/2/ -->
<head>
<link rel="canonical" href="https://example.com/en-us/blog/page/2/">
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/blog/page/2/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/blog/page/2/">
<link rel="alternate" hreflang="de-DE" href="https://example.com/de-de/blog/page/2/">
<link rel="alternate" hreflang="fr-FR" href="https://example.com/fr-fr/blog/page/2/">
<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/blog/page/2/">
</head>
7.2 Self-Canonical on Every Paginated Page
Each paginated page in each locale canonicals to itself, not to page 1 of its series. The historical pattern of canonicalizing page 2, 3, 4 to page 1 was discouraged by Google in 2019 when rel=next/prev was deprecated and is now actively incorrect (Google Search Central, "How to specify a canonical with rel=canonical", revision 2023; Lumar SEO, "Canonical Tags Dos and Donts", 2024).
<!-- On https://example.com/en-us/blog/page/2/ -->
<link rel="canonical" href="https://example.com/en-us/blog/page/2/">
<!-- NOT this: -->
<!-- <link rel="canonical" href="https://example.com/en-us/blog/"> -->
Self-canonical lets Google index each page as a distinct entry in the series and lets each page accumulate its own ranking signals if any inbound links land on it.
7.3 What About rel="next" and rel="prev"?
Google deprecated rel=next/prev for pagination signaling in March 2019. The official position since: Google can infer pagination from internal linking patterns; rel=next/prev tags are not consumed (Google Search Central, blog post and subsequent Mueller statements, 2019-2024).
Bing continues to support rel=next/prev "on a case-by-case basis" per Bing Webmaster documentation 2023. Some other consumers may still respect it. The cost of including rel=next/prev is negligible; the benefit is partial. The current pattern most large sites use:
- Include rel=next and rel=prev in the head as documentation of the series structure (Bing and some other consumers may use them).
- Use self-canonical on every paginated page.
- Use full hreflang cluster on every paginated page, each page pointing to its locale peer at the same page number.
- Do not canonical paginated pages to page 1.
<!-- On https://example.com/en-us/blog/page/2/ -->
<head>
<link rel="canonical" href="https://example.com/en-us/blog/page/2/">
<link rel="prev" href="https://example.com/en-us/blog/">
<link rel="next" href="https://example.com/en-us/blog/page/3/">
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/blog/page/2/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/blog/page/2/">
<link rel="alternate" hreflang="de-DE" href="https://example.com/de-de/blog/page/2/">
<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/blog/page/2/">
</head>
7.4 Asymmetric Pagination Across Locales
A common reality: the English blog has 50 posts across 5 pages of 10. The German blog has only 30 posts across 3 pages of 10. The Spanish blog has 80 posts across 8 pages.
The cluster for page 1 is straightforward: every locale has a page 1. The cluster for page 2 is fine: every locale has a page 2. The cluster for page 4 fails: en-US has page 4, es-ES has page 4, but de-DE does not.
Resolution rules:
Rule A: if a locale does not have the page number, do not include that locale in the hreflang cluster for that page. en-US page 4 lists en-GB page 4 (if exists), es-ES page 4 (if exists), de-DE page 4 (if exists; in our example, does not), x-default. de-DE simply omits.
Rule B: do not point en-US page 4 to de-DE page 1 as a "best fallback." The cluster declares peer relationships, not fallback relationships. Mismatched page numbers fragment the cluster signals.
Rule C: if pagination is so asymmetric that most pages have only one locale, reconsider whether pagination should be in hreflang at all. The canonical pattern: hreflang only on page 1 of each series. Pages 2+ omit hreflang, rely on user navigation. Acceptable when paginated pages are not significant traffic targets.
7.5 Infinite Scroll and View-All Patterns
Sites using infinite scroll without distinct paginated URLs do not have a pagination + hreflang interaction; there is only one URL per locale. Hreflang is straightforward: each locale's blog landing URL points to every other locale's blog landing URL.
Sites with a view-all option (/blog/all/ or ?view=all) treat the view-all URL as its own cluster member. The view-all in en-US is a peer of the view-all in de-DE.
7.6 Faceted Navigation and Filter Combinations
Faceted URLs like /en-us/shop/shirts/?color=red&size=large introduce massive URL explosion. Hreflang in these contexts is operationally impractical unless faceted URLs are deliberately limited. The typical pattern:
- Canonical primary category URLs only (
/en-us/shop/shirts/) and exclude faceted URLs from hreflang. - Use
noindexon faceted URLs or block them inrobots.txt. - Reserve hreflang for the canonical product, category, and content pages.
For ecommerce specifically, the canonical strategy for faceted navigation is covered in framework-ecommerceseo.md. The hreflang implication: only the canonical surface gets hreflang.
8. Canonicalization vs Hreflang Precedence
8.1 The Conflict Mode
The most ranking-disruptive failure pattern in international SEO: hreflang and canonical declare conflicting URLs for the same content.
The broken pattern:
<!-- On https://example.com/en-gb/widget/ -->
<link rel="canonical" href="https://example.com/en-us/widget/"> <!-- WRONG -->
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/widget/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">
The canonical says "I am a duplicate of en-us; index that one." The hreflang says "I am a peer of en-us; index both, show the locale-matched version." Google receives contradictory signals.
The resolution Google applies:
Per Google Search Central and reaffirmed by John Mueller in Office Hours through 2024-2025: when hreflang and canonical conflict, hreflang is silently discarded. The canonical signal wins (Google Search Central, "Localized versions of your pages", revision 2024; Mueller, multiple Office Hours, including the May 2025 statement that hreflang signals are "hints, not guarantees" relative to canonical and other indexability signals).
Practical consequence: the en-GB version is treated as a duplicate of en-US, dropped from the index, and never shown to UK users. UK users see the en-US version even when they search from the UK. The UK locale silently disappears from the SERP.
8.2 The Correct Pattern
Every URL in an hreflang cluster self-canonicals. Every URL in the cluster declares every other URL (including itself) as an hreflang alternate.
<!-- On https://example.com/en-gb/widget/ -->
<link rel="canonical" href="https://example.com/en-gb/widget/"> <!-- self -->
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/widget/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">
<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/widget/">
<!-- On https://example.com/en-us/widget/ -->
<link rel="canonical" href="https://example.com/en-us/widget/"> <!-- self -->
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/widget/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">
<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/widget/">
Both URLs canonical to themselves. Both list the cluster. The peer relationship is intact. Google indexes both. Each locale-matched URL appears in its locale SERP.
8.3 When Are Two URLs Actually Duplicates?
Sometimes en-US and en-GB content really is identical. Marketing might use the same English copy across all English-speaking markets to save on translation costs. Is this a duplicate-content problem?
Google's position: identical content across hreflang peers is acceptable. Hreflang exists precisely to allow same-language content to target different regions. Google will not treat the cluster as duplicate if hreflang is correctly declared. The cluster shares ranking signals; the locale-matched URL surfaces in the locale SERP.
What is not acceptable: identical content across the same language and same region (en-US and en-US on different URLs) without canonical resolution. That is a duplicate-content problem unrelated to hreflang.
8.4 The Specific Mistake: "Canonical to the Master, Hreflang to the Variants"
Some agencies historically taught a pattern where a "master" English URL was canonical, and all other locales canonicaled to it while declaring hreflang. The reasoning: "consolidate signal to one URL, distribute discoverability via hreflang."
This pattern is wrong for the reason in Section 8.1: the canonical signal wins, the hreflang is discarded, and only the master URL is ever indexed or shown. The other locales become functionally invisible to search.
This is the most common high-severity hreflang error in SaaS and ecommerce migrations from a single-locale site to multi-locale. The CMS adds locale duplicates with the canonical pointing back to the master. Hreflang is then bolted on. The cluster looks valid on paper but never functions. Symptom in GSC: locale URLs in "Duplicate, Google chose different canonical" coverage status.
8.5 The Verification Test
For every URL in an hreflang cluster, three properties must hold:
-
Self-canonical:
<link rel="canonical" href="THIS_URL">where THIS_URL is the current URL. - HTTP 200: the URL must return HTTP 200, not 301, 302, 404, or 5xx.
-
Indexable: no
<meta name="robots" content="noindex">, noX-Robots-Tag: noindexHTTP header, not blocked by robots.txt.
A URL failing any of these three properties cannot be in an hreflang cluster. Its inclusion will silently break the cluster.
8.6 The Conflict Resolution Decision
Does the page declare canonical?
NO -> Add self-canonical. Continue.
YES -> Check the canonical target.
Does canonical point to the same URL as the current page?
YES -> Cluster is correct. Continue.
NO -> Canonical points elsewhere. This URL is a duplicate.
-> Remove hreflang from this URL.
-> Or change canonical to self and accept the URL as a cluster peer.
-> Decide based on whether the URL has distinct content per locale.
9. Regional Sub-Variants (en-US vs en-GB vs en-CA)
9.1 The Same-Language Multi-Region Question
When does English content warrant separate en-US, en-GB, en-CA, en-AU, en-IE, en-IN, en-NZ, en-SG, en-ZA URLs versus a single en URL serving all English-speaking countries?
The answer hinges on whether the content actually differs across regions. Hreflang is a structural signal; it cannot manufacture content differentiation that does not exist.
9.2 When to Differentiate
Differentiate (use en-US, en-GB, en-CA as separate URLs with separate content) when:
- Currency differs. Pricing in $, £, $CAD, $AUD requires the page to actually show different currencies. A page showing $USD to a UK user is a friction conversion event.
- Local compliance text differs. GDPR notices for EU, CCPA for California, Australian Consumer Law for AU, EU VAT vs US sales tax. Different per region.
- Spelling and vocabulary localization matters. "Color" vs "colour", "elevator" vs "lift", "gas" vs "petrol", "stroller" vs "pushchair". The audience perceives the wrong spelling as foreign and trust erodes.
- Examples and references are region-specific. Sports references (NFL vs Premier League vs AFL), local landmarks, regional weather examples.
- Product availability or shipping differs. A product page that lists US-only shipping cannot serve UK users without misleading them.
- Phone numbers, addresses, business hours differ. Customer support contact info per region.
- Reviews and testimonials are from local users. Trust signals localized.
9.3 When Not to Differentiate
Do not differentiate when:
- Content is genuinely identical and the only difference would be hreflang tags. Hreflang on identical content is permitted by Google but provides no user value. The CMS overhead is not worth it.
- The site does not have the translation or localization budget to maintain N separate English variants. Maintaining one English variant well beats maintaining four poorly.
- The target audience is global and language-agnostic. A developer documentation site serving English to a global technical audience does not need en-US vs en-GB; it needs en.
9.4 The Threshold Test
en_subvariant_threshold:
required:
currency_differs: true
compliance_text_differs: true
product_availability_differs: true
physical_office_per_region: true
recommended:
spelling_localization_in_scope: true
examples_localized_in_scope: true
customer_support_per_region: true
optional:
minor_vocabulary_differences: true
not_sufficient_alone:
region_targeting_intent: true # cannot justify subvariants without content differentiation
A site with no item in "required" or "recommended" categories should use a single en URL with no sub-variants. A site with two or more "required" items should differentiate. A site with one "required" item should evaluate case-by-case.
9.5 The es-ES vs es-MX vs es-AR Case
Spanish has the same dynamic. The vocabulary, idiom, and cultural references differ substantially between Spain (Castilian), Mexico (Mexican), Argentina (Rioplatense), and other Latin American countries. The differentiation case is often stronger for Spanish than for English because the linguistic differences are larger.
The mistake to avoid: declaring es-419 for "Spanish, Latin America" (the UN M.49 code for Latin America). Google does not support es-419. Only ISO 639-1 language plus ISO 3166-1 Alpha 2 country codes are accepted. To target Latin America, declare individual country codes (es-MX, es-AR, es-CO, es-CL, es-PE) or use generic es with an x-default fallback.
9.6 The pt-BR vs pt-PT Case
Portuguese is the clearest case for differentiation. Brazilian Portuguese (pt-BR) and European Portuguese (pt-PT) differ substantially in vocabulary, grammar, and spelling after the 1990 orthographic agreement. A Brazilian user served pt-PT content perceives it as a foreign dialect and vice versa. Both Brazil and Portugal warrant separate URLs whenever the site serves both markets.
9.7 The fr-FR vs fr-CA vs fr-BE vs fr-CH Case
French has narrower vocabulary differences than Spanish or Portuguese but stronger administrative differences (currency, compliance, formal address conventions). The differentiation case is mid-strength. France and Quebec usually warrant separation; Belgium and Switzerland are case-by-case based on whether the site has significant operations in those markets.
9.8 The de-DE vs de-AT vs de-CH Case
German has subtle vocabulary differences across Germany, Austria, and Switzerland. The differentiation case is usually administrative (currency: EUR in DE and AT, CHF in CH; tax regimes differ). Differentiate when administrative differences require it; do not differentiate solely on vocabulary unless Swiss German content is explicitly distinct (and the page is genuinely localized).
9.9 Region Without Language (Not Supported)
Hreflang does not support country-only declarations. hreflang="US" is not valid. The language code is required. To target a country with whatever language is appropriate, declare the country with the language explicitly: hreflang="en-US". This is sometimes a surprise to sites that want to target a country regardless of language; they must pick a language.
10. Hreflang Validation
Validation is mandatory before deploy and after deploy. Hreflang errors fail silently in production; the cluster looks fine on paper, and ranking degradation in the affected locale shows up weeks later as a flat or declining trend with no obvious cause.
10.1 Screaming Frog SEO Spider
The reference desktop crawler for hreflang audits. Thirteen dedicated hreflang filters in the Hreflang tab (Screaming Frog SEO Spider documentation, 2025):
- Contains Hreflang, pages that declare any hreflang at all.
- Missing Hreflang, pages that are part of a cluster but missing their hreflang block.
- Missing Self-Reference, pages whose hreflang declares peers but not the current URL.
- Missing Return Links, pages declaring peer X without peer X declaring this page.
- Incorrect Language Codes, invalid ISO 639-1 codes (en-uk, en-eu, gb, jp, cn instead of en-GB, en-EU is invalid, en, ja, zh).
- Inconsistent Language Codes, same locale declared with different code formats on different pages.
- Non-200 Hreflang URLs, hreflang pointing to URLs returning 301, 404, 5xx.
- Unlinked Hreflang URLs, hreflang pointing to URLs that have no internal link from the rest of the site.
- Missing x-default, clusters without an x-default declaration.
- Multiple Languages in Same Page, a page declaring hreflang for two same-language variants without country differentiation.
- Conflicting Canonical and Hreflang, pages where canonical points to a URL that is also an hreflang peer (the broken pattern).
- Hreflang HTTP Header Mismatch, Link header and HTML head declaring different cluster contents.
- Hreflang in Body, tags in the wrong location (body instead of head).
To run a full hreflang audit:
1. Configuration > Spider > Crawl Behavior > "Crawl Hreflang"
2. Configuration > Spider > Extraction > confirm "Hreflang" is checked
3. Crawl the site (or a sitemap)
4. Open the Hreflang tab; review every filter that has results
5. Export issues to CSV; correlate against the cluster definitions
For sites larger than the desktop crawler can handle in memory, switch to database storage mode (Screaming Frog SQLite backend) or to Sitebulb.
10.2 Sitebulb
Sitebulb's international audit tools are designed for scheduled cluster validation. The international hints surface:
- Canonicalized URL has incoming hreflang.
- Hreflang cluster broken at multiple levels.
- Cross-domain hreflang where return tags fail.
- Hreflang URL not in sitemap.
Sitebulb visualizes hreflang clusters as a directed graph, which is useful when the cluster has more than five locales and the relationship matrix is hard to verify by table.
10.3 hreflang.org Testing Tool
A free web-based tester for an individual URL (app.hreflang.org). Fetches the target URL, parses the hreflang block, follows each declared alternate, and verifies the return tag from each alternate. Single-URL scope; for site-wide audit, use Screaming Frog or Sitebulb.
10.4 Merkle / TechnicalSEO.com Hreflang Tester
Alternative single-URL tester (technicalseo.com/tools/hreflang/). Also supports HTTP header inspection, which the hreflang.org tester does not. Use this one for PDF hreflang validation.
10.5 Aleyda Solis Hreflang Tags Generator
Not a validator; a generator. Used during initial implementation. Input: a list of URLs and the language-country combination for each. Output: HTML head, HTTP header, or XML sitemap version of the cluster. Limits: up to 50 URL variants per session. Available at aleydasolis.com/en/seo-resources-tools/hreflang-tags-generator/. Referenced by Google Search Central documentation as a recommended tool.
10.6 Google Search Console
GSC no longer has the dedicated International Targeting report (deprecated September 22, 2022 per support.google.com/webmasters/answer/12474899). What remains in GSC that bears on hreflang:
- URL Inspection tool: enter a URL; the live test shows the rendered HTML head including any hreflang tags. Useful for spot-checking that the deployed page actually carries the hreflang block.
- Coverage / Pages report: filter by "Duplicate, Google chose different canonical" to find URLs where canonical-hreflang conflict has caused the cluster to break.
-
Sitemaps report: confirm the sitemap with
xhtml:linkhreflang entries is being processed. - Performance > Search results: filter by country to verify that locale URLs are receiving impressions from their target country.
10.7 Bash Validation Scripts
Command-line scripts let validation be part of a CI pipeline or a pre-deploy gate. Examples below assume nginx serves the site from /var/www/sites/example/ on Bubbles.
Script 1: Extract hreflang URLs from a sitemap and verify each returns HTTP 200.
#!/bin/bash
# /var/www/sites/example/scripts/hreflang-check-sitemap.sh
# Usage: ./hreflang-check-sitemap.sh https://example.com/sitemap-international.xml
SITEMAP_URL="$1"
TMP=$(mktemp)
if [ -z "$SITEMAP_URL" ]; then
echo "Usage: $0 <sitemap_url>"
exit 1
fi
echo "Fetching sitemap: $SITEMAP_URL"
curl -sL "$SITEMAP_URL" -o "$TMP"
echo "Extracting all hreflang URLs..."
grep -oP '(?<=xhtml:link[^>]{0,200})href="[^"]+"' "$TMP" \
| sed 's/href="//;s/"$//' \
| sort -u > /tmp/hreflang-urls.txt
TOTAL=$(wc -l < /tmp/hreflang-urls.txt)
echo "Found $TOTAL unique hreflang URLs"
echo "Verifying each returns HTTP 200..."
FAILED=0
while IFS= read -r URL; do
STATUS=$(curl -sI -o /dev/null -w "%{http_code}" "$URL")
if [ "$STATUS" != "200" ]; then
echo "FAIL [$STATUS] $URL"
FAILED=$((FAILED + 1))
fi
done < /tmp/hreflang-urls.txt
echo "Done. $FAILED of $TOTAL URLs returned non-200."
rm -f "$TMP"
Script 2: Verify self-reference and return-tag bidirectionality for a single URL.
#!/bin/bash
# /var/www/sites/example/scripts/hreflang-check-bidirectional.sh
# Usage: ./hreflang-check-bidirectional.sh https://example.com/en-us/widget/
URL="$1"
if [ -z "$URL" ]; then
echo "Usage: $0 <url>"
exit 1
fi
echo "Fetching $URL"
HTML=$(curl -sL "$URL")
echo "Extracting hreflang block..."
echo "$HTML" \
| grep -oP '<link[^>]+rel="alternate"[^>]+hreflang="[^"]+"[^>]*>' \
> /tmp/hreflang-block.txt
echo "Hreflang declarations on $URL:"
cat /tmp/hreflang-block.txt
echo ""
# Extract URLs from hreflang block
echo "$HTML" \
| grep -oP '<link[^>]+rel="alternate"[^>]+hreflang="[^"]+"[^>]*>' \
| grep -oP 'href="[^"]+"' \
| sed 's/href="//;s/"$//' \
| sort -u > /tmp/peer-urls.txt
# Check self-reference
if grep -qFx "$URL" /tmp/peer-urls.txt; then
echo "PASS: self-reference present"
else
echo "FAIL: self-reference missing"
fi
# Check return tag from each peer
echo "Checking return tags from each peer..."
while IFS= read -r PEER; do
if [ "$PEER" = "$URL" ]; then continue; fi
PEER_HTML=$(curl -sL "$PEER")
if echo "$PEER_HTML" | grep -qF "href=\"$URL\""; then
echo "PASS: $PEER lists $URL"
else
echo "FAIL: $PEER does not list $URL"
fi
done < /tmp/peer-urls.txt
Script 3: Check canonical-hreflang conflict for a URL.
#!/bin/bash
# /var/www/sites/example/scripts/hreflang-check-canonical.sh
# Usage: ./hreflang-check-canonical.sh https://example.com/en-us/widget/
URL="$1"
HTML=$(curl -sL "$URL")
CANONICAL=$(echo "$HTML" \
| grep -oP '<link[^>]+rel="canonical"[^>]+href="[^"]+"' \
| grep -oP 'href="[^"]+"' \
| sed 's/href="//;s/"$//' \
| head -n 1)
echo "Page URL: $URL"
echo "Canonical URL: $CANONICAL"
if [ "$CANONICAL" = "$URL" ]; then
echo "PASS: self-canonical"
else
echo "FAIL: canonical points to different URL"
echo " This URL is treated as a duplicate of $CANONICAL"
echo " Hreflang on this page will be silently discarded"
fi
# Check whether the canonical target is also in hreflang
if echo "$HTML" | grep -q "href=\"$CANONICAL\".*hreflang="; then
echo "WARN: canonical target $CANONICAL is also an hreflang peer"
echo " This is the canonical-hreflang conflict pattern"
fi
Script 4: Validate ISO 639-1 language and ISO 3166-1 Alpha 2 country codes.
#!/bin/bash
# /var/www/sites/example/scripts/hreflang-check-codes.sh
# Usage: ./hreflang-check-codes.sh https://example.com/en-us/widget/
URL="$1"
HTML=$(curl -sL "$URL")
# Valid ISO 639-1 (subset, common languages)
VALID_LANG="ar|bg|bn|cs|da|de|el|en|es|et|fa|fi|fr|gu|he|hi|hr|hu|id|it|ja|kn|ko|lt|lv|ml|mr|ms|nl|no|pa|pl|pt|ro|ru|sk|sl|sr|sv|sw|ta|te|th|tl|tr|uk|ur|vi|x-default|zh"
# Valid ISO 3166-1 Alpha 2 (subset, common countries)
VALID_COUNTRY="AE|AR|AT|AU|BE|BR|CA|CH|CL|CN|CO|CZ|DE|DK|EG|ES|FI|FR|GB|GR|HK|HU|ID|IE|IL|IN|IT|JP|KR|MX|MY|NL|NO|NZ|PE|PH|PL|PT|RO|RU|SA|SE|SG|SK|TH|TR|TW|UA|US|VN|ZA"
echo "Extracting hreflang values from $URL"
echo "$HTML" \
| grep -oP 'hreflang="[^"]+"' \
| sed 's/hreflang="//;s/"$//' \
| sort -u > /tmp/hreflang-values.txt
ERRORS=0
while IFS= read -r VALUE; do
if [ "$VALUE" = "x-default" ]; then
echo "OK x-default"
continue
fi
# Parse lang[-COUNTRY]
LANG=$(echo "$VALUE" | cut -d- -f1 | tr '[:upper:]' '[:lower:]')
COUNTRY=$(echo "$VALUE" | awk -F- '{print toupper($2)}')
# Check language
if ! echo "$LANG" | grep -qE "^($VALID_LANG)$"; then
echo "FAIL invalid language '$LANG' in '$VALUE'"
ERRORS=$((ERRORS + 1))
continue
fi
# Check country if present
if [ -n "$COUNTRY" ]; then
if ! echo "$COUNTRY" | grep -qE "^($VALID_COUNTRY)$"; then
echo "FAIL invalid country '$COUNTRY' in '$VALUE'"
ERRORS=$((ERRORS + 1))
continue
fi
fi
echo "OK $VALUE"
done < /tmp/hreflang-values.txt
echo "Done. $ERRORS invalid codes found."
For full ISO 639-1 and ISO 3166-1 Alpha 2 lookup tables, see the Hreflang.org valid-codes reference (hreflang.org/list-of-hreflang-codes/) or the IANA registry. The scripts above include common subsets; extend the regex lists per project scope.
10.8 Continuous Validation
Validation is not one-time. Hreflang clusters break over time through routine site operations:
- A page gets republished with a new URL slug; the old URL 301-redirects; hreflang on peer pages now points to a redirect. Cluster breaks.
- A page gets
noindexadded during a campaign; hreflang still lists it. Cluster breaks. - A locale gets removed from the site; pages in other locales still reference it. Cluster has dangling peers.
- A new locale is added; existing pages have no return tag to the new locale. Cluster is partial.
The validation cadence:
hreflang_validation_cadence:
pre_deploy:
every_change: "validate the affected cluster before merge"
automated: "CI runs hreflang-check-bidirectional and hreflang-check-canonical"
weekly:
sample_size: "10 percent of pages, rotating coverage"
tool: "Screaming Frog scheduled crawl or Sitebulb scheduled audit"
metric: "zero canonical-hreflang conflicts; zero non-200 hreflang URLs"
monthly:
sample_size: "full site"
tool: "Sitebulb full international audit"
metric: "comprehensive issue inventory; trend over time"
on_locale_change:
trigger: "adding or removing a locale"
scope: "full site"
tool: "Screaming Frog + manual cluster spot-check on representative pages"
11. Common Hreflang Mistakes
The ten anti-patterns most likely to be present in a 2026 site audit, ranked roughly by frequency in the Ahrefs and Semrush studies (Ahrefs n=374,756 domains 2024; Semrush n=20,000 multilingual sites 2023).
11.1 Missing Self-Reference
Frequency: 16 percent of multilingual sites (Ahrefs 2024).
Anti-pattern:
<!-- On https://example.com/en-us/widget/ -->
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">
<link rel="alternate" hreflang="de-DE" href="https://example.com/de-de/widget/">
<!-- en-US peer entry missing -->
Fix:
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/widget/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">
<link rel="alternate" hreflang="de-DE" href="https://example.com/de-de/widget/">
Why it breaks: Google's documentation and Mueller's statements treat self-reference as "good practice, technically optional" but in audited behavior, missing self-reference correlates with cluster discard. Always include it.
11.2 Missing Return Tags (Broken Bidirectionality)
Frequency: 31 percent of multilingual sites have conflicting or broken bidirectional declarations (Ahrefs 2024).
Anti-pattern:
<!-- On https://example.com/en-us/widget/ -->
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/widget/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">
<!-- On https://example.com/en-gb/widget/ -->
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">
<!-- no en-US entry -->
Fix: every URL in the cluster lists every other URL.
Why it breaks: Google explicitly states: "If page X links to page Y, page Y must link back to page X. If this is not the case for all pages that use hreflang annotations, those annotations may be ignored or not interpreted correctly" (Google Search Central, "Localized versions of your pages", 2024 revision).
11.3 Invalid Language or Country Codes
Frequency: 8.91 percent of multilingual sites (Ahrefs 2024).
Anti-patterns:
<link rel="alternate" hreflang="en-uk" href="..."> <!-- WRONG: uk is not ISO 3166-1 -->
<link rel="alternate" hreflang="en-eu" href="..."> <!-- WRONG: eu is not a country -->
<link rel="alternate" hreflang="jp" href="..."> <!-- WRONG: jp is not ISO 639-1 (it's ja) -->
<link rel="alternate" hreflang="es-419" href="..."> <!-- WRONG: 419 not supported -->
<link rel="alternate" hreflang="zh-CN" href="..."> <!-- OK (CN is valid) -->
<link rel="alternate" hreflang="zh-Hans" href="..."> <!-- WRONG: Hans is ISO 15924 script, not country -->
<link rel="alternate" hreflang="en-419" href="..."> <!-- WRONG: 419 not supported -->
<link rel="alternate" hreflang="en_US" href="..."> <!-- WRONG: underscore, must be hyphen -->
Fixes:
<link rel="alternate" hreflang="en-GB" href="..."> <!-- UK is GB -->
<!-- there is no EU country code; use individual member states -->
<link rel="alternate" hreflang="ja" href="..."> <!-- Japanese is ja, not jp -->
<!-- for Latin American Spanish, use country codes per market: es-MX, es-AR, etc. -->
<link rel="alternate" hreflang="zh-CN" href="..."> <!-- China -->
<link rel="alternate" hreflang="zh-TW" href="..."> <!-- Taiwan, traditional script implied -->
<link rel="alternate" hreflang="en-US" href="..."> <!-- always hyphen -->
Common pitfalls in code mapping:
| Wrong | Right | Reason |
|---|---|---|
en-uk |
en-GB |
UK is not in ISO 3166-1 Alpha 2; GB is |
en-eu |
(multiple) | EU is not a country; use per-country |
jp |
ja |
Japanese ISO 639-1 is ja |
cn |
zh |
Chinese ISO 639-1 is zh |
kr |
ko |
Korean ISO 639-1 is ko |
es-419 |
(per-country) | 419 is UN code, not ISO 3166-1 |
en_US |
en-US |
Hyphen, not underscore |
en-GBR |
en-GB |
Alpha 2, not Alpha 3 |
en-USA |
en-US |
Alpha 2, not Alpha 3 |
11.4 Hreflang to Non-200 or Non-Canonical URLs
Frequency: common; specific statistic varies by study.
Anti-pattern:
<!-- hreflang points to a URL that 301 redirects -->
<link rel="alternate" hreflang="en-GB" href="https://example.com/uk/widget">
<!-- but /uk/widget 301s to /en-gb/widget/ -->
Fix:
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/widget/">
Why it breaks: Google may still partially process hreflang to a redirect (Mueller stated in 2018 and reaffirmed 2025 that hreflang to a 301 is "probably OK" but should be automated to follow the redirect target). Best practice: point directly to the destination URL. A hreflang to a 404 or 5xx URL is definitively broken.
11.5 Canonical-Hreflang Conflict
Frequency: common; severity high.
Covered in detail in Section 8. The pattern: hreflang declares peer; canonical declares same peer as the master, making this URL a duplicate. Google discards hreflang. Locale-matched URL never appears in its SERP.
Fix: every URL in the cluster self-canonicals. Canonical never points to an hreflang peer.
11.6 Hreflang in Body Instead of Head
Anti-pattern:
<head>
<title>Widget</title>
</head>
<body>
<link rel="alternate" hreflang="en-GB" href="..."> <!-- WRONG -->
<h1>Widget</h1>
</body>
Why it breaks: Google explicitly states that <link rel="alternate"> tags only count when in <head>. Body-located tags are ignored.
Fix: move all hreflang tags to <head>.
11.7 Mixing HTML Head and Sitemap Declarations That Disagree
Anti-pattern: the HTML head on /en-us/widget/ lists six peers; the XML sitemap entry for /en-us/widget/ lists eight peers. The two sources disagree about which URLs are in the cluster.
Fix: pick one source of truth. Either remove the HTML head hreflang (use sitemap only) or remove the sitemap hreflang (use HTML head only) or keep both in sync. The maintenance burden of keeping both in sync is real; for sites with frequent locale changes, the sitemap-only pattern is operationally safer.
11.8 Auto-Redirect by IP Replacing Hreflang
Anti-pattern: the site auto-redirects users by IP geolocation. A user in Germany requesting /en-us/widget/ is server-redirected to /de-de/widget/. Googlebot (which crawls from US IPs) requesting /de-de/widget/ from the UK locale's hreflang cannot reach the URL because the server redirects it back to /en-us/.
Why it breaks: Googlebot never sees the locale-specific content; the cluster is invisible to Google. Hreflang declarations point to URLs Googlebot cannot reach. Mueller has stated this pattern blocks effective indexing of non-default locales (Search Central Office Hours, multiple, 2019-2024).
Fix: do not auto-redirect by IP. Offer a banner ("It looks like you are in Germany; view the German site?") with an explicit link. Let the user choose. Googlebot indexes every locale because every URL is directly reachable.
11.9 Same-Language Without Region or x-default
Anti-pattern:
<link rel="alternate" hreflang="en-US" href="https://example.com/en-us/">
<link rel="alternate" hreflang="en-GB" href="https://example.com/en-gb/">
<link rel="alternate" hreflang="en-CA" href="https://example.com/en-ca/">
<!-- no x-default; no generic en -->
A user in India searching in English has no specific match (en-IN is not in the cluster). Google must pick algorithmically. The site loses control over which version appears.
Fix: add x-default pointing to the most appropriate fallback (typically the global English or en-US).
<link rel="alternate" hreflang="x-default" href="https://example.com/en-us/">
11.10 JavaScript-Injected Hreflang
Anti-pattern: the hreflang block is generated client-side by JavaScript, not present in the first-byte HTML.
<!-- First byte response -->
<head>
<title>Widget</title>
<!-- no hreflang -->
</head>
<!-- ...later, client JS injects: -->
<script>
document.head.appendChild(/* hreflang link tag */);
</script>
Why it breaks: AI crawlers do not execute JavaScript (framework-contentfirst.md, Section 4.1). Many SEO crawlers (Screaming Frog by default, Sitebulb without rendering enabled) do not execute JavaScript. Googlebot's JavaScript rendering is reliable but delayed; indexing decisions can be made on the unrendered HTML before the hreflang is injected. The cluster is invisible to a portion of the consumer surface.
Fix: server-render hreflang in the first byte. For Next.js, use generateMetadata in App Router or getStaticProps in Pages Router (framework-nextjs.md). For React SPAs, prerender with a build step that emits per-locale HTML. For WordPress, ensure the multilingual plugin (Polylang, WPML) writes to the head template, not to a post-render JavaScript hook.
12. Monitoring and Maintenance
12.1 The Monitoring Stack After GSC Deprecation
The GSC International Targeting report was the centerpiece of hreflang monitoring before September 2022. Since deprecation, the monitoring stack is third-party plus GSC's remaining tools:
monitoring_stack_2026:
primary_crawl_validator:
tool: "Screaming Frog SEO Spider or Sitebulb"
cadence: "weekly scheduled crawl with hreflang export"
metric: "zero new errors compared to last week's baseline"
gsc_url_inspection:
purpose: "spot check that deployed pages carry the hreflang block"
cadence: "after every deploy that touches international templates"
gsc_coverage_pages_report:
purpose: "detect duplicate-canonical issues that indicate hreflang break"
cadence: "weekly review of new entries"
filter: "look for 'Duplicate, Google chose different canonical' for locale URLs"
gsc_performance_by_country:
purpose: "verify locale URLs receive impressions from target country"
cadence: "monthly; compare 28-day trends per locale"
per_locale_gsc_property:
purpose: "isolate metrics per locale segment"
recommendation: "set up property for each subdirectory or subdomain"
example_properties:
- "https://example.com/en-us/"
- "https://example.com/en-gb/"
- "https://example.com/de-de/"
- "https://example.com/fr-fr/"
third_party_rank_tracker:
tool: "Semrush, Ahrefs, AccuRanker; configured per target country"
purpose: "verify rankings are happening on the locale-matched URL"
cadence: "weekly"
log_analysis:
tool: "nginx access logs on Bubbles, parsed for Googlebot per locale"
purpose: "verify Googlebot crawls all locale URLs at expected rate"
cadence: "monthly review; quarterly deep dive"
12.2 Weekly Checks
weekly_hreflang_checks:
cluster_health:
- "screaming frog scheduled crawl with hreflang export"
- "compare current errors to last week"
- "investigate any new errors"
new_locale_coverage:
- "if a new locale was added in the past 7 days, full cluster spot-check"
- "verify return tags from every existing locale to the new locale"
- "verify the new locale's pages list every existing locale"
removed_url_audit:
- "any URL removed (deleted, 410'd, 301'd) in past 7 days"
- "search the rest of the site for residual hreflang references"
- "fix dangling peers"
gsc_inspection:
- "spot check 5 representative pages per locale"
- "verify URL Inspection shows hreflang in rendered HTML"
duplicate_canonical_alert:
- "GSC Coverage > Pages > 'Duplicate, Google chose different canonical'"
- "if locale URLs appear in this bucket, cluster has broken"
- "investigate immediately; rank loss within 2-4 weeks if not fixed"
12.3 Monthly Checks
monthly_hreflang_checks:
full_site_crawl:
- "Sitebulb or Screaming Frog full crawl"
- "compare to last month's baseline"
- "trend per error type"
performance_by_country:
- "GSC Performance > Search Results > filter by country"
- "verify each target country sees its locale URLs as primary landing pages"
- "investigate any locale where the wrong country URL is ranking"
log_review:
- "nginx access logs parsed for Googlebot per locale"
- "verify crawl rate is proportional to locale size"
- "investigate any locale with significantly under-rate crawling"
competitor_check:
- "spot check 2-3 competitors in each major market"
- "verify they are using hreflang correctly"
- "note any structural changes worth considering"
12.4 Change Management for New Markets
Adding a locale is the most error-prone routine hreflang operation. The pattern that consistently works:
new_locale_rollout:
step_1_pre_launch:
- "build all new locale URLs to spec"
- "self-canonical, HTTP 200, indexable, content complete"
- "build new locale's hreflang block listing all existing locales + self"
- "do NOT yet update existing locales' hreflang to include the new one"
step_2_internal_verification:
- "Screaming Frog crawl of new locale URLs only"
- "verify cluster is well-formed within the new locale"
- "fix any errors before exposure"
step_3_existing_locale_update:
- "deploy hreflang changes to existing locales' templates"
- "every existing locale's pages now list the new locale as a peer"
- "deploy in a single change set, not piecemeal"
step_4_sitemap_update:
- "regenerate XML sitemap with new locale URLs"
- "submit updated sitemap to GSC"
step_5_post_launch_validation:
- "Screaming Frog full crawl"
- "verify zero broken return tags across all locales"
- "verify new locale URLs appear in URL Inspection with correct hreflang block"
- "submit new locale URLs via IndexNow if Bing-targeted"
step_6_monitoring:
- "weekly cluster check for first month"
- "GSC Performance check for new locale country at 2 weeks and 4 weeks"
- "investigate any unexpected drop in existing locale impressions"
12.5 Change Management for Removed Markets
Removing a locale is less common but equally risky. The pattern:
removed_locale_rollout:
step_1_remove_from_other_locales:
- "update every other locale's hreflang to drop the dying locale"
- "deploy in a single change set"
step_2_410_or_redirect:
- "decide: 410 (gone permanently) or 301 (consolidate to another locale)"
- "410 if no replacement content"
- "301 to closest matching locale if user value preserved (e.g., en-CA -> en-US)"
step_3_sitemap_cleanup:
- "regenerate XML sitemap without the removed locale URLs"
- "submit updated sitemap to GSC"
step_4_validation:
- "Screaming Frog full crawl"
- "verify no residual hreflang references to dead locale"
step_5_log_review_4_weeks:
- "verify Googlebot stops requesting the dead URLs within 4 weeks"
- "verify no residual rankings on dead locale URLs"
13. Audit Rubric
The audit rubric is per-page (a sample) plus site-wide plus 90-day post-deploy verification.
13.1 Per-Page Audit (10 Items)
For a sample of representative pages, one per locale, plus the homepage, plus one paginated page if applicable.
| # | Criterion | Pass / Fail |
|---|---|---|
| HL1 | Hreflang block present in <head> (or XML sitemap entry, or HTTP header) |
|
| HL2 | Self-reference present in cluster | |
| HL3 | Every declared peer URL returns HTTP 200 | |
| HL4 | Every declared peer URL is self-canonical | |
| HL5 | Current page is self-canonical (canonical points to current URL) | |
| HL6 | Every declared peer URL declares the current page in return | |
| HL7 | Language codes are valid ISO 639-1 | |
| HL8 | Country codes (where present) are valid ISO 3166-1 Alpha 2 | |
| HL9 | x-default present (if cluster has multiple locales) and points to a valid URL | |
| HL10 | Tags render in first-byte HTML, not injected by JavaScript |
Score per page: 10. Threshold: 10 of 10 pass. A single failure means the cluster is partially broken for that page.
13.2 Site-Wide Audit (10 Items)
| # | Criterion | Pass / Fail |
|---|---|---|
| HS1 | URL structure decision documented and consistent (ccTLD, subdir, subdomain) | |
| HS2 | Implementation method decision documented (head, sitemap, header) | |
| HS3 | XML sitemap includes all locale URLs (if sitemap method) | |
| HS4 | No canonical-hreflang conflict across audited sample (1000+ URL crawl) | |
| HS5 | Screaming Frog or Sitebulb full crawl yields zero broken return tags | |
| HS6 | Zero hreflang to non-200 URLs across full crawl | |
| HS7 | Zero invalid language or country codes across full crawl | |
| HS8 | No JavaScript-injected hreflang (every page renders in first byte) | |
| HS9 | x-default policy applied consistently across cluster types | |
| HS10 | No IP-based auto-redirects between locales |
Score: 10. World-class: 10 of 10. Threshold for "shipped": 8 of 10 with HS4, HS5, HS6 mandatory pass.
13.3 First 90 Days Post-Deploy (8 Checkpoints)
| Day | Check |
|---|---|
| Day 1 | Screaming Frog crawl matches pre-deploy expectations; zero new errors |
| Day 3 | GSC URL Inspection on 5 representative pages per locale shows correct hreflang in rendered HTML |
| Day 7 | nginx access logs confirm Googlebot is crawling all locale URLs |
| Day 14 | GSC Performance > Country filter shows impressions in target countries on target URLs |
| Day 21 | Sitebulb scheduled audit confirms cluster is stable |
| Day 30 | Comparison of organic landing pages by country in GA4: locale-matched URLs are the primary landing pages per country |
| Day 60 | Rank tracker per country shows locale URLs ranking, not the wrong-locale URL |
| Day 90 | Full re-audit: any drift from baseline triggers cluster repair |
13.4 Audit Decision Tree
Is hreflang present anywhere on the site?
NO -> Decide: does site need hreflang?
YES -> Full install (Sections 4-11).
NO -> Skip framework.
YES -> Continue.
Does Screaming Frog or Sitebulb crawl show return-tag errors?
YES -> Cluster is broken. Repair to zero return-tag errors before any other work.
NO -> Continue.
Are any URLs in "Duplicate, Google chose different canonical" in GSC?
YES -> Canonical-hreflang conflict probable. Section 8 remediation.
NO -> Continue.
Do all locale URLs return HTTP 200?
YES -> Continue.
NO -> Fix non-200 URLs in cluster.
Are all language and country codes valid?
YES -> Continue.
NO -> Fix code errors. Common: en-uk -> en-GB.
Is x-default present where needed?
YES -> Continue.
NO -> Add x-default per Section 6.
Are tags in first-byte HTML?
YES -> Continue.
NO -> Server-render hreflang. See framework-contentfirst.md.
Site passes audit. Move to monitoring per Section 12.
13.5 Failure Severity Classification
failure_severity:
critical_blocks_indexing:
- canonical_hreflang_conflict
- hreflang_in_body_only
- javascript_injected_only
- all_peer_urls_404
impact: "cluster is invisible to Google for affected URLs"
timeline: "fix within 24 hours; rank loss within 2-4 weeks"
high_partial_function:
- missing_self_reference
- broken_return_tags
- invalid_iso_codes
- hreflang_to_redirect
impact: "cluster partially discarded; some locales not surfacing"
timeline: "fix within 1 week"
medium_suboptimal:
- missing_x_default
- inconsistent_method_across_pages
- sitemap_out_of_sync_with_head
impact: "edge cases not optimally handled"
timeline: "fix within 4 weeks"
low_cosmetic:
- uppercase_lowercase_inconsistency_in_codes
- extra_whitespace_in_tags
impact: "no functional impact; tidiness issue"
timeline: "fix during next template refactor"
14. Maintenance Schedule and Report Templates
14.1 Maintenance Schedule
hreflang_maintenance_schedule:
every_deploy:
- "CI pipeline runs hreflang-check-bidirectional.sh on changed URLs"
- "CI pipeline runs hreflang-check-canonical.sh on changed URLs"
- "CI pipeline runs hreflang-check-codes.sh on changed templates"
- "fail the deploy if any script returns errors"
weekly:
- "Screaming Frog scheduled crawl with hreflang export"
- "compare to previous week's baseline"
- "GSC Coverage review for duplicate-canonical entries"
- "URL Inspection spot check on 5 pages per locale"
monthly:
- "full Sitebulb or Screaming Frog crawl"
- "GSC Performance by country review"
- "nginx log review for crawl coverage per locale"
- "competitor hreflang spot check (2 to 3 competitors)"
- "report generation (Section 14.3)"
quarterly:
- "full audit per Section 13"
- "policy review: are URL structure decisions still right"
- "report generation (Section 14.4)"
annually:
- "review locale strategy: add markets, drop markets, deepen localization"
- "review URL structure: is ccTLD vs subdir vs subdomain still right"
- "verify ISO codes have not changed (rare but possible)"
14.2 New Locale Launch Schedule
new_locale_launch_schedule:
week_minus_4:
- "decide locale (language plus country)"
- "decide URL structure for new locale"
- "translation and localization begins"
week_minus_2:
- "all new locale URLs built and live in staging"
- "Screaming Frog crawl of staging confirms cluster well-formed"
- "existing locale templates updated in staging to include new peer"
week_minus_1:
- "staging full validation"
- "GSC sandbox property set up for new locale"
week_zero_launch:
- "deploy"
- "sitemap updated and submitted to GSC"
- "URL Inspection on representative pages"
- "IndexNow submission if Bing targeting matters"
week_plus_1:
- "Screaming Frog crawl of production"
- "GSC Coverage check for new URLs entering index"
- "rank tracker baseline established for new country"
week_plus_2:
- "Performance by country check"
- "verify Googlebot is crawling new locale URLs"
week_plus_4:
- "first month-end report"
- "investigation of any unexpected behavior"
week_plus_12:
- "first quarter retrospective"
- "decision: is the new locale tracking expected trajectory"
14.3 Monthly Report Template
# Hreflang Health Report for Month YYYY-MM
## Cluster Health
- Total locale URLs: N
- Total clusters: M
- Screaming Frog hreflang error count: X (previous month: Y, delta: Z)
## Error Breakdown
| Error Type | Count This Month | Count Previous | Delta | Severity |
|---|---|---|---|---|
| Missing self-reference | | | | High |
| Missing return links | | | | High |
| Non-200 hreflang URL | | | | High |
| Invalid language code | | | | High |
| Canonical-hreflang conflict | | | | Critical |
| Missing x-default | | | | Medium |
## Performance by Country
| Country | Impressions | Clicks | Locale-Matched URL Rate |
|---|---|---|---|
| US | | | percent |
| GB | | | percent |
| DE | | | percent |
| FR | | | percent |
## Coverage by Locale
| Locale | URLs in Index | Pages with Errors | Locale Health |
|---|---|---|---|
| en-US | | | |
| en-GB | | | |
| de-DE | | | |
| fr-FR | | | |
## Actions This Month
- Fixes deployed: list
- New errors discovered: list
- Open items for next month: list
## Recommendations
- short list
14.4 Quarterly Audit Report Template
# Hreflang Quarterly Audit for QN YYYY
## Executive Summary
- One-paragraph summary of cluster health, key changes, and trajectory.
## Audit Rubric Scores
- Per-page audit: average X / 10 across sample of N pages
- Site-wide audit: X / 10
- 90-day post-deploy checkpoints: list pass/fail
## Strategic Questions
- Is the URL structure still right for the business in Quarter N?
- Are there markets to add or drop?
- Is the implementation method still right (head vs sitemap vs header)?
- Are there pagination changes that require hreflang updates?
## Issue Trend
- Critical errors: 13-week trend
- High errors: 13-week trend
- New issue types introduced this quarter: list
## Performance by Country
- Per-country impressions trend
- Per-country click trend
- Per-country locale-matched URL surface rate
## Recommendations
- Short list with owners and deadlines
14.5 Incident Response Template
When a cluster breaks unexpectedly (rank loss in a locale, sudden duplicate-canonical errors, etc.):
# Hreflang Incident YYYY-MM-DD
## Symptom
- What was observed (rank loss, GSC alert, third-party tool flag).
- When first noticed.
- What locale(s) affected.
## Diagnosis
- Screaming Frog or Sitebulb diagnosis result.
- Specific error type identified.
- Probable cause (recent deploy, template change, locale change).
## Impact Assessment
- Number of URLs affected.
- Estimated traffic loss (per locale).
- Estimated revenue loss (per locale).
## Remediation
- Fix applied.
- Deploy date.
- Validation result.
## Post-Mortem
- Root cause.
- Why was this not caught in pre-deploy validation.
- Process change to prevent recurrence.
End of Framework Document
Document version: 1.0
Companion documents:
-
framework-international.md, Strategic internationalization, URL structure decisions at the business level, content localization -
framework-contentfirst.md, The doctrine that hreflang must render in the first byte -
framework-technicalseo.md, Canonical signal stack, URL structure conventions, robots.txt -
framework-schema.md, Structured data; the inLanguage property complements hreflang -
framework-internallinking.md, Internal linking topology that reinforces hreflang -
framework-gscanalysis.md, GSC reporting after International Targeting deprecation -
framework-ga4.md, Country-level analytics for verifying locale routing -
framework-migration.md, Adding locales as a migration event -
framework-security.md, HTTPS posture across ccTLDs and subdomains -
framework-cross-stack-implementation.md, Framework-specific patterns for hreflang -
framework-react.md, Client-rendered SPAs and hreflang prerender requirements -
framework-nextjs.md,generateMetadata, App Router alternates, locale routing -
framework-wordpress.md, Polylang, WPML hreflang patterns -
framework-shopify.md, Shopify Markets and hreflang interaction -
SEO-Search-Appearance.md, How locale-matched URLs surface in the SERP -
SERP-Optimization.md, Country-specific SERP feature targeting
Phase 2 siblings scheduled: framework-multilingual-content, framework-localization-process, framework-cross-border-ecommerce, framework-geo-targeting-without-hreflang.
Sources cited in this framework:
- Google Search Central, "Localized versions of your pages" (
developers.google.com/search/docs/specialty/international/localized-versions), 2024 revision - Google Search Central, "Managing Multi-Regional and Multilingual Sites" (
developers.google.com/search/docs/specialty/international/managing-multi-regional-sites), 2024 revision - Google Search Central, "Introducing x-default hreflang for international landing pages", April 10, 2013
- Google Search Central, "The International Targeting report is deprecated" (
support.google.com/webmasters/answer/12474899), September 22, 2022 deprecation - John Mueller, Google Search Off the Record and Search Central Office Hours, multiple statements 2018-2025 on hreflang as hint, return tag confirmation, canonical conflict, and complexity
- Ahrefs Hreflang Implementation Study, 374,756 domains sample, 2024, finding 67 percent implementation issues
- Semrush, "9 Common Hreflang Errors", site audit data from 20,000 multilingual sites, 2023-2025
- Screaming Frog SEO Spider, "How To Audit Hreflang" documentation, 2025
- Screaming Frog SEO Spider, "Issues > Hreflang > Non-200 Hreflang URLs", 2025
- Sitebulb, "International Hints" documentation, 2025
- Aleyda Solis, hreflang Tags Generator (
aleydasolis.com/en/seo-resources-tools/hreflang-tags-generator/), referenced by Google Search Central - hreflang.org Testing Tool (
app.hreflang.organdhreflang.org/list-of-hreflang-codes/), valid code reference 2025 - Merkle Technical SEO, Hreflang Tag Tester (
technicalseo.com/tools/hreflang/), 2025 - ISO 639-1 language code registry (Library of Congress maintainer), 2025 edition
- ISO 3166-1 Alpha 2 country code registry (ISO maintenance agency), 2025 edition
- Search Engine Land, "Study: 31 percent of international websites contain hreflang errors", 2022 study referenced in 2024 update
- Bing Webmaster Blog, "How Bing handles hreflang", 2016, cited in 2025 Bing documentation
- Lumar SEO, "Canonical Tags Easy Dos and Donts", 2024
- Prerender, "10 Common Hreflang Tag Issues and How to Fix Them", 2025
End of framework.
From the ThatDevPro Engine Optimization framework library. Studio: ThatDevPro (SDVOSB veteran-owned web + AI engineering). Sister property: ThatDeveloperGuy. Source: https://www.thatdevpro.com/insights/framework-hreflang/.
Top comments (0)