DEV Community

Joseph Anady
Joseph Anady

Posted on • Originally published at thatdevpro.com

Internal linking: hub-and-spoke architecture

Originally published at thatdevpro.com. Part of ThatDevPro's open SEO + AI framework library. Open-source AI citation toolkit: github.com/Janady13/aio-surfaces.


Hub-and-Spoke Architecture, Anchor Text Discipline, Topical Clusters, Crawl Depth, Orphan Detection, and Link Equity Distribution

A comprehensive installation and audit reference for internal linking — the discipline of using on-site links to communicate site architecture to crawlers, distribute ranking signals across pages, and guide users through topical depth. Internal linking is one of the highest-ROI SEO levers because it costs nothing per link added and compounds across the entire site. Dual-purpose: installation manual and audit document.


1. Document Purpose

This is the canonical reference for internal linking. Most sites have content that ranks. Most sites have technical SEO that works. Few sites have internal linking that does what it should. The gap shows up as orphan pages, shallow content, ranking signals trapped on the homepage, and topical clusters that exist on paper but not in the link graph.

In 2026, internal linking has become more important, not less. AI search engines parse internal link patterns to understand topical authority. Google's crawl-budget allocation favors well-linked pages. Topical-cluster ranking — a page ranks when its surrounding cluster ranks — is now a measurable phenomenon. Sites that use internal linking strategically outrank sites with better content but flatter link graphs.

1.1 Required Tools

  • Screaming Frog SEO Spider — desktop crawler; reveals link graph, orphan pages, anchor text
  • Sitebulb — desktop crawler with stronger reporting on crawl depth and link counts
  • Ahrefs Site Explorer — Internal Backlinks — sitewide internal link audit
  • Ahrefs Site Audit — Internal Linking — detects orphan pages, broken links
  • Semrush Site Audit — Internal Linking — alternative
  • Google Search Console — Links report — Google's view of internal linking
  • Link Whisper (WordPress) — automated internal-link suggestion
  • Custom Python + BeautifulSoup — for programmatic link extraction at scale
  • Graphviz / yEd / Gephi — for visualizing link graphs
  • Spreadsheet (Sheets or Excel) — link inventory management

1.2 Document Scope

Covers: site architecture patterns, hub-and-spoke / topical-cluster organization, anchor text discipline, crawl depth, orphan detection, link equity distribution, contextual vs navigational linking, breadcrumbs, faceted navigation handling, and pagination patterns. Touches but does not exhaust: keyword research and topic mapping (framework-keywordresearch.md), schema's BreadcrumbList (framework-schema.md), navigation UX (framework-uxseo.md).


2. Client Variables Intake

domain: ""
total_pages_indexed: 0
content_taxonomy_documented: false
hub_pages_identified: []
known_orphan_pages: []
existing_internal_link_strategy: ""    # describe or "none"
sitemap_url_count: 0
crawl_depth_max_observed: 0            # from Sitebulb / Screaming Frog
average_internal_links_per_page: 0
top_traffic_pages: []
top_conversion_pages: []
known_cannibalization: []              # cross-reference framework-keywordresearch.md
Enter fullscreen mode Exit fullscreen mode

3. The Three Functions of Internal Links

Every internal link does three things at once:

  1. Architectural — tells crawlers and users that the linked page exists and matters.
  2. Topical — tells crawlers what the linked page is about (via anchor text and surrounding content).
  3. Equity — passes a portion of the source page's ranking signals to the target.

A weak link does one. A strong link does all three.


4. Site Architecture Patterns

4.1 The Hub-and-Spoke Model

The dominant architecture pattern for content sites and most service businesses.

                  Homepage (root hub)
                  /         |         \
              Hub A      Hub B       Hub C
              /  \         |         /  \
           Sub  Sub     Sub Sub    Sub  Sub
            \    \       |   /     /    /
             [back-links to hub from each spoke]
Enter fullscreen mode Exit fullscreen mode

The pattern:

  • Pillar / hub pages cover a broad topic comprehensively. They link out to all relevant sub-pages.
  • Cluster / spoke pages cover narrow sub-topics in depth. They link back to their hub.
  • Cross-cluster links are sparing and used only when topically warranted.

Why it works:

  • Hubs accumulate link equity from spokes.
  • Spokes signal topical depth around the hub.
  • Crawlers see a clear topical structure.
  • Users navigate naturally between depth levels.

4.2 The Mesh Model

For wikis, knowledge bases, and densely interconnected content, a mesh works better than strict hubs.

  • Every page links to many topically related pages.
  • No strict hierarchy.
  • Depth comes from cross-references, not parent-child.

Wikipedia is the canonical mesh. It's appropriate for sites with hundreds of densely related entities. Most agency clients should use hub-and-spoke instead.

4.3 The Catalog Model (Ecommerce)

For ecommerce:

  • Homepage → Category → Subcategory → Product
  • Cross-links between related products
  • Cross-links from category to comparison guides (cluster overlay)

Cross-reference: framework-ecommerceseo.md.

4.4 The Editorial Model (Publishers)

For news and editorial sites:

  • Section → Article
  • Articles link forward to follow-up coverage
  • Articles link back to evergreen explainer pages
  • "Related articles" widgets serve as soft hubs

Cross-reference: framework-newsseo.md.


5. Topical Clusters

A topical cluster is a hub page plus all the spokes that cover the topic. Clusters are how modern sites compete for broad topics they couldn't dominate with a single page.

5.1 Cluster Anatomy

For a cluster on "Local SEO":

HUB: /topics/local-seo/  (pillar page, comprehensive overview)

SPOKES:
  /topics/local-seo/google-business-profile/
  /topics/local-seo/local-citations/
  /topics/local-seo/local-pack-ranking/
  /topics/local-seo/review-management/
  /topics/local-seo/local-link-building/
  /topics/local-seo/local-schema/
  /topics/local-seo/local-content-strategy/
Enter fullscreen mode Exit fullscreen mode

Each spoke:

  • Links back to the hub in the opening paragraph
  • Links back to the hub in the conclusion
  • Links to 2-3 sister spokes in topically appropriate places
  • Does not link to spokes from unrelated clusters

The hub:

  • Links to every spoke at least once
  • Includes a structured "in this cluster" section listing spokes
  • Updates as new spokes are added

5.2 Cluster Sizing

Cluster size Strategy
3-5 spokes Minimum viable cluster. Common starting point.
6-10 spokes Healthy cluster. Most competitive topics need this depth.
11-20 spokes Authoritative cluster. Often the bar for ranking head terms.
20+ spokes Pillar treatment. Reserved for category-defining hubs.

5.3 Cluster Identification

For existing sites:

  1. Inventory all content pages.
  2. Group by topical theme.
  3. Identify the strongest existing piece per theme as the candidate hub.
  4. Document the cluster gap (sub-topics not yet covered).
  5. Decide: build out the cluster, or consolidate into fewer pages?

For new sites:

  1. Start with keyword research (framework-keywordresearch.md).
  2. Identify head term per cluster.
  3. Decide on hub URL and 5-10 starting spokes.
  4. Build hub first; spokes can roll out over weeks.

6. Anchor Text Discipline

Anchor text — the visible link text — is one of the strongest topical signals to Google.

6.1 Anchor Text Types

Type Example When to use
Exact match <a href="/local-seo/">local SEO</a> Sparingly; risk of over-optimization
Partial match <a href="/local-seo/">our guide to local SEO</a> Most common; safest
Branded <a href="/local-seo/">read more on ThatDeveloperGuy</a> Brand-building; less topical signal
Generic <a href="/local-seo/">click here</a> Avoid. Wastes the topical signal.
Naked URL <a href="/local-seo/">thatdeveloperguy.com/local-seo/</a> Avoid for internal links; fine for citations
Image image with alt="local SEO guide" Alt text serves as anchor; use descriptive alt

6.2 Anchor Text Patterns That Hurt

  • "Click here", "read more", "learn more" — wasted signal. Especially bad on important hub links.
  • The same exact-match anchor on every link to a page — over-optimization, looks unnatural.
  • Mismatched anchor and target — anchor says "plumbing" but page is about "electrical work."
  • Stuffed anchors<a>local SEO services Cassville Missouri SDVOSB</a> reads as keyword stuffing.

6.3 Healthy Anchor Distribution

For a typical hub page with many internal pointers, anchor text should vary naturally:

  • ~30% exact or partial-match topic anchors
  • ~30% partial-match with surrounding context ("our guide to...", "more on...")
  • ~20% branded or contextual ("we've covered this on the blog", "see our process")
  • ~20% incidental (in-prose mentions where the anchor reflects the surrounding sentence)

This distribution emerges naturally from good editorial writing. It rarely emerges from mass-produced internal-link automation.


7. Crawl Depth

Crawl depth is the number of clicks from the homepage to a given page. It is one of the strongest predictors of which pages Google indexes and ranks.

7.1 Targets

Site size Maximum acceptable crawl depth
Small (under 100 pages) 3 clicks
Medium (100-1,000 pages) 4 clicks
Large (1,000-10,000 pages) 5 clicks
Massive (10,000+ pages) 6 clicks (with strong sitemap support)

Pages buried deeper rarely accumulate link equity, rarely get crawled frequently, and rarely rank.

7.2 Crawl Depth vs URL Path Depth

These are different. URL path depth (/a/b/c/d/page/ = depth 5) is unrelated to crawl depth (clicks from homepage). A page at /a/b/c/d/page/ can be crawl-depth 2 if linked from the homepage. A page at /page/ can be crawl-depth 6 if buried behind paginated archives.

Optimize for crawl depth. URL depth doesn't matter to crawlers.

7.3 Reducing Crawl Depth

Common pages buried too deep:

  • Old blog posts — accessible only via paginated archives at /page/2/, /page/3/, etc.
  • Product variants — buried under category → subcategory → product → variant
  • Deep category pages — buried under multi-level taxonomies
  • Tag archives — typically dead ends

Fixes:

  • Add hub pages that link directly to important deep content
  • Create "Best of" / "Most popular" sections on the homepage or category pages
  • Link from new content to high-value old content
  • Build cluster hubs that surface deep content
  • Use breadcrumbs (counts toward crawl depth from any breadcrumb-equipped page)

8. Orphan Pages

An orphan page is one with zero internal links pointing to it. Orphans:

  • Do not get crawled (or are crawled rarely)
  • Do not accumulate ranking signals
  • Often do not rank at all

8.1 Detection

Screaming Frog method:

  1. Crawl the site
  2. Crawl Analysis → Configure → enable "Crawl Analysis"
  3. Run Crawl Analysis after main crawl completes
  4. Reports → Orphan Pages

Sitebulb method: Auto-detects in standard reports.

Ahrefs / Semrush: Site audit reports orphans.

XML sitemap cross-reference: Compare sitemap URLs against URLs found via crawl. Difference set = orphans.

8.2 Resolution

For each orphan, decide:

  • Should this page exist? If not, 410 it.
  • If yes, where should it be linked from? Identify the natural parent in the site structure.
  • Add at least 3 inbound internal links before considering the orphan resolved.

8.3 Prevention

  • Editorial workflow: never publish a page without identifying at least 2-3 places to link to it from.
  • Hub-and-spoke discipline: every spoke is added to its hub when published.
  • Internal-linking review at publish: spend 5 minutes after every publish placing inbound links.

9. The Link Equity Lens

Link equity (informally "link juice") is the ranking-signal capital a page accumulates from internal and external links. Internal linking redistributes equity within the site.

9.1 PageRank Distribution Logic (Conceptual)

Every page has some equity. It distributes equity to all the pages it links to (divided by the number of outgoing links). The pages that receive the most equity rank best.

Implications:

  • The homepage has the most equity (highest external inbound).
  • Every link from the homepage divides its equity by the number of outbound links.
  • Hub pages accumulate equity by being linked from many sources within the site.
  • A page with many outbound links passes less equity per link.

9.2 The 100-Link Heuristic

Google indicated decades ago that 100 links per page was a soft maximum. The modern reality: there is no hard limit, but pages with 200+ outbound links pass diluted equity.

For a typical content page, target:

  • 3-7 contextual outbound links to other site pages
  • Plus navigation links (header, footer, breadcrumbs)
  • Plus footer links

A page with 50+ outbound links should be questioned.

9.3 nofollow on Internal Links

Don't nofollow internal links unless there's a specific reason (e.g., blocking infinite-spawn faceted URLs from crawl). Old SEO advice to nofollow login or registration links is outdated — Google handles those fine without intervention.

9.4 Strategic Equity Concentration

Identify the 5-10 pages on the site that should rank highest (commercial pages, top-converting pages, hub pages). For these, intentionally:

  • Link from the homepage prominently
  • Link from every related blog post
  • Link from major content hubs
  • Use varied descriptive anchor text

Equity flow is editable through linking decisions.


10. Navigation Linking

The header, footer, and sidebar are the "navigation layer" — links that appear on every page (or every page in a section).

10.1 Header Navigation

Should contain links to:

  • Top-level commercial pages (Services, Pricing, Contact)
  • Top-level content hubs (Blog, Resources)
  • The single highest-priority CTA

Should NOT contain:

  • Every section the site has (mega-menus only when truly needed)
  • Promotional / temporary content
  • Deep pages that should rank organically (concentrating equity on every page is wasted)

10.2 Footer Navigation

A natural place for:

  • Sitemap-style links to important pages
  • Trust signals (about, contact, privacy, terms)
  • Service pages organized logically
  • Brand and credential links

The footer "supplies" link equity to a wider page set than the header. Use this deliberately.

10.3 Mega-Menus

Mega-menus (multi-column dropdowns with dozens of links) work well for ecommerce and large content sites — but the link count from every page is significant. For smaller sites, mega-menus dilute rather than help.

10.4 Breadcrumbs

Breadcrumbs add internal links (parent and grandparent pages) to every non-homepage. They:

  • Reduce crawl depth (every breadcrumb-enabled page is closer to root)
  • Add BreadcrumbList schema (cross-reference: framework-schema.md)
  • Improve user navigation
  • Display in SERPs as a navigation aid

Implement on every non-homepage. Always.


11. Contextual Linking

Contextual links are in-prose links within editorial content. They are the strongest internal links because:

  • Surrounding content reinforces the topical signal
  • Anchor text is naturally varied
  • They appear in the part of the page Google weighs most heavily

11.1 The Editorial Discipline

Every editorial publish should include 3-5 contextual links to other pages on the site. These should appear:

  • Once in the introduction (link forward to related topic)
  • 1-2 times in the body (link to specific related sub-topics)
  • Once in the conclusion (link to the next logical step)

This is not link manipulation. This is good editorial practice.

11.2 The "Updated Old Posts" Pattern

When publishing new content:

  • Identify 3-5 old posts that should now reference the new content.
  • Add contextual links from those posts to the new one.
  • Update the old posts' dateModified.

This is one of the highest-ROI SEO activities. Costs 15 minutes per publish; compounds across the site.

11.3 The "Linked Mentions" Audit

Spot-check old content quarterly. For every brand, product, or topic mention in old posts, verify it links to the appropriate destination page. Mentions without links are wasted internal-linking opportunities.


12. Faceted Navigation

Faceted navigation is the filter/sort UI common on ecommerce and listing sites. It generates URLs combinatorially:

/category/?color=red&size=m&brand=acme
Enter fullscreen mode Exit fullscreen mode

12.1 The Crawl Budget Problem

A faceted system with 5 facets and 5 options each generates 5^5 = 3,125 URL variations. With 10 facets, the count explodes. Crawlers can spend their entire budget on faceted URLs.

12.2 Strategies

Block facet parameters in robots.txt:

Disallow: /*?color=
Disallow: /*?size=
Disallow: /*?sort=
Enter fullscreen mode Exit fullscreen mode

Canonical to parameterless version:

Each faceted URL canonicals to the unfiltered category page.

noindex on facet combinations:

<meta name="robots" content="noindex, follow"> on faceted URLs. They get crawled but not indexed.

Whitelist facet combinations that should rank:

Some facet combinations are valuable landing pages ("red running shoes"). Whitelist these as canonical, indexable, and present in the sitemap. Block the rest.

12.3 Internal Linking with Facets

  • Faceted URLs should not be linked from elsewhere on the site (don't pass equity into pages you don't index).
  • Whitelisted facet pages get treated as normal landing pages — link to them from category pages.

Cross-reference: framework-ecommerceseo.md.


13. Pagination

Paginated archives (/page/2/, /page/3/) need careful handling.

13.1 The rel=next/prev Pattern (Deprecated)

Google deprecated rel=next/prev as a ranking signal in 2019. The link tags can still be present but Google ignores them.

13.2 Modern Pagination Patterns

Pattern 1: Self-canonical paginated pages

Every paginated page self-canonicals (page 2 canonicals to itself). All paginated pages may be indexed. Best when each page has substantively different content.

Pattern 2: Canonical to page 1

Page 2, 3, 4 canonical to page 1. Only page 1 is indexed. Best when paginated pages are mostly listing pagination with no unique value.

Pattern 3: noindex paginated pages

Page 2+ are noindex,follow. They pass internal links but don't appear in search. Best when you want crawl access but no SERP presence.

13.3 Internal Linking with Pagination

  • Provide direct links to important deep content from elsewhere on the site rather than relying on pagination as the only access path.
  • For long archives, add "Browse by year" or "Browse by category" overlays that link directly to specific posts without pagination.
  • Don't rely on infinite-scroll-only patterns; ensure crawler can reach all content via traditional links.

14. Linking Hygiene

14.1 Broken Links

Internal 4xx links waste crawl budget, frustrate users, and signal site neglect. Audit:

  • Screaming Frog → Internal → Status code filter for 4xx
  • Sitebulb → Internal Links report
  • Ahrefs Site Audit → Broken Internal Links report

Fix:

  • Update the link to point to the new canonical URL
  • 301-redirect the broken target if it should still exist
  • 410 the broken target if it shouldn't

14.2 Redirect Chains in Internal Links

An internal link pointing to a URL that 301-redirects: the link should be updated to point to the final destination directly. Why:

  • Saves crawler an extra hop
  • Eliminates dependency on the 301 staying in place
  • Marginally improves user perceived performance

Find with Screaming Frog → Redirect chain report.

14.3 Cross-Domain Internal Links

Sites that span multiple domains (multibrand portfolios, microsite networks) treat cross-domain links as external for SEO purposes. There is no internal link equity flow between separate domains, even if they're owned by the same entity.

For multibrand operations, decide deliberately:

  • Single domain with subfolders (/brand-a/, /brand-b/) — true internal linking, shared equity
  • Subdomains (brand-a.example.com, brand-b.example.com) — partially shared, weaker
  • Separate domains — no internal-link equity flow

Cross-reference: framework-multibrand.md (when built).


15. Audit Mode

# Criterion Pass/Fail
IL1 Site has documented topical taxonomy / cluster strategy
IL2 Hub pages identified for each major topic
IL3 Each spoke page links back to its hub at least twice
IL4 Each hub page links to all its spokes
IL5 Cross-cluster links used sparingly and topically
IL6 Crawl depth report shows zero pages over depth 3 (small sites) / 5 (large sites)
IL7 Zero orphan pages detected (Screaming Frog Crawl Analysis)
IL8 Anchor text varies naturally; no exact-match overuse
IL9 Anchor text contains zero "click here" / "read more" without context
IL10 Breadcrumbs implemented on every non-homepage with BreadcrumbList schema
IL11 Header navigation focused on top commercial + content hubs only
IL12 Footer provides supplemental link layer to important pages
IL13 Average outbound internal links per content page between 5 and 30
IL14 Zero broken internal links (4xx)
IL15 Zero internal links pointing to redirect chains
IL16 Faceted navigation strategy documented (block / canonical / whitelist)
IL17 Pagination strategy documented (self-canonical / canonical-to-1 / noindex)
IL18 Top 10 commercial pages each have 20+ inbound internal links
IL19 Top traffic pages link to conversion pages contextually
IL20 Editorial workflow includes inbound-link placement step
IL21 "Updated old posts" pattern practiced when new content publishes
IL22 nofollow not used on internal links (no exceptions, or documented exceptions)
IL23 Cluster gap analysis completed in last 90 days
IL24 Internal link audit run in last 30 days
IL25 Cross-domain links treated correctly per multibrand strategy

Score: 25. World-class: 23+/25.


16. Common Mistakes

  1. No clear hub pages. Every page is at the same level; nothing accumulates topical authority.
  2. Spokes that don't link back to the hub. Cluster signal lost.
  3. "Click here" / "read more" anchors. Wastes the topical signal on every internal link.
  4. Orphan pages. Indexed but invisible to the link graph.
  5. Crawl depth over 5. Important pages buried.
  6. Broken internal links unfixed. Cumulative crawl budget waste.
  7. Internal links to redirect chains. Same — extra hop.
  8. Mega-menu on a small site. Diluting equity on every page.
  9. Faceted URLs without robots / canonical strategy. Crawl budget exploded by combinatorial parameters.
  10. No breadcrumbs. Free internal-linking layer ignored.
  11. Editorial publishes new content without inbound link placement. Page launches as orphan.
  12. Old content never updated. Anchors mention things that should now link to newer pages.
  13. Anchor text identical on every internal link to a page. Looks manipulated; sometimes triggers algorithmic suppression.
  14. Footer linking to every page on the site. Equity diluted; signals devalued.
  15. Treating cross-domain links as internal. Equity does not flow.
  16. Infinite-scroll archives with no traditional link path. Pages discoverable only by JS-rendered scroll events.
  17. JavaScript-only navigation. Some crawlers see it; others don't.
  18. Cross-cluster spoke-to-spoke linking everywhere. Weakens cluster boundaries.

17. Maintenance

Weekly:

  • Spot-check newly published content for inbound and outbound links
  • Verify new content added to relevant hub page

Monthly:

  • Sitewide broken-link scan (Screaming Frog or alternative)
  • Orphan-page report review
  • New publish "Updated old posts" pass
  • Anchor-text spot audit on top 10 pages

Quarterly:

  • Comprehensive internal-link audit
  • Crawl depth review (Sitebulb)
  • Cluster gap analysis
  • Hub page link inventory refresh
  • Faceted navigation audit (if applicable)
  • Pagination strategy review

Annually:

  • Full link graph visualization (Gephi / yEd)
  • Cluster structure review against keyword research
  • Site architecture review (do hubs still match priorities?)
  • Multibrand cross-domain linking review (if applicable)

18. Companion Documents

  • framework-keywordresearch.md — Topic clusters depend on keyword research
  • framework-schema.md — BreadcrumbList implementation
  • framework-technicalseo.md — Crawl depth, faceted URL handling, pagination
  • framework-ecommerceseo.md — Faceted navigation deep dive
  • framework-newsseo.md — Editorial linking patterns
  • framework-uxseo.md — Navigation UX patterns supporting SEO
  • framework-hcs.md — Topical depth as a Helpful Content signal
  • framework-eeat.md — Topical authority as an Expertise signal
  • framework-pageexperience.md — Navigation tap-target sizing

Document version: 1.0
Last updated: 2026-05-05
Owner: Joseph W. Anady — ThatDeveloperGuy — SDVOSB


Source

Canonical: https://www.thatdevpro.com/insights/framework-internallinking/

ThatDevPro is an SDVOSB-certified veteran-owned web + AI engineering studio. Engine Optimization service · Contact

Top comments (0)