DEV Community

Tugelbay Konabayev
Tugelbay Konabayev

Posted on • Edited on • Originally published at about-kazakhstan.com

Automated Internal Linking with Tag-Based Content Graphs

Why Internal Links Matter

Internal links distribute page authority, help Google discover content, and keep users engaged. For 80+ articles, manual linking is unsustainable.

Tag-Based Link Graph

Every article has tags in frontmatter. The algorithm finds related articles by tag overlap:

function buildLinkMap(articles) {
  const linkMap = {};
  for (const article of articles) {
    const related = articles
      .filter(a => a.slug !== article.slug)
      .map(a => ({
        slug: a.slug,
        overlap: a.tags.filter(t => article.tags.includes(t)).length
      }))
      .filter(a => a.overlap > 0)
      .sort((a, b) => b.overlap - a.overlap)
      .slice(0, 6);
    linkMap[article.slug] = related;
  }
  return linkMap;
}
Enter fullscreen mode Exit fullscreen mode

Output Format

The script generates a TypeScript file with related articles that components import for rendering.

Dry Run Mode

Run with --dry-run to preview changes before committing.

Impact

After implementing automated internal linking:

  • Average internal links per article: 1.2 to 5.4
  • Pages with zero internal links: 23 to 0
  • Google discovered 17 more pages within two weeks

Explore More About Kazakhstan

Top comments (2)

Collapse
 
bhavin-allinonetools profile image
Bhavin Sheth

This is honestly the only scalable way to handle internal linking once content grows.

I tried manual linking before—missed so many pages. Tag-based linking like this keeps everything connected without extra effort 👍

Collapse
 
stackedboost profile image
Peter Hallander

The tag-overlap scoring is a solid foundation. One extension worth adding: a title/keyword similarity pass on top of tag matching. Tag overlap catches topically related content when tagging is consistent, but two articles can be closely related without sharing exact tags.

A simple approach is to tokenize titles, strip stopwords, and compute Jaccard similarity on the resulting token sets. You combine the two scores: finalScore = (tagOverlap * 0.7) + (titleSimilarity * 0.3). This catches cases where articles about the same subject use different tag vocabularies.

Another consideration: recency weighting. Newer content often deserves a slight boost in the related set, especially for evergreen topics where you have both a 3-year-old foundational post and a recent update. A simple exponential decay function on publish date blended with the content score handles this.

For anyone working specifically on Shopify stores and wanting this pattern without building the tooling, I built Better Related Blog Posts (apps.shopify.com/better-related-blog-posts) that applies exactly this tag + keyword scoring to Shopify's blog articles. Disclosure: I'm the developer. It surfaces a configurable related posts widget at the end of each blog post, which has a measurable impact on pages per session for content-heavy stores.

The broader insight from your benchmarks - internal links going from 1.2 to 5.4 per article - is the number that should convince any content team this is worth automating. Manual linking at scale just doesn't happen consistently.