Why Internal Links Matter
Internal links distribute page authority, help Google discover content, and keep users engaged. For 80+ articles, manual linking is unsustainable.
Tag-Based Link Graph
Every article has tags in frontmatter. The algorithm finds related articles by tag overlap:
function buildLinkMap(articles) {
const linkMap = {};
for (const article of articles) {
const related = articles
.filter(a => a.slug !== article.slug)
.map(a => ({
slug: a.slug,
overlap: a.tags.filter(t => article.tags.includes(t)).length
}))
.filter(a => a.overlap > 0)
.sort((a, b) => b.overlap - a.overlap)
.slice(0, 6);
linkMap[article.slug] = related;
}
return linkMap;
}
Output Format
The script generates a TypeScript file with related articles that components import for rendering.
Dry Run Mode
Run with --dry-run to preview changes before committing.
Impact
After implementing automated internal linking:
- Average internal links per article: 1.2 to 5.4
- Pages with zero internal links: 23 to 0
- Google discovered 17 more pages within two weeks
Top comments (2)
This is honestly the only scalable way to handle internal linking once content grows.
I tried manual linking before—missed so many pages. Tag-based linking like this keeps everything connected without extra effort 👍
The tag-overlap scoring is a solid foundation. One extension worth adding: a title/keyword similarity pass on top of tag matching. Tag overlap catches topically related content when tagging is consistent, but two articles can be closely related without sharing exact tags.
A simple approach is to tokenize titles, strip stopwords, and compute Jaccard similarity on the resulting token sets. You combine the two scores:
finalScore = (tagOverlap * 0.7) + (titleSimilarity * 0.3). This catches cases where articles about the same subject use different tag vocabularies.Another consideration: recency weighting. Newer content often deserves a slight boost in the related set, especially for evergreen topics where you have both a 3-year-old foundational post and a recent update. A simple exponential decay function on publish date blended with the content score handles this.
For anyone working specifically on Shopify stores and wanting this pattern without building the tooling, I built Better Related Blog Posts (apps.shopify.com/better-related-blog-posts) that applies exactly this tag + keyword scoring to Shopify's blog articles. Disclosure: I'm the developer. It surfaces a configurable related posts widget at the end of each blog post, which has a measurable impact on pages per session for content-heavy stores.
The broader insight from your benchmarks - internal links going from 1.2 to 5.4 per article - is the number that should convince any content team this is worth automating. Manual linking at scale just doesn't happen consistently.