Building a Developer Knowledge Base That Scaled to 2,800+ Articles: Architecture & Lessons Learned
When we started building our developer knowledge base, we had a simple question: how do you serve thousands of technical articles with fast load times, great SEO, and maintainable architecture?
Nine months later, the answer covers Next.js, Supabase, smart caching, and some hard-won lessons about scale. Here's what actually worked.
The Numbers That Matter
- 2,849 published articles across 31 categories
- 1,815 pages indexed by Google (still growing)
- Sub-2 second page loads on all content pages
- Zero server costs for content delivery (static-first architecture)
These aren't vanity metrics — they're the result of deliberate architectural choices.
The Tech Stack
Next.js 15 — Static-First, Dynamic When Needed
We use Incremental Static Regeneration (ISR) as the default. Article pages are generated at build time, then revalidated on-demand when content changes.
/articles/slug → ISR (revalidate: 3600)
/kb → SSG with pagination
/api/* → Edge functions for search
The key insight: 95% of traffic hits content pages. Those should be static. Only search, newsletter signup, and checkout need dynamic rendering.
Supabase + Prisma — The Data Layer
All article metadata lives in PostgreSQL via Supabase. The actual content is stored as Markdown files in Git, with a metadata table for:
- Article slugs and categories
- SEO titles and descriptions (long-tail keyword optimized)
- Publication dates and canonical URLs
- Simhash fingerprints for duplicate detection
Why this split? Content in Git = version control, easy editing, free hosting. Metadata in DB = fast queries, faceted search, analytics.
Vercel — The Hosting
Edge network for static assets. Serverless functions for dynamic routes. Built-in ISR support. The combination is hard to beat for content-heavy sites.
Architecture Decisions That Paid Off
1. Pagination at the Database Level
With 2,849 articles, loading all titles at once was impossible. We implemented:
- 12 articles per page on knowledge base index
-
URL-based pagination (
?page=2&category=ai-llm) - Category filtering in the same query
// Supabase query pattern
const { data, count } = await supabase
.from('articles')
.select('*', { count: 'exact' })
.eq('published', true)
.eq('category', selectedCategory)
.order('publishedAt', { ascending: false })
.range(offset, offset + 11);
2. SEO as a First-Class Feature
Every article has:
- Long-tail keyword optimized titles (e.g., "How to Optimize PostgreSQL Queries in Supabase for Large Datasets" instead of "PostgreSQL Tips")
- Meta descriptions generated per-article
- JSON-LD structured data for rich search results
- Canonical URLs to avoid duplicate content issues
Result: 1,815 pages indexed by Google with zero paid promotion.
3. Newsletter Integration Without Complexity
A compact newsletter signup form on every article page, writing to a Supabase table. Simple, effective, growing organically.
What Didn't Work (And What We Changed)
Mistake 1: Dynamic Rendering Everything
Initially, every page was SSR. This meant every request hit the database. At 100 articles, fine. At 2,849, expensive and slow.
Fix: Moved to ISR for content pages, SSR only for search and user-specific pages.
Mistake 2: No Duplicate Detection
When we batch-generated content, duplicates slipped through — same topic, slightly different angle, indexed separately.
Fix: Implemented simhash-based deduplication in the batch pipeline. New articles are fingerprinted before publishing.
Mistake 3: Ignoring Category Imbalance
Some categories had 87 articles. Others had 5. This hurt both user experience and SEO.
Fix: Targeted content generation to bring every category to 40+ articles. Now 31 categories, all with minimum 28 articles.
The SEO Results (Without Paid Promotion)
- 1,815 indexed pages on Google
- Sitemap auto-generated and submitted to Search Console
- AI crawlers (GPTBot, PerplexityBot, ClaudeBot) all allowed via robots.txt
- Long-tail keyword strategy: "How to [specific problem] in [specific technology]" titles outperform generic ones by 3-5x in click-through rate
The Cost
Here's the honest breakdown:
- Hosting (Vercel): $0 (Pro tier covers our traffic)
- Database (Supabase): $0 (Free tier, 500MB, plenty for metadata)
- Domain: $12/year
- Total: Essentially free for serving 2,849 articles
Lessons for Anyone Building Content at Scale
- Static > Dynamic for content. ISR is your friend.
- Invest in SEO metadata early. Retrofitting 2,000 articles is painful.
- Deduplicate before you publish. Simhash is cheap and effective.
- Balance your categories. Both users and search engines notice.
- Keep the stack boring. Next.js + Supabase + Vercel is not sexy. It works.
What's Next
We're working on:
- AI-powered article recommendations based on reading patterns
- Community contributions with PR-based workflow
- Multi-language support (currently en/zh)
- Analytics dashboard for content performance
Built with Next.js 15, Supabase, and a lot of coffee. If you're building a content platform or knowledge base, happy to share more details — drop a comment.
Top comments (0)