When we set out to build startup-cost.com, we knew traditional WordPress wouldn't cut it. We needed to serve 79,000+ unique pages - one for every combination of 479 cities and 167 business types - with real cost data, real-time calculations, and solid performance.
Most people hear "80K pages on WordPress" and assume we're crazy. WordPress is a blogging platform, right? Well, yes - but under the hood it's a flexible PHP framework with a powerful rewrite engine. We just had to throw away the parts that don't scale and build our own.
Here's the story of how we did it without a single row in wp_posts.
The Problem with wp_posts at Scale
WordPress stores all content in a single table called wp_posts. For a blog or a small business site with a few hundred pages, this works fine. But when you start pushing tens of thousands of rows into that table, things fall apart quickly:
-
Query performance degrades - WordPress joins
wp_postswithwp_postmetafor almost every query. With 80K posts, each with 10+ meta fields, you're looking at 800K+ rows in postmeta alone. Queries that used to take 5ms now take 500ms. - The admin panel becomes unusable - Try loading the "All Posts" screen with 80K entries. WordPress paginates, sure, but even counting the total takes forever.
- Revision history eats disk - WordPress auto-saves revisions. With programmatic content that gets regenerated, you end up with 3-4x the actual content in revision rows.
- XML sitemaps choke - Popular sitemap plugins try to query all posts at once. With 80K rows, they either timeout or consume all available memory.
- Imports and exports break - WordPress export generates a single XML file. Good luck with an 80K-post WXR file.
We needed a fundamentally different approach.
Why Not Use a Static Site Generator or a Different CMS?
Fair question. We considered Hugo, Next.js, and even a custom Node.js app. But WordPress gave us specific advantages:
- Hosting is dirt cheap - Shared WordPress hosting costs a few dollars a month and handles our traffic fine
- The plugin ecosystem - We still use various utility plugins for SEO settings, caching, and other tasks
- Familiar deployment - Our team knows WordPress inside out. No learning curve, no new CI/CD pipeline needed
- PHP is actually fast enough - With opcache and a clean query pattern, PHP 8 serves pages in double-digit milliseconds
The key insight was: we don't have to use WordPress the way it was designed. We can use it as a routing and rendering framework while storing our data however we want.
The Architecture - High Level
We built a custom WordPress plugin that bypasses wp_posts entirely. The concept has three pillars:
1. Custom Database Tables
Instead of stuffing everything into posts and postmeta, we created dedicated tables with proper schemas, data types, and indexes. Think of it as designing a mini-application database that lives inside WordPress's MySQL instance.
The key principle: each entity type (cities, business types, cost metrics) gets its own table with columns that match the actual data model - not the generic key-value pairs that postmeta forces you into. This means proper indexing, proper normalization, and queries that hit exactly the data they need.
The performance difference is massive. A meta-based lookup on 80K posts might take hundreds of milliseconds. A direct indexed query on a purpose-built table returns in single-digit milliseconds.
2. Virtual URL Routing
WordPress has a powerful but underused rewrite API. Most developers only interact with it through the permalink settings screen. But under the hood, you can register completely custom URL patterns that map to your own rendering logic.
We define URL patterns that WordPress recognizes and routes to our plugin. When a request comes in, WordPress matches the URL against our patterns, extracts the relevant slugs, and hands control to our code. No post is created. No database row in wp_posts is touched. The URL exists purely because we told WordPress to recognize the pattern.
This approach gives us full control over URL structure while still benefiting from WordPress's request lifecycle, caching hooks, and plugin compatibility.
3. Dynamic Content Generation
This is what makes programmatic SEO actually work. We don't just slap the same text on 80K pages with city names swapped - each page has genuinely different data. Real cost figures, real calculations, real comparisons.
The rendering layer pulls data from our custom tables, runs computations specific to each city-business combination, and outputs a fully formed HTML page with all the SEO elements (unique title, meta description, schema markup, etc.).
Every page displays different numbers because the underlying data is different. Google is smart enough to detect thin, templated content. Genuine value on each page is what makes programmatic SEO work long-term.
Performance Results
We benchmarked both approaches on the same server (shared hosting, PHP 8.1, MariaDB 10.6):
| Metric | wp_posts approach | Our custom approach |
|---|---|---|
| Average page load (TTFB) | ~800ms | ~120ms |
| Database query time | ~200ms (with meta joins) | ~15ms (indexed lookup) |
| Admin dashboard load | 30+ seconds | Instant (no admin overhead) |
| Sitemap generation | Timeout at 60s | ~2 seconds |
| Memory usage per request | ~64MB | ~12MB |
The difference is dramatic. Custom tables with proper indexes make WordPress perform like a purpose-built application.
Sitemap Strategy
Google's sitemap limit is 50,000 URLs per file and 50MB uncompressed. With 80K pages, we need multiple sitemaps plus a sitemap index.
We generate sitemaps programmatically, chunking URLs into manageable files, and create a sitemap index that references all of them. The generation runs on a weekly cron. Google Search Console picks them up without issues, and we can track indexing progress per sitemap chunk.
Internal Linking
With 80K pages, internal linking is critical both for SEO and for helping users navigate:
- Each city page links to all business types available in that city
- Each business page links to top cities for that business type
- Each detail page links to related pages based on geographic and thematic proximity
- Dense internal link graph helps search engines discover and understand the site structure
Caching
We use a multi-tier caching strategy. Not all 80K pages get equal traffic - popular city/business combinations are pre-cached with longer TTLs, while long-tail pages are cached on demand with shorter TTLs. We also cache at the database query level for frequently accessed aggregations.
What We Learned
WordPress can handle massive scale - but only if you step outside its default content model. The platform is more flexible than people give it credit for.
Custom tables are not "hacky" - They're the right tool when your data doesn't fit the post/meta pattern. WordPress itself uses custom tables for comments, users, and options.
Virtual routing is powerful - WordPress rewrite rules can handle complex URL patterns. You don't need a custom framework just because your URLs don't map to posts.
Programmatic SEO needs real value - Google detects thin, templated content. Every page needs genuinely different, useful data. That's the difference between spam and a real product.
Start with the data model - We spent more time designing our database schema than writing rendering code. Good data modeling pays off at every layer.
Monitor, then optimize - We started without caching and added it only where monitoring showed bottlenecks. Premature optimization would have added complexity we didn't need.
Check out the result: startup-cost.com - startup cost estimates for 167 business types across 479 cities worldwide.
Built by Kavela Ltd - we build data-driven web tools at scale.
Top comments (0)