<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: kavela</title>
    <description>The latest articles on DEV Community by kavela (@kavelaltd).</description>
    <link>https://dev.to/kavelaltd</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3841559%2Fe6231d4d-9c33-45b2-a533-74abcfbc6048.png</url>
      <title>DEV Community: kavela</title>
      <link>https://dev.to/kavelaltd</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kavelaltd"/>
    <language>en</language>
    <item>
      <title>Pruning a Sitemap From 327,621 URLs to 2,483 (And Why That Was the Right Call)</title>
      <dc:creator>kavela</dc:creator>
      <pubDate>Thu, 07 May 2026 19:49:40 +0000</pubDate>
      <link>https://dev.to/kavelaltd/pruning-a-sitemap-from-327621-urls-to-2483-and-why-that-was-the-right-call-61o</link>
      <guid>https://dev.to/kavelaltd/pruning-a-sitemap-from-327621-urls-to-2483-and-why-that-was-the-right-call-61o</guid>
      <description>&lt;p&gt;&lt;a href="https://evrangeguide.com" rel="noopener noreferrer"&gt;evrangeguide.com&lt;/a&gt; is a programmatic SEO site for EV charging routes — 27 countries, 370 cities, 24 vehicles, and over two thousand canonical inter-city routes. When we totaled the URL space across every page-type combination, we got &lt;strong&gt;327,621 URLs&lt;/strong&gt;. We listed every one of them in the sitemap.&lt;/p&gt;

&lt;p&gt;Google's response was decisive: it indexed almost none of them, and the crawl-budget metrics in Search Console got worse week over week.&lt;/p&gt;

&lt;p&gt;This is the story of cutting that sitemap by 99.2% — down to 2,483 URLs — and why scarcity beat coverage.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architecture
&lt;/h2&gt;

&lt;p&gt;ev-engine is the same family of virtual-routing plugin we use across &lt;a href="https://dev.to/kavelaltd/how-we-built-a-programmatic-seo-engine-serving-80k-pages-on-wordpress-without-using-wpposts-2kgn"&gt;the network&lt;/a&gt;. No &lt;code&gt;wp_posts&lt;/code&gt;. The data lives in custom tables:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Table&lt;/th&gt;
&lt;th&gt;Rows&lt;/th&gt;
&lt;th&gt;Holds&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_ev_routes&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;2,044&lt;/td&gt;
&lt;td&gt;inter-city routes (from-to pairs)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_ev_cities&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;370&lt;/td&gt;
&lt;td&gt;city metadata + lat/lng&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_ev_countries&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;27&lt;/td&gt;
&lt;td&gt;country hubs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_ev_vehicles&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;24&lt;/td&gt;
&lt;td&gt;EV models with battery + range specs&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_ev_route_stations&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;~2.36M&lt;/td&gt;
&lt;td&gt;charger stops along each route&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_ev_route_vehicles&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;147K&lt;/td&gt;
&lt;td&gt;per-vehicle stop plans (route × vehicle)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;From these, the router (&lt;code&gt;EV_Router&lt;/code&gt;) projects URL patterns:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/route/{from}-to-{to}/&lt;/code&gt; — 2,044 base routes&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/route/{from}-to-{to}/{vehicle}/&lt;/code&gt; — 2,044 × 24 = ~49K route-vehicle pages&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/route/{from}-to-{to}/{tier}/&lt;/code&gt; — long/mid/short-range tier views&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/city/{slug}/&lt;/code&gt;, &lt;code&gt;/country/{slug}/&lt;/code&gt;, &lt;code&gt;/ev/{vehicle}/&lt;/code&gt;, &lt;code&gt;/glamour/{slug}/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Index pages: &lt;code&gt;/routes/&lt;/code&gt;, &lt;code&gt;/cities/&lt;/code&gt;, &lt;code&gt;/countries/&lt;/code&gt;, &lt;code&gt;/vehicles/&lt;/code&gt;, &lt;code&gt;/glamour/&lt;/code&gt;
&lt;/li&gt;
&lt;li&gt;Multilingual prefixes for &lt;code&gt;tr/de/fr/es&lt;/code&gt; (UI-translated, body still EN)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Multiplied across language prefixes and tier slices, the total grew past 327K.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why we listed all of them
&lt;/h2&gt;

&lt;p&gt;The naive pSEO instinct: more URLs is more surface area. If a long-tail searcher types "Tesla Model 3 Berlin to Munich charging stops," we wanted a page tuned to &lt;em&gt;that exact query&lt;/em&gt;. So we generated &lt;code&gt;/route/berlin-to-munich/tesla-model-3/&lt;/code&gt; and 49K of its siblings, then submitted them all.&lt;/p&gt;

&lt;p&gt;Google's reality: for a new EV-niche domain with low brand authority, indexing budget is allocated as a function of trust × content uniqueness × discovery cost. We were maximizing the third variable while having close to zero of the first.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the data showed
&lt;/h2&gt;

&lt;p&gt;Two months in, GSC had:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;~327K URLs &lt;strong&gt;discovered&lt;/strong&gt; but flagged "Crawled – currently not indexed" or "Discovered – currently not indexed"&lt;/li&gt;
&lt;li&gt;A handful of routes indexed; most of the indexed pages were the small set of &lt;em&gt;unique&lt;/em&gt; hubs (countries, vehicles)&lt;/li&gt;
&lt;li&gt;Crawl rate stuck low because Google was sampling thousands of near-duplicate route-vehicle and tier slices&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The route-vehicle pages were structurally near-duplicate: same route metadata, same map, same station list, with one paragraph of vehicle-specific stop math. The tier pages (long/mid/short-range) were even worse — they sliced the same route by an arbitrary battery-range threshold.&lt;/p&gt;

&lt;p&gt;We had built thousands of pages that, from Google's perspective, were dilutions of one good page.&lt;/p&gt;

&lt;h2&gt;
  
  
  The prune
&lt;/h2&gt;

&lt;p&gt;We kept what was structurally unique and dropped the rest from the sitemap:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Page type&lt;/th&gt;
&lt;th&gt;Kept in sitemap&lt;/th&gt;
&lt;th&gt;Behavior&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Routes&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;2,044&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;indexable, in sitemap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Cities&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;370&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;indexable, in sitemap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Countries&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;27&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;indexable, in sitemap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Vehicles&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;24&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;indexable, in sitemap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Glamour pages&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;18&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;curated highlights, in sitemap&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Route × tier&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;&lt;code&gt;noindex, follow&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Route × vehicle&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;&lt;code&gt;noindex, follow&lt;/code&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Multilingual (non-EN)&lt;/td&gt;
&lt;td&gt;0&lt;/td&gt;
&lt;td&gt;
&lt;code&gt;noindex&lt;/code&gt; (UI translation only, body EN)&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The templates that emit &lt;code&gt;noindex, follow&lt;/code&gt; (&lt;code&gt;route-range.php&lt;/code&gt;, &lt;code&gt;route-vehicle-weather.php&lt;/code&gt;) keep crawl flow going — internal links from these pages still pass discovery to the canonical route page — but the pages themselves do not compete for index slots. New sitemap total: &lt;strong&gt;2,483 URLs&lt;/strong&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The single-DB migration that made this possible
&lt;/h2&gt;

&lt;p&gt;A quick aside, because it's part of the same story. Until early May, the EV data tables (&lt;code&gt;wp_ev_*&lt;/code&gt;) lived on a separate Hostinger Business database, while the WP core tables lived on contabo. The plugin had a custom &lt;code&gt;ev_data_db()&lt;/code&gt; function that returned a second &lt;code&gt;wpdb&lt;/code&gt; connection.&lt;/p&gt;

&lt;p&gt;This worked for a year, then quietly broke after a Hostinger network change: cross-host queries started timing out under crawler load. The fix:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Whitelist contabo's IP on Hostinger Remote MySQL (&lt;code&gt;92.113.22.128&lt;/code&gt;, IPv4-forced — their host is IPv4-only).&lt;/li&gt;
&lt;li&gt;Dump the EV tables (88MB gzipped) and import into &lt;code&gt;wp_evrangeguide&lt;/code&gt; on contabo's local MariaDB.&lt;/li&gt;
&lt;li&gt;Patch &lt;code&gt;ev_data_db()&lt;/code&gt; to return &lt;code&gt;$wpdb&lt;/code&gt; directly. Single connection from then on.
&lt;/li&gt;
&lt;/ol&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;ev_data_db&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;global&lt;/span&gt; &lt;span class="nv"&gt;$wpdb&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$wpdb&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;While we were in there we caught one more bug: a related-routes query that was casting distance into a &lt;code&gt;BIGINT UNSIGNED&lt;/code&gt; and underflowing on "distance less than threshold" comparisons. Fix:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="k"&gt;CAST&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;distance_km&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="nb"&gt;SIGNED&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;threshold&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Without that cast, MariaDB was returning empty result sets for any negative comparison (which is half of all related-route queries).&lt;/p&gt;

&lt;h2&gt;
  
  
  IndexNow + GSC API
&lt;/h2&gt;

&lt;p&gt;We re-submit the pruned sitemap via the GSC Sitemaps API after each rebuild. The shared service account (&lt;code&gt;nererdenizlenir@dekorfiyat.iam.gserviceaccount.com&lt;/code&gt;) is &lt;code&gt;siteOwner&lt;/code&gt; on every property in our network, so a single key lets us PUT to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://www.googleapis.com/webmasters/v3/sites/{site}/sitemaps/{feedpath}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Returns 204; Google re-fetches the sitemap within 24h. We push fresh URL changes via IndexNow (key at the webroot, batched to &lt;code&gt;api.indexnow.org&lt;/code&gt;). Google's old &lt;code&gt;/ping?sitemap=&lt;/code&gt; endpoint is fully deprecated since 2023, so the GSC API path is the only programmatic re-submit channel.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we'd tell our March-self
&lt;/h2&gt;

&lt;p&gt;For a new pSEO domain, the right shape of the sitemap is:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Hubs first.&lt;/strong&gt; Cities, countries, vehicles, categories — these are the pages with strongest internal-link equity and the easiest to make uniquely useful.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One canonical layer per long-tail query.&lt;/strong&gt; Routes go in; route-vehicle and route-tier slices stay live but &lt;code&gt;noindex, follow&lt;/code&gt;. Internal navigation is the discovery path; the sitemap is the indexing intent.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cap multilingual to noindex&lt;/strong&gt; until you have the translation budget for genuine localization. Body-language mismatch (English body under a &lt;code&gt;de/&lt;/code&gt; URL) is worse than no localization at all.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Resubmit via the GSC API after every prune.&lt;/strong&gt; It's a 204 response. Don't wait for Googlebot to discover the change.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Watch "Discovered – currently not indexed" as a primary metric, not a footnote.&lt;/strong&gt; If it's growing faster than "Indexed," you're polluting the surface area.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;When you're sitting on 327K URLs and Google is indexing a few hundred, the fastest path to better indexing isn't more content. It's less surface area, sharper hubs, and patience.&lt;/p&gt;

&lt;p&gt;Live site: &lt;a href="https://evrangeguide.com" rel="noopener noreferrer"&gt;evrangeguide.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>wordpress</category>
      <category>seo</category>
      <category>webdev</category>
      <category>database</category>
    </item>
    <item>
      <title>Why Every Internal Link On Our Site Was a 404 (And the Smart-301 That Fixed It)</title>
      <dc:creator>kavela</dc:creator>
      <pubDate>Thu, 07 May 2026 19:46:04 +0000</pubDate>
      <link>https://dev.to/kavelaltd/why-every-internal-link-on-our-site-was-a-404-and-the-smart-301-that-fixed-it-3iko</link>
      <guid>https://dev.to/kavelaltd/why-every-internal-link-on-our-site-was-a-404-and-the-smart-301-that-fixed-it-3iko</guid>
      <description>&lt;p&gt;&lt;a href="https://beverageindex.com" rel="noopener noreferrer"&gt;BeverageIndex.com&lt;/a&gt; is a programmatic SEO site indexing drink prices across 98 cities and 30 drinks. 3,728 indexable URLs. We launched it in March, submitted the sitemap to Google Search Console, and watched indexing flatline at zero for two full months.&lt;/p&gt;

&lt;p&gt;When we finally crawled our own site with a friendly bot, the cause was embarrassingly simple: &lt;strong&gt;every drink and city link in our internal navigation pointed to a URL that returned 404.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The setup
&lt;/h2&gt;

&lt;p&gt;bi-engine is a virtual-routing WordPress plugin (no &lt;code&gt;wp_posts&lt;/code&gt;) backed by these tables:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Table&lt;/th&gt;
&lt;th&gt;Rows&lt;/th&gt;
&lt;th&gt;What it holds&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_bi_drinks&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;30&lt;/td&gt;
&lt;td&gt;id=slug; category enum: coffee/beer/wine/spirits/tea/juice/specialty/soft_drinks&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_bi_cities&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;98&lt;/td&gt;
&lt;td&gt;id=slug, country, currency, lat/lng&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_bi_prices&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;2,940&lt;/td&gt;
&lt;td&gt;drink × city, USD-normalized&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_bi_routes&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;3,727&lt;/td&gt;
&lt;td&gt;pre-computed &lt;code&gt;/{drink}/{city}/&lt;/code&gt; paths&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The theme is unusual for a pSEO site: &lt;strong&gt;a single SPA terminal template&lt;/strong&gt; (&lt;code&gt;template-terminal.php&lt;/code&gt;) hydrates from a JSON payload + &lt;code&gt;/wp-json/bi/v1/enriched&lt;/code&gt;. Country hubs, city hubs, drink hubs, category hubs, and city × drink price grids all render through the same client-side template, distinguished by URL pattern.&lt;/p&gt;

&lt;h2&gt;
  
  
  The URL patterns
&lt;/h2&gt;

&lt;p&gt;The canonical paths are namespaced:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/category/{slug}/&lt;/code&gt; — coffee, beer, wine, etc.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/city/{slug}/&lt;/code&gt; — amsterdam, tokyo, istanbul…&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/drink/{slug}/&lt;/code&gt; — americano, cappuccino, coca-cola…&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/country/{slug}/&lt;/code&gt; — by name slug or ISO2&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/{drink}/{city}/&lt;/code&gt; — 3,727 price grids (e.g. &lt;code&gt;/americano/amsterdam/&lt;/code&gt;)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The last pattern is the workhorse. It's also where the sitemap gets most of its URLs.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bug
&lt;/h2&gt;

&lt;p&gt;Look at any internal hub page on the site. The HTML is full of links like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="nt"&gt;&amp;lt;a&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;"/americano/"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Americano&lt;span class="nt"&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;a&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;"/amsterdam/"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Amsterdam&lt;span class="nt"&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;a&lt;/span&gt; &lt;span class="na"&gt;href=&lt;/span&gt;&lt;span class="s"&gt;"/coffee/"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Coffee&lt;span class="nt"&gt;&amp;lt;/a&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Not &lt;code&gt;/drink/americano/&lt;/code&gt;. Not &lt;code&gt;/city/amsterdam/&lt;/code&gt;. &lt;strong&gt;Slug-only.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;And slug-only didn't have a route. Googlebot crawled the homepage, followed every drink link, and got 404. Crawled the city hub, followed every drink × city link → those worked because &lt;code&gt;/{drink}/{city}/&lt;/code&gt; was registered. But the hub anchors themselves were dead.&lt;/p&gt;

&lt;p&gt;We had two reasonable choices:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Rewrite all the internal links&lt;/strong&gt; to use the canonical &lt;code&gt;/category/&lt;/code&gt;, &lt;code&gt;/drink/&lt;/code&gt;, &lt;code&gt;/city/&lt;/code&gt;, &lt;code&gt;/country/&lt;/code&gt; prefixes. Hundreds of edits across the SPA template and JSON payload generators.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Make slug-only URLs work&lt;/strong&gt;, by 301-redirecting them to the canonical prefixed form.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;We picked (2) because it's also good externally — third parties linking to &lt;code&gt;beverageindex.com/cappuccino/&lt;/code&gt; will resolve correctly forever, even if our internal nav drifts.&lt;/p&gt;

&lt;h2&gt;
  
  
  The smart-301
&lt;/h2&gt;

&lt;p&gt;We added a single rewrite rule, &lt;code&gt;bi_slug_only&lt;/code&gt;, plus a handler that does a 4-way disambiguation:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;maybe_slug_redirect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$slug&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// 1. Drink?&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;drink_exists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$slug&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;wp_redirect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"/drink/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$slug&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;301&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="k"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c1"&gt;// 2. City?&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;city_exists&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$slug&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;wp_redirect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"/city/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$slug&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;301&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="k"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c1"&gt;// 3. Category? (URL hyphen ↔ DB underscore)&lt;/span&gt;
    &lt;span class="nv"&gt;$cat_db&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;str_replace&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'-'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'_'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$slug&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;in_array&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$cat_db&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;categories&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;wp_redirect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"/category/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$slug&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;301&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="k"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c1"&gt;// 4. Country? (ISO2 or full-name slug)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$country&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;country_lookup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$slug&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;wp_redirect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s2"&gt;"/country/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nv"&gt;$country&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;slug&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s2"&gt;/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;301&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="k"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c1"&gt;// Unknown — explicit 404 with theme template&lt;/span&gt;
    &lt;span class="nf"&gt;status_header&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;404&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;include&lt;/span&gt; &lt;span class="nf"&gt;get_404_template&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="k"&gt;exit&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The ordering matters: drinks shadow categories shadow countries. We had two collisions to handle — &lt;code&gt;coffee&lt;/code&gt; and &lt;code&gt;soft-drinks&lt;/code&gt; are both potential drink slugs and category slugs. They're categories, so we resolved by reordering the lookups (drink table first, but neither &lt;code&gt;coffee&lt;/code&gt; nor &lt;code&gt;soft_drinks&lt;/code&gt; is in &lt;code&gt;wp_bi_drinks&lt;/code&gt;, so they fall through to the category check correctly).&lt;/p&gt;

&lt;h2&gt;
  
  
  The other 301 we didn't notice
&lt;/h2&gt;

&lt;p&gt;While fixing this, we caught a second crawl-budget leak. The sitemap index &lt;code&gt;bi-sitemap.xml&lt;/code&gt; and its five children (&lt;code&gt;bi-sitemap-cities.xml&lt;/code&gt;, &lt;code&gt;bi-sitemap-drinks.xml&lt;/code&gt;, &lt;code&gt;bi-sitemap-categories.xml&lt;/code&gt;, &lt;code&gt;bi-sitemap-countries.xml&lt;/code&gt;, &lt;code&gt;bi-sitemap-prices-1.xml&lt;/code&gt;) were all 301-redirecting to a trailing-slash variant before serving. Every Googlebot fetch was burning two requests instead of one.&lt;/p&gt;

&lt;p&gt;WordPress's &lt;code&gt;redirect_canonical&lt;/code&gt; filter was the culprit. The fix:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nf"&gt;add_filter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'redirect_canonical'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'skip_canonical_for_sitemap'&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;

&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;function&lt;/span&gt; &lt;span class="n"&gt;skip_canonical_for_sitemap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$redirect_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$requested_url&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;strpos&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$requested_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s1"&gt;'bi-sitemap'&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;!==&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;false&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nv"&gt;$redirect_url&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now all six sitemap files return 200 directly.&lt;/p&gt;

&lt;h2&gt;
  
  
  IndexNow: not all engines treat all domains equally
&lt;/h2&gt;

&lt;p&gt;We push fresh slugs to IndexNow on publish. beverageindex's IndexNow key (&lt;code&gt;0047e09268ff…ccb&lt;/code&gt;) is happily accepted by both Yandex (HTTP 202) and &lt;code&gt;api.indexnow.org&lt;/code&gt; / Bing (HTTP 200).&lt;/p&gt;

&lt;p&gt;Meanwhile, &lt;a href="https://assetvs.com" rel="noopener noreferrer"&gt;our sister site assetvs.com&lt;/a&gt; gets &lt;code&gt;403 UserForbiddedToAccessSite&lt;/code&gt; from &lt;code&gt;api.indexnow.org&lt;/code&gt;, while Yandex still accepts. Same key format, same payload, different domain reputation — assetvs probably tripped a Bing trust signal during its 2-month zero-indexing window. The lesson: if you operate a network, don't assume IndexNow status from one domain transfers to another. Test each.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where indexing actually stands
&lt;/h2&gt;

&lt;p&gt;Fix shipped. Sitemap re-submitted via the GSC Sitemaps API (PUT returns 204). 3,728 URLs are now linkable from anywhere on the site. Whether that converts into indexing in the next two weeks is the real test.&lt;/p&gt;

&lt;p&gt;The broader lesson is one we've now repeated across multiple network properties: &lt;strong&gt;for a new pSEO domain, the technical indexing tax shows up as silent 4xx noise, not as a single dramatic failure.&lt;/strong&gt; A trailing-slash 301 here, a 404'd internal link there, a missing FX table on another property. None of it is loud. All of it compounds.&lt;/p&gt;

&lt;p&gt;Live site: &lt;a href="https://beverageindex.com" rel="noopener noreferrer"&gt;beverageindex.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>wordpress</category>
      <category>seo</category>
      <category>webdev</category>
      <category>php</category>
    </item>
    <item>
      <title>Lighting Up 2,232 Dead URLs With One World Bank API Call</title>
      <dc:creator>kavela</dc:creator>
      <pubDate>Thu, 07 May 2026 19:46:03 +0000</pubDate>
      <link>https://dev.to/kavelaltd/lighting-up-2232-dead-urls-with-one-world-bank-api-call-1djf</link>
      <guid>https://dev.to/kavelaltd/lighting-up-2232-dead-urls-with-one-world-bank-api-call-1djf</guid>
      <description>&lt;p&gt;On &lt;a href="https://assetvs.com" rel="noopener noreferrer"&gt;assetvs.com&lt;/a&gt; we run a programmatic SEO site comparing investment returns across 16 countries and 5 asset classes (stocks, bonds, real estate, bills, gold). The site has 3,655 URLs in the sitemap. Last week we discovered that &lt;strong&gt;2,232 of them — every currency-pair page — were silently 404ing&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;This is the story of how one missing table turned a third of our site dark, and how a single World Bank endpoint brought it all back online in an afternoon.&lt;/p&gt;

&lt;h2&gt;
  
  
  The architecture (quick recap)
&lt;/h2&gt;

&lt;p&gt;assetvs uses the same virtual-routing pattern we've written about &lt;a href="https://dev.to/kavelaltd/how-we-built-a-programmatic-seo-engine-serving-80k-pages-on-wordpress-without-using-wpposts-2kgn"&gt;before on the network&lt;/a&gt;: no &lt;code&gt;wp_posts&lt;/code&gt;, just custom tables and rewrite rules. The relevant tables are:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Table&lt;/th&gt;
&lt;th&gt;Rows&lt;/th&gt;
&lt;th&gt;Purpose&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_ir_countries&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;16&lt;/td&gt;
&lt;td&gt;G7 + EU + ANZ + JP, ISO codes&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_ir_assets&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;5&lt;/td&gt;
&lt;td&gt;stocks/bonds/real-estate/bills/gold&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_ir_annual_returns&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;3,955&lt;/td&gt;
&lt;td&gt;nominal returns, country × asset × year&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_ir_inflation&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;976&lt;/td&gt;
&lt;td&gt;for real-return calculation&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_ir_comparisons&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;100&lt;/td&gt;
&lt;td&gt;full-period asset-vs-asset summaries&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;&lt;code&gt;wp_ir_exchange_rates&lt;/code&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;0 → 976&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;currency pair history&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;That last row is the punchline.&lt;/p&gt;

&lt;h2&gt;
  
  
  The URL patterns
&lt;/h2&gt;

&lt;p&gt;The router (&lt;code&gt;IR_Router&lt;/code&gt;) maps slugs onto eight page types:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;code&gt;/{country}/&lt;/code&gt; and &lt;code&gt;/{asset}/&lt;/code&gt; — 21 hubs&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/{country}/{asset}/&lt;/code&gt; — 80 country-asset combos&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/{country}/{asset_a}-vs-{asset_b}/&lt;/code&gt; — 100 comparisons (alphabetical canonical)&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/{country}/{asset_a}-vs-{asset_b}/{decade}/&lt;/code&gt; — sliced by 1960s–2010s&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/{asset}/{country_a}-vs-{country_b}/&lt;/code&gt; — 600 cross-country comparisons&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/currencies/{a}-{b}/&lt;/code&gt; and &lt;code&gt;/currencies/{a}-{b}/{year}/&lt;/code&gt; — &lt;strong&gt;2,232 currency pages&lt;/strong&gt;
&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;/rankings/{asset}/&lt;/code&gt;, &lt;code&gt;/about/&lt;/code&gt;, &lt;code&gt;/data/&lt;/code&gt;, etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Everything renders server-side from the custom tables. No posts, no taxonomies, no theme fight.&lt;/p&gt;

&lt;h2&gt;
  
  
  The bug, hiding in plain sight
&lt;/h2&gt;

&lt;p&gt;The currency builder (&lt;code&gt;build_currency_pair&lt;/code&gt;) had this shape:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="nv"&gt;$history&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nv"&gt;$this&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;get_fx_pair_history&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$a&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nv"&gt;$b&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;empty&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$history&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kc"&gt;null&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="c1"&gt;// ...render the page&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Returning &lt;code&gt;null&lt;/code&gt; from the page builder cascades up to a 404. Reasonable defense — but &lt;code&gt;wp_ir_exchange_rates&lt;/code&gt; had &lt;strong&gt;never been populated&lt;/strong&gt;. We'd built the schema, written the renderer, generated 2,232 rewrite rules, listed the URLs in the sitemap, and submitted them to Google Search Console for two months. All of them 404.&lt;/p&gt;

&lt;p&gt;GSC's coverage report had been flagging "Not found (404)" for weeks, but with 3,655 URLs and zero indexed pages overall (new domain trust issues), nobody had drilled into &lt;em&gt;which&lt;/em&gt; URLs were 404 vs which were just unindexed-but-live.&lt;/p&gt;

&lt;h2&gt;
  
  
  The fix: one API
&lt;/h2&gt;

&lt;p&gt;The World Bank publishes annual official exchange rates as indicator &lt;code&gt;PA.NUS.FCRF&lt;/code&gt;, free, no auth:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;https://api.worldbank.org/v2/country/{ISO}/indicator/PA.NUS.FCRF?date=1960:2020&amp;amp;format=json
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For each of our 16 countries, that's one HTTP call returning 61 annual rates against USD. Sixteen calls total. We wrote a one-shot fill script:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight php"&gt;&lt;code&gt;&lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$countries&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nv"&gt;$rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;wb_fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$c&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;iso2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;foreach&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nv"&gt;$rows&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="nv"&gt;$year&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$rate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nv"&gt;$wpdb&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="nf"&gt;insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s1"&gt;'wp_ir_exchange_rates'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;
            &lt;span class="s1"&gt;'country_id'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$c&lt;/span&gt;&lt;span class="o"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'year'&lt;/span&gt;       &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$year&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'rate_per_usd'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$rate&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="s1"&gt;'yoy_change'&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nv"&gt;$yoy&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;]);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Result: 976 rows. The currency pair builder derives any &lt;code&gt;(A, B)&lt;/code&gt; pair by triangulating through USD — so 16 base currencies give us C(16,2) × 2 directions × 61 years × a few view modes = the 2,232 URLs we'd been promising Google for two months.&lt;/p&gt;

&lt;h2&gt;
  
  
  Lessons we keep relearning
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. "Sitemap submitted" ≠ "page renders."&lt;/strong&gt; GSC will happily accept a sitemap of 404s and quietly file them under "Discovered – currently not indexed" or "Not found." There is no alarm. Build a synthetic monitor that hits one URL per page-type weekly and asserts HTTP 200. We'd have caught this in March.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Returning &lt;code&gt;null&lt;/code&gt; from a builder is a foot-gun.&lt;/strong&gt; The render path collapses three failure modes into one 404: missing data, missing slug, and genuinely-unknown URL. Now we log a distinct warning when a &lt;em&gt;known-routed&lt;/em&gt; slug returns null, so missing tables surface in error logs instead of just the GSC dashboard.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. New-domain GSC trust dwarfs technical issues.&lt;/strong&gt; Even with the FX table filled, indexing didn't snap back overnight. We're at zero indexed pages two months in, and lighting up the currency tier is a prerequisite for indexing — but not sufficient. Content depth is next: stock index names per country, country-level macro from World Bank, narrative blocks generated through our LLM pool.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. Cross-site editorial linking is cheap and free crawl signal.&lt;/strong&gt; We added a &lt;code&gt;historysaid_for_country($slug)&lt;/code&gt; linker that pulls the top 5 inquiries citing the country slug from &lt;code&gt;historysaid.com/wp-json/hs/v1/inquiries/by-country/{slug}&lt;/code&gt; (12-hour transient cache) and renders them as a "Further Reading" block. United States gets 8 inquiries, Germany 10, UK 10. Two sites, one network, free internal-link equity.&lt;/p&gt;

&lt;h2&gt;
  
  
  What's next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Fill &lt;code&gt;stock_index_name&lt;/code&gt; per country (S&amp;amp;P 500, FTSE 100, DAX, Nikkei…) so each &lt;code&gt;/{country}/stocks/&lt;/code&gt; page has a named anchor instead of generic "stocks."&lt;/li&gt;
&lt;li&gt;Pull World Bank macro (GDP per capita, inflation regime, currency volatility) into country hubs.&lt;/li&gt;
&lt;li&gt;Generate narrative paragraphs (~150 words) per country-asset page through our LLM pool — same enrichment pattern we use on &lt;a href="https://historysaid.com/" rel="noopener noreferrer"&gt;HistorySaid's 18,304 indicator pages&lt;/a&gt;.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The larger lesson: when 2,232 URLs go dark, it usually isn't an SEO problem. It's a missing INSERT.&lt;/p&gt;

&lt;p&gt;Live site: &lt;a href="https://assetvs.com" rel="noopener noreferrer"&gt;assetvs.com&lt;/a&gt;&lt;/p&gt;

</description>
      <category>wordpress</category>
      <category>seo</category>
      <category>webdev</category>
      <category>database</category>
    </item>
    <item>
      <title>Turning World Bank Data Into 50K+ Searchable Pages with WordPress</title>
      <dc:creator>kavela</dc:creator>
      <pubDate>Tue, 24 Mar 2026 11:28:45 +0000</pubDate>
      <link>https://dev.to/kavelaltd/turning-world-bank-data-into-50k-searchable-pages-with-wordpress-2ojp</link>
      <guid>https://dev.to/kavelaltd/turning-world-bank-data-into-50k-searchable-pages-with-wordpress-2ojp</guid>
      <description>&lt;p&gt;What if you could make decades of World Bank and IMF economic data actually accessible and browsable - not buried in spreadsheets and PDF reports that nobody reads?&lt;/p&gt;

&lt;p&gt;That's what we built with &lt;a href="https://historysaid.com" rel="noopener noreferrer"&gt;historysaid.com&lt;/a&gt;: a programmatic SEO site that transforms raw international development data into &lt;strong&gt;50,000+ structured, searchable pages&lt;/strong&gt;. Every country, every indicator, every year - all queryable, all browsable, all indexed by Google.&lt;/p&gt;

&lt;p&gt;This post covers the architectural thinking behind it and what we learned building it.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Data Problem
&lt;/h2&gt;

&lt;p&gt;The World Bank and IMF publish some of the richest economic datasets on the planet:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;GDP, inflation, trade balances, debt levels for 200+ countries&lt;/li&gt;
&lt;li&gt;Time series spanning 60+ years (some indicators go back to the 1960s)&lt;/li&gt;
&lt;li&gt;Hundreds of unique economic indicators covering everything from agricultural output to internet penetration rates&lt;/li&gt;
&lt;li&gt;Regular updates as new data gets published quarterly or annually&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;But the official portals are designed for researchers and economists who already know what they're looking for. You need to know specific indicator codes and use clunky query builders to extract data into spreadsheets.&lt;/p&gt;

&lt;p&gt;There's no way to just... explore. To browse. To stumble upon interesting economic stories by clicking around.&lt;/p&gt;

&lt;p&gt;We wanted to change that.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Not Just Build a Dashboard?
&lt;/h2&gt;

&lt;p&gt;We considered building a single-page dashboard app with interactive charts and filters. But dashboards have a fundamental SEO problem: they're one URL. Google can't index the state of your filters. If someone searches "Turkey GDP growth history", a dashboard app won't rank because that specific view doesn't have its own URL.&lt;/p&gt;

&lt;p&gt;Programmatic SEO solves this. Each unique combination of country + indicator gets its own page, its own URL, its own title, and its own meta description. Google can index all 50K of them.&lt;/p&gt;

&lt;p&gt;We chose WordPress for the same reasons we used it for startup-cost.com (see our &lt;a href="https://dev.to/kavelaltd/how-we-built-a-programmatic-seo-engine-serving-80k-pages-on-wordpress-without-using-wpposts-2kgn"&gt;previous post&lt;/a&gt;): cheap hosting, familiar ecosystem, and a powerful rewrite engine that nobody uses to its full potential.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture - Overview
&lt;/h2&gt;

&lt;p&gt;We built a custom WordPress plugin that handles everything from data ingestion to page rendering.&lt;/p&gt;

&lt;h3&gt;
  
  
  Data Pipeline
&lt;/h3&gt;

&lt;p&gt;The data flows through several stages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;World Bank API --&amp;gt; Fetch &amp;amp; Parse --&amp;gt; Validate --&amp;gt; Normalize --&amp;gt; MySQL
IMF Data Portal --&amp;gt; Fetch &amp;amp; Parse --&amp;gt; Validate --&amp;gt; Normalize --&amp;gt; MySQL
                                                                  |
MySQL --&amp;gt; Virtual URL Routing --&amp;gt; Template Engine --&amp;gt; HTML Page
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each data source has its own ingestion logic because the API formats differ significantly. The World Bank provides a well-documented REST API with JSON responses, while IMF data comes in a different structure. We wrote adapters that normalize both into a common internal format.&lt;/p&gt;

&lt;p&gt;The pipeline runs on a scheduled basis. When new data is published by either source, our next run picks it up automatically and updates the relevant records.&lt;/p&gt;

&lt;h3&gt;
  
  
  Data Quality Challenges
&lt;/h3&gt;

&lt;p&gt;Working with international economic data is messier than you'd expect:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Missing values everywhere&lt;/strong&gt; - Some countries don't report certain indicators for certain years. We handle nulls gracefully rather than showing zeros.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Delayed reporting&lt;/strong&gt; - Some nations publish data 2-3 years late. Our pages show the most recent available data and clearly indicate the time period.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Unit inconsistency&lt;/strong&gt; - Some values are in current USD, some in constant USD, some in percentages. Each indicator carries its unit metadata.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Country code mismatches&lt;/strong&gt; - The World Bank uses ISO 3166-1 alpha-3 codes, the IMF sometimes uses its own codes. Our normalization layer handles the mapping.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Database Design Principles
&lt;/h3&gt;

&lt;p&gt;We use custom MySQL tables (not &lt;code&gt;wp_posts&lt;/code&gt;) following the same pattern from our startup-cost.com engine. The key design decisions:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Proper normalization&lt;/strong&gt; - Countries, indicators, and data points are separate tables with foreign key relationships&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Appropriate data types&lt;/strong&gt; - We use high-precision decimal types for economic values because the data ranges from tiny percentages to trillion-dollar GDP figures. Floating point would introduce precision errors that data-savvy users would notice.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Strategic indexing&lt;/strong&gt; - Our most common query patterns (all data for a country+indicator, all countries for an indicator+year) each have compound indexes that resolve in single-digit milliseconds&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Roughly 4 million data points&lt;/strong&gt; in the main table, all queryable in under 10ms thanks to proper indexing&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Page Types
&lt;/h3&gt;

&lt;p&gt;Our routing creates four types of pages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Country pages&lt;/strong&gt; - Overview of all available indicators for a country. Shows key stats, latest values, and links to explore each indicator in depth.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Country + Indicator pages&lt;/strong&gt; - The core of the site. Detailed time series data with charts, data tables, summary statistics, and trend analysis. This is the bulk of our 50K pages.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Indicator pages&lt;/strong&gt; - Global comparison view. Shows all countries ranked by a specific indicator, with the ability to see how they compare.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Comparison pages&lt;/strong&gt; - Side-by-side country comparisons for a given indicator. Perfect for searches like "Japan vs South Korea GDP."&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Zero rows in &lt;code&gt;wp_posts&lt;/code&gt;. Everything is computed from data tables on each request (with caching for popular pages).&lt;/p&gt;

&lt;h3&gt;
  
  
  Charts and Data Display
&lt;/h3&gt;

&lt;p&gt;Each data page includes an interactive chart (using a lightweight client-side charting library) and a full data table. The chart data is embedded as JSON in the page - fast, cacheable, and SEO-friendly since the actual values are also present in the HTML table.&lt;/p&gt;

&lt;p&gt;We also calculate and display summary statistics: latest value, historical min/max, average, and trend direction. These make each page genuinely informative rather than just a raw data dump.&lt;/p&gt;

&lt;h3&gt;
  
  
  SEO Strategy
&lt;/h3&gt;

&lt;p&gt;Every page gets unique, data-driven SEO elements:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Dynamic titles&lt;/strong&gt; that include the country name and indicator name&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Meta descriptions&lt;/strong&gt; that include actual data values ("Turkey's GDP was $X in 2024. Explore the full trend from 1960 to 2024...")&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Schema.org Dataset markup&lt;/strong&gt; so Google understands these are data pages with temporal and spatial coverage&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Breadcrumb navigation&lt;/strong&gt; for clear site hierarchy&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each meta description contains real numbers from the data, making every page genuinely unique in Google's eyes.&lt;/p&gt;

&lt;h3&gt;
  
  
  Caching
&lt;/h3&gt;

&lt;p&gt;With 50K+ pages, not everything can be pre-cached. We use a tiered approach:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Popular combinations&lt;/strong&gt; (major countries + major indicators) get pre-cached with longer TTLs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Medium-traffic pages&lt;/strong&gt; are cached on demand&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Long-tail pages&lt;/strong&gt; have shorter TTLs and are generated fresh when needed&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We also cache aggregated data (like country rankings and regional averages) at the query level since multiple pages reference the same aggregations.&lt;/p&gt;

&lt;h3&gt;
  
  
  Internal Linking
&lt;/h3&gt;

&lt;p&gt;Strong internal linking is essential for a site this large. Without it, search engines would never discover most of the pages:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each country page links to all available indicators for that country&lt;/li&gt;
&lt;li&gt;Each indicator page links to top countries for that indicator&lt;/li&gt;
&lt;li&gt;Breadcrumbs on every page create clear hierarchy&lt;/li&gt;
&lt;li&gt;Related content suggestions based on geographic and thematic proximity&lt;/li&gt;
&lt;li&gt;Comparison links suggest relevant country pairs&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The internal link graph ensures that any page on the site is reachable within 3-4 clicks from the homepage.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results After 12 Months
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;50,000+ pages indexed&lt;/strong&gt; in Google Search Console&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Average TTFB: ~150ms&lt;/strong&gt; on shared hosting&lt;/li&gt;
&lt;li&gt;&lt;strong&gt;Database queries consistently under 10ms&lt;/strong&gt;&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Zero editorial work&lt;/strong&gt; - the site runs itself, updated automatically from source data&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Growing organic traffic&lt;/strong&gt; from long-tail searches like "Nigeria inflation rate 2015" or "Vietnam GDP per capita history"&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The Reusable Pattern
&lt;/h2&gt;

&lt;p&gt;This is the same architectural pattern we use across multiple sites at Kavela:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Find an interesting, structured dataset&lt;/li&gt;
&lt;li&gt;Design custom tables optimized for the specific data model&lt;/li&gt;
&lt;li&gt;Build a data pipeline that keeps the database fresh&lt;/li&gt;
&lt;li&gt;Use WordPress virtual routing to create SEO-friendly URLs&lt;/li&gt;
&lt;li&gt;Render pages dynamically from the data&lt;/li&gt;
&lt;li&gt;Generate chunked sitemaps and build strong internal links&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The key is making sure every page offers genuine value - real data, real calculations, real insights. Template spam with swapped-out city names won't work. Search engines are smart enough to detect that. But if every page genuinely answers a different question with different data, you've built something valuable.&lt;/p&gt;

&lt;p&gt;Explore it yourself: &lt;a href="https://historysaid.com" rel="noopener noreferrer"&gt;historysaid.com&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built by Kavela Ltd - turning data into discoverable web experiences.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>wordpress</category>
      <category>data</category>
      <category>seo</category>
      <category>webdev</category>
    </item>
    <item>
      <title>How We Built a Programmatic SEO Engine Serving 80K+ Pages on WordPress (Without Using wp_posts)</title>
      <dc:creator>kavela</dc:creator>
      <pubDate>Tue, 24 Mar 2026 11:28:20 +0000</pubDate>
      <link>https://dev.to/kavelaltd/how-we-built-a-programmatic-seo-engine-serving-80k-pages-on-wordpress-without-using-wpposts-2kgn</link>
      <guid>https://dev.to/kavelaltd/how-we-built-a-programmatic-seo-engine-serving-80k-pages-on-wordpress-without-using-wpposts-2kgn</guid>
      <description>&lt;p&gt;When we set out to build &lt;a href="https://startup-cost.com" rel="noopener noreferrer"&gt;startup-cost.com&lt;/a&gt;, we knew traditional WordPress wouldn't cut it. We needed to serve &lt;strong&gt;79,000+ unique pages&lt;/strong&gt; - one for every combination of 479 cities and 167 business types - with real cost data, real-time calculations, and solid performance.&lt;/p&gt;

&lt;p&gt;Most people hear "80K pages on WordPress" and assume we're crazy. WordPress is a blogging platform, right? Well, yes - but under the hood it's a flexible PHP framework with a powerful rewrite engine. We just had to throw away the parts that don't scale and build our own.&lt;/p&gt;

&lt;p&gt;Here's the story of how we did it without a single row in &lt;code&gt;wp_posts&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Problem with wp_posts at Scale
&lt;/h2&gt;

&lt;p&gt;WordPress stores all content in a single table called &lt;code&gt;wp_posts&lt;/code&gt;. For a blog or a small business site with a few hundred pages, this works fine. But when you start pushing tens of thousands of rows into that table, things fall apart quickly:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Query performance degrades&lt;/strong&gt; - WordPress joins &lt;code&gt;wp_posts&lt;/code&gt; with &lt;code&gt;wp_postmeta&lt;/code&gt; for almost every query. With 80K posts, each with 10+ meta fields, you're looking at 800K+ rows in postmeta alone. Queries that used to take 5ms now take 500ms.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The admin panel becomes unusable&lt;/strong&gt; - Try loading the "All Posts" screen with 80K entries. WordPress paginates, sure, but even counting the total takes forever.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Revision history eats disk&lt;/strong&gt; - WordPress auto-saves revisions. With programmatic content that gets regenerated, you end up with 3-4x the actual content in revision rows.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;XML sitemaps choke&lt;/strong&gt; - Popular sitemap plugins try to query all posts at once. With 80K rows, they either timeout or consume all available memory.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Imports and exports break&lt;/strong&gt; - WordPress export generates a single XML file. Good luck with an 80K-post WXR file.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;We needed a fundamentally different approach.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Not Use a Static Site Generator or a Different CMS?
&lt;/h2&gt;

&lt;p&gt;Fair question. We considered Hugo, Next.js, and even a custom Node.js app. But WordPress gave us specific advantages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Hosting is dirt cheap&lt;/strong&gt; - Shared WordPress hosting costs a few dollars a month and handles our traffic fine&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The plugin ecosystem&lt;/strong&gt; - We still use various utility plugins for SEO settings, caching, and other tasks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Familiar deployment&lt;/strong&gt; - Our team knows WordPress inside out. No learning curve, no new CI/CD pipeline needed&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PHP is actually fast enough&lt;/strong&gt; - With opcache and a clean query pattern, PHP 8 serves pages in double-digit milliseconds&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The key insight was: we don't have to use WordPress the way it was designed. We can use it as a routing and rendering framework while storing our data however we want.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture - High Level
&lt;/h2&gt;

&lt;p&gt;We built a custom WordPress plugin that bypasses &lt;code&gt;wp_posts&lt;/code&gt; entirely. The concept has three pillars:&lt;/p&gt;

&lt;h3&gt;
  
  
  1. Custom Database Tables
&lt;/h3&gt;

&lt;p&gt;Instead of stuffing everything into posts and postmeta, we created dedicated tables with proper schemas, data types, and indexes. Think of it as designing a mini-application database that lives inside WordPress's MySQL instance.&lt;/p&gt;

&lt;p&gt;The key principle: each entity type (cities, business types, cost metrics) gets its own table with columns that match the actual data model - not the generic key-value pairs that postmeta forces you into. This means proper indexing, proper normalization, and queries that hit exactly the data they need.&lt;/p&gt;

&lt;p&gt;The performance difference is massive. A meta-based lookup on 80K posts might take hundreds of milliseconds. A direct indexed query on a purpose-built table returns in single-digit milliseconds.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Virtual URL Routing
&lt;/h3&gt;

&lt;p&gt;WordPress has a powerful but underused rewrite API. Most developers only interact with it through the permalink settings screen. But under the hood, you can register completely custom URL patterns that map to your own rendering logic.&lt;/p&gt;

&lt;p&gt;We define URL patterns that WordPress recognizes and routes to our plugin. When a request comes in, WordPress matches the URL against our patterns, extracts the relevant slugs, and hands control to our code. No post is created. No database row in &lt;code&gt;wp_posts&lt;/code&gt; is touched. The URL exists purely because we told WordPress to recognize the pattern.&lt;/p&gt;

&lt;p&gt;This approach gives us full control over URL structure while still benefiting from WordPress's request lifecycle, caching hooks, and plugin compatibility.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Dynamic Content Generation
&lt;/h3&gt;

&lt;p&gt;This is what makes programmatic SEO actually work. We don't just slap the same text on 80K pages with city names swapped - each page has genuinely different data. Real cost figures, real calculations, real comparisons.&lt;/p&gt;

&lt;p&gt;The rendering layer pulls data from our custom tables, runs computations specific to each city-business combination, and outputs a fully formed HTML page with all the SEO elements (unique title, meta description, schema markup, etc.).&lt;/p&gt;

&lt;p&gt;Every page displays different numbers because the underlying data is different. Google is smart enough to detect thin, templated content. Genuine value on each page is what makes programmatic SEO work long-term.&lt;/p&gt;

&lt;h2&gt;
  
  
  Performance Results
&lt;/h2&gt;

&lt;p&gt;We benchmarked both approaches on the same server (shared hosting, PHP 8.1, MariaDB 10.6):&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;wp_posts approach&lt;/th&gt;
&lt;th&gt;Our custom approach&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Average page load (TTFB)&lt;/td&gt;
&lt;td&gt;~800ms&lt;/td&gt;
&lt;td&gt;~120ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Database query time&lt;/td&gt;
&lt;td&gt;~200ms (with meta joins)&lt;/td&gt;
&lt;td&gt;~15ms (indexed lookup)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Admin dashboard load&lt;/td&gt;
&lt;td&gt;30+ seconds&lt;/td&gt;
&lt;td&gt;Instant (no admin overhead)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Sitemap generation&lt;/td&gt;
&lt;td&gt;Timeout at 60s&lt;/td&gt;
&lt;td&gt;~2 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Memory usage per request&lt;/td&gt;
&lt;td&gt;~64MB&lt;/td&gt;
&lt;td&gt;~12MB&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The difference is dramatic. Custom tables with proper indexes make WordPress perform like a purpose-built application.&lt;/p&gt;

&lt;h2&gt;
  
  
  Sitemap Strategy
&lt;/h2&gt;

&lt;p&gt;Google's sitemap limit is 50,000 URLs per file and 50MB uncompressed. With 80K pages, we need multiple sitemaps plus a sitemap index.&lt;/p&gt;

&lt;p&gt;We generate sitemaps programmatically, chunking URLs into manageable files, and create a sitemap index that references all of them. The generation runs on a weekly cron. Google Search Console picks them up without issues, and we can track indexing progress per sitemap chunk.&lt;/p&gt;

&lt;h2&gt;
  
  
  Internal Linking
&lt;/h2&gt;

&lt;p&gt;With 80K pages, internal linking is critical both for SEO and for helping users navigate:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Each city page links to all business types available in that city&lt;/li&gt;
&lt;li&gt;Each business page links to top cities for that business type&lt;/li&gt;
&lt;li&gt;Each detail page links to related pages based on geographic and thematic proximity&lt;/li&gt;
&lt;li&gt;Dense internal link graph helps search engines discover and understand the site structure&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Caching
&lt;/h2&gt;

&lt;p&gt;We use a multi-tier caching strategy. Not all 80K pages get equal traffic - popular city/business combinations are pre-cached with longer TTLs, while long-tail pages are cached on demand with shorter TTLs. We also cache at the database query level for frequently accessed aggregations.&lt;/p&gt;

&lt;h2&gt;
  
  
  What We Learned
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;WordPress can handle massive scale&lt;/strong&gt; - but only if you step outside its default content model. The platform is more flexible than people give it credit for.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Custom tables are not "hacky"&lt;/strong&gt; - They're the right tool when your data doesn't fit the post/meta pattern. WordPress itself uses custom tables for comments, users, and options.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Virtual routing is powerful&lt;/strong&gt; - WordPress rewrite rules can handle complex URL patterns. You don't need a custom framework just because your URLs don't map to posts.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Programmatic SEO needs real value&lt;/strong&gt; - Google detects thin, templated content. Every page needs genuinely different, useful data. That's the difference between spam and a real product.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Start with the data model&lt;/strong&gt; - We spent more time designing our database schema than writing rendering code. Good data modeling pays off at every layer.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Monitor, then optimize&lt;/strong&gt; - We started without caching and added it only where monitoring showed bottlenecks. Premature optimization would have added complexity we didn't need.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Check out the result: &lt;a href="https://startup-cost.com" rel="noopener noreferrer"&gt;startup-cost.com&lt;/a&gt; - startup cost estimates for 167 business types across 479 cities worldwide.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Built by Kavela Ltd - we build data-driven web tools at scale.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>wordpress</category>
      <category>seo</category>
      <category>database</category>
      <category>webdev</category>
    </item>
  </channel>
</rss>
