DEV Community

Cover image for How I kept 62 of 80 programmatic pages alive while hiding them from Google
MORINAGA
MORINAGA

Posted on

How I kept 62 of 80 programmatic pages alive while hiding them from Google

After my second AdSense rejection for scaled content, I had two options for the thin pages on Open Alternative To: delete them and accept 404s on any inbound links, or keep them alive while hiding them from Google's quality evaluation. I chose the second.

The reasoning: I have links pointing at some of these URLs — from earlier articles in this series, from social posts, from internal site navigation. A 404 would break all of them. The pages aren't wrong, they're just thin. The correct signal to Google is "don't evaluate these" rather than "these don't exist."

The isCurated gate

The gate lives in apps/oss-alternatives/src/lib/curation.ts:

export const CURATION = {
  MIN_ALTERNATIVES: 4,
  MIN_TOP_STARS: 1000,
  MIN_INTRO_LEN: 80,
} as const;

export function isCurated(s: SaasEntry): boolean {
  if (!s.intro || s.intro.length < CURATION.MIN_INTRO_LEN) return false;
  const alts = s.alternatives ?? [];
  if (alts.length < CURATION.MIN_ALTERNATIVES) return false;
  const topStars = alts.reduce((m, a) => Math.max(m, a.stars ?? 0), 0);
  if (topStars < CURATION.MIN_TOP_STARS) return false;
  return true;
}
Enter fullscreen mode Exit fullscreen mode

Three conditions, all required:

  • At least 4 open-source alternatives listed — a comparison page with fewer entries is barely a comparison
  • Top alternative has 1,000+ GitHub stars — filters out obscure or unmaintained projects that don't demonstrate the category's depth
  • Intro text is at least 80 characters — rules out the fallback-template content that the ETL quality ladder writes when Claude is unavailable

These are objective thresholds, not hand-picked entries. The gate runs automatically at every Astro build. Entries that gain another alternative or get a longer intro in the next ETL run will silently cross the threshold and become discoverable without any manual action.

Currently: 18 of 80 entries pass. That's the real data state, not a target. The nightly ETL upgrades entries progressively; the curated count will grow as the content improves.

Why the gate lives in its own module

saas.ts — where the main data access code lives — imports @libsql/client to query Turso. Any module that imports saas.ts at the value level picks up that dependency. Astro's static page bundles can't include server-only DB dependencies, so they'd fail to build.

The solution: curation.ts imports only types from saas.ts:

import type { SaasEntry } from "./saas.ts";
Enter fullscreen mode Exit fullscreen mode

TypeScript erases type imports at compile time. At runtime, curation.ts has no external dependencies — it's a pure computation module that Astro can safely include in static page bundles. saas.ts stays server-side-only, imported only in getStaticPaths where the DB dependency is expected.

This split-by-dependency-type pattern comes up regularly in Astro monorepos. Anything that touches a runtime external goes server-side; the pure logic you need in both places gets its own module.

Four discovery surfaces gated on the same function

A page being "hidden" means four things happen simultaneously:

1. noindex meta tagBase.astro checks isCurated(entry) and adds <meta name="robots" content="noindex, nofollow"> for entries that don't pass.

2. Sitemap exclusionastro.config.mjs has a sitemap filter applying the same threshold logic. This is the one awkward part: astro.config.mjs can't import from src/, so the threshold values are duplicated. I put // KEEP IN SYNC: curation.ts on both. Changing the thresholds in one place without updating the other would produce a sitemap that disagrees with the noindex tags — some pages would be submitted to Google while simultaneously declaring noindex.

3. RSS feed — the feed only includes curated entries. Non-curated pages won't surface in feed readers as new content.

4. Internal navigation — homepage category cards, footer category links, breadcrumb paths, and "related alternatives" widgets all filter through isCurated. A direct link from outside the site still reaches the page. But browsing the site organically won't surface non-curated entries.

The category layer

Categories follow the same logic. A category is only indexable if it has at least two curated entries (CATEGORY_MIN_CURATED = 2). Categories below that threshold still generate pages — preserving any external links to category URLs — but they're noindex and excluded from the sitemap, homepage, and footer navigation.

Right now, only one category (customer-support) meets the threshold. That's the honest state of the data: the site has broad coverage but thin editorial depth across most categories. As the ETL runs and more entries cross the curation threshold, more categories will become indexable automatically.

What changes automatically

The gate is deterministic and evaluated at build time from live DB data. When foss-alternative-to-figma gains its fourth alternative and Claude Haiku generates a 90-character intro in the next nightly run, the following Astro build will automatically include it in the sitemap, remove its noindex tag, and add it to the relevant category card and footer link.

The only thing that doesn't update automatically is the duplicate threshold in astro.config.mjs. I'll eventually extract the constants to a shared JSON file that both curation.ts and astro.config.mjs read, eliminating the sync risk. For now the comment is the guard.


Part of an ongoing 6-month experiment running three AI-curated directory sites. The technical claims here are real; this article was AI-assisted.

Top comments (0)