DEV Community

tylerbrennan
tylerbrennan

Posted on • Originally published at instayolo.com on

1 of 57 pages indexed. The other 56 are technically perfect. So why?

I caught instayolo.com on the drop in March 2025. 13 months later, 1 of 57 indexable pages on the site is in Google. The other 56 are sitting in two GSC buckets despite every technical signal being correct. The real bottleneck isn't crawlability or schema — it's the prior owner's reputation drag, and the only way out is one external high-trust link.

Caught a dropped domain in March 2025

I caught instayolo.com on the drop in March 2025. The previous owner had let registration lapse. Short name, two real words, reads as a downloader. Bought, sat on it for a couple weeks, then started building.

1 of 57 indexed

13 months later, 1 of 57 indexable pages on the site is in Google. Just the homepage.

Of the 56 not indexed, 38 are in GSC's "Discovered – currently not indexed" bucket — Google saw the URL in the sitemap and decided not to bother crawling. 17 are in "Crawled – currently not indexed" — Google crawled the page, then quietly tossed the result. One straggler is "Page with redirect" (/story-viewer/story-downloader, a 308 shipped after merging two near-duplicate URLs into one).

Technical state, by category

Sitemap.xml gets served from a real app/sitemap.ts route with lastmod derived from git log instead of the lazy now value everybody ships. Robots.txt allows /, disallows only /api/. One H1 per page. JSON-LD spans Organization, WebSite (with SearchAction so sitelinks-searchbox is unlocked), SoftwareApplication, BreadcrumbList, Article, Person. Every block validates in Rich Results Test. hreflang x-default everywhere. Canonical URLs absolute. X-Robots-Tag is never set to noindex on an indexable host. Googlebot UA gets the same 200 and same byte size as default UA — no cloaking, no soft 404s.

Two different LLM agents audited the stack independently. Both came back clean.

So why.

The domain wasn't fresh

The previous owner had run a Shopify-flavored e-commerce site under instayolo.com — /products/..., /collections/..., /cart, /account/login — and every one of those was cached in Google's index when I took ownership. Search "instayolo" today and the brand entity in Google's knowledge graph is partially mine, partially the prior owner's. Google's quality classifier is still working out what this domain is, given the gap between its memory of the domain and what my actual content says it does.

This is the part SEO writeups skip. Fresh domains are predictable — sandbox, ride it out, build links, climb. Inherited domains with an active prior-owner footprint are different. The crawl scheduler treats you as worse than new: not unproven, but established-and-radically-changed. Authority signals from the prior site decay slowly, but they decay onto your pages now. The quality classifier eyes your content sideways for not matching the brand's history. Nobody warned me.

A week of compressed work follows.

Sweep the ghost URLs

GSC URL Removals tool, Directory mode (collapses 12 paths into 4 prefix patterns). Same patterns into Bing Webmaster's Block URLs. Removals buys roughly 6 months of suppression. Without it, plan on Google taking 12+ months to drop the prior owner's URLs, with ghost results haunting your brand SERP the whole time.

Eliminate every false signal

Tedious but earns the keep. The Article schema's publisher.logo pointed at /icon-512.png — that URL 404'd, because Next.js App Router generates /icon dynamically without the -512.png suffix. Fixed to the real route, explicit 256×256 dimensions. Nginx was returning <h1>301 Moved Permanently</h1> as the response body on www→apex redirects. Google ignores those bodies. Bing was treating it as page content, and reporting the www host as "missing meta description." Added X-Robots-Tag: noindex, follow on the www server block — tells engines to never index www separately and consolidate everything into apex. 38 meta_title strings rendered over 60 chars in real SERPs; trimmed. 12 meta_description strings over 160; trimmed.

Performance

PSI mobile was Perf 78, TBT 530ms. The TBT was Google Tag Manager loading via afterInteractive. Switched it to lazyOnload. Perf 78 → 98, TBT 530 → 56ms. Then a Cloudflare Cache Rule for HTML — Cloudflare's default does not cache HTML, which I had assumed it did. (You have to opt in, via a Page Rule or Cache Rule.) TTFB 1.6s → 0.64s.

Author + entity hookup

Built /authors/torrance with a proper Person schema. Switched every Article schema's author.url from /about to /authors/{slug}. Added an "About the author" block at the foot of every blog post — template change, propagates to all 16 posts at once, no manual loop. Cross-linked old posts to newer ones. The "Crawled – not indexed" bucket was almost entirely older posts with no inbound internal links from fresh content, so the working theory is Google saw orphans and dropped them.

Indexed count after all of that: still 1.

What nobody warned me

What nobody warned me about, and the reason this post exists, is that none of the technical work moves crawl budget on a reputation-drag domain. What moves it is one external high-trust link. Technical work matters for quality once Google decides to crawl. It doesn't move whether Google decides to crawl. Authority does that. Specifically, authority that is observably new and attached to current content. A single backlink from a domain Google trusts probably breaks the purgatory faster than two more weeks of internal optimization.

Hence this post.

If you've inherited a dropped domain and Google is still stuck on the prior owner, feel free to reach out — receipts on the small fixes (commits, before/after PSI, GSC screenshots) are happy to share. Compare notes welcome.

Indexed pages: 1.

Top comments (0)