Building an OG Previewer: Per-Platform Fallback Chains

#html #socialmedia #tooling #webdev

The first version of an OG previewer that anyone sketches reads the og: tags off a page and renders them. That design survives until the first real page hits it, because two assumptions in it are wrong. The first is that the tags are present. A large share of pages in the wild are missing at least one of og:title, og:description, or og:image. The second is that there is one right way to render them. There is not. Facebook, X, LinkedIn, and Slack each resolve the same set of tags through their own fallback chain, so the card a user sees depends on which platform built it.

So a previewer that wants to tell the truth cannot just print the tags. It has to model what each platform does when a tag is missing, then show you the four different rectangles those four chains produce from one page. This is a write-up of how we built that.

Fallback chains are data, not code

Each platform's resolution is an ordered preference list per field: try this tag, then that one, then the page-level fallback. The page-level fallbacks are just keys too. We write them as page:title and page:description so they sit in the same list as the tag keys. The chains we model look like this.

Facebook and LinkedIn share a chain:

Title: og:title, then page:title (the page's <title> text).
Description: og:description, then page:description (the page's meta description).
Image: og:image, with no fallback. If it is missing, there is no image.

X resolves its own layer first, then falls through to Open Graph, then to the page:

Title: twitter:title, then og:title, then page:title.
Description: twitter:description, then og:description, then page:description.
Image: twitter:image, then og:image.
Card type: twitter:card, else default to summary.

Slack follows the Open Graph resolution, the same as Facebook and LinkedIn for title and description. It also surfaces og:site_name above the card, and shows no image when og:image is missing.

The thing to notice is that these are the same shape: an ordered list of keys, walked until one produces a value, with the page-level fallbacks living in the list as page: keys. That shape is the whole design. Instead of a render function per platform with the fallbacks baked into branches, each chain is a list of keys, and one resolver walks any list. When a key starts with page: the resolver reads from the page dict instead of the tags:

def resolve(field_chain, tags, page):
    # field_chain: ordered keys to try, e.g.
    #   ["twitter:title", "og:title", "page:title"]
    # tags: extracted meta tags; page: {"title": ..., "description": ...}
    for key in field_chain:
        if key.startswith("page:"):
            value = page.get(key.split(":", 1)[1])
        else:
            value = tags.get(key)
        if value:
            return value
    return None

Adding a platform, or fixing a chain when a platform changes its behavior, is editing a list of keys, not rewriting render logic. All four previews come out of the same resolver fed four different chains.

og:image present is not og:image working

The single most common reason a card looks broken is the image, and the failure is sneaky because the tag is right there in the markup. og:image points at a URL, the validator that only reads tags says "image: present", and the card still renders empty. The tag being present and the image being reachable are two different facts, and only the second one matters to the platform.

So the previewer does not trust the markup. It makes a server-side HEAD request to the og:image URL, with a five-second timeout, following redirects, and flags any image that does not come back with a 2xx. That check catches the failures that the tag-presence check misses:

Relative URLs. og:image set to /img/card.png. Loads fine on the origin page where the browser resolves it against the current host, and resolves to nothing when a platform crawler with no base context fetches it.
CDN hotlink blocks. The image exists, but the CDN refuses requests whose referer is not the origin site. Your browser passes, Facebook's fetcher gets a 403, the card is grey.
Redirects to a login or interstitial. Following the redirect lands on an HTML page, not an image. The HEAD request follows it and sees the wrong thing come back.

What the check deliberately does not do is fully validate the image. A HEAD request confirms the URL answers and roughly what it answers with; it does not download the bytes, decode the image, and measure its dimensions. Confirming an image is actually 1200 by 630 means fetching and decoding it, which is a heavier operation than a header probe and a different kind of check. So we recommend 1200 by 630 (the 1.91 to 1 ratio platforms crop to) as guidance, and we flag what a header check can actually catch: the URL that does not resolve at all. That honesty matters more than pretending the tool measures something it does not.

A severity model with two real levels

Not every missing tag is equally bad, and the report should say so. The split we settled on maps to a real distinction: will the card render wrong, or will it not render the way it should at all.

Errors: missing og:title, og:description, og:image, or og:url. These are the load-bearing tags. Miss one and the card is either broken or built entirely from a guess.
Warnings: missing twitter:card. X still renders, it just falls back to the default summary card, the small-thumbnail layout, when you probably wanted summary_large_image. The card works; it is smaller and less prominent than intended. The warning text says exactly that: X will use a default card type.

The line we drew is between "the card will look wrong" and "the card will render in a less ideal form." A missing og:image is the first kind, an empty rectangle, so it is an error. A missing twitter:card is the second kind, a working but downgraded card, so it is a warning. Keeping those two states distinct is what lets someone reading the report triage: fix the errors before a share goes out, schedule the warnings.

Parsing hostile HTML

The clean case, a well-formed head with tidy meta tags, is the case that never causes bugs. Real pages are messier, and a previewer that only handles the clean case is a demo, not a tool. The pages we have to survive include:

Meta tags placed in the body instead of the head, because a CMS or a tag manager injected them after the fact.
Pages with no <head> element at all, just tags floating in malformed markup.
The same property declared two or three times, an og:image from the theme and another from an SEO plugin, with no agreement on which wins.

The rule we follow is to parse leniently and resolve the way a platform would, not the way a spec says a document should be. We collect meta tags wherever they appear, not only inside the head, because the platform crawlers are lenient too and a tag in the body still gets used. For duplicated properties we take a defined position rather than throwing, so the preview is deterministic instead of dependent on parse order. The goal is never to grade the HTML. It is to render the card the platform would render from that HTML, warts and all, so what you see in the previewer is what your audience will see.

Try it

The OG Previewer runs everything above on a single URL: four platform cards from four fallback chains, the server-side image reachability check, and the error and warning report. The tag audit underneath it pairs naturally with the Meta Tag Analyzer, which scores the full tag set rather than just the share-card subset, if you want the complete head graded in one pass.

If there is one idea here worth carrying into something you build, it is modeling per-consumer behavior as ordered data instead of branching code. The four platforms looked like four rendering problems. They turned out to be one resolver and four lists. Any time you find yourself writing a function per consumer with the differences baked into if-statements, check whether the differences are really just data the consumers disagree on. Usually they are, and pulling them out into a table makes the next consumer a row instead of a rewrite.

Mehul Jain is an AI entrepreneur and product builder. He works on Geology, a GEO platform.