You've seen it a thousand times: you paste a URL into Slack, Discord, or iMessage and it blooms into a tidy card with a title, an image, and a description. It's one of the highest-trust little UI elements you can add to a chat app, a comment box, a CMS, or a bookmarking tool. A bare URL looks like spam. An unfurled link looks legit — and gets clicked.
So you decide to build it. How hard can it be? You fetch the page, grab a few meta tags, done by lunch.
Then reality shows up.
What a link preview actually is
The card is built from Open Graph tags in the page's <head>:
<meta property="og:title" content="The Verge" />
<meta property="og:description" content="Technology, science and culture." />
<meta property="og:image" content="https://.../cover.png" />
```
Read those, render a card. Simple — in the demo. The pain is everything *between* "fetch the page" and "read the tags."
## The five things that break your weekend scraper
1. **Redirects.** `t.co`, `bit.ly`, and `http → https` mean the URL you were handed isn't the page you parse. You have to follow them.
2. **Missing tags.** Tons of sites have no `og:*` at all. You need a fallback chain: Open Graph → Twitter Card → `<title>`/`meta description` → first `<h1>`/`<img>`.
3. **Relative image URLs.** `og:image` is often `/img/cover.png`, not a full URL. Resolve it against the final page URL or the image just won't load.
4. **Timeouts and giant pages.** A slow or multi-megabyte page will hang your request or eat your memory. You need a hard timeout and a byte cap.
5. **SSRF — the dangerous one.** If users submit the URLs, an attacker can point you at `http://169.254.169.254` (cloud metadata) or `http://localhost` to reach internal services. You must block private/loopback/link-local IPs — on **every** redirect hop, not just the first.
That last one is why "just scrape it yourself" quietly becomes a security review. Link-preview features are a classic SSRF vector in real apps.
## The shortcut: one request, clean JSON
If you'd rather not own all of that, hand the URL to a service that already has, and get structured data back:
```javascript
const res = await fetch(
"https://link-preview14.p.rapidapi.com/preview?url=" +
encodeURIComponent(targetUrl),
{ headers: {
"X-RapidAPI-Key": "YOUR_KEY",
"X-RapidAPI-Host": "link-preview14.p.rapidapi.com",
} },
);
const preview = await res.json();
```
You get back:
```json
{
"resolvedUrl": "https://www.theverge.com/",
"title": "The Verge",
"description": "The Verge is about technology and how it makes us feel.",
"image": "https://www.theverge.com/static-assets/og-image.png",
"favicon": "https://www.theverge.com/favicon.ico",
"siteName": "The Verge",
"themeColor": "#5200ff"
}
```
Redirects followed, relative URLs resolved, fallbacks applied, private IPs blocked, responses cached. Every missing field comes back as `null`, so the shape never changes and your render code stays boring (the good kind).
## When to build vs. when to buy
Building it yourself is totally reasonable if you control the input — a fixed list of trusted URLs, internal pages, that kind of thing. The moment URLs are **user-submitted**, the edge cases and the SSRF surface make a purpose-built endpoint the faster, safer call.
If you want to skip the weekend, the API above is on RapidAPI with a free tier (no card): **[Link Preview API](https://rapidapi.com/bpmcginley/api/link-preview14)**. Full write-up with the field reference is [here](https://linkpreviewapi.onrender.com/blog/how-to-generate-link-previews.html).
What edge case bit you hardest building link previews? I'll add the bad ones to the article.
Top comments (0)