DEV Community

DarkPancakes
DarkPancakes

Posted on

I built a Screenshot & Metadata API that extracts 50+ fields from any URL

I built a Screenshot/PDF API that does more than just screenshots. The metadata endpoint extracts 50+ fields from any URL — OG tags, Twitter Card, JSON-LD, content analysis, and more.

What it does

  • URL to Screenshot — capture any webpage as PNG, JPEG, or WebP
  • URL to PDF — generate PDFs with custom format, margins, orientation
  • Metadata Extraction — 50+ fields from any URL (see below)
  • HTML to Image — render custom HTML/CSS to PNG/JPEG/WebP

The Metadata Endpoint

This is what makes it different. A single GET request extracts:

Basic SEO:
title, description, keywords, author, language, charset, viewport, robots, canonical URL, generator

Open Graph:
og:title, og:description, og:image (+ dimensions), og:url, og:type, og:site_name, og:locale

Twitter Card:
card type, title, description, image, @site, @creator

Icons & Theme:
favicon, apple-touch-icon, manifest, theme-color, color-scheme

Content Analysis:
first h1 text, h2 count, internal links count, external links count, images count, images without alt text, forms count, scripts count, stylesheets count, word count

Structured Data:
JSON-LD (Schema.org) parsed and returned

Feeds:
RSS/Atom feeds auto-detected

Raw dump:
All meta tags as key-value pairs

Quick Start (Python)

import requests

headers = {
    "X-RapidAPI-Key": "YOUR_KEY",
    "X-RapidAPI-Host": "screenshot-pdf-api.p.rapidapi.com"
}

# Screenshot a website
response = requests.get(
    "https://screenshot-pdf-api.p.rapidapi.com/v1/screenshot",
    headers=headers,
    params={"url": "https://github.com", "width": 1280, "format": "png"}
)

with open("screenshot.png", "wb") as f:
    f.write(response.content)

print(f"Saved {len(response.content)} bytes")
Enter fullscreen mode Exit fullscreen mode

Quick Start (JavaScript)

// Screenshot
const response = await fetch(
  "https://screenshot-pdf-api.p.rapidapi.com/v1/screenshot?url=https://github.com&format=png",
  {
    headers: {
      "X-RapidAPI-Key": "YOUR_KEY",
      "X-RapidAPI-Host": "screenshot-pdf-api.p.rapidapi.com"
    }
  }
);
const blob = await response.blob();

// Metadata
const meta = await fetch(
  "https://screenshot-pdf-api.p.rapidapi.com/v1/metadata?url=https://github.com",
  {
    headers: {
      "X-RapidAPI-Key": "YOUR_KEY",
      "X-RapidAPI-Host": "screenshot-pdf-api.p.rapidapi.com"
    }
  }
);
const data = await meta.json();
console.log(data.data.title); // "GitHub · Build and ship software..."
console.log(data.data.og_image); // "https://..."
console.log(data.data.word_count); // 834
Enter fullscreen mode Exit fullscreen mode

cURL

# Screenshot
curl -o screenshot.png \
  -H "X-RapidAPI-Key: YOUR_KEY" \
  -H "X-RapidAPI-Host: screenshot-pdf-api.p.rapidapi.com" \
  "https://screenshot-pdf-api.p.rapidapi.com/v1/screenshot?url=https://github.com"

# Full page capture
curl -o fullpage.png \
  -H "X-RapidAPI-Key: YOUR_KEY" \
  -H "X-RapidAPI-Host: screenshot-pdf-api.p.rapidapi.com" \
  "https://screenshot-pdf-api.p.rapidapi.com/v1/screenshot?url=https://en.wikipedia.org&full_page=true"
Enter fullscreen mode Exit fullscreen mode

Endpoints

Endpoint Description Tier
GET /v1/screenshot Screenshot URL to PNG/JPEG/WebP Free
GET /v1/health API status & queue depth Free
GET /v1/pdf Generate PDF from URL Basic
GET /v1/metadata Extract 50+ metadata fields Basic
POST /v1/screenshot/html Render HTML/CSS to image Pro

Screenshot Parameters

Param Default Description
url required URL to capture
width 1280 Viewport width
height 800 Viewport height
format png png, jpeg, webp
quality 85 JPEG/WebP quality (1-100)
full_page false Capture entire scrollable page
delay 0 Wait N seconds before capture (0-5)
selector null CSS selector to capture specific element

Use Cases

  • Social media previews — generate Open Graph images
  • PDF reports — convert dashboards and pages to PDF
  • Web scraping — screenshot + metadata in one call
  • Thumbnails — generate website thumbnails at scale
  • SEO auditing — check OG tags, missing alt text, structured data
  • Link previews — build rich preview cards
  • Visual regression testing — automated screenshots for QA

Pricing vs Competitors

Feature This API ScreenshotOne URLBox
Free tier 20/day 100 one-time None
Basic plan $9/mo $17/mo $19/mo
Metadata extraction 50+ fields No No
JSON-LD parsing Yes No No
Content analysis Yes No No

Try it

Screenshot & PDF API on RapidAPI

Postman Collection

Built with FastAPI + Playwright (headless Chromium). Hosted on a Hetzner VPS.

What other metadata fields would be useful to extract? Let me know in the comments!

Top comments (0)