How I Built a Screenshot API with Device Frames and AI Cleanup

#webdev #api #tutorial #javascript

The Problem

I needed website screenshots for social cards, docs, and marketing. Puppeteer works but requires:

Server setup and maintenance
Handling timeouts, cookie banners, lazy loading
Device frame overlays (separate image processing pipeline)
Consistent quality across different sites

So I built GrabShot -- a hosted API that handles all of this in one call.

How It Works

curl "https://grabshot.dev/v1/screenshot?url=https://example.com&device=iphone15&aiCleanup=true&apiKey=gs_your_key"

Returns a device-framed, cleaned-up screenshot in ~3 seconds.

The Stack

Express for the API server
Puppeteer for headless Chrome rendering
Sharp for image processing and device frame compositing
Gemini 2.0 Flash for AI cleanup (identifying and removing popups)
SQLite for user data and rate limiting
Stripe for billing
Caddy for reverse proxy and auto-SSL

Key Technical Decisions

Device Frames

I pre-render device frame templates at multiple resolutions. When a screenshot is captured, Sharp composites the screenshot into the frame at the correct position and scale. This is much faster than doing it client-side.

AI Cleanup

The AI cleanup feature sends the page to Gemini with instructions to identify overlay elements (cookie banners, chat widgets, notification popups), then injects CSS to hide them before capturing. It adds ~1-2 seconds but the result is dramatically cleaner.

Rate Limiting

Each tier gets a monthly quota tracked in SQLite. Free tier gets 25/month. The limiter is per-API-key, not per-IP, which is simpler and more predictable.

Try It

Free tier: 25 screenshots/month, no credit card: grabshot.dev
API docs: grabshot.dev/docs
Interactive demo: grabshot.dev/try.html

Would love to hear what you'd use this for!

DEV Community