DEV Community

Boehner
Boehner

Posted on

How I give my AI agents eyes with a single API call

My AI agent was blind.

It could read text, write code, call APIs — but the moment I asked it to work with a webpage, it hit a wall. "Go check if this landing page looks broken." "Tell me what the pricing page says now." "Monitor this competitor's homepage for changes." All blocked.

The obvious fix: give it a browser. The actual experience: install Puppeteer, debug the Chrome binary path, hit memory limits in Lambda, watch it break on every third-party CDN that detects headless browsers. An afternoon of yak-shaving every time.

I built SnapAPI to fix this.

What SnapAPI is

A REST API that wraps a headless browser. You send a URL, you get back a screenshot, PDF, or structured page data. No Puppeteer, no containers, no Chrome binary management.

Three lines of Python vs. a weekend of DevOps:

import requests

resp = requests.get(
    "https://snapapi.tech/v1/analyze",
    params={"url": "https://example.com"},
    headers={"X-API-Key": "YOUR_KEY"}
)
data = resp.json()
print(data["title"])       # "Example Domain"
print(data["text_summary"]) # "This domain is for use in illustrative examples..."
Enter fullscreen mode Exit fullscreen mode

The three calls I use constantly

1. Analyze — structured page intelligence

This is the workhorse for AI pipelines. Instead of dumping raw HTML into a context window, I use /v1/analyze to get a clean, token-efficient JSON summary:

curl "https://snapapi.tech/v1/analyze?url=https://news.ycombinator.com" \
  -H "X-API-Key: YOUR_KEY"
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "url": "https://news.ycombinator.com",
  "title": "Hacker News",
  "description": "Links to stuff",
  "headings": [
    { "level": 1, "text": "Hacker News" }
  ],
  "links": [
    { "text": "new", "href": "https://news.ycombinator.com/newest" },
    { "text": "past", "href": "https://news.ycombinator.com/front" },
    { "text": "comments", "href": "https://news.ycombinator.com/newcomments" }
  ],
  "text_summary": "Ask HN: What are you working on? | 312 comments\nShow HN: ...",
  "load_time_ms": 847
}
Enter fullscreen mode Exit fullscreen mode

Feed that text_summary to GPT-4 instead of the raw HTML. You go from 150k tokens of angle brackets to 2k tokens of actual content.

2. Screenshot — visual verification

AI agents that can see screenshots can verify things that text parsing misses: broken layouts, missing images, visual regressions, forms that didn't render.

curl "https://snapapi.tech/v1/screenshot?url=https://snapapi.tech&width=1280&height=800&format=png" \
  -H "X-API-Key: YOUR_KEY" \
  --output page.png
Enter fullscreen mode Exit fullscreen mode

Parameters worth knowing:

  • full_page=true — captures the entire scrollable page, not just the viewport
  • dark_mode=true — renders in dark mode (useful for testing)
  • block_ads=true — blocks ad scripts before capture
  • wait_for_selector=.main-content — waits for a specific element before shooting
  • delay=1000 — waits N milliseconds after load (for JS-heavy SPAs)

I pipe screenshots directly to GPT-4V: "Does this page look broken? What changed since yesterday?"

3. Batch — process multiple URLs in one call

When you're monitoring a competitor's entire pricing page, checking 50 product pages for freshness, or building a dataset:

curl -X POST "https://snapapi.tech/v1/batch" \
  -H "X-API-Key: YOUR_KEY" \
  -H "Content-Type: application/json" \
  -d '{
    "urls": [
      "https://competitor-a.com/pricing",
      "https://competitor-b.com/pricing",
      "https://competitor-c.com/pricing"
    ],
    "endpoint": "analyze",
    "params": {}
  }'
Enter fullscreen mode Exit fullscreen mode

Response:

{
  "total": 3,
  "succeeded": 3,
  "failed": 0,
  "duration_ms": 2841,
  "results": [
    { "url": "https://competitor-a.com/pricing", "title": "Pricing — CompA", "text_summary": "..." },
    { "url": "https://competitor-b.com/pricing", "title": "Plans — CompB", "text_summary": "..." },
    { "url": "https://competitor-c.com/pricing", "title": "Pricing — CompC", "text_summary": "..." }
  ]
}
Enter fullscreen mode Exit fullscreen mode

One API call. Three pages. No rate limit juggling, no thread management.

Real use cases from my own pipelines

AI research assistant: Agent gets asked "what does Company X's product page say?" — calls /v1/analyze, feeds structured JSON to the LLM instead of raw HTML. Works reliably even on JS-heavy SPAs.

Automated visual regression: Cron job calls /v1/screenshot on a set of pages after every deploy. Screenshots stored in S3. If diff score exceeds threshold, Slack alert fires. Cost: ~$0.001/screenshot.

Competitive monitoring: Weekly job batches competitor pricing and feature pages. LLM diffs the extracted text against last week's version. Email alert on any meaningful change.

OG image generation: /v1/render takes raw HTML and returns a screenshot. Feed it a styled HTML template, get back a 1200×630 social share image. No canvas, no serverless Chrome, no font loading headaches.

Other endpoints

/v1/pdf — generates a PDF from any URL. Useful for reports, invoices, archival. Supports custom margins, landscape mode, background printing.

/v1/metadata — lightweight, fast metadata pull (title, og:image, canonical, favicon) without a full render. Use when you just need basic page info without executing JavaScript.

Getting started

Free tier requires no credit card. API key in your dashboard within 30 seconds.

# 1. Sign up at https://snapapi.tech
# 2. Copy your API key from the dashboard
# 3. Try it:
curl "https://snapapi.tech/v1/analyze?url=https://example.com" \
  -H "X-API-Key: YOUR_KEY"
Enter fullscreen mode Exit fullscreen mode

Full docs at snapapi.tech/docs.


If your AI pipeline currently handles URLs by fetching raw HTML and dumping it into the context window — this is a direct upgrade. Structured output, real rendering, lower token cost, higher reliability.

Top comments (0)