DEV Community

smallhandsome
smallhandsome

Posted on

Give Your AI Agent Eyes: Building a Visual MCP Server for Web Screenshots

Launch Announcement: ShotAPI MCP Server is launching on Product Hunt on Wednesday June 10! Come support us!

Give Your AI Agent Eyes: Building a Visual MCP Server for Web Screenshots

Why AI Agents Need to See the Web

MCP (Model Context Protocol) agents can fetch and parse web content, but they're blind to how it looks. A broken layout, a misaligned element, a CSS rendering bug — these are visual problems that text alone can't diagnose.

That's why I built ShotAPI — an MCP server that gives Claude, Cursor, and other AI agents screenshot and HTML rendering capabilities via a single remote connection.

No Python, no Playwright, no local install. One command:

claude mcp add --transport streamable-http shotapi https://aiphotoshop.mynatapp.cc/mcp
Enter fullscreen mode Exit fullscreen mode

Three Tools, One Connection

ShotAPI exposes three MCP tools through streamable-http (no stdio, no local setup):

  1. screenshot_one_liner(url) — Quick screenshot of any URL. One parameter, one result.

  2. screenshot(url, selector?, fullpage?, viewport?, format?) — Full control: CSS selectors, viewport sizing, full-page captures, PNG/JPEG/WebP output.

  3. render(html) — Render any HTML/CSS to an image. Write code, see the result, revise. This closes the visual feedback loop.

Practical Use Cases

1. Visual Bug Detection

Tell your agent to check a webpage, and it can see the problem:

> Take a screenshot of https://myapp.com/dashboard and check if the sidebar overlaps the main content
Enter fullscreen mode Exit fullscreen mode

The agent gets an actual image — not just the DOM text — and can identify layout issues, missing images, or broken responsive designs.

2. Design Review Automation

> Compare screenshots of https://mysite.com before and after the CSS change
Enter fullscreen mode Exit fullscreen mode

The agent captures both states visually and describes the differences — something text-based tools can't do.

3. HTML Prototyping with render

The render tool is where it gets interesting. Your agent can:

> Write a landing page hero section with Tailwind CSS, render it, and show me how it looks
Enter fullscreen mode Exit fullscreen mode

The agent writes HTML -> renders it -> sees the visual result -> identifies issues -> revises -> renders again. A real visual feedback loop, all within the conversation.

# Quick test from your terminal
curl -s "https://aiphotoshop.mynatapp.cc/v1/render"   -X POST   -H "Content-Type: application/json"   -d '{"html": "<h1 style="color:#4f46e5;font-family:sans-serif">ShotAPI Works!</h1>"}'   -o rendered.png
Enter fullscreen mode Exit fullscreen mode

4. Batch Monitoring

Take periodic screenshots of production sites to catch visual regressions:

import httpx

sites = ["https://myap---
title: "How I Built ShotAPI: An MCP Server That Gives AI Agents Visual Context"
published: false
description: "A practical guide to building and deploying an MCP server for screenshot and HTML rendering, with real Claude integration examples"
tags: mcp, ai, screenshot, claude, tutorial
cover_image: https://aiphotoshop.mynatapp.cc/v1/screenshot?url=https://github.com/smallhandsome/shotapi-mcp-server
---

## The Problem: AI Agents Are Blind

When you're using Claude Code or Cursor to build web applications, there's a fundamental gap: **the AI can't see what it's building**. It writes HTML and CSS, but it has no way to verify the visual output. It can't tell if the layout is broken, if the colors look right, or if the responsive design actually works.

This isn just a minor inconvenience. It leads to:
- Iterative "check it and tell me what you see" loops that waste time
- Deployments that look fine in code but broken in production
- No way for AI agents to autonomously verify UI changes

## The Solution: ShotAPI MCP Server

[ShotAPI](https://aiphotoshop.mynatapp.cc/docs) is an MCP (Model Context Protocol) server that provides two tools:

1. **`screenshot`** — Captures a full-page screenshot of any URL
2. **`render`** — Renders raw HTML/CSS and returns the visual result

It works as a remote MCP server via streamable-http transport. No browser install needed on your machine.

### Connect in One Command

Enter fullscreen mode Exit fullscreen mode


bash
claude mcp add --transport streamable-http shotapi https://aiphotoshop.mynatapp.cc/mcp


Or for Cursor:

Enter fullscreen mode Exit fullscreen mode


json
{
"mcpServers": {
"shotapi": {
"url": "https://aiphotoshop.mynatapp.cc/mcp",
"transport": "streamable-http"
}
}
}


Free tier: 100 screenshots + 100 renders per month, no signup needed.

## Real Use Cases

### 1. Claude Verifies Its Own Code

Enter fullscreen mode Exit fullscreen mode


plaintext
Human: Build a landing page with a hero section, testimonials, and pricing grid.

Claude: [writes HTML/CSS code]

Claude: Let me verify this looks correct. I'll use the render tool to check.

[Claude calls ShotAPI render tool with the HTML it just wrote]

Claude: I can see the hero section is working, but the pricing grid is overlapping on mobile. Let me fix the responsive breakpoints.


### 2. Deployment Verification

Enter fullscreen mode Exit fullscreen mode


plaintext
Human: I just deployed my app to production. Check if it looks right.

Claude: [calls ShotAPI screenshot tool on https://yourapp.com]

Claude: The page loaded successfully. I notice the footer text is cut off — looks like a CSS overflow issue. Also the background image didn't load (404).


### 3. Competitor/Reference Site Analysis

Enter fullscreen mode Exit fullscreen mode


plaintext
Human: I want my site to look like stripe.com. What layout patterns do they use?

Claude: [calls ShotAPI screenshot tool on https://stripe.com]

Claude: Stripe uses a dark hero with gradient text, centered nav with minimal items, and a card-based feature grid. The hero has a large heading with animated gradient, and below it is a product screenshot.


### 4. Automated UI Regression Testing

In CI/CD pipelines, you can use the `/v1/screenshot` REST API to capture screenshots before and after changes:

Enter fullscreen mode Exit fullscreen mode


bash

Before deploy

curl -s "https://aiphotoshop.mynatapp.cc/v1/screenshot?url=https://staging.myapp.com" -o before.png

After deploy

curl -s "https://aiphotoshop.mynatapp.cc/v1/screenshot?url=https://staging.myapp.com" -o after.png

Compare

If visual regression detected, block the deploy


## Technical Architecture

ShotAPI is built with:
- **FastAPI** for the HTTP API and MCP server endpoints
- **Playwright** (Chromium) for headless browser rendering
- **Streamable-HTTP** MCP transport (no SSE, no WebSocket — simple POST-based)
- **NATAPP tunnel** for Chinese network accessibility

The MCP server follows the official MCP specification. It registers two tools with proper input schemas:

Enter fullscreen mode Exit fullscreen mode


python

Tool definition in MCP server

tools = [
{
"name": "screenshot",
"description": "Capture a screenshot of a webpage",
"inputSchema": {
"type": "object",
"properties": {
"url": {"type": "string", "description": "URL to screenshot"},
"width": {"type": "integer", "description": "Viewport width (default 1280)"},
"full_page": {"type": "boolean", "description": "Capture full page (default true)"},
},
"required": ["url"]
}
},
{
"name": "render",
"description": "Render HTML/CSS and return the visual result",
"inputSchema": {
"type": "object",
"properties": {
"html": {"type": "string", "description": "HTML content to render"},
"css": {"type": "string", "description": "Optional CSS to apply"},
},
"required": ["html"]
}
}
]


## REST API (for non-MCP usage)

If you're not using Claude/Cursor but still want screenshots:

Enter fullscreen mode Exit fullscreen mode


bash

Screenshot any URL (free, no API key needed)

curl "https://aiphotoshop.mynatapp.cc/v1/screenshot?url=https://github.com"

Render HTML

curl -X POST "https://aiphotoshop.mynatapp.cc/v1/render" -H "Content-Type: application/json" -d '{"html": "

Hello World

"}'

Response is a PNG image (or JSON with base64 data).

## What's Next

ShotAPI is listed in the [official MCP Registry](https://registry.modelcontextprotocol.io) and [awesome-mcp-servers](https://github.com/punkpeye/awesome-mcp-servers). The free tier gives 100+100 calls/month with no signup.

If you're building AI agent workflows that need visual context, give it a try:

Enter fullscreen mode Exit fullscreen mode


bash
claude mcp add --transport streamable-http shotapi https://aiphotoshop.mynatapp.cc/mcp


[GitHub Repo](https://github.com/smallhandsome/shotapi-mcp-server) | [Docs](https://aiphotoshop.mynatapp.cc/docs) | [MCP Registry](https://registry.modelcontextprotocol.io)
p.com", "https://docs.myapp.com", "https://admin.myapp.com"]
for site in sites:
    resp = httpx.post("https://aiphotoshop.mynatapp.cc/v1/screenshot",
                      json={"url": site, "fullpage": True},
                      headers={"Authorization": "Bearer YOUR_KEY"})
    with open(f"{site.split('//')[1].replace('.','_')}.png", "wb") as f:
        f.write(resp.content)
Enter fullscreen mode Exit fullscreen mode


shell

Free Tier — No Signup Required

ShotAPI has a free tier that works without any API key:

  • 30 screenshots + 30 renders per month per IP address
  • All 3 MCP tools available
  • Streamable-http remote connection (no VPN needed from China)

Just add the MCP connection and start using it. No registration, no email, no keys.

Getting Started

For Claude Code / Claude Desktop

claude mcp add --transport streamable-http shotapi https://aiphotoshop.mynatapp.cc/mcp
Enter fullscreen mode Exit fullscreen mode

For Cursor

Add to your Cursor MCP config:

{
  "mcpServers": {
    "shotapi": {
      "url": "https://aiphotoshop.mynatapp.cc/mcp",
      "transport": "streamable-http"
    }
  }
}
Enter fullscreen mode Exit fullscreen mode

Direct API (no MCP)

# Free screenshot (IP-based, no key)
curl -s "https://aiphotoshop.mynatapp.cc/v1/screenshot?url=https://news.ycombinator.com" -o hn.png

# With API key (dedicated quota)
curl -s "https://aiphotoshop.mynatapp.cc/v1/screenshot?url=https://github.com&selector=.Header"   -H "Authorization: Bearer YOUR_KEY"   -o header.png
Enter fullscreen mode Exit fullscreen mode

Paid Tiers

Tier Price Calls/Month Features
Starter $4.90/mo 5,000+5,000 API Key, full params
Pro $9.90/mo 20,000+20,000 Full params, usage tracking
Free $0 30+30 IP-based, no signup

Chinese users can pay via WeChat/Alipay on Afdian with instant activation.

What's Next

I'm actively working on:

  • PDF export support
  • Video/animation capture
  • Batch screenshot API
  • Webhook notifications for monitoring

Try it now — just add the MCP connection and ask your agent to take a screenshot. No signup, no install, no waiting.

Docs: https://aiphotoshop.mynatapp.cc/en/docs
Smithery: https://smithery.ai/server/@ljs/shotapi
GitHub: https://github.com/smallhandsome/shotapi-mcp-server


ShotAPI is an open MCP server for web screenshots and HTML rendering. Free tier available with no registration required.

Top comments (0)