Custodia-Admin

Posted on Mar 25 • Originally published at pagebolt.dev

Screenshot API for Flask: Capture Web Pages from Your Python App in Minutes

#flask #python #api #screenshots

Screenshot API for Flask: Capture Web Pages from Your Python App in Minutes

Flask developers face a familiar problem: your app needs to capture web pages as screenshots or PDFs, but you don't want to manage Puppeteer, Selenium, or wkhtmltopdf in production.

This is exactly the problem PageBolt solves. Instead of spinning up a headless browser, managing dependencies, and debugging timeout issues, you just send a REST request. Your screenshot is ready in under a second.

The Problem: Why Not Just Use Selenium?

If you've tried Selenium or Puppeteer for Flask apps, you know the pain:

Dependency hell: Installing ChromeDriver, Chromium, or Firefox requires system packages, version management, and Docker complexity
Resource heavy: Each screenshot request spins up a new browser process or queues on a shared pool — not great for high-traffic apps
Fragile in production: Screenshots timeout, browsers crash, PDF rendering breaks on obscure CSS, and you're debugging on someone else's infrastructure
Maintenance burden: Browser versions change, Selenium releases break compatibility, CI/CD needs special flags like --no-sandbox

PageBolt removes all of this. Your code is just a few lines of requests.post().

The Solution: REST API Instead of Local Browsers

Here's the whole idea in one example:

import requests

# Capture a website as a screenshot
response = requests.post(
    'https://api.pagebolt.dev/screenshot',
    headers={'x-api-key': PAGEBOLT_API_KEY},
    json={
        'url': 'https://example.com',
        'format': 'png'
    }
)

# response.content is your PNG file
# Save it, return it, email it, whatever
with open('screenshot.png', 'wb') as f:
    f.write(response.content)

Done. No browsers. No dependencies. No timeouts.

Complete Flask Example 1: Synchronous Screenshot Route

Let's build a simple Flask app with a route that takes a screenshot and returns it:

from flask import Flask, request, send_file
import requests
import io
import os

app = Flask(__name__)

PAGEBOLT_API_KEY = os.getenv('PAGEBOLT_API_KEY')
PAGEBOLT_BASE_URL = 'https://api.pagebolt.dev'

@app.route('/capture-screenshot', methods=['POST'])
def capture_screenshot():
    """
    POST /capture-screenshot
    Body: { "url": "https://example.com", "format": "png" }
    Returns: PNG file
    """
    data = request.json
    url = data.get('url')

    if not url:
        return {'error': 'url is required'}, 400

    # Call PageBolt API
    response = requests.post(
        f'{PAGEBOLT_BASE_URL}/screenshot',
        headers={'x-api-key': PAGEBOLT_API_KEY},
        json={
            'url': url,
            'format': 'png',
            'width': 1280,
            'height': 720,
            'fullPage': True  # Capture full scrollable page
        }
    )

    if response.status_code != 200:
        return {'error': f'PageBolt error: {response.status_code}'}, 500

    # Return the screenshot as a file download
    return send_file(
        io.BytesIO(response.content),
        mimetype='image/png',
        as_attachment=True,
        download_name=f'screenshot.png'
    )

@app.route('/capture-pdf', methods=['POST'])
def capture_pdf():
    """
    POST /capture-pdf
    Body: { "url": "https://example.com" }
    Returns: PDF file
    """
    data = request.json
    url = data.get('url')

    if not url:
        return {'error': 'url is required'}, 400

    response = requests.post(
        f'{PAGEBOLT_BASE_URL}/pdf',
        headers={'x-api-key': PAGEBOLT_API_KEY},
        json={
            'url': url,
            'format': 'A4',
            'margin': '1cm'
        }
    )

    if response.status_code != 200:
        return {'error': f'PageBolt error: {response.status_code}'}, 500

    return send_file(
        io.BytesIO(response.content),
        mimetype='application/pdf',
        as_attachment=True,
        download_name='document.pdf'
    )

if __name__ == '__main__':
    app.run(debug=True)

Test it:

curl -X POST http://localhost:5000/capture-screenshot \
  -H 'Content-Type: application/json' \
  -d '{"url": "https://pagebolt.dev"}' \
  > screenshot.png

Done. Your Flask app now captures web pages.

Complete Flask Example 2: Asynchronous Celery Task Pattern

For higher-traffic apps, you'll want async task processing. Here's a Celery pattern that queues screenshot requests and stores results:

from flask import Flask, request, jsonify
from celery import Celery
import requests
import os
from datetime import datetime

app = Flask(__name__)
app.config['CELERY_BROKER_URL'] = 'redis://localhost:6379/0'
app.config['CELERY_RESULT_BACKEND'] = 'redis://localhost:6379/0'

celery = Celery(app.name, broker=app.config['CELERY_BROKER_URL'])
celery.conf.update(app.config)

PAGEBOLT_API_KEY = os.getenv('PAGEBOLT_API_KEY')
PAGEBOLT_BASE_URL = 'https://api.pagebolt.dev'

# In-memory storage for demo (use a real database in production)
screenshot_jobs = {}

@celery.task(bind=True)
def capture_screenshot_async(self, job_id, url, format='png'):
    """
    Async Celery task to capture a screenshot
    """
    try:
        response = requests.post(
            f'{PAGEBOLT_BASE_URL}/screenshot',
            headers={'x-api-key': PAGEBOLT_API_KEY},
            json={
                'url': url,
                'format': format,
                'width': 1280,
                'height': 720,
                'fullPage': True
            },
            timeout=30
        )

        if response.status_code == 200:
            # Store the screenshot in S3, database, or filesystem
            screenshot_jobs[job_id] = {
                'status': 'completed',
                'url': url,
                'timestamp': datetime.utcnow().isoformat(),
                'data': response.content  # In production, save to S3
            }
        else:
            screenshot_jobs[job_id] = {
                'status': 'failed',
                'error': f'PageBolt returned {response.status_code}'
            }
    except Exception as e:
        screenshot_jobs[job_id] = {
            'status': 'failed',
            'error': str(e)
        }

@app.route('/screenshot-async', methods=['POST'])
def screenshot_async():
    """
    Queue a screenshot capture task
    Returns: job_id for polling
    """
    data = request.json
    url = data.get('url')

    if not url:
        return {'error': 'url is required'}, 400

    job_id = f"job_{datetime.utcnow().timestamp()}"
    screenshot_jobs[job_id] = {'status': 'pending'}

    # Queue the task
    capture_screenshot_async.delay(job_id, url)

    return {
        'job_id': job_id,
        'status': 'queued',
        'poll_url': f'/screenshot-status/{job_id}'
    }

@app.route('/screenshot-status/<job_id>', methods=['GET'])
def screenshot_status(job_id):
    """
    Poll the status of a screenshot job
    """
    job = screenshot_jobs.get(job_id)

    if not job:
        return {'error': 'job not found'}, 404

    status = job.get('status')

    if status == 'pending' or status == 'in_progress':
        return {'status': status, 'job_id': job_id}
    elif status == 'completed':
        return {
            'status': 'completed',
            'job_id': job_id,
            'url': job.get('url'),
            'download_url': f'/screenshot-download/{job_id}'
        }
    else:
        return {'status': 'failed', 'error': job.get('error')}, 500

@app.route('/screenshot-download/<job_id>', methods=['GET'])
def screenshot_download(job_id):
    """
    Download the completed screenshot
    """
    job = screenshot_jobs.get(job_id)

    if not job or job.get('status') != 'completed':
        return {'error': 'screenshot not ready'}, 404

    from flask import send_file
    import io

    return send_file(
        io.BytesIO(job['data']),
        mimetype='image/png',
        as_attachment=True,
        download_name='screenshot.png'
    )

if __name__ == '__main__':
    app.run(debug=True)

Usage:

# Queue a screenshot
curl -X POST http://localhost:5000/screenshot-async \
  -H 'Content-Type: application/json' \
  -d '{"url": "https://pagebolt.dev"}'
# Returns: { "job_id": "job_1234567890", "status": "queued" }

# Poll status
curl http://localhost:5000/screenshot-status/job_1234567890

# Download when ready
curl http://localhost:5000/screenshot-download/job_1234567890 > screenshot.png

Comparison: PageBolt vs Selenium vs wkhtmltopdf

Feature	PageBolt	Selenium	wkhtmltopdf
Setup complexity	1 API key	Install ChromeDriver + Chromium	Install system package
Dependency management	None (just `requests`)	Fragile versioning	Outdated, unmaintained
CPU/memory per request	Hosted (not your problem)	Spins up new browser	Heavy process
JavaScript rendering	Full	Full	Basic
Cookie/auth support	Yes (headers)	Yes (Selenium webdriver)	No
Timeouts	Handled server-side	Your responsibility	Your responsibility
Cost at scale	$29/mo for 10k requests	$0 but hosting costs	$0 but hosting costs
Maintenance burden	None	High (browser updates)	Very high (unmaintained)

Winner for Flask: PageBolt. Zero setup, no dependency hell, predictable pricing.

Cost Analysis

PageBolt pricing:

Free tier: 100 requests/month
Paid: $29/month for 10,000 requests (~$0.003 per screenshot)

Self-hosted Selenium:

Infrastructure: EC2 instance ($30–$100/month depending on load)
Developer time: 8+ hours to set up, debug, monitor
Maintenance: 4+ hours/month for browser updates, timeout fixes
Real cost: $100+/month + your time

At just 1,000 requests/month, PageBolt pays for itself in saved infrastructure costs alone.

Real-World Example: Report Generation

Here's a practical example — a Flask app that generates HTML reports and emails them as PDFs:

from flask import Flask, request
from celery import Celery
import requests
import smtplib
from email.mime.base import MIMEBase
from email.mime.multipart import MIMEMultipart
from email.mime.text import MIMEText

@celery.task
def generate_report_pdf(report_id, email, report_html_url):
    """
    Generate a PDF from an HTML report and email it
    """
    # Capture the HTML as a PDF
    response = requests.post(
        'https://api.pagebolt.dev/pdf',
        headers={'x-api-key': PAGEBOLT_API_KEY},
        json={
            'url': report_html_url,
            'format': 'A4',
            'margin': '1cm'
        }
    )

    if response.status_code != 200:
        print(f"PDF generation failed: {response.status_code}")
        return

    # Email the PDF
    msg = MIMEMultipart()
    msg['From'] = 'reports@yourapp.com'
    msg['To'] = email
    msg['Subject'] = f'Report #{report_id}'

    attachment = MIMEBase('application', 'octet-stream')
    attachment.set_payload(response.content)
    attachment.add_header('Content-Disposition', 'attachment', filename=f'report_{report_id}.pdf')
    msg.attach(attachment)

    # Send via your SMTP server
    server = smtplib.SMTP('smtp.gmail.com', 587)
    server.starttls()
    server.login('your_email@gmail.com', 'your_password')
    server.send_message(msg)
    server.quit()

@app.route('/reports/<report_id>/email', methods=['POST'])
def email_report(report_id):
    email = request.json.get('email')
    report_url = f'https://yourapp.com/reports/{report_id}/view'

    generate_report_pdf.delay(report_id, email, report_url)

    return {'status': 'report queued for email'}

Next Steps

Get a free API key: Visit pagebolt.dev and sign up — 100 requests/month, no credit card required
Install requests: pip install requests
Copy one of the examples above and run it in your Flask app
Scale as needed: If you exceed 100 requests/month, upgrade to a paid plan ($29/month, cancel anytime)

Flask developers shouldn't be managing headless browsers. With PageBolt, you get web capture in minutes, not weeks.

Try it free — 100 requests/month, no credit card. Start capturing screenshots now.

DEV Community

Screenshot API for Flask: Capture Web Pages from Your Python App in Minutes

Screenshot API for Flask: Capture Web Pages from Your Python App in Minutes

The Problem: Why Not Just Use Selenium?

The Solution: REST API Instead of Local Browsers

Complete Flask Example 1: Synchronous Screenshot Route

Complete Flask Example 2: Asynchronous Celery Task Pattern

Comparison: PageBolt vs Selenium vs wkhtmltopdf

Cost Analysis

Real-World Example: Report Generation

Next Steps

Top comments (0)