DEV Community

Custodia-Admin
Custodia-Admin

Posted on • Originally published at pagebolt.dev

Screenshot API for Python Developers: Requests vs Hosted API

You're building a Python app and need to take screenshots. Maybe you're:

  • Building a social media link preview service
  • Auto-generating OG images for Flask/Django apps
  • Testing web interfaces programmatically
  • Monitoring website changes
  • Archiving web content for compliance

You search "Python screenshot library" and find Selenium. It's been around for years. It's in PyPI. Thousands of projects use it.

Three days later, you're debugging WebDriver timeouts, wrestling with Firefox vs Chrome, and wondering why your screenshots look different on different machines.

There's a simpler way. Let me show you both approaches — Selenium and a hosted API — so you can decide which fits your project.

The Selenium Approach

Selenium is a browser automation framework. Here's the minimal example:

from selenium import webdriver
from selenium.webdriver.common.by import By

driver = webdriver.Chrome()
driver.get('https://example.com')
driver.save_screenshot('screenshot.png')
driver.quit()
Enter fullscreen mode Exit fullscreen mode

That's 7 lines. Simple, right?

But in production, this becomes a nightmare. Here's what you'll actually need:

1. Browser Driver Management

from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service

service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service)
Enter fullscreen mode Exit fullscreen mode

Now you need webdriver-manager as a dependency. It works, but adds complexity.

2. Headless Rendering & Options

from selenium.webdriver.chrome.options import Options

options = Options()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--disable-gpu')
options.add_argument(f'--window-size=1280,720')
options.add_argument(f'--user-agent=Mozilla/5.0...')

driver = webdriver.Chrome(options=options)
Enter fullscreen mode Exit fullscreen mode

Each flag is a gotcha. On Linux, you need --no-sandbox. On macOS, you don't. In Docker, you need --disable-dev-shm-usage. Get it wrong and your screenshots fail silently.

3. Waits and Timing

from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC

driver.get(url)

# Wait for page to load
WebDriverWait(driver, 10).until(
    EC.presence_of_all_elements_located((By.TAG_NAME, 'body'))
)

# Wait for JavaScript to finish (how long? nobody knows)
time.sleep(2)

# Now take screenshot
driver.save_screenshot('screenshot.png')
Enter fullscreen mode Exit fullscreen mode

How long should you wait? 1 second? 2? 5? Too short and your screenshot shows a blank page. Too long and your service times out.

4. Error Handling & Cleanup

try:
    driver = webdriver.Chrome(options=options)
    driver.get(url)
    driver.save_screenshot(filename)
except TimeoutException:
    print('Page took too long to load')
except WebDriverException as e:
    print(f'Selenium error: {e}')
except Exception as e:
    print(f'Unknown error: {e}')
finally:
    driver.quit()
Enter fullscreen mode Exit fullscreen mode

Now multiply this by every function that takes a screenshot. You're writing the same error handling 10+ times.

5. Infrastructure & Scaling

Selenium runs Chrome locally. Chrome takes 100–200MB of RAM per instance. If you have 10 concurrent requests, you need 1–2GB of RAM just for browsers.

In production:

  • Deploy to a Docker container with Chrome installed (adds 400MB to your image)
  • Monitor memory usage
  • Handle browser crashes
  • Implement a queue for concurrent requests
  • Scale horizontally (add more servers = add more complexity)

Real-world cost at 10,000 screenshots/month:

  • Server: $400/month (2GB RAM, 2 CPU)
  • DevOps/monitoring: $2,000/month (on-call, incident response)
  • Total: $2,400/month

And that's before you hit the limits of a single server.

The API Approach

Here's the same task with a hosted screenshot API:

import requests

response = requests.post(
    'https://api.pagebolt.io/api/v1/screenshot',
    headers={'Authorization': 'Bearer YOUR_API_KEY'},
    json={'url': 'https://example.com'}
)

with open('screenshot.png', 'wb') as f:
    f.write(response.content)
Enter fullscreen mode Exit fullscreen mode

That's it. 8 lines, including the file write. No browser management. No memory leaks. No infrastructure.

Here's a more realistic production-ready example:

import requests
import logging

logger = logging.getLogger(__name__)

def take_screenshot(url, filename, timeout=10):
    """Take a screenshot of a URL and save to file."""
    try:
        response = requests.post(
            'https://api.pagebolt.io/api/v1/screenshot',
            headers={'Authorization': f'Bearer {os.environ["PAGEBOLT_API_KEY"]}'},
            json={
                'url': url,
                'width': 1280,
                'height': 720,
                'blockAds': True,
                'blockBanners': True
            },
            timeout=timeout
        )

        if response.status_code != 200:
            logger.error(f'Screenshot failed: {response.status_code} {response.text}')
            raise Exception(f'API returned {response.status_code}')

        with open(filename, 'wb') as f:
            f.write(response.content)

        logger.info(f'Screenshot saved: {filename}')
        return filename

    except requests.Timeout:
        logger.error(f'Screenshot request timed out: {url}')
        raise
    except requests.RequestException as e:
        logger.error(f'Screenshot request failed: {e}')
        raise
Enter fullscreen mode Exit fullscreen mode

That's 35 lines of real, production-ready code. Compare it to managing Selenium.

Feature Comparison

Feature Selenium Hosted API
Setup time 1 hour 5 minutes
Lines of code 150+ (with pools, error handling) 30–50
Infrastructure You manage Chrome, memory, scaling Handled for you
Cost (10k screenshots/month) $2,400/month $29/month
Device presets Manual setup Built-in (25+ presets)
PDF generation Extra library, extra complexity One parameter
Reliable waits Guessing (sleep) Built-in (networkidle)
Retry logic You implement it Included
Uptime Depends on your infrastructure 99.9% SLA
Monitoring You do it Included

When to Use Selenium

Use Selenium if:

  • You're testing internal web applications (API can't reach them)
  • You need to interact with JavaScript heavily (click buttons, fill forms, verify state changes)
  • You're learning web automation (educational context)
  • You have existing Selenium test infrastructure and want to integrate screenshots
  • You can afford the infrastructure and maintenance cost

When to Use an API

Use an API if:

  • You want screenshots in production without infrastructure headaches
  • You need reliability and uptime guarantees
  • You're taking static screenshots (no complex interactions needed)
  • You want to scale without managing more servers
  • You want to focus on your app, not on browser management

Real-World Example: Django Link Preview Service

You're building a service that generates preview cards for shared links (like Discord does).

With Selenium:

  1. Install Chrome in Docker
  2. Set up WebDriver
  3. Handle timeouts and retries
  4. Monitor memory usage
  5. Deploy to a server with enough RAM
  6. Scale horizontally as traffic grows
  7. Cost: $2,400+/month in infrastructure

With an API:

  1. pip install requests
  2. Call the API endpoint
  3. Save the image
  4. Return to user
  5. Cost: $29/month

The API approach takes one day. The Selenium approach takes two weeks.

Code Example: Flask Link Preview

from flask import Flask, request, jsonify
import requests
import os

app = Flask(__name__)

@app.route('/api/preview', methods=['POST'])
def create_preview():
    """Generate a link preview card."""
    data = request.get_json()
    url = data.get('url')

    if not url:
        return jsonify({'error': 'URL required'}), 400

    try:
        # Step 1: Take screenshot
        response = requests.post(
            'https://api.pagebolt.io/api/v1/screenshot',
            headers={'Authorization': f'Bearer {os.environ["PAGEBOLT_API_KEY"]}'},
            json={
                'url': url,
                'width': 1200,
                'height': 630,
                'blockAds': True,
                'blockBanners': True
            },
            timeout=10
        )

        if response.status_code != 200:
            return jsonify({'error': 'Screenshot failed'}), 500

        # Step 2: Get page metadata (optional)
        meta_response = requests.post(
            'https://api.pagebolt.io/api/v1/inspect',
            headers={'Authorization': f'Bearer {os.environ["PAGEBOLT_API_KEY"]}'},
            json={'url': url},
            timeout=10
        )

        metadata = meta_response.json() if meta_response.ok else {}

        # Step 3: Return preview card
        return jsonify({
            'title': metadata.get('title', 'Untitled'),
            'description': metadata.get('description', ''),
            'image_url': response.url,  # CDN URL for the screenshot
            'url': url
        })

    except requests.Timeout:
        return jsonify({'error': 'Request timed out'}), 504
    except Exception as e:
        return jsonify({'error': str(e)}), 500

if __name__ == '__main__':
    app.run(debug=False)
Enter fullscreen mode Exit fullscreen mode

That's everything. No Selenium. No browser management. No infrastructure.

Hybrid Approach

Some teams use both:

  • Selenium for QA automation (testing interactions, verifying UI state)
  • API for customer-facing features (link previews, screenshot galleries)

Selenium shines when you need to interact with the page. APIs shine when you just need static screenshots.

The Bottom Line

Selenium is powerful if you need it. But most teams don't. They need screenshots, and they need them to work without becoming infrastructure engineers.

An API costs $29/month and takes 5 minutes to integrate. Selenium costs $2,400+/month and two weeks to get right.

The choice depends on your use case. But for most Python projects, the API wins on simplicity, cost, and reliability.


Try PageBolt Free

100 requests/month. No credit card. No infrastructure required.

Start your free trial and see how simple screenshots can be in Python.

Top comments (0)