You're building a Python app and need to take screenshots. Maybe you're:
- Building a social media link preview service
- Auto-generating OG images for Flask/Django apps
- Testing web interfaces programmatically
- Monitoring website changes
- Archiving web content for compliance
You search "Python screenshot library" and find Selenium. It's been around for years. It's in PyPI. Thousands of projects use it.
Three days later, you're debugging WebDriver timeouts, wrestling with Firefox vs Chrome, and wondering why your screenshots look different on different machines.
There's a simpler way. Let me show you both approaches — Selenium and a hosted API — so you can decide which fits your project.
The Selenium Approach
Selenium is a browser automation framework. Here's the minimal example:
from selenium import webdriver
from selenium.webdriver.common.by import By
driver = webdriver.Chrome()
driver.get('https://example.com')
driver.save_screenshot('screenshot.png')
driver.quit()
That's 7 lines. Simple, right?
But in production, this becomes a nightmare. Here's what you'll actually need:
1. Browser Driver Management
from selenium import webdriver
from webdriver_manager.chrome import ChromeDriverManager
from selenium.webdriver.chrome.service import Service
service = Service(ChromeDriverManager().install())
driver = webdriver.Chrome(service=service)
Now you need webdriver-manager as a dependency. It works, but adds complexity.
2. Headless Rendering & Options
from selenium.webdriver.chrome.options import Options
options = Options()
options.add_argument('--headless')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--disable-gpu')
options.add_argument(f'--window-size=1280,720')
options.add_argument(f'--user-agent=Mozilla/5.0...')
driver = webdriver.Chrome(options=options)
Each flag is a gotcha. On Linux, you need --no-sandbox. On macOS, you don't. In Docker, you need --disable-dev-shm-usage. Get it wrong and your screenshots fail silently.
3. Waits and Timing
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
driver.get(url)
# Wait for page to load
WebDriverWait(driver, 10).until(
EC.presence_of_all_elements_located((By.TAG_NAME, 'body'))
)
# Wait for JavaScript to finish (how long? nobody knows)
time.sleep(2)
# Now take screenshot
driver.save_screenshot('screenshot.png')
How long should you wait? 1 second? 2? 5? Too short and your screenshot shows a blank page. Too long and your service times out.
4. Error Handling & Cleanup
try:
driver = webdriver.Chrome(options=options)
driver.get(url)
driver.save_screenshot(filename)
except TimeoutException:
print('Page took too long to load')
except WebDriverException as e:
print(f'Selenium error: {e}')
except Exception as e:
print(f'Unknown error: {e}')
finally:
driver.quit()
Now multiply this by every function that takes a screenshot. You're writing the same error handling 10+ times.
5. Infrastructure & Scaling
Selenium runs Chrome locally. Chrome takes 100–200MB of RAM per instance. If you have 10 concurrent requests, you need 1–2GB of RAM just for browsers.
In production:
- Deploy to a Docker container with Chrome installed (adds 400MB to your image)
- Monitor memory usage
- Handle browser crashes
- Implement a queue for concurrent requests
- Scale horizontally (add more servers = add more complexity)
Real-world cost at 10,000 screenshots/month:
- Server: $400/month (2GB RAM, 2 CPU)
- DevOps/monitoring: $2,000/month (on-call, incident response)
- Total: $2,400/month
And that's before you hit the limits of a single server.
The API Approach
Here's the same task with a hosted screenshot API:
import requests
response = requests.post(
'https://api.pagebolt.io/api/v1/screenshot',
headers={'Authorization': 'Bearer YOUR_API_KEY'},
json={'url': 'https://example.com'}
)
with open('screenshot.png', 'wb') as f:
f.write(response.content)
That's it. 8 lines, including the file write. No browser management. No memory leaks. No infrastructure.
Here's a more realistic production-ready example:
import requests
import logging
logger = logging.getLogger(__name__)
def take_screenshot(url, filename, timeout=10):
"""Take a screenshot of a URL and save to file."""
try:
response = requests.post(
'https://api.pagebolt.io/api/v1/screenshot',
headers={'Authorization': f'Bearer {os.environ["PAGEBOLT_API_KEY"]}'},
json={
'url': url,
'width': 1280,
'height': 720,
'blockAds': True,
'blockBanners': True
},
timeout=timeout
)
if response.status_code != 200:
logger.error(f'Screenshot failed: {response.status_code} {response.text}')
raise Exception(f'API returned {response.status_code}')
with open(filename, 'wb') as f:
f.write(response.content)
logger.info(f'Screenshot saved: {filename}')
return filename
except requests.Timeout:
logger.error(f'Screenshot request timed out: {url}')
raise
except requests.RequestException as e:
logger.error(f'Screenshot request failed: {e}')
raise
That's 35 lines of real, production-ready code. Compare it to managing Selenium.
Feature Comparison
| Feature | Selenium | Hosted API |
|---|---|---|
| Setup time | 1 hour | 5 minutes |
| Lines of code | 150+ (with pools, error handling) | 30–50 |
| Infrastructure | You manage Chrome, memory, scaling | Handled for you |
| Cost (10k screenshots/month) | $2,400/month | $29/month |
| Device presets | Manual setup | Built-in (25+ presets) |
| PDF generation | Extra library, extra complexity | One parameter |
| Reliable waits | Guessing (sleep) | Built-in (networkidle) |
| Retry logic | You implement it | Included |
| Uptime | Depends on your infrastructure | 99.9% SLA |
| Monitoring | You do it | Included |
When to Use Selenium
Use Selenium if:
- You're testing internal web applications (API can't reach them)
- You need to interact with JavaScript heavily (click buttons, fill forms, verify state changes)
- You're learning web automation (educational context)
- You have existing Selenium test infrastructure and want to integrate screenshots
- You can afford the infrastructure and maintenance cost
When to Use an API
Use an API if:
- You want screenshots in production without infrastructure headaches
- You need reliability and uptime guarantees
- You're taking static screenshots (no complex interactions needed)
- You want to scale without managing more servers
- You want to focus on your app, not on browser management
Real-World Example: Django Link Preview Service
You're building a service that generates preview cards for shared links (like Discord does).
With Selenium:
- Install Chrome in Docker
- Set up WebDriver
- Handle timeouts and retries
- Monitor memory usage
- Deploy to a server with enough RAM
- Scale horizontally as traffic grows
- Cost: $2,400+/month in infrastructure
With an API:
pip install requests- Call the API endpoint
- Save the image
- Return to user
- Cost: $29/month
The API approach takes one day. The Selenium approach takes two weeks.
Code Example: Flask Link Preview
from flask import Flask, request, jsonify
import requests
import os
app = Flask(__name__)
@app.route('/api/preview', methods=['POST'])
def create_preview():
"""Generate a link preview card."""
data = request.get_json()
url = data.get('url')
if not url:
return jsonify({'error': 'URL required'}), 400
try:
# Step 1: Take screenshot
response = requests.post(
'https://api.pagebolt.io/api/v1/screenshot',
headers={'Authorization': f'Bearer {os.environ["PAGEBOLT_API_KEY"]}'},
json={
'url': url,
'width': 1200,
'height': 630,
'blockAds': True,
'blockBanners': True
},
timeout=10
)
if response.status_code != 200:
return jsonify({'error': 'Screenshot failed'}), 500
# Step 2: Get page metadata (optional)
meta_response = requests.post(
'https://api.pagebolt.io/api/v1/inspect',
headers={'Authorization': f'Bearer {os.environ["PAGEBOLT_API_KEY"]}'},
json={'url': url},
timeout=10
)
metadata = meta_response.json() if meta_response.ok else {}
# Step 3: Return preview card
return jsonify({
'title': metadata.get('title', 'Untitled'),
'description': metadata.get('description', ''),
'image_url': response.url, # CDN URL for the screenshot
'url': url
})
except requests.Timeout:
return jsonify({'error': 'Request timed out'}), 504
except Exception as e:
return jsonify({'error': str(e)}), 500
if __name__ == '__main__':
app.run(debug=False)
That's everything. No Selenium. No browser management. No infrastructure.
Hybrid Approach
Some teams use both:
- Selenium for QA automation (testing interactions, verifying UI state)
- API for customer-facing features (link previews, screenshot galleries)
Selenium shines when you need to interact with the page. APIs shine when you just need static screenshots.
The Bottom Line
Selenium is powerful if you need it. But most teams don't. They need screenshots, and they need them to work without becoming infrastructure engineers.
An API costs $29/month and takes 5 minutes to integrate. Selenium costs $2,400+/month and two weeks to get right.
The choice depends on your use case. But for most Python projects, the API wins on simplicity, cost, and reliability.
Try PageBolt Free
100 requests/month. No credit card. No infrastructure required.
Start your free trial and see how simple screenshots can be in Python.
Top comments (0)