Selenium can take a screenshot in one line, but it captures only the visible viewport by default, and unlike Playwright or Puppeteer it has no built-in full-page flag. That gap is where most of the work hides. This guide covers the basic capture, the full-page workarounds Selenium actually needs, single-element shots, and the point where an API is less effort than driving a browser.
The basic Selenium screenshot
In Python, the driver captures the current window directly:
from selenium import webdriver
driver = webdriver.Chrome()
driver.get('https://example.com')
driver.save_screenshot('example.png')
driver.quit()
In Java, you cast the driver to TakesScreenshot and choose an output type:
File shot = ((TakesScreenshot) driver).getScreenshotAs(OutputType.FILE);
Both capture the visible viewport only. If the page is taller than the window, everything below the fold is missing. This is the single biggest difference from Puppeteer and Playwright, which both expose a one-line full-page option.
Full-page screenshots (the part Selenium does not do for you)
Core Selenium has no fullPage flag. You have three real options.
1. Chrome DevTools Protocol. Selenium 4 can send CDP commands. Page.captureScreenshot with captureBeyondViewport: true renders the entire page:
result = driver.execute_cdp_cmd('Page.captureScreenshot', {
'captureBeyondViewport': True,
'fromSurface': True,
})
import base64
with open('full.png', 'wb') as f:
f.write(base64.b64decode(result['data']))
2. Firefox's native method. GeckoDriver exposes a full-page call that Chrome does not:
driver.get_full_page_screenshot_as_file('full.png') # Firefox only
3. Scroll and stitch. Resize the window to the full scroll height, then capture. This is brittle on lazy-loaded pages and pages with position: fixed headers, which repeat or float in the stitched result.
The takeaway: full-page capture in Selenium is browser-specific and more code than the other tools. If full page is your main need, that is worth knowing before you build on it.
Screenshotting a single element
Call the screenshot method on the WebElement instead of the driver. Selenium crops to the element's bounding box:
from selenium.webdriver.common.by import By
card = driver.find_element(By.ID, 'pricing-card')
card.screenshot('card.png')
If the element is below the fold, scroll it into view first so it actually renders before the capture:
driver.execute_script('arguments[0].scrollIntoView();', card)
Waiting so the capture is not blank
The most common bug is capturing before the page is ready. Use an explicit wait keyed to something real on the page rather than a fixed sleep:
from selenium.webdriver.support.ui import WebDriverWait
from selenium.webdriver.support import expected_conditions as EC
WebDriverWait(driver, 10).until(
EC.presence_of_element_located((By.ID, 'pricing-card'))
)
driver.save_screenshot('ready.png')
For web fonts specifically, wait on document.fonts.ready via execute_script, otherwise the screenshot can show a fallback font.
Where running Selenium just for screenshots gets expensive
Selenium is a browser-testing framework. Using it only to produce images means you own everything that comes with driving a browser, none of which is the screenshot itself:
- Drivers and browsers. You provision ChromeDriver or GeckoDriver, keep it matched to the browser version, and install the browser plus its system libraries in every environment.
- Full-page workarounds. The CDP or scroll-and-stitch code above is yours to maintain across browser updates.
- Memory and concurrency. A browser per job, closed on every error path, and a pool with back-pressure once you capture at volume.
If screenshots are the goal and the test suite is not, that is a lot of moving parts for an image.
The same capture as an API call
A screenshot API runs the browser for you. The full-page capture that took browser-specific code above becomes one request, with a real full_page flag:
curl https://api.grabbit.live/v1/grabs \
-H "Authorization: Bearer sk_live_..." \
-H "Content-Type: application/json" \
-d '{
"url": "https://example.com",
"width": 1280,
"full_page": true,
"format": "webp"
}'
The response includes a hosted image_url you can use directly:
{
"id": "grb_01jx...",
"status": "done",
"image_url": "https://cdn.grabbit.live/grabs/grb_01jx....webp",
"width": 1280,
"format": "webp",
"bytes": 48210,
"execution_ms": 1180
}
The Selenium patterns map onto request parameters: viewport size is width (320 to 1920) and height (240 to 1080), full page is full_page, the element-screenshot pattern is a selector field, and the explicit wait becomes delay_ms (0 to 10000). format is png, jpeg, or webp.
Which to use
Reach for Selenium when you already run a Selenium test suite and a screenshot is one assertion or artifact among many. Reach for an API when screenshots are the actual product feature, especially if you need full-page captures and would rather not maintain CDP calls and driver versions to get them.
For the same comparison in other tools, see screenshots in Puppeteer and screenshots in Playwright. If you are weighing hosted options, the honest comparison of screenshot APIs covers the trade-offs without the marketing.
Originally published on the Grabbit blog.
Top comments (0)