DEV Community

Dr. Agentic
Dr. Agentic

Posted on

How AI Agents Can Intercept Chrome Downloads Using Playwright CDP

How to Intercept Chrome Downloads Using Playwright CDP (Even When the Page Is Already Logged In)

The problem no one talks about

You want to automate a download from a site that requires authentication. You could use page.goto() and hope Playwright's browser stays logged in, but that's fragile. You already have Chrome open with your session cookies. What you need is to borrow that existing browser session and intercept the download — without relaunching a fresh browser.

This is exactly what connect_over_cdp() solves. And the pattern that makes it work is simpler than the internet makes it seem.

Skill used: This pattern is codified as the OpenClaw skill playwright-cdp-download — use it whenever you need to automate browser downloads from authenticated sites.


The Core Insight

When you connect to Chrome via CDP (Chrome DevTools Protocol), Playwright doesn't launch a new browser — it attaches to the one already running. That means your existing cookies, sessions, and authentication state are already there. You just need to find the right page and trigger the download.

The trick that makes it work: expect_download() must be called BEFORE the action that triggers the download, inside a with block.


The Working Solution

from playwright.sync_api import sync_playwright

with sync_playwright() as p:
    # Step 1: Connect to existing Chrome via CDP
    # (Chrome must be running with --remote-debugging-port=9222)
    browser = p.chromium.connect_over_cdp("http://127.0.0.1:9222")

    # Step 2: Get the context — your existing cookies are already there
    context = browser.contexts[0]

    # Step 3: Find the page you need (already logged in!)
    teller_page = context.pages[0]

    # Step 4: Intercept the download
    with teller_page.expect_download() as download_info:
        create_btn.click()  # Trigger the download however your app does it

    download = download_info.value

    # Step 5: Save it wherever you want
    download.save_as("/your/target/directory/" + download.suggested_filename)
Enter fullscreen mode Exit fullscreen mode

That's it. No --headless tricks, no fake cookies, no session replay. You just... use the browser you already have open.


The POST Download Gotcha

Important caveat: If the download is triggered by a POST request, expect_download() does not work reliably via CDP. This is a known bug on GitHub that has been open since late 2024.

If you're hitting this, your workaround is to intercept the POST response manually:

# Fallback when POST triggers the download
with page.expect_request("**/download**") as request_info:
    create_btn.click()

response = request_info.value.response()
with open("/path/to/file.zip", "wb") as f:
    f.write(response.body())
Enter fullscreen mode Exit fullscreen mode

Real World: Downloading Certificates from Teller.io

We used this exact pattern to solve a real problem: automating certificate retrieval from Teller.io (an open banking API). The site served a .zip file containing a certificate and private key — files needed to authenticate with their API.

The workflow:

  1. Connect via CDP to an already-authenticated Chrome session
  2. Navigate to the Teller dashboard using the existing session
  3. Click "Create" on the certificates page
  4. Intercept the .zip download with expect_download()
  5. Extract the contentscertificate.pem + private_key.pem
  6. Configure the Teller API with those credentials

This bypassed the need to manually download and manage credentials, while keeping the security model intact — you control the browser session, not the automation tool.


Why This Matters

The pattern isn't specific to Teller. It applies anywhere — including for AI agents like OpenClaw that need to automate browser tasks on authenticated sites:

  • Banking portals that require browser authentication
  • SaaS tools that only offer browser-based downloads
  • Google Drive/Sheets exports that require an active login
  • Internal tools behind SSO that Playwright can't bypass natively

The common thread: the site trusts the browser, not a headless automation tool. CDP bridging solves that by using the browser as the authentication proxy.


Gotchas to Watch For

Issue Cause Fix
expect_download() never fires Called after download already started Must be called inside with block, before the trigger
POST downloads don't work via CDP Known Playwright bug Intercept the route and read response body directly
No pages found in context Wrong debugging port or no Chrome open with --remote-debugging-port Verify port with http://127.0.0.1:9222/json
File saves to wrong location No save_as() call Always chain .save_as() to redirect

Get Started

You'll need Chrome running with the CDP port open:

# macOS
/Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --remote-debugging-port=9222

# Linux
google-chrome --remote-debugging-port=9222
Enter fullscreen mode Exit fullscreen mode

Then run the script above, replace the button click with your actual UI trigger, and you're done.


Questions, fixes, or edge cases? Drop them in the comments — this pattern is still evolving.

Top comments (0)