Using Playwright + Bright Data Browser API in a Kubernetes Scraping Pipeline

#playwright #python #webscraping #kubernetes

A Playwright script that works on your laptop can still fail once it becomes a scheduled scraping worker in production.

In this post, I break down a practical setup using Playwright, Bright Data Browser API, and Kubernetes Jobs/CronJobs to run browser-based scraping more reliably for JavaScript-heavy targets.

It covers:

remote browser execution over CDP
lightweight worker containers
Kubernetes Job/CronJob scheduling
secret handling with Kubernetes Secrets
avoiding overlapping runs in production

If you are building real scraping pipelines rather than one-off demos, this walkthrough may be useful.

https://levelup.gitconnected.com/using-playwright-bright-datas-browser-api-in-a-kubernetes-deployed-scraping-pipeline-e914b4e1800e?sk=19e9162cbbf9e7cb6e7b8f0c21510c3b

Top comments (1)

Alex Serebriakov • Apr 8

docker + chromium sandbox flags trip up so many people — the --no-sandbox situation is annoying

we moved screenshot generation to snapapi.pics and deleted the chromium docker setup entirely