Logging into websites every single run is the fastest way to:
π« trigger anti-bot systems
π§© face endless CAPTCHAs
π get accounts locked
π slow down your scraper
The correct pattern is simple:
Login once β persist browser profile β reuse it forever (until expiration).
In this article I will show how this works in a real project:
π https://github.com/AmaLS367/parts_info_collector
This project automates data extraction from Geminiβs web UI and keeps authentication between runs using a persistent browser profile.
π§ The Core Idea
Instead of exporting cookies manually, Playwright can launch Chromium with a persistent user profile directory.
That directory stores:
cookies πͺ
localStorage
IndexedDB
login tokens
session metadata
Once created, it becomes your reusable authenticated identity.
π How the Project Does It
In parts_info_collector, authentication is handled via:
a user-data/ directory
first interactive run via first_start.bat
manual login to Gemini
all next runs reuse the same browser profile
So the flow is:
1οΈβ£ Run once with UI
2οΈβ£ Log in manually
3οΈβ£ Browser profile is saved
4οΈβ£ Next runs are headless and already authenticated
No repeated login.
No constant CAPTCHA hell π
π Persistent Context in Playwright
This is the key API:
browser_context = playwright.chromium.launch_persistent_context(
user_data_dir="user-data",
headless=True
)
That single folder is everything.
If it exists, Playwright loads it.
If not, you run in visible mode and authenticate.
π§βπ» First Run: Create the Session
In the project, the first launch happens with UI enabled so you can log in:
browser_context = playwright.chromium.launch_persistent_context(
user_data_dir="user-data",
headless=False
)
You open Gemini, authenticate manually, and close the browser.
From that moment:
π user-data/ contains your session.
β»οΈ All Next Runs: Fully Headless
Subsequent executions simply reuse the same folder:
browser_context = playwright.chromium.launch_persistent_context(
user_data_dir="user-data",
headless=True
)
You are already logged in.
No forms.
No credentials.
No redirects.
β οΈ What Happens When the Session Expires?
Eventually cookies die.
The project handles this operationally:
you delete user-data/
or rerun first_start.bat
login again
profile is recreated
Simple, manual, and reliable.
You can extend this with:
detecting redirect to /login
checking DOM markers
auto-relogin logic
alerting
π‘οΈ Why Persistent Profiles Beat Cookie Dumps
Using a persistent profile directory is stronger than just exporting cookies:
β
keeps IndexedDB tokens
β
survives browser restarts
β
mimics real user Chrome profile
β
less suspicious than scripted logins
β
perfect for long-running monitors
π When You Should Use This Pattern
This approach is ideal for:
π price monitors
π€ automation bots
π§ AI web scrapers
π background workers
π periodic collectors
Anything that runs for weeks.
π Real Repository
Full project:
π https://github.com/AmaLS367/parts_info_collector
π§Ύ TL;DR
β
use launch_persistent_context
β
store profile in user-data/
β
login once manually
β
reuse forever
β
refresh only when expired
Top comments (3)
Solid pattern, and the launch_persistent_context approach is definitely cleaner than dumping cookies manually.
One thing worth knowing for anyone running this at scale or on multiple machines: the persistent profile solves the "login again every run" problem, but a saved user-data dir still produces a browser that looks like a headless Chromium to fingerprinting systems. Canvas hash, WebGL renderer, screen resolution, those come from the machine running it, not from the profile.
So if you're running this on a cloud VM or several parallel instances, sites that use Akamai/Kasada/Cloudflare Bot Management will often still catch you even with a perfectly preserved session. The session keeps you authenticated, it doesn't make the browser look human.
Great writeup though - the "login once, reuse forever" flow is exactly the right mental model for long-running scrapers.
Thanks for sharing your opinion! Appreciate your right thoughts)
Cheers Ama, glad it was useful. It's one of those things that only bites you after you've already built the session management piece and think you're done!