I Built a Tool That Lets You Solve CAPTCHAs Once and Automate Forever

#playwright #automation #python #webdev

Every automation engineer has hit this wall. Your headless browser can scrape 10,000 pages, but it can't solve a CAPTCHA.

You build the perfect scraper. It handles pagination, retries, rate limiting — everything. Then you hit a login page with a CAPTCHA, and your entire pipeline falls apart.

I got tired of this, so I built SessionKeeper.

The Problem Nobody Talks About

Modern websites have layered defenses:

CAPTCHAs that block automated logins
Bot detection (Cloudflare, DataDome, PerimeterX) that fingerprints headless browsers
Session expiry that forces re-authentication every few hours
MFA flows that require human interaction

The usual workarounds all have drawbacks:

Approach	Problem
CAPTCHA solving services	$2-3 per 1,000 solves, unreliable, ethically questionable
Stealing cookies from your real browser	Breaks when cookies expire, fragile
Keeping a browser open 24/7	Resource hog, sessions still expire
Rotating proxies + new accounts	Expensive, against most ToS

What if you could just log in once, by hand, and then automate everything until the session actually expires?

Enter SessionKeeper

SessionKeeper is a Python tool that manages browser sessions for automation. The core idea is simple:

Detect when a session is expired
Open a visible browser so a human can log in (solve CAPTCHAs, do MFA, whatever)
Save the authenticated session
Return to headless automation using the saved session
Only bother you again when the session actually expires

You solve the CAPTCHA once. SessionKeeper handles the rest.

Quick Start

pip install playwright && playwright install firefox

Use it in your automation:

from sessionkeeper import SessionKeeper

async with SessionKeeper("reddit") as sk:
    page = await sk.get_authenticated_page("https://reddit.com")
    # You're logged in. Do your automation.
    await page.goto("https://reddit.com/r/blender/submit")

The first time you run this, a browser window pops up. You log into Reddit normally — solve the CAPTCHA, enter your credentials, do whatever the site asks. Once you're in, SessionKeeper saves the session and closes the visible browser.

Every subsequent run uses the saved session. No browser window. No CAPTCHA. Pure headless automation.

When the session eventually expires, SessionKeeper detects it and opens the browser again. One login, and you're good for another session cycle.

CLI Usage

Pre-authenticate from the command line, then use sessions in your scripts:

# Authenticate with a site
python sessionkeeper.py auth reddit

# Check if a session is still valid
python sessionkeeper.py check reddit

# List all saved sessions
python sessionkeeper.py status

# Clear an expired session
python sessionkeeper.py clear reddit

Built-in Site Configs

SessionKeeper ships with configurations for 5 sites out of the box:

Reddit — detects login state via user menu elements
Gumroad — handles reCAPTCHA on login
DEV.to — dashboard detection
Twitter/X — multi-step login flow
note.com — Japanese blogging platform

Each config defines the login URL, a check URL to verify auth, CSS selectors for success/failure states.

Custom Site Configuration

Need to automate a site that isn't built in? Pass a config dict:

config = {
    "login_url": "https://mysite.com/login",
    "check_url": "https://mysite.com/dashboard",
    "success_indicator": ".user-avatar, a[href*='settings']",
    "failure_indicator": "input[type='password']",
    "display_name": "My Site",
}

async with SessionKeeper("mysite", config=config) as sk:
    page = await sk.get_authenticated_page("https://mysite.com/dashboard")

The success_indicator and failure_indicator are CSS selectors that SessionKeeper checks after navigating to check_url. If the success selector matches, the session is valid. If the failure selector matches (or success doesn't), it's time to re-authenticate.

How It Works Under the Hood

SessionKeeper is built on Playwright and uses its storage_state persistence:

1. Check for saved session file (~/.sessionkeeper/reddit_session.json)
2. If exists → load into headless browser → navigate to check_url → verify auth
3. If valid → return authenticated page (headless)
4. If expired/missing → launch VISIBLE browser → navigate to login_url
5. Wait for human to complete login + CAPTCHA
6. On success → save storage_state → close visible browser → return headless page

The saved state includes all cookies (including httpOnly), localStorage, and sessionStorage. Because Playwright manages a real Firefox instance, sites see a normal browser.

Why Not Just Use CAPTCHA Solving Services?

Cost adds up fast. At $2-3 per 1,000 solves, running daily automation across multiple sites costs $50-100/month. SessionKeeper costs you 30 seconds of manual login per session cycle (sessions typically last hours to days).

Reliability is inconsistent. CAPTCHA services have solve rates of 85-95%. SessionKeeper's solve rate is 100% because a human is doing it.

New CAPTCHA types break services. Every time Google updates reCAPTCHA or Cloudflare changes Turnstile, solving services lag behind. A human doesn't have this problem.

Real-World Use Cases

Social media automation — posting to Reddit, Twitter without re-authenticating every run
E-commerce monitoring — price tracking on sites that require login
Content management — automated publishing to platforms with CAPTCHA walls
Internal tools — logging into dashboards for automated reporting