We got tired of broken chromedriver and built our own — with SOCKS5, multiprocessing, and captcha support

#programming #python #selenium #opensource

How it started

We needed web scraping. Simple enough, right? Just grab Selenium and go.

Except reality had other plans: CloudFlare blocked us within seconds, Chrome couldn't handle SOCKS5 proxies with authentication, and running multiple processes caused them to crash each other.

We found undetected-chromedriver — it bypassed bot detection, which was exactly what we needed. But the maintainer had abandoned it, and it was broken in all the places that mattered to us.

So we decided: let's fork it and fix it ourselves.

What we added

SOCKS5 with authentication

Chrome doesn't natively support SOCKS5 proxies that require login credentials. Our solution: spin up a local proxy server inside the library itself. Chrome thinks it's talking to localhost with no auth — our proxy quietly adds the credentials and forwards everything.

driver = uc.Chrome(proxy={
    "host": "1.2.3.4",
    "port": 1080,
    "user": "my_login",
    "pass": "my_password"
})

Multiprocessing without headaches

The original library with multiple workers was like one bathroom shared by the entire apartment building — everyone fighting for the same door. Each new process restarted chromedriver, crashing all previous workers.

Our fix: one master (patched) copy of the driver exists, and each worker gets its own isolated copy. When the worker finishes, the copy is deleted. We also added inter-process locking — 100 workers won't trigger 100 driver downloads.

from multiprocessing import Pool
import rtfox_browser as uc

def run_worker(worker_id):
    driver = uc.Chrome(worker_id=worker_id, proxy={...})
    driver.get("https://example.com")
    driver.quit()

with Pool(4) as pool:
    pool.map(run_worker, ["w1", "w2", "w3", "w4"])

Captcha module — just drop a file in a folder

We wanted a system where adding new solvers required zero changes to the library itself. The result: write a class, drop the .py file into the solvers/ folder, and the method automatically appears in CaptchaService.

captcha = CaptchaService(api_key="YOUR_KEY", driver=driver)
print(captcha.available())  # ['ebay_hcaptcha', 'aws_image']
captcha.ebay_hcaptcha()

Wrapping up

Three problems, three solutions. The library is called rtfox-browser and it's available on PyPI:

pip install rtfox-browser

If you're into scraping and have run into the same walls — give it a try. Feedback is very welcome!

DEV Community