DEV Community

agenthustler
agenthustler

Posted on

ThorData Proxies Tutorial: Rotate Residential IPs for Web Scraping in 2026

Every web scraper hits the same wall: IP bans, CAPTCHAs, and geo-restricted content. You rotate user agents, add delays, retry failed requests — and still get blocked after a few hundred requests.

The fix? Residential proxies that route your traffic through real consumer IPs. In this tutorial, I'll show you how to use ThorData's residential proxy network to scrape reliably with Python — from basic requests to async rotation patterns.

Why Residential Proxies Matter for Scraping

Datacenter proxies are cheap but easy to detect. Websites fingerprint IP ranges owned by cloud providers (AWS, GCP, Hetzner) and block them aggressively.

Residential proxies use IPs assigned to real ISPs and households. To a target website, your request looks like a regular user browsing from their home. This matters for three scenarios:

  1. Anti-bot systems (Cloudflare, DataDome, PerimeterX) that block datacenter IPs on sight
  2. Geo-restricted content — you need an IP in Brazil to see Brazilian pricing, a UK IP for UK news archives
  3. Rate limits — rotating through thousands of IPs means no single IP gets flagged

Without residential proxies, you're fighting an arms race you'll lose. With them, you're invisible.

ThorData: What You Get

ThorData runs a residential proxy pool of 60M+ IPs across 195+ countries. Here's what makes it practical for developers:

  • Pay-as-you-go pricing — no monthly commitment, no contracts. You pay per GB of traffic.
  • Sticky sessions — keep the same IP for up to 30 minutes when you need session persistence (login flows, paginated scraping).
  • City-level targeting — target specific countries, states, or cities via the proxy username string.
  • HTTP/HTTPS/SOCKS5 support — works with any HTTP client library.
  • Dashboard with real-time usage — track bandwidth, success rates, and costs.

For scraping projects, the combination of large IP pool + granular geo-targeting + no contracts is hard to beat.

Getting Started: Sign Up and Get Credentials

  1. Create an account at ThorData. You'll get a free trial with bandwidth to test.
  2. Once logged in, navigate to the DashboardResidential Proxies section.
  3. Copy your proxy credentials: you'll get a host, port, username, and password.

Your proxy endpoint will look like this:

Host: geo.thordata.net
Port: 9000
Username: your-username-res-any
Password: your-password
Enter fullscreen mode Exit fullscreen mode

The -res-any suffix in the username tells ThorData to use any available residential IP. You can change any to a country code like us, gb, or br for geo-targeting.

Basic Python Tutorial: Requests with ThorData Proxy

Let's start with the simplest approach — using the requests library with ThorData as your proxy:

import requests

PROXY_HOST = "geo.thordata.net"
PROXY_PORT = 9000
PROXY_USER = "your-username-res-any"
PROXY_PASS = "your-password"

proxy_url = f"http://{PROXY_USER}:{PROXY_PASS}@{PROXY_HOST}:{PROXY_PORT}"

proxies = {
    "http": proxy_url,
    "https": proxy_url,
}

response = requests.get(
    "https://httpbin.org/ip",
    proxies=proxies,
    timeout=30,
)

print(response.json())
# {"origin": "186.215.xx.xx"}  — a residential IP, not your server's IP
Enter fullscreen mode Exit fullscreen mode

Each request gets a random IP from the pool. No configuration needed — ThorData handles rotation automatically.

Geo-Targeting

Target a specific country by changing the username suffix:

# US IPs only
PROXY_USER = "your-username-res-us"

# UK IPs only
PROXY_USER = "your-username-res-gb"

# Brazil, São Paulo specifically
PROXY_USER = "your-username-res-br-city-saopaulo"
Enter fullscreen mode Exit fullscreen mode

Sticky Sessions

When you need the same IP across multiple requests (e.g., maintaining a logged-in session), add a session ID:

import random

session_id = random.randint(10000, 99999)
PROXY_USER = f"your-username-res-us-session-{session_id}"

# All requests with this proxy_user will use the same IP
# for up to 30 minutes
Enter fullscreen mode Exit fullscreen mode

Error Handling

Production scrapers need to handle proxy errors gracefully:

import requests
from requests.exceptions import ProxyError, Timeout

def fetch_with_proxy(url, max_retries=3):
    for attempt in range(max_retries):
        try:
            response = requests.get(url, proxies=proxies, timeout=30)

            if response.status_code == 407:
                raise Exception("Proxy auth failed — check credentials")

            if response.status_code == 403:
                print(f"Blocked on attempt {attempt + 1}, rotating IP...")
                continue  # next attempt gets a new IP automatically

            response.raise_for_status()
            return response

        except (ProxyError, Timeout) as e:
            print(f"Proxy error on attempt {attempt + 1}: {e}")
            if attempt == max_retries - 1:
                raise

    return None

# Usage
result = fetch_with_proxy("https://example.com/data")
if result:
    print(result.text[:200])
Enter fullscreen mode Exit fullscreen mode

Key points:

  • 407 means your proxy credentials are wrong. Don't retry — fix the username/password.
  • 403 often means the target site blocked that IP. A retry gets a fresh IP automatically.
  • Timeouts happen with residential proxies more than datacenter ones. Set a reasonable timeout (30s) and retry.

Advanced: Async Scraping with aiohttp

For high-throughput scraping, use aiohttp to make concurrent requests through ThorData:

import asyncio
import aiohttp

PROXY_URL = "http://user-res-any:pass@geo.thordata.net:9000"

async def fetch(session, url):
    for attempt in range(3):
        try:
            async with session.get(
                url, proxy=PROXY_URL, timeout=aiohttp.ClientTimeout(total=30)
            ) as response:
                if response.status == 403:
                    print(f"403 on {url}, retrying...")
                    await asyncio.sleep(1)
                    continue
                return await response.text()
        except (aiohttp.ClientProxyConnectionError, asyncio.TimeoutError) as e:
            print(f"Error on {url}: {e}")
            if attempt == 2:
                return None
    return None

async def scrape_batch(urls, concurrency=10):
    semaphore = asyncio.Semaphore(concurrency)

    async def bounded_fetch(session, url):
        async with semaphore:
            return await fetch(session, url)

    async with aiohttp.ClientSession() as session:
        tasks = [bounded_fetch(session, url) for url in urls]
        return await asyncio.gather(*tasks)

# Usage
urls = [f"https://httpbin.org/ip?n={i}" for i in range(50)]
results = asyncio.run(scrape_batch(urls, concurrency=10))

successful = [r for r in results if r is not None]
print(f"Fetched {len(successful)}/{len(urls)} URLs successfully")
Enter fullscreen mode Exit fullscreen mode

This pattern gives you:

  • 10 concurrent requests through different residential IPs
  • Automatic retries on 403s and timeouts
  • Backpressure via semaphore so you don't overwhelm the proxy or target

For larger jobs, bump concurrency to 20-50. ThorData handles the IP rotation — you just need to manage your own request rate to stay polite to target servers.

Cost Comparison: ThorData vs Competitors

Here's how ThorData stacks up against the two biggest residential proxy providers as of early 2026:

Feature ThorData Bright Data Oxylabs
Pool size 60M+ IPs 72M+ IPs 100M+ IPs
Entry price ~$2/GB (pay-as-you-go) ~$8/GB (min $500/mo) ~$8/GB (min $300/mo)
Contract required No Yes (monthly) Yes (monthly)
Geo-targeting Country, city Country, city, ASN Country, city
Sticky sessions Up to 30 min Up to 10 min Up to 30 min
SOCKS5 support Yes Yes Yes
Free trial Yes Yes (limited) Yes (limited)

The big difference is the entry price and commitment. Bright Data and Oxylabs are enterprise-focused — their per-GB rates are competitive at scale, but you're locked into $300-500/month minimums.

ThorData lets you start at a few dollars and scale up. For indie developers, side projects, and early-stage startups, that pay-as-you-go model is significantly more practical. You're not burning $500/month while you figure out if your scraping project even works.

Conclusion

Residential proxies are the difference between a scraper that works on your laptop and one that works in production. ThorData gives you the IP diversity and geo-targeting you need without enterprise pricing or contracts.

To recap what we covered:

  1. Basic proxy setup with Python requests — one line to add proxy support
  2. Geo-targeting and sticky sessions via the username string
  3. Error handling for 407, 403, and timeout scenarios
  4. Async scraping with aiohttp for high-throughput jobs
  5. Cost comparison showing ThorData's advantage for pay-as-you-go usage

Ready to try it? Sign up for ThorData and grab your proxy credentials. The free trial gives you enough bandwidth to test everything in this tutorial. Start with the basic requests example, verify your IP is rotating, then scale up to async when you need throughput.

Happy scraping.

Top comments (0)