Every web scraper hits the same wall: IP bans, CAPTCHAs, and geo-restricted content. You rotate user agents, add delays, retry failed requests — and still get blocked after a few hundred requests.
The fix? Residential proxies that route your traffic through real consumer IPs. In this tutorial, I'll show you how to use ThorData's residential proxy network to scrape reliably with Python — from basic requests to async rotation patterns.
Why Residential Proxies Matter for Scraping
Datacenter proxies are cheap but easy to detect. Websites fingerprint IP ranges owned by cloud providers (AWS, GCP, Hetzner) and block them aggressively.
Residential proxies use IPs assigned to real ISPs and households. To a target website, your request looks like a regular user browsing from their home. This matters for three scenarios:
- Anti-bot systems (Cloudflare, DataDome, PerimeterX) that block datacenter IPs on sight
- Geo-restricted content — you need an IP in Brazil to see Brazilian pricing, a UK IP for UK news archives
- Rate limits — rotating through thousands of IPs means no single IP gets flagged
Without residential proxies, you're fighting an arms race you'll lose. With them, you're invisible.
ThorData: What You Get
ThorData runs a residential proxy pool of 60M+ IPs across 195+ countries. Here's what makes it practical for developers:
- Pay-as-you-go pricing — no monthly commitment, no contracts. You pay per GB of traffic.
- Sticky sessions — keep the same IP for up to 30 minutes when you need session persistence (login flows, paginated scraping).
- City-level targeting — target specific countries, states, or cities via the proxy username string.
- HTTP/HTTPS/SOCKS5 support — works with any HTTP client library.
- Dashboard with real-time usage — track bandwidth, success rates, and costs.
For scraping projects, the combination of large IP pool + granular geo-targeting + no contracts is hard to beat.
Getting Started: Sign Up and Get Credentials
- Create an account at ThorData. You'll get a free trial with bandwidth to test.
- Once logged in, navigate to the Dashboard → Residential Proxies section.
- Copy your proxy credentials: you'll get a host, port, username, and password.
Your proxy endpoint will look like this:
Host: geo.thordata.net
Port: 9000
Username: your-username-res-any
Password: your-password
The -res-any suffix in the username tells ThorData to use any available residential IP. You can change any to a country code like us, gb, or br for geo-targeting.
Basic Python Tutorial: Requests with ThorData Proxy
Let's start with the simplest approach — using the requests library with ThorData as your proxy:
import requests
PROXY_HOST = "geo.thordata.net"
PROXY_PORT = 9000
PROXY_USER = "your-username-res-any"
PROXY_PASS = "your-password"
proxy_url = f"http://{PROXY_USER}:{PROXY_PASS}@{PROXY_HOST}:{PROXY_PORT}"
proxies = {
"http": proxy_url,
"https": proxy_url,
}
response = requests.get(
"https://httpbin.org/ip",
proxies=proxies,
timeout=30,
)
print(response.json())
# {"origin": "186.215.xx.xx"} — a residential IP, not your server's IP
Each request gets a random IP from the pool. No configuration needed — ThorData handles rotation automatically.
Geo-Targeting
Target a specific country by changing the username suffix:
# US IPs only
PROXY_USER = "your-username-res-us"
# UK IPs only
PROXY_USER = "your-username-res-gb"
# Brazil, São Paulo specifically
PROXY_USER = "your-username-res-br-city-saopaulo"
Sticky Sessions
When you need the same IP across multiple requests (e.g., maintaining a logged-in session), add a session ID:
import random
session_id = random.randint(10000, 99999)
PROXY_USER = f"your-username-res-us-session-{session_id}"
# All requests with this proxy_user will use the same IP
# for up to 30 minutes
Error Handling
Production scrapers need to handle proxy errors gracefully:
import requests
from requests.exceptions import ProxyError, Timeout
def fetch_with_proxy(url, max_retries=3):
for attempt in range(max_retries):
try:
response = requests.get(url, proxies=proxies, timeout=30)
if response.status_code == 407:
raise Exception("Proxy auth failed — check credentials")
if response.status_code == 403:
print(f"Blocked on attempt {attempt + 1}, rotating IP...")
continue # next attempt gets a new IP automatically
response.raise_for_status()
return response
except (ProxyError, Timeout) as e:
print(f"Proxy error on attempt {attempt + 1}: {e}")
if attempt == max_retries - 1:
raise
return None
# Usage
result = fetch_with_proxy("https://example.com/data")
if result:
print(result.text[:200])
Key points:
- 407 means your proxy credentials are wrong. Don't retry — fix the username/password.
- 403 often means the target site blocked that IP. A retry gets a fresh IP automatically.
- Timeouts happen with residential proxies more than datacenter ones. Set a reasonable timeout (30s) and retry.
Advanced: Async Scraping with aiohttp
For high-throughput scraping, use aiohttp to make concurrent requests through ThorData:
import asyncio
import aiohttp
PROXY_URL = "http://user-res-any:pass@geo.thordata.net:9000"
async def fetch(session, url):
for attempt in range(3):
try:
async with session.get(
url, proxy=PROXY_URL, timeout=aiohttp.ClientTimeout(total=30)
) as response:
if response.status == 403:
print(f"403 on {url}, retrying...")
await asyncio.sleep(1)
continue
return await response.text()
except (aiohttp.ClientProxyConnectionError, asyncio.TimeoutError) as e:
print(f"Error on {url}: {e}")
if attempt == 2:
return None
return None
async def scrape_batch(urls, concurrency=10):
semaphore = asyncio.Semaphore(concurrency)
async def bounded_fetch(session, url):
async with semaphore:
return await fetch(session, url)
async with aiohttp.ClientSession() as session:
tasks = [bounded_fetch(session, url) for url in urls]
return await asyncio.gather(*tasks)
# Usage
urls = [f"https://httpbin.org/ip?n={i}" for i in range(50)]
results = asyncio.run(scrape_batch(urls, concurrency=10))
successful = [r for r in results if r is not None]
print(f"Fetched {len(successful)}/{len(urls)} URLs successfully")
This pattern gives you:
- 10 concurrent requests through different residential IPs
- Automatic retries on 403s and timeouts
- Backpressure via semaphore so you don't overwhelm the proxy or target
For larger jobs, bump concurrency to 20-50. ThorData handles the IP rotation — you just need to manage your own request rate to stay polite to target servers.
Cost Comparison: ThorData vs Competitors
Here's how ThorData stacks up against the two biggest residential proxy providers as of early 2026:
| Feature | ThorData | Bright Data | Oxylabs |
|---|---|---|---|
| Pool size | 60M+ IPs | 72M+ IPs | 100M+ IPs |
| Entry price | ~$2/GB (pay-as-you-go) | ~$8/GB (min $500/mo) | ~$8/GB (min $300/mo) |
| Contract required | No | Yes (monthly) | Yes (monthly) |
| Geo-targeting | Country, city | Country, city, ASN | Country, city |
| Sticky sessions | Up to 30 min | Up to 10 min | Up to 30 min |
| SOCKS5 support | Yes | Yes | Yes |
| Free trial | Yes | Yes (limited) | Yes (limited) |
The big difference is the entry price and commitment. Bright Data and Oxylabs are enterprise-focused — their per-GB rates are competitive at scale, but you're locked into $300-500/month minimums.
ThorData lets you start at a few dollars and scale up. For indie developers, side projects, and early-stage startups, that pay-as-you-go model is significantly more practical. You're not burning $500/month while you figure out if your scraping project even works.
Conclusion
Residential proxies are the difference between a scraper that works on your laptop and one that works in production. ThorData gives you the IP diversity and geo-targeting you need without enterprise pricing or contracts.
To recap what we covered:
-
Basic proxy setup with Python
requests— one line to add proxy support - Geo-targeting and sticky sessions via the username string
- Error handling for 407, 403, and timeout scenarios
-
Async scraping with
aiohttpfor high-throughput jobs - Cost comparison showing ThorData's advantage for pay-as-you-go usage
Ready to try it? Sign up for ThorData and grab your proxy credentials. The free trial gives you enough bandwidth to test everything in this tutorial. Start with the basic requests example, verify your IP is rotating, then scale up to async when you need throughput.
Happy scraping.
Top comments (0)