Anti-bot detection has gotten dramatically more sophisticated since 2022. TLS fingerprinting, behavioural analysis, and ML-based anomaly detection have made naive scrapers useless.
Here is what detection systems actually look at in 2026, and which bypasses are still effective.
Layer 1: Network-Level Detection
TLS/JA3 Fingerprinting
Every HTTPS client has a unique TLS handshake signature based on cipher suites, extensions, and ordering. Python's requests library has a distinctive JA3 fingerprint that is instantly identifiable.
Bypass: Use a library that randomises or mimics browser TLS signatures. curl_cffi (Python) mimics Chrome's TLS stack exactly. This alone bypasses a significant percentage of detection systems.
import curl_cffi.requests as requests
# Mimics Chrome 120 TLS fingerprint
response = requests.get(url, impersonate="chrome120")
IP Reputation
Datacenter IPs (AWS, GCP, Hetzner) are immediately flagged. Most anti-bot systems maintain databases of datacenter CIDR ranges.
Bypass: Residential proxies. The ASN of a residential ISP passes this check. Mobile proxies (4G/5G exit nodes) are even cleaner.
HTTP/2 Fingerprinting
HTTP/2 has settings frames that browsers configure differently from Python HTTP clients. The SETTINGS frame values, HEADERS frame ordering, and stream priorities all create a fingerprint.
Bypass: Same solution as TLS — curl_cffi with browser impersonation handles HTTP/2 correctly.
Layer 2: Browser Fingerprinting
For JavaScript-rendered sites (Cloudflare, Akamai), the detection runs in the browser:
Canvas Fingerprint
Browsers render a hidden canvas element slightly differently based on GPU, OS, and font rendering. Headless Chrome has a distinctive canvas fingerprint.
Bypass: Use puppeteer-extra-plugin-stealth or Playwright with stealth patches. These modify canvas rendering to produce a realistic output.
WebGL Renderer
The WebGL renderer string (RENDERER attribute) exposes the GPU. Headless browsers often report SwiftShader — an immediate flag.
Bypass: Inject a realistic GPU string:
Object.defineProperty(WebGLRenderingContext.prototype, 'getParameter', {
value: function(parameter) {
if (parameter === 37446) return 'ANGLE (NVIDIA GeForce RTX 3060)';
return originalGetParameter.call(this, parameter);
}
});
Navigator Properties
Headless Chrome leaks via:
navigator.webdriver === truenavigator.plugins.length === 0-
window.chromebeing undefined - Missing browser-specific APIs
Bypass: Stealth patches set navigator.webdriver = false, inject a realistic plugins array, and define window.chrome.
Layer 3: Behavioural Analysis
This is where modern detection is hardest to beat:
Request Timing
Humans have variable timing. Scrapers hit pages at machine-perfect intervals.
Bypass: Add Gaussian noise to delays:
import random, time
def human_delay(min_sec=1.5, max_sec=4.0):
base = random.uniform(min_sec, max_sec)
noise = random.gauss(0, 0.3)
time.sleep(max(0.5, base + noise))
Mouse Movement (for interactive sessions)
Bots move in straight lines or do not move at all. Real users have curved, slightly jittery paths.
Bypass: Pre-record real mouse movement traces and replay them with slight randomisation.
Session Depth
Bots often hit one page and leave. Real users navigate, go back, follow links.
Bypass: Simulate a realistic session — visit homepage, navigate to category, browse a few items, then hit the target page.
Layer 4: ML-Based Anomaly Detection
Cloudflare Bot Management and Akamai Bot Manager use ML models trained on billions of requests. They detect patterns that are not obvious rules:
- Unusual Accept-Language headers for the claimed geography
- User-Agent claiming Windows but HTTP/2 settings from Linux
- Cookie handling inconsistencies
- Too-perfect mouse movements (overcorrecting for jitter)
Bypass: No single patch works. The key is holistic consistency — every signal should tell the same story.
The 9 Bypasses That Still Work in 2026
| Bypass | Blocks What | Difficulty |
|---|---|---|
curl_cffi with browser impersonation |
TLS/JA3, HTTP/2 fingerprint | Easy |
| Residential proxies | IP reputation | Easy (costs $) |
| Playwright + stealth plugin | Navigator leaks, canvas, WebGL | Medium |
| Gaussian timing noise | Request interval detection | Easy |
| Session depth simulation | Single-page-hit patterns | Medium |
| Real browser rendering (Playwright) | JavaScript challenges | Medium |
| Cookie jar persistence | Session tracking | Easy |
| Consistent Accept-Language + Geo match | Header inconsistency | Easy |
| Human mouse traces | Interactive behaviour analysis | Hard |
What Does Not Work Anymore
- Rotating User-Agent strings (JA3 fingerprint is not in UA)
- Simple datacenter proxies for Cloudflare-protected sites
-
headless=Truewithout stealth patches (detectable in under 1 second) - Fixed
time.sleep(2)between requests (machine-perfect timing is flagged)
Pre-Built Anti-Bot Stack
Buying a pre-built scraping toolkit with anti-bot handling already configured is faster than building the stealth stack yourself:
Includes scrapers pre-configured with: curl_cffi browser impersonation, Playwright stealth patches, residential proxy rotation, human timing simulation, and session management.
Which anti-bot system is giving you the most trouble right now? Drop it in comments and I will give a specific bypass recommendation.
Top comments (0)