How to Bypass Akamai Bot Detection in 2026: curl-cffi + Residential Proxies
If you've hit an Akamai-protected site and watched your scraper go from 200 OK to a wall of CAPTCHAs and 403s in under 30 seconds, you already know: Akamai is not Cloudflare.
Cloudflare checks your TLS handshake and browser cookies. Akamai runs sensor.js — a 50KB+ JavaScript fingerprinting engine that inspects your browser's GPU rendering, audio context, WebRTC stack, and hundreds of passive signals to assign you a bot score before a single HTTP request completes.
Standard tools fail hard:
- Selenium with a vanilla Chrome profile: ~80% detection rate against Akamai in recent tests
- Python requests with a User-Agent header: ~100% detection within the first 5 requests
- Playwright default: Still gets flagged at high volumes
The combination that actually works: curl-cffi with Chrome impersonation plus fresh residential proxies. Here's the full picture.
Why Akamai Is Harder Than Cloudflare
Cloudflare's primary detection vector is the TLS fingerprint (JA3/JA4 hash) and whether your browser completes a challenge correctly. Fix the TLS fingerprint, and you're largely through.
Akamai layers multiple systems:
- sensor.js behavioral fingerprinting: Executes on page load, probes browser internals (canvas rendering speed, WebGL vendor, font enumeration, navigator.hardwareConcurrency, etc.)
- Passive client hints: Reads HTTP/2 and HTTP/3 connection characteristics your client sends
- IP velocity scoring: Tracks requests per IP across their network, not just the target site
- Session integrity checks: Validates that subsequent requests from the same session behave like a real browser
The key difference: Akamai detects who you are (browser fingerprint + IP history), not just what you sent (one request's headers).
What Actually Works
1. curl-cffi with Chrome Impersonation
The curl-cffi library uses libcurl's impersonation mode to replicate Chrome's TLS fingerprint exactly — including HTTP/2 settings, cipher suites, and ALPN protocols. Combined with browser-like default headers, it slips past Akamai's TLS checks.
from curl_cffi.requests import Session
def create_browser_session(proxy=None):
"""Create a curl-cffi session that impersonates Chrome 124."""
session = Session(
impersonate='chrome124',
timeout=30,
proxies={'http': proxy, 'https': proxy} if proxy else None,
headers={
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/avif,image/webp,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
'Sec-Fetch-Dest': 'document',
'Sec-Fetch-Mode': 'navigate',
'Sec-Fetch-Site': 'none',
'Sec-Fetch-User': '?1',
}
)
return session
Test it against a site running Akamai:
session = create_browser_session()
response = session.get('https://www.target-akamai-site.com/')
print(f'Status: {response.status_code}')
print(f'URL: {response.url}')
# If redirected to CAPTCHA or blocked page, status_code will be 200 but content will have bot indicators
if 'captcha' in response.text.lower() or 'blocked' in response.text.lower():
print('DETECTED — try a different proxy')
else:
print('Access OK')
On simple Akamai-protected sites (no sensor.js challenge), this alone gets you through 80-90% of the time.
2. Residential Proxy Rotation
For pages where sensor.js fires and scores you as a bot, you need:
- Residential proxies (not data center — Akamai flags DC IPs aggressively)
- Fresh IP per request or per session, not per page batch
- Geographic consistency — rotating through IPs in the same country as your target site's expected audience
import random
import time
from curl_cffi.requests import Session
# Example proxy list (replace with your provider's credentials)
PROXIES = [
'http://user1:pass1@residential-proxy-1.example.com:8080',
'http://user2:pass2@residential-proxy-2.example.com:8080',
'http://user3:pass3@residential-proxy-3.example.com:8080',
]
def get_fresh_session():
proxy = random.choice(PROXIES)
session = Session(
impersonate='chrome124',
proxies={'http': proxy, 'https': proxy},
timeout=30,
)
return session
def scrape_with_retry(url, max_retries=3):
for attempt in range(max_retries):
session = get_fresh_session()
resp = session.get(url)
if resp.status_code == 200 and 'captcha' not in resp.text.lower():
return resp
# Rotate proxy on detection
time.sleep(1)
return None
Provider options (no affiliation): Bright Data, Oxylabs, SmartProxy. Expect to pay $15-30/GB for residential traffic. A single 5-page scrape run might use 10-50MB depending on the site.
3. The Selenium/Playwright Fallback (When curl-cffi Isn't Enough)
If curl-cffi keeps getting flagged even with fresh proxies, the site is running sensor.js with active behavioral checks. In that case, you need a real browser — but configured properly:
from selenium import webdriver
from selenium_stealth import stealth
from selenium.webdriver.chrome.options import Options
def create_stealth_driver(proxy=None):
options = Options()
options.add_argument('--headless=new')
options.add_argument('--no-sandbox')
options.add_argument('--disable-dev-shm-usage')
options.add_argument('--disable-blink-features=AutomationControlled')
options.add_argument('--disable-gpu')
options.add_argument('--window-size=1920,1080')
if proxy:
options.add_argument(f'--proxy-server={proxy}')
driver = webdriver.Chrome(options=options)
stealth(driver,
languages=['en-US', 'en'],
vendor='Google Inc.',
webgl_vendor='Intel Inc.',
renderer='Intel Iris OpenGL Engine',
fix_hairline=True,
)
return driver
# Usage
driver = create_stealth_driver('http://user:pass@proxy:8080')
driver.get('https://www.target-akamai-site.com/')
time.sleep(3) # Let sensor.js run
html = driver.page_source
driver.quit()
The selenium-stealth package patches the most common navigator.webdriver and automation flags, but detection rates vary by Akamai configuration. In practice: Selenium as a fallback gets you through maybe 60-70% of the time where curl-cffi fails, at 10-20x the latency cost.
What Doesn't Work
| Method | Detection Rate | Notes |
|---|---|---|
requests + User-Agent |
~100% | Dead in seconds |
| Selenium (vanilla) | ~80% |
navigator.webdriver=true flags immediately |
| Playwright (default) | ~60-70% | Better, but still detectable at volume |
| curl-cffi + datacenter proxy | ~50-70% | TLS passes but IP is flagged |
| curl-cffi + residential proxy | ~10-20% | Works on most sites without JS challenges |
Handling Dynamic Content and JavaScript Challenges
Many Akamai-protected pages render critical content after initial page load via JavaScript. For these:
- Check if the data is available in the initial HTML — look at the raw response before assuming JS is needed
- If JS is required, use Selenium with stealth + residential proxy
- Consider a headless browser API (e.g., Apify actors built for this) instead of running your own — they handle the browser infrastructure and proxy rotation at scale
# Quick check: is data in the initial HTML?
session = create_browser_session()
resp = session.get(url)
if '<script' in resp.text and 'akamai' in resp.text.lower():
print("Dynamic content — Selenium recommended")
else:
print("Static page — curl-cffi sufficient")
Complete Working Example
import random
import time
import json
from curl_cffi.requests import Session
from bs4 import BeautifulSoup
PROXIES = [
'http://user:pass@residential-proxy-1.example.com:8080',
'http://user:pass@residential-proxy-2.example.com:8080',
]
IMPERSONATE = 'chrome124'
def scrape_akamai_page(url, use_proxy=True):
"""Scrape an Akamai-protected page with automatic proxy rotation."""
session = Session(
impersonate=IMPERSONATE,
proxies={'http': random.choice(PROXIES), 'https': random.choice(PROXIES)} if use_proxy else None,
timeout=30,
headers={
'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,*/*;q=0.8',
'Accept-Language': 'en-US,en;q=0.9',
'Accept-Encoding': 'gzip, deflate, br',
}
)
resp = session.get(url)
# Check for bot detection
if resp.status_code == 200:
soup = BeautifulSoup(resp.text, 'html.parser')
title = soup.title.string if soup.title else ''
if any(kw in resp.text.lower() for kw in ['captcha', 'blocked', 'access denied', 'security check']):
return {'success': False, 'error': 'bot_detected', 'status': resp.status_code}
return {'success': True, 'status': resp.status_code, 'title': title, 'content': resp.text[:500]}
return {'success': False, 'error': f'http_{resp.status_code}'}
# Test against a known Akamai site
result = scrape_akamai_page('https://www.example-akamai-site.com/')
print(json.dumps(result, indent=2))
Key Takeaways
-
Start with curl-cffi — it's 10-50x faster than Selenium and passes TLS checks that
requestscannot - Residential proxies are mandatory for serious Akamai scraping — datacenter IPs get flagged quickly
- Selenium is a fallback, not a primary tool — use it when curl-cffi gets blocked after multiple proxy rotations
- Monitor your bot score — if you're hitting CAPTCHAs on 2+ consecutive requests, rotate your proxy and add delays
- Consider managed solutions if the cat-and-mouse game costs more than your time is worth
The arms race is continuous. Akamai updates sensor.js regularly. Test against your specific target — what works on one site may need tuning on another.
Related Tools
Pre-built actors for this use case:
n8n AI Automation Pack ($39) — 5 production-ready workflows
Skip the setup
Pre-built scrapers with Akamai bypass built in:
Apify Scrapers Bundle — $29 one-time
35+ actors, instant download. Handle anti-bot automatically.
Top comments (0)