1. What Cloudflare actually is
Cloudflare is not a reverse proxy running on one server somewhere. It is a globally distributed edge network with over 300 points of presence (PoPs). When you put your domain behind Cloudflare, you are routing all traffic through that network before it ever reaches your server
The mechanism is anycast routing. Cloudflare announces the same IP address from every PoP simultaneously. When a user sends a request to your site, BGP routing automatically directs it to the closest PoP, not to your origin server. From there, Cloudflare decides what to do with it
User in Tokyo
|
| (anycast routes to nearest PoP)
v
Cloudflare Tokyo PoP
|-- cached? → serve from edge, origin never touched
|-- blocked? → return 403, origin never touched
|-- challenge? → run Turnstile, origin never touched
|-- clean? → forward to origin, return response
v
Your origin server
TLS termination happens at the edge PoP, not at your origin. Cloudflare holds the certificate, decrypts the request, inspects it, then re-encrypts it for the leg to your origin (assuming SSL between Cloudflare and origin is enabled, which it should be)
This is why Cloudflare can inspect HTTPS traffic for WAF rules without a man-in-the-middle attack: you are explicitly delegating that decryption to them
2. The layers between a request and your server
A request arriving at a Cloudflare PoP passes through several decision layers in order:
DDoS mitigation runs first. Volumetric floods are absorbed at the network layer. HTTP floods are identified by rate, pattern, and reputation
IP reputation and geofencing checks the source IP against Cloudflare's threat database. IPs from known botnets, Tor exit nodes, or datacenter ranges are scored
WAF inspects the HTTP layer: headers, path, query params, body. Cloudflare maintains a managed ruleset covering OWASP Top 10 plus known CVEs
Bot management (Turnstile is the visible part) assigns each request a bot score from 1 to 99. Score 1 is almost certainly a bot. Score 99 is almost certainly human
Cache is the last layer before origin. If the response is cacheable and a fresh copy exists at the PoP, Cloudflare serves it without touching your server
3. How Turnstile works
Turnstile is Cloudflare's CAPTCHA replacement. Unlike reCAPTCHA v2, it has no image challenge: the goal is to verify a visitor is human without making them solve anything visible
1. The widget loads a JS challenge from Cloudflare's edge. The script is different per request, not a static file you can analyze once
2. The script collects passive signals:
- Timing: how long did each JS operation take? Headless browsers running at full CPU speed have suspiciously uniform timing.
- Interaction: did the mouse move before the form was submitted? Did keystrokes have natural delays?
- Browser fingerprint: canvas rendering, WebGL renderer, installed fonts, audio context output.
- Environment: is
navigator.webdriverexposed? Are dev tools open?
3. Cloudflare runs those signals through a model trained on billions of requests and issues a signed token if the request looks human
4. Your backend verifies the token against Cloudflare's siteverify API:
POST https://challenges.cloudflare.com/turnstile/v0/siteverify
{
"secret": "your-secret-key",
"response": "token-from-widget"
}
If your backend does not make this call, the protection is entirely client-side and trivially bypassed by skipping the form submission step
4. Finding the origin server behind Cloudflare
If an attacker finds your origin IP, they can bypass Cloudflare entirely by sending requests directly to that IP. Your WAF, DDoS protection, and Turnstile all disappear
Here are the techniques commonly used, in order of how often they succeed
SSL certificate history
Before you put a domain behind Cloudflare, it had a certificate issued directly to the origin. Certificate transparency logs are public and record every certificate ever issued:
https://crt.sh/?q=example.com
If the origin IP appeared in a certificate before Cloudflare was enabled, it is in the log forever
DNS history
Before Cloudflare, your A record pointed directly to your origin. Those records are archived by SecurityTrails, DNSDumpster, and ViewDNS.info, often with timestamps showing exactly when you switched
Subdomains not behind Cloudflare
Many teams proxy www and the apex but leave other subdomains with a grey cloud (not proxied) by accident:
-
ftp.example.com: legacy, often points to origin -
dev.example.com,staging.example.com: forgotten -
api.example.com: sometimes bypasses the proxy for latency reasons
A subdomain enumeration pass reveals which subdomains resolve to a non-Cloudflare IP
MX records
Mail servers cannot be proxied through Cloudflare. Your MX record points directly to a mail server, often on the same IP block as your web server:
dig MX example.com # → mail.example.com
dig A mail.example.com # → 203.0.113.42
SPF records
SPF records list every IP authorized to send email on your behalf. They often include your origin server or hosting provider's IP range:
dig TXT example.com
# v=spf1 ip4:203.0.113.0/24 include:sendgrid.net ~all
Shodan + certificate fingerprint
If your origin uses a Cloudflare origin certificate, its fingerprint is the same regardless of how it is accessed. Shodan and Censys index TLS certificates across the entire IPv4 space: search for your cert fingerprint to find the raw IP
5. Bypassing Turnstile
Solving services
2captcha, Anti-Captcha, and CapSolver use human workers who run a real browser session and return the token. This works but is slow (seconds per token) and costs money per solve. Practical at low volume, expensive at scale
Headless browser spoofing
Playwright and Puppeteer combined with stealth plugins patch the detectable properties:
-
navigator.webdriverset toundefined - Spoofed canvas fingerprint
- Realistic mouse movement and keystroke timing
- Full Chrome user agent
A well-configured headless browser can pass Turnstile at a reasonable rate. Cloudflare's model is continuously updated, but it is an ongoing arms race
What actually stops most bots
The visible Turnstile widget is not the main defense. Cloudflare's bot score from network-level signals (IP reputation, ASN, request rate, TLS fingerprint) catches far more traffic than the JS challenge does. A request from AWS Lambda with a clean User-Agent still has a datacenter ASN: that alone raises the bot score before any JS runs
Turnstile alone, validated client-side only, is weak. The combination of network scoring plus behavioral analysis is what makes the system effective
6. How to actually protect your origin
Use Cloudflare Tunnel. This is the only approach that fully hides your origin IP. cloudflared opens an outbound connection from your server to Cloudflare's network. No open inbound ports, no IP to find.
cloudflared tunnel create my-tunnel
cloudflared tunnel route dns my-tunnel example.com
cloudflared tunnel run my-tunnel
If you cannot use Tunnel, firewall your origin to Cloudflare IPs only. Cloudflare publishes its full IP range at cloudflare.com/ips-v4. Allow only those ranges on 80 and 443. Drop everything else
Proxy every subdomain. Audit your DNS records. Every subdomain that should be proxied must have the orange cloud enabled. Grey-cloud records pointing to your origin are a bypass by design
Keep mail on a separate IP. Your mail server should not share an IP or IP block with your web server
Validate Turnstile server-side, always. The token must be verified by your backend on every form submission
Check your certificate history now. Run your domain through crt.sh and SecurityTrails. If your old origin IP is visible, either move to a new IP (and use Tunnel going forward) or rely entirely on the firewall approach
Originally published on jguillaumesio.com
Top comments (0)