Why Your HTTPS Traffic Still Gets Blocked (and How DPI Evasion Works)

#networking #security #proxy #devops

You've set up your development environment, configured your proxy, everything is running over HTTPS — and your traffic still gets dropped. No error message, no timeout, just... nothing. If you've ever worked behind a restrictive corporate firewall or tried to reach package registries from a locked-down network, you know this pain.

The culprit is usually Deep Packet Inspection (DPI), and understanding how it works will save you hours of debugging.

The Problem: HTTPS Isn't as Opaque as You Think

Here's the thing most developers don't realize: even though HTTPS encrypts your payload, there's metadata leaking everywhere. The biggest offender is the SNI (Server Name Indication) field in the TLS handshake. SNI is sent in plaintext before encryption is established, which means any middlebox sitting on the network can read exactly which domain you're connecting to.

# Simplified TLS handshake — notice SNI is plaintext
Client → Server: ClientHello
  - SNI: api.npmjs.org        ← visible to any network observer
  - Supported cipher suites
  - TLS version

Server → Client: ServerHello
  - Selected cipher suite
  - Certificate
  # Encrypted tunnel established AFTER this point

DPI appliances exploit this. They inspect the SNI field, match it against blocklists, and drop the connection before TLS even completes. Your payload is encrypted, sure — but the destination is announced in the clear.

Other metadata that DPI can use:

DNS queries (unless you're using DoH/DoT)
Packet sizes and timing patterns (traffic fingerprinting)
TLS certificate fingerprints (JA3/JA4 hashing)
IP address reputation from threat intelligence feeds

Root Cause: How DPI Actually Inspects Your Connections

DPI doesn't just look at IP headers. Modern DPI engines reconstruct entire TCP streams and apply pattern matching at the application layer. Here's a rough breakdown of what happens:

Layer 3/4 inspection: Source/destination IP, port numbers, protocol detection
TLS fingerprinting: The ClientHello message has a unique structure per application — cipher suites, extensions, and their ordering create a fingerprint (this is what JA3 hashing captures)
SNI matching: Domain-level blocking without needing to decrypt anything
Statistical analysis: Machine learning models that classify traffic patterns even when the content is encrypted

The key insight is that encryption protects content, not metadata. And metadata is often enough to block you.

The Domain Fronting Technique

Domain fronting is a technique that exploits a gap between the SNI field and the HTTP Host header. It works like this:

import ssl
import socket

# The SNI shows a high-reputation domain (e.g., a CDN)
context = ssl.create_default_context()
conn = context.wrap_socket(
    socket.socket(),
    server_hostname="cdn.googleapis.com"  # SNI: looks legitimate to DPI
)
conn.connect(("cdn.googleapis.com", 443))

# But the HTTP Host header points to the actual destination
request = (
    "GET / HTTP/1.1\r\n"
    "Host: your-actual-backend.example.com\r\n"  # real destination, inside encrypted tunnel
    "\r\n"
)
conn.send(request.encode())

The DPI appliance sees traffic going to cdn.googleapis.com — a major CDN it can't afford to block. But inside the encrypted tunnel, the Host header routes the request to a completely different backend. The CDN (or shared infrastructure) forwards the request based on the Host header.

This technique gained attention when it was used as a censorship circumvention method. Most major cloud providers have since patched this by validating that SNI and Host headers match, but the concept illustrates something important about how layered protocols can create inspection gaps.

HTTP Tunneling Through Serverless Functions

A more modern approach uses serverless platforms as relay points. The idea: deploy a lightweight function on a trusted cloud platform, then route your traffic through it. The project MasterHttpRelayVPN on GitHub demonstrates this pattern using Google Apps Script as the relay layer.

The architecture looks roughly like this:

┌──────────┐    HTTPS     ┌──────────────────┐    HTTPS     ┌─────────────┐
│  Client  │ ──────────►  │  Google Apps      │ ──────────►  │ Destination │
│  (local  │  SNI:        │  Script (relay)   │              │  Server     │
│  proxy)  │  script.     │                   │              │             │
│          │  google.com  │  Forwards request │              │             │
└──────────┘              └──────────────────┘              └─────────────┘
     ▲                                                            │
     └────────────────── Response relayed back ───────────────────┘

From the network's perspective, all traffic goes to script.google.com — a Google domain that virtually no firewall blocks. The serverless function acts as a proxy, forwarding requests to the actual destination and relaying responses back.

This approach supports both HTTP and SOCKS5 proxy protocols, and can multiplex multiple streams over a single connection using HTTP/2 framing. The multiplexing is critical for performance — without it, each proxied request would need its own round-trip through the relay.

The MITM TLS Consideration

To proxy HTTPS traffic through this kind of relay, the local proxy component needs to terminate TLS locally. This means it generates certificates on the fly for each destination domain, signed by a local CA that the client trusts. This is the same technique that tools like mitmproxy, Charles Proxy, and corporate SSL inspection appliances use.

# Generate a local CA (same concept used by mitmproxy, Charles, etc.)
openssl genrsa -out ca-key.pem 2048
openssl req -new -x509 -key ca-key.pem -out ca-cert.pem -days 365 \
    -subj "/CN=Local Development CA"

# The local proxy uses this CA to sign per-domain certificates
# so it can decrypt, relay through the tunnel, and re-encrypt

This is standard practice for debugging tools but comes with obvious security implications. You're trusting the local proxy with all your decrypted traffic. Only use this pattern with tools you've audited, and never install a third-party root CA on a machine you use for anything sensitive.

DPI Evasion Techniques Worth Understanding

Beyond domain fronting, there are several techniques that DPI evasion tools use. Understanding them helps you debug why traffic behaves differently across networks:

TLS record fragmentation: Splitting the ClientHello across multiple TCP segments so pattern-matching engines fail to reassemble it
TCP segmentation tricks: Sending the SNI field split across packet boundaries
Padding and timing manipulation: Altering packet sizes and inter-arrival times to defeat statistical classifiers
ECH (Encrypted Client Hello): The proper, standards-track solution — encrypts the SNI field using a key published in DNS. This is the "right" way to solve the problem, and it's gaining browser support

What You Should Actually Do

If you're dealing with restrictive networks in a development context, here's the practical advice:

Start with ECH: If your target servers support it and your client stack is recent enough, Encrypted Client Hello solves the SNI leak problem at the protocol level. Check the Cloudflare ECH docs for the current state of support.
Use DoH/DoT for DNS: Tools like dnscrypt-proxy or systemd-resolved with DoT prevent DNS-level blocking. This is the lowest-hanging fruit.
WireGuard or SSH tunnels: For development access, a simple WireGuard tunnel to a cloud VM is far more reliable and secure than HTTP relay tricks. It's also easier to audit.
Understand your network's policies: If you're on a corporate network, talk to your IT team. Most of the time, they can allowlist the domains you need for development tools.
Reserve relay techniques for when you need them: HTTP relay approaches through serverless platforms are clever, but they add latency, complexity, and a trust dependency on the relay code. Use them as a last resort, not a first choice.

Prevention: Building Applications That Handle Restrictive Networks

If you're building tools that developers use, design for hostile network conditions:

Support HTTP proxy configuration via environment variables (HTTP_PROXY, HTTPS_PROXY)
Implement connection fallback — try direct, then proxy, then alternative ports
Use standard ports (443) for all traffic. Non-standard ports are blocked by default on most managed networks
Support certificate pinning bypass for development environments (with clear warnings)
Log connection failures with enough detail to diagnose DPI interference

The networking stack between your code and the internet is more complex than most developers realize. Understanding how DPI works, why HTTPS doesn't hide everything, and what techniques exist to work around network restrictions will save you from those mysterious "it works on my machine but not on the office network" debugging sessions.