Denis Rybakov

Posted on Feb 7 • Edited on Feb 15

Russia's Internet Filtering Infrastructure: Evolution and Architecture

#network #security #architecture

Part 1 of the series: "Internet Censorship in Russia: A Technical Deep Dive"

Disclaimer

Educational Purpose: This article examines the technical architecture of internet filtering systems for educational purposes. The author does not encourage violation of local laws. All information is based on publicly available sources and open-source research.

Introduction: Why This Matters for Engineers

What makes Russia's internet filtering infrastructure technically interesting for us, engineers:

Scale: Distributed real-time system operating at main network speeds
Complexity: Combining networking, cryptography, and ML/staticsics in a single architecture
Evolution: A genuine "arms race" of algorithms, protocols and ideas happening in real-time
Architecture: Lessons applicable to any high-throughput, stateful inspection system

Here we examines the technical aspects from an engineering perspective, without political commentary using Russia as example. There is also another notorious example of such systems: China Great Firewall, and we will make some comparison with this two. We'll look at how they work, what problems they solve, and what trade-offs engineers make. We'll also discuss how circumvention works and what means exist to bypass these systems and how.

A red thread of this topic is that no event or technology should be considered in only present state, we need a historical perspective to understand what is happening now and why some decisions were made.

This isn't a political piece. Here we discuss architecture, engineering problems, trade-offs and solutions on both sides.

Volens-nolens, to understand a topic, we need to look from both perspectives: the filtering and the circumvention sides. Not to pick one, but to understand what tools exist and what decisions engineers make.

Whether you're interested in network security, distributed systems, or privacy-preserving technologies, understanding how modern internet filtering works provides valuable insights.

From IP Blocking to Deep Packet Inspection: The Evolution

The Early Stage: Simple IP Blocking

In the beginning, internet filtering in Russia was primitive—blocking by IP addresses. This worked against small websites with dedicated IPs, but quickly proved ineffective:

Problems with IP blocking:

Collateral damage: A single IP can host thousands of websites (shared hosting, CDN)
- Block one site → accidentally block hundreds of legitimate sites
- Example: Blocking a Cloudflare IP affects 10,000+ domains
Easy circumvention: Change the server's IP address
- Takes 5 minutes to update DNS
- Blocking is always playing catch-up
Cloud infrastructure: Services like Telegram use thousands of IP addresses
- 2018: Attempted Telegram block resulted in ~18 million IPs blocked
- Collateral damage: AWS, Google Cloud services affected
- Telegram remained operational via proxy networks

Example of an IP-based blocking in Russia

In 2018 it was a notorious Telegram blocking attempt. The Russian government blocked ~18 million IP addresses with heavy collateral damage ( Academy of Sciences made a lawsuit against network censorship ministry ) but without any real effect on Telegram.

What happened? AWS services went down for businesses, Google Cloud was
affected, random legitimate websites became unreachable, Sites and services of a Russian Academy of Sciences, some goverment detachments were blocked.
And as icing on a cake: Russia network censoship ministry services was also blocked in process.

In those time it was a big problem for Russian government but for regular users future looks brilliant and cloudless, seems no real censorship possible.

But it was not the end of the story.

The system needed to evolve.

Evolution Path

2012-2015: IP-based blocking
    ↓
2015-2017: DNS filtering
    ↓
2017-2019: SNI inspection (TLS handshake analysis)
    ↓
2019-present: Deep Packet Inspection + Behavioral analysis

What is Deep Packet Inspection (DPI)?

DPI is not just a filter—it's a real-time traffic classification system operating at network backbone speed.

Key characteristics:

Analyzes packet contents, not just headers ( despite of it is impossible to decrypt TSL after hello exchange )
Maintains state across connections (stateful inspection)
Makes decisions in microseconds
Must handle 10-100+ Gbps throughput

This leads to both capabilities and limitations, which we'll explore next.

TSPU: Russia's DPI Infrastructure

What is TSPU?

TSPU (Технические Средства Противодействия Угрозам, "Technical Means of Counteracting Threats") is Russia's DPI-based filtering infrastructure, mandated by the "Sovereign Internet" law of 2019.

Aspects of TSPU in Russia is classified, but we can make a educated guess based on behavior.
Areas in Russia differs by filtering rules, this difference looks like testing of new rules before rollout or some local restrictions. This can lead us to key architecture principles of this system.
Another key insight is a law and a demand to set up TSPU in every ISP in Russia.

Key differences from China's Great Firewall:

Aspect	China (GFW)	Russia (TSPU)
Architecture	Centralized: ~3 major backbone chokepoints	Distributed: Thousands of ISP nodes
Control	Direct government operation	Installed at ISPs, remotely managed
Deployment	Border filtering (international traffic)	Both domestic and international
Transparency	"Black box"	Legal framework with ISP obligations

It's said that distributed architecture means Russia can't simply "flip a switch" to filter all traffic at once and in same time - like China can, but it actually does not matters for users - if a "switch flipped" in Russia it results in a gradual filtering rules implementation, but at the end - result the same.

Deployment Architecture

Where TSPU is installed:

ISP nodes (mandatory for major operators)
- At Internet Service Provider facilities
- Both mobile and fixed-line networks
- Operators legally required to allow installation
Backbone channels
- Major inter-city links
- Internet Exchange Points (IX)
- International gateway connections
Two operational modes:

Active mode (inline filtering):

Internet traffic → TSPU (analyzes & filters) → Destination

All traffic passes through TSPU
Can block or modify packets in real-time
Adds latency (milliseconds)
More effective filtering

Passive mode (monitoring):

Internet traffic → TAP/SPAN port → TSPU (analyzes copy)
                ↓
            Direct flow (unaffected)

Receives copy of traffic for analysis
Can send RST packets to terminate connections
Cannot modify packets in-flight
Lower latency impact

Why two modes?

My understanding: most ISPs run active mode as primary filtering, but use
passive mode as a complement.

_Active mode catches obvious violations in real-time. Passive mode is for
the non-obvious cases — suspicious patterns that need deeper analysis without _ blocking traffic immediately. TSPU need to pass packet if it fails to analize it.

It's like having a bouncer at the door (active) and security cameras
watching for unusual behavior (passive).

Technical Capabilities

What TSPU Can See and Do

1. Unencrypted data visibility:

IP addresses and ports (always visible)
DNS queries
HTTP headers (if not HTTPS)
SNI in TLS Client Hello ← Primary blocking method for HTTPS

Why SNI matters:

TLS Handshake (simplified):

Client → Server: Client Hello
├─ TLS version
├─ Supported ciphers
└─ SNI: "blocked-site.com" ← VISIBLE IN PLAINTEXT

[Rest of handshake encrypted]

The SNI (Server Name Indication) tells the server which domain the client wants to reach. It's transmitted before encryption begins, making it perfect for filtering.

2. Metadata analysis:

Packet sizes and patterns
Intervals between packets
Connection duration
Volume of transferred data
Flow direction: upload vs download ratio

3. Machine Learning classification:

Protocol identification by traffic patterns
VPN service fingerprinting
Behavioral analysis:
- Normal web browsing: burst patterns (page load → pause)
- VPN: continuous bidirectional flow
- Streaming: large download, small upload

4. Active probing:
Here is what TSPU passive mode seems designed for. When TSPU detects suspicious traffic:

1. Client connects to suspicious IP:port
2. TSPU records: IP 1.2.3.4:8443
3. TSPU initiates its own connection to 1.2.3.4:8443
4. Sends typical VPN handshake packets
5. Analyzes server response
6. If server responds like VPN → add to blocklist

This is why running a VPN server on a public IP is increasingly difficult—TSPU will discover and block it automatically.

5. Government certificates:

Ministry of Digital Development root certificates
Enables MITM (Man-in-the-Middle) attacks on TLS
Currently limited deployment (pilot projects)
If widely adopted: could decrypt all HTTPS traffic

Why someone need to install a Goverment root certificate? It is obligatory if we need to connect to some legal services for online service: tax, socials, medicine.

What TSPU Cannot Do

1. Decrypt TLS traffic (without MITM):

Modern TLS with forward secrecy
No way to decrypt past traffic even with keys

2. Inspect encrypted tunnels:

VPN protocols use their own encryption
Cannot see contents of WireGuard, OpenVPN, etc.
Can only analyze metadata and patterns

3. Process 100% of traffic:

The performance constraint: At 100 Gbps backbone speeds, there's
~0.1 microseconds per packet. Deep analysis is physically impossible
for all traffic.

This is why circumvention exists.

TSPU architects know this. They made a conscious trade-off: optimize
for the 80% case (SNI blocking), use sampling for the rest.

This creates opportunities. When TSPU is under load (peak hours, DDoS,
major events), filtering quality degrades. Sophisticated circumvention
techniques slip through.

It's not a bug. It's an inherent limitation of real-time systems at scale.

Here's the fundamental trade-off:

Analysis Depth vs. Coverage:

┌──────────────────────────────────┐
│ Fast check (SNI)                 │ ← 100% of traffic
├──────────────────────────────────┤
│ Statistical analysis             │ ← ~50-70% (sampling)
├──────────────────────────────────┤
│ ML classification                │ ← ~10-20% (expensive)
├──────────────────────────────────┤
│ Deep reconstruction              │ ← <1% (suspicious only)
└──────────────────────────────────┘

The performance constraint: At 100 Gbps backbone speeds, there's ~0.1 microseconds to make a decision per packet. Deep analysis is simply too expensive for all traffic.

4. Block without collateral damage:

Shared IP problem: One IP = hundreds of sites
CDN infrastructure: Cloudflare, AWS, etc.
Blocking Cloudflare IP → hundreds of innocent sites affected

Detection Methods: How TSPU Identifies VPNs

1. Statistical Fingerprinting

VPN traffic has characteristic patterns that differ from normal HTTPS:

Packet size distribution:

Normal HTTPS:
Size: [52, 1460, 1460, 52, 150, 1460, ...]
      Small requests, large responses

VPN:
Size: [148, 92, 1320, 1280, 1350, ...]
      More uniform distribution

Timing analysis:

Normal HTTPS:
Packets: [0ms, 50ms, 51ms, 2000ms, 2050ms, ...]
         Bursts (page load), then pauses

VPN:
Packets: [0ms, 25000ms, 50000ms, 75000ms, ...]
         Regular keepalive packets

Connection characteristics:

Long-lived connections (hours vs minutes)
Balanced bidirectional traffic
Continuous data flow (not a burst)

2. Protocol Signatures

Each VPN protocol has recognizable patterns:

WireGuard:

Handshake Initiation: exactly 148 bytes
Handshake Response: exactly 92 bytes
UDP protocol (not TCP)
Specific packet size alignment (multiples of 16)
pure WG is useless in Russia due to static handshake pattern

OpenVPN:

Characteristic handshake sequence
Currently in Russia this protocol useless for censorship circumventing - outdated and well-known patterns seen by DPI.

Shadowsocks:

Specific SOCKS5 patterns
Currently in Russia this protocol useless for censorship circumventing - it seen by DPI and blocked with ease.

3. Active Probing

TSPU doesn't just passively observe—it actively tests suspicious servers:

Probing workflow:

1. Detect: Suspicious traffic pattern to IP:port
2. Record: Store IP and port
3. Probe: TSPU connects to the same IP:port
4. Test: Send VPN-like handshake packets
5. Analyze: Check if server responds like VPN
6. Block: If confirmed VPN → add to blocklist

This is one of the way how TSPU handle with advanced VLESS+Reality approach.

Why this works:

Most VPN servers respond predictably to handshakes
TSPU can test without being an actual client
Automated discovery of new VPN servers

How VPN protocols defend:

Authentication before responding (VLESS+Reality approach)
Whitelist of allowed client IPs
Respond differently to unauthorized connections

4. Replay Attacks

Concept: Record legitimate traffic and replay it to suspicious servers

But here is not a much information and me personally does not see such behavior, only rumors.

The Fundamental Trade-off

TSPU faces an inherent constraint:

Performance ←→ Accuracy

High throughput processing:
- Fast, simple checks (SNI lookup)
- Low false positives
- Easy to circumvent

Deep analysis:
- Expensive (CPU, memory, latency)
- Better accuracy
- Cannot apply to all traffic

This trade-off is why circumvention is possible. Under load, TSPU must choose between:

Dropping packets (network failure)
Letting suspicious traffic through
Heavy sampling (miss some VPNs)

In peak hours, filtering quality degrades—this is exploitable.

Architectural Challenges

Decentralized Network Problem

Unlike China's centralized Great Firewall, Russia has:

Hundreds of ISPs (not 3-4 major ones)
No single network chokepoint
International connections via multiple routes
Difficulty coordinating across all nodes

Result: Inconsistent filtering across ISPs and regions.

Economic Constraints

Aggressive filtering has costs:

Collateral damage → business complaints
AWS/Cloudflare blocking → economic impact
Performance degradation → legal users and intergoverment exchange have problems
Investment in TSPU infrastructure → budget constraints

The state must balance control vs. economic functionality.

Technical Debt

Modern internet architecture is inherently distributed:

CDN networks (Cloudflare, Akamai, Fastly)
Cloud providers (AWS, Google Cloud, Azure)
Peer-to-peer protocols (WebRTC, BitTorrent)
Dynamic IPs and load balancing

Old model (pre-2010s):

Website → Fixed IP → Easy to block

New model (2020s):

Service → CDN → 10,000 IPs → Shared with other services → Hard to block

Trying to block a single service often requires blocking entire CDN ranges, causing massive collateral damage.

What We've Learned

Key insights about TSPU architecture:

✅ Distributed system operating at massive scale (100+ Gbps)
✅ Real-time classification with microsecond decision windows
✅ Trade-off between depth and coverage (can't deeply analyze all traffic)
✅ Primary method: SNI inspection (simple, fast, effective)
✅ Advanced methods: ML, statistical analysis, active probing
❌ Cannot decrypt modern TLS without MITM
❌ Collateral damage problem with shared infrastructure
❌ Performance constraints limit deep inspection

The bottom line: TSPU is a sophisticated system, but it has fundamental limitations imposed by physics (processing speed), economics (cost vs. benefit), and architecture (distributed internet infrastructure).

What's Next

In Part 2, we'll explore:

How protocols exploit TSPU's weaknesses
The OSI model through a security lens
VPN protocols: WireGuard, VLESS+Reality, Shadowsocks
Packet manipulation techniques (TCP segmentation, TTL tricks)
Why nested encryption (SSH tunneling TLS) works

References & Further Reading

Next in series: Part 2 - "Circumventing Internet Censorship: Protocols, Techniques, and the Arms Race"

DEV Community