DEV Community

Romeo Mihalcea
Romeo Mihalcea

Posted on

ISP proxies, AI crawlers, and the slow death of datacenter IPs: 2026 in numbers

TL;DR

Bots passed humans on the open web. IP reputation feeds stopped working for residential traffic. IPv4 prices collapsed. AI crawlers became a measurable tax on public sites. And Europe finally started writing big GDPR checks while only fining 1.3% of complaints. If you ship anything that touches the public web at scale, the IP infrastructure you set up in 2022 is doing more harm than good in 2026.

The headline numbers:

  • 51% of all web traffic in 2024 was automated. Bots beat humans for the first time in a decade. (Imperva 2025 Bad Bot Report)
  • 37% was bad bots specifically, up from 32% in 2023. Sixth straight yearly increase.
  • 2.8% of websites tested in 2025 were fully protected against bots, down from 8.4% the year before. (DataDome 2025 Global Bot Security Report)
  • 78% of residential-IP sessions in a 4-billion-session study evaded conventional IP reputation feeds. (GreyNoise / IPInfo, April 2026)
  • <$21/IP for large-block IPv4 transfers in May 2025. Roughly a 10-year low. (IPv4.Global)
  • +187% YoY growth in AI-driven traffic in 2025. (HUMAN Security)
  • 62% → 90% of investment firms using alternative data, in two years. (Lowenstein Sandler)
  • €1.15B in GDPR fines from EU DPAs in 2025; only 1.3% of complaints actually result in a fine. (EDPB; noyb)

Bots overtook humans in 2024 and the gap keeps widening

If you've shipped a scraper in the last two years you already feel this. The data behind the gut feeling: 51% of web traffic in 2024 was automated, per Imperva. Bad bots specifically were 37%, up from 32%. That 5-point jump is the largest single-year increase in Imperva's twelve-year time series.

The defense side moved the wrong way. DataDome tested over 16,900 sites across 22 industries in 2025 and found only 2.8% were fully protected, down from 8.4% in 2024. 61% of domains failed to detect a single test bot.

That's not a story about bot mitigation getting worse. It's a story about generative AI lowering the cost of writing request-level automation. People who couldn't afford a developer can now prompt one.

The target surface shifted too. 44% of advanced bot traffic now hits APIs instead of HTML pages. Verizon's 2025 DBIR puts the median rate of credential-stuffing activity across SSO providers at 19% of daily auth attempts. Roughly one in five logins at identity-provider scale is machine-driven. That's wild.

Why datacenter IPs stopped working

A joint GreyNoise / IPInfo study published in April 2026 examined 4 billion edge-attack sessions over three months. The findings:

  • 39% of those sessions came from residential IPs.
  • 78% of the residential-IP sessions evaded IP reputation feeds entirely.
  • 89.7% of the malicious residential IPs were active for under a month before rotating out.

Static IP blocklists, the backbone of anti-bot defense for a decade, no longer carry the signal they used to.

Detection now has to come from somewhere else: behavior over time, browser fingerprint, session history, telemetry. The IP alone tells defenders very little.

That sets up a weird market. Residential-class IPs are the dominant workaround, and the underlying economics got cheap fast.

IPv4 prices collapsed

Large-block (/16+) transfer prices fell to under $21 per IP in May 2025, the lowest in roughly a decade per IPv4.Global. 8,062 IPv4 transfers were recorded globally in 2025, near an all-time high in transfer volume. Monthly lease rates sit at roughly $0.40 to $0.50 per IP.

The structural reason: IPv6 finally caught up. Google reports US IPv6 share crossed 50% in February 2025; France hit ~86% by February 2026. New enterprise workloads are migrating to v6, and incumbents are liquidating hoarded v4 blocks. An ISP proxy is, structurally, an IPv4 lease on a reputable consumer-ISP ASN with the right reverse DNS pointer. Those numbers set the floor on unit economics across the sector.

The AI crawler problem nobody had two years ago

This one sneaks up on you if you run any public site with content.

Cloudflare measured AI crawlers at ~8.7% of all HTML request traffic in 2025. Googlebot was 4.5%; the other AI bots together were ~4.2%. User-driven AI crawls (someone hits "research this" in their assistant) grew 15× year over year.

HUMAN Security's 2026 State of AI Traffic benchmark reports AI-driven traffic up 187% YoY, with agentic-browser traffic up 7,851%. Akamai counted 25 billion AI-bot requests to commerce sites in July and August 2025 alone. DoubleVerify attributes 86% of the General Invalid Traffic increase in 2025 to AI crawlers, not classical fraud.

If you're an SRE, AI crawlers are now a meaningful share of your tail-latency budget, and they aren't all polite about robots.txt. If you're building anything that needs a clean read of the live web at scale (alt data, market intel, training corpora) you're competing for IP infrastructure with everyone running an LLM.

Comparison: datacenter vs rotating residential vs ISP proxies

If you're picking infra in 2026, here's the practical shape of the tradeoff:

Dimension Datacenter Rotating residential ISP (static residential)
ASN type Hosting (AWS, Hetzner, GCP) Consumer ISPs via real devices Consumer ISPs, hosted on servers
IP reputation pass rate Low. Detected within hours on protected sites. High. ~78% evade reputation feeds. High. Same ASN trust as residential.
Session stability High Low. IP rotates on device reconnect. High. Static IPs, server-grade uptime.
Speed Fast (~ms) Variable, often slow Fast (~ms)
Best fit Internal tools, low-protection targets Throwaway high-volume scrapes Long sessions, logged-in flows, ad verification, SERP
Worst fit Anything with modern bot detection Cart fills, auth flows, multi-step scrapes Pure rotation needs

The middle column is where rotating residential pools shine. The right column is where ISP proxies pay for themselves: anything that needs the same IP across a 20-minute logged-in session, or geo-stable for a SERP scrape, or trusted enough to render real ads instead of cloaked decoys.

What teams are actually buying this stuff for

Five workloads dominate the buyer mix in 2026, and the numbers behind each are big enough to matter:

  1. Retail price monitoring. Global e-commerce hit $6.42T in 2025, 20.5% of all retail (eMarketer). McKinsey's classic finding still holds: a 1% price improvement yields about 8.7% operating-profit lift. At that elasticity a continuous competitor scan pays for itself in weeks. Datacenter scrapes against major retailers now get silently poisoned with bad prices instead of blocked, which is worse than blocked.
  2. Ad verification. US digital ad revenue hit $294.6B in 2025 (IAB/PwC). Programmatic was $162.4B of it. The ANA reports $26.8B of programmatic spend leaked to inefficiency in Q2 2025 alone, up 34% in two years. You need real residential IPs in the right geos to see what campaigns actually look like for end users.
  3. Travel fare aggregation. Imperva found 48% of travel-industry traffic in 2024 was bad bots, the highest share of any sector. Skift reports the top four OTAs control 96% of the sector's $58B in revenue. Metasearch teams need stable residential egress just to keep rates fresh.
  4. SEO / SERP rank tracking. SEO software is an $84.9B market in 2025 (Fortune Business Insights), forecast to $154.6B by 2030. Personalized SERPs make rank tracking from scraping farms unreliable; agencies need geo-distributed residential egress.
  5. Alternative data for hedge funds. This one's the sleeper hit. Investment firms using alt data jumped from 62% in 2023 to 90% in 2025 (Lowenstein Sandler). 89% plan to grow budgets. Two-thirds already spend $1M+ per year. Grand View projects the alt-data market at $135.7B by 2030 from $11.65B in 2024, a 63.4% CAGR.

If you've wondered who writes the checks for residential IPs at industrial scale, it isn't marketing teams. It's quants and LLM labs.

The legal track moved more than most teams realize

Two years of case law and regulation worth knowing before you ship a commercial scraper:

US: scraping public data is defensible. Van Buren v. United States (2021) read the CFAA's "without authorization" language narrowly. The Ninth Circuit reaffirmed hiQ Labs v. LinkedIn in 2022; that case eventually settled for $500K plus a permanent injunction and data destruction. Public-data scraping at the appellate level is, post-hiQ, on solid ground.

EU: enforcement scaled, but base rate stays low. National DPAs issued €1,145,760,374 in GDPR fines during 2025 (EDPB). Cumulative fines since 2018 sit above €4.2B across 6,680+ decisions. The counter-signal worth pinning to your wall: only 1.3% of complaints brought to EU DPAs end in a fine, per noyb. Headline totals are real. The base rate per complaint is much lower than the totals imply.

EU Data Act and AI Act. The Data Act applies from 12 September 2025. The AI Act's Article 53 requires general-purpose AI providers to respect Article 4(3) machine-readable opt-outs and publish a "sufficiently detailed summary" of training data, including main scraped domains. Territorial scope follows EU market placement, so EU training-data compliance effectively exports anywhere a model is sold in Europe.

California woke up. The California Privacy Protection Agency's largest fine to date is $1.35M against Tractor Supply in September 2025, on Global Privacy Control non-compliance. The CPPA has telegraphed that GPC compliance is the priority enforcement vector through 2026.

None of this is legal advice. Talk to a lawyer before scaling anything commercial.

FAQ

What's an ISP proxy in plain terms?
A server-hosted IP address that originates from a consumer ISP's ASN (Comcast, Verizon, BT, Deutsche Telekom, etc.) instead of a datacenter ASN (AWS, Hetzner, GCP). Sometimes called a static residential proxy. From a target site's perspective it looks like home broadband, but it runs on server hardware for speed and session stability.

Why not just use rotating residential IPs?
For workflows that need a stable session (logged-in scrapes, multi-step flows, cart fills, ad verification) IP rotation breaks things. ISP proxies give you residential-grade trust without the volatility.

Are datacenter proxies actually dead?
For modern bot-protected targets, in practice yes. They still work for low-protection internal tools, certain APIs, or staging. They will not survive a price-monitoring run against a Tier-1 retailer or a SERP scrape on a tracked term.

How much do ISP proxies cost in 2026?
Underlying IPv4 lease rates are about $0.40 to $0.50 per IP per month, after large-block transfer prices fell under $21 per IP in 2025. Retail pricing has tracked those costs down. Single-digit dollars per IP per month is the realistic range at scale.

Is scraping legal?
Public-data scraping in the US is defensible after Van Buren and the Ninth Circuit's reaffirmation of hiQ. EU collection requires GDPR, EU Data Act (effective 12 Sept 2025), and AI Act compliance for any training-data use. Talk to a lawyer before scaling anything commercial.

How big is the proxy market really?
Mordor Intelligence sizes the residential proxy server software market at $122M in 2025, growing to $148M by 2030 (3.98% CAGR). The downstream markets that consume the IP layer are where the growth is: web scraping at $1.03B → $2.23B by 2031, alternative data at $11.65B → $135.7B by 2030.

What changed in the last 12 months specifically?
Three things. AI crawler traffic became a measurable, named tax on the public web. IPv4 prices collapsed alongside IPv6 finally crossing 50% in major markets. And IP reputation feeds, the load-bearing component of bot defense for a decade, are now functionally defeated for residential traffic.

What I'd tell someone starting fresh in 2026

If you're building automation that touches the public web at scale, your IP layer is no longer set-it-and-forget-it. Datacenter IPs are first-resort blocked, rotating residential pools break logged-in flows, and the IPs that work for serious work are residential-grade with stable sessions. That's the gap ISP proxies fill, and the data behind that gap is the rest of this post.

A few honest closing notes. The Mordor residential-proxy figure ($122M) is software-revenue scope only; broader sizings that bundle bandwidth resale and DaaS spend run an order of magnitude larger. The 51% bot share number from Imperva is measured against sites under their protection (so skewed toward enterprise targets attackers hunt). Cloudflare's ~30% bot share over the full anycast network is not the same denominator. Both numbers are correct at their stated scope. Don't conflate them.

This post was written by the team behind anonymous-proxies.net. We sell ISP proxies among other products. The full long-form analysis with all 50+ data points, every primary-source citation, the mega-table, and the methodology notes lives here:

https://anonymous-proxies.net/posts/isp-proxies-statistics-2026/

Top comments (0)