Tinyfishie

Posted on May 15 • Originally published at tinyfish.ai

Why Web Agents Fail on Protected Sites — And How to Fix It at the Infrastructure Level

#agents #infrastructure #security #webscraping

Web agents are increasingly central to how AI systems interact with the web — automating research, extracting structured data, completing multi-step workflows. But in production, many of them fail. Not because the agent logic is wrong. Because the browser infrastructure underneath isn't built for the modern web.

This article explains why protected sites are hard for automated agents, what kinds of solutions exist, and what "infrastructure-level" actually means in practice.

What Makes a Site "Protected"

Most developers think of site protection in terms of CAPTCHAs — the visible challenge that asks you to identify traffic lights or type distorted text. But modern access management goes several layers deeper.

When a request arrives at a protected site, the system evaluates multiple signals simultaneously:

IP reputation. Where is this request coming from? Datacenter IP ranges (AWS, GCP, Azure) are associated with automated traffic by default. An agent running on a cloud VM gets flagged at this layer before anything else is checked. Residential IPs are associated with real users and treated differently.
TLS fingerprint. Before the HTTP request arrives, the TLS handshake reveals what client is making it. A Python requests session or Node fetch call has a fingerprint protection systems identify in milliseconds — before your agent has seen the first byte of the page. Automation libraries have distinct signatures that differ from real browsers, and this check happens before a single page loads.
HTTP protocol patterns. Real browsers use HTTP/2 with specific header ordering and frame sequencing. Many automation tools default to HTTP/1.1 or send headers in a different order, creating a mismatch protection systems detect before any page logic runs.
Browser environment. Headless browsers have detectable properties — missing plugins, inconsistent hardware attributes, non-standard rendering metrics. An agent using headless Chrome without additional configuration exposes dozens of these signals simultaneously. Protection systems check them against known browser profiles.
Behavioral signals. Mouse movement patterns, scroll behavior, timing between interactions — real users produce different patterns than automated tools. An agent that navigates directly to a button and clicks it in 40ms looks nothing like a human. Modern protection systems use statistical models trained on real user behavior to flag these patterns.
Challenge responses. When earlier signals are ambiguous, the system issues a challenge (CAPTCHA or equivalent) that requires human-verifiable interaction. Passing the first five layers reduces how often this happens, but doesn't eliminate it entirely.

These signals compound. An agent that passes IP checks but fails environment checks still gets flagged. The failure mode most teams don't anticipate isn't the initial block — it's silent degradation: some protection systems let requests through while returning subtly incomplete data. By the time you notice, weeks of collection may be compromised. Each signal independently might seem manageable, but all of them together require a coherent approach.

The Assembly Problem

For developers who encounter this, the natural response is to assemble solutions: add a proxy service, configure the browser environment, simulate realistic timing. The components exist — residential proxies, browser configuration libraries, behavioral timing modules.

The problem isn't finding the pieces. It's maintaining the stack.

Access management systems are not static. They update continuously as they learn new patterns. A browser configuration that works today may be detectable in six weeks. The proxy pool that performs well this month may have degraded reputation by next quarter. The behavioral patterns that pass current analysis may fail against a new detection model.

This creates an ongoing engineering problem: someone on your team is responsible for watching it, updating it, and debugging it when things break in production. For teams that are fundamentally building agents — not infrastructure — this is a high tax on engineering attention.

A realistic estimate for a production DIY access management stack: $500–5,000/month in services — residential proxies alone run $3–15/GB depending on provider and geography (based on published rates from major providers as of Q1 2026), plus cloud compute and any fallback solving services — plus ongoing engineering time to keep it current. The services are the smaller cost. The real cost is the engineer maintaining the stack as detection systems evolve.

Infrastructure-Level vs. Application-Level Solutions

There's a meaningful architectural distinction in how access management can be handled:

Application-level: Your agent code handles access management. You configure proxies, harden the browser environment, add retry logic, and tune behavioral timing in your application. You control every layer and are responsible for maintaining each one as conditions change.

Infrastructure-level: A platform layer handles access management before your agent code runs. The agent receives a browser session already configured with appropriate access properties. Your application code doesn't need to know the details — and doesn't need to update when protection systems evolve.

The difference matters most for maintenance burden. When protection systems update, application-level solutions require you to update your code. Infrastructure-level solutions push that maintenance to the platform layer.

Neither model is universally better:

Application-level gives you full control over every component. It's more cost-effective at very high volume if you have dedicated infrastructure engineering bandwidth and need compliance auditability at every layer.
Infrastructure-level trades control for reduced maintenance. It makes sense for teams focused on agent capabilities who don't want access management to be a recurring engineering concern.

The right choice depends on your team's constraints — not on which approach sounds more sophisticated.

What Infrastructure-Level Looks Like in Practice

TinyFish is built around the infrastructure-level model. Rather than exposing access management components for developers to configure, it handles them as a platform service.

The developer-facing interface is minimal:

{
  "goal": "Extract pricing data from this page",
  "url": "https://example.com/pricing",
  "browser_profile": "stealth"
}

browser_profile: "stealth" activates infrastructure-level access handling. The platform layer configures the session appropriately for the target site. Your agent code stays the same regardless of what protection system the site uses.

To make this concrete: an agent monitoring price changes on a heavily protected retail site sends its first request. Under managed mode, the infrastructure layer handles routing, session configuration, and request properties automatically. The agent receives the page content and proceeds to the next step. On the third request 90 seconds later, the infrastructure layer rotates session parameters silently. The agent doesn't see it.

The auto-reconfiguration behavior is relevant here. In the Mind2Web benchmark, one task initially encountered an access issue and completed successfully on a subsequent run after the infrastructure layer reconfigured automatically — without developer input. This is what infrastructure-level handling means in practice: adaptation happens at the platform layer, not in your code.

Honest Capabilities

Infrastructure-level solutions don't eliminate all access challenges. Some sites use systems aggressive enough to require different approaches.

What works reliably. Sites with standard access controls are handled automatically in managed mode. Across the Mind2Web benchmark, TinyFish achieved approximately 90% task success across 136 live websites — all 300 execution traces are published publicly for independent review.

What has limitations. Sites using enterprise-grade protection systems are handled in some configurations, but with lower consistency than standard protection. If your target uses one of these systems, test with the free tier before committing.

What no tool handles. Sites that implement hard IP-level blocks have made an architectural decision to reject automated access. No browser infrastructure, regardless of how it's configured, can address a hard block at the network level.

Access challenges that appear despite infrastructure handling. Web agents on heavily protected sites sometimes encounter verification challenges even with infrastructure-level handling in place. TinyFish runs real browser sessions with infrastructure-level request handling. For sites where verification challenges still appear, third-party services can be integrated at the application layer as a fallback.

Two Browser Profiles

TinyFish offers two modes:

infrastructure mode ("stealth") — Full infrastructure-level access handling. Use for any production workflow against external sites with access controls. Slightly slower due to platform-layer processing.
lite — Minimal access handling, faster execution. Use for internal tools, public APIs, or sites with no access controls.

Default to managed for production agent workflows against external sites. Switch to lite only after confirming the target has no access controls.

The Right Question

The question of "which tool handles more protection systems" focuses on the wrong variable — it measures the tool instead of measuring your team's operational burden. The more durable question is architectural: where should access management live in your system?

For teams building AI agents, the better question is architectural: where does access management belong in your system?

If it belongs in your application — because you need full control, compliance auditability, or cost optimization at scale — build it there and staff accordingly.

If it belongs at the infrastructure layer — because your team's job is building agent capabilities, not maintaining access management — use a platform that handles it.

TinyFish gives you 500 free steps to test against the site that's actually giving you trouble. No credit card.

👉 Start free on TinyFish

FAQ

What protection systems does TinyFish handle?

Managed mode works reliably on sites with standard access controls. Sites using enterprise-grade protection systems have lower consistency. Hard IP-level blocks cannot be handled by any tool. Test with the free tier against your specific target before committing to production volume.

Do I need to configure proxies separately?

No. Residential proxy routing is included in every TinyFish plan at no extra cost. Add proxy_config: { enabled: true, country_code: "US" } to route through a specific geography. Supported countries: US, GB, CA, DE, FR, JP, AU.

What happens when access fails?

The infrastructure layer detects failures and attempts reconfiguration automatically — without your input. If reconfiguration succeeds, the task continues. If it fails, the run completes with a failure status including screenshots and execution logs for every step, accessible via the streaming URL.

How does managed mode affect speed?

Managed mode is slightly slower than lite mode due to infrastructure-layer processing. Simple extractions typically take 10–30 seconds. Multi-step workflows take 30–90 seconds depending on complexity. For sites without access controls, lite mode is faster and lower cost.

How does TinyFish compare to Browserbase or Firecrawl?

Browserbase provides cloud browsers where you implement access management in your application code via Stagehand or your own scripts — an application-level model. Firecrawl handles basic rendering with a crawl-focused API — a different architectural model from infrastructure-level access handling. TinyFish handles access management at the infrastructure layer, activated with a single parameter. See TinyFish vs Browserbase and TinyFish vs Firecrawl for detailed breakdowns.

DEV Community