DEV Community

Cover image for Browser Simulation vs API Access: Why Amazon Is Challenging Perplexity Comet
Ali Farhat
Ali Farhat Subscriber

Posted on • Originally published at scalevise.com

Browser Simulation vs API Access: Why Amazon Is Challenging Perplexity Comet

The first real battle over AI web access has started, and it is not about scraping. It is about identity, control and the future architecture of autonomous AI agents.

Perplexity’s new desktop assistant Comet behaves like a human user with a browser. It can open websites, extract information in real time and act on behalf of the user. Amazon has responded by signalling that this type of access may violate its platform rules.

Developers have discussed scraping, headless browsers and bot detection for years, but Comet is the first mainstream AI tool to turn that into a user-facing product. That changes the stakes. This is no longer about bypassing rate limits. It is about whether an AI system is allowed to act as the user without using an API that the platform controls.

This is a technical conflict disguised as a legal one. To understand where this is going, we need to look at the layers: browser behaviour, headers, cookie identity, user intent, platform policy enforcement and the architectural differences between “acting as a user” and “acting as a bot.”

How Comet actually works

Comet does not use Amazon’s Product Advertising API. It does not request privileged data. It loads the same public pages a user would, but it does so programmatically.

The workflow looks something like this:

  1. Local client receives the user query
  2. Comet sends a headless browser request through its agent layer
  3. The agent behaves like a real browser: headers, cookies, viewport, timing
  4. DOM is parsed and extracted
  5. Results are returned to the user in structured form

This is not classic scraping. It more closely mirrors “assisted browsing.” The technical premise is simple: if a human user has the right to load a page, an AI acting on behalf of that user should have the same right.

Platforms do not agree.

How Amazon detects the difference

Amazon has multiple defensive layers designed to identify non-human access:

  • Request fingerprinting
  • Header and UA validation
  • Behavioural timing analysis
  • IP and ASN profiling
  • Cookie and session history
  • API-vs-browser access matching
  • Commercial intent detection based on velocity

Even if the agent imitates a normal browser, certain signals eventually give it away: latency patterns, no mouse movement, no asset rendering, zero scroll events, or the fact that the session never logs in but repeatedly fetches structured product data.

For years this was a private cat-and-mouse game between scrapers and platform anti-bot systems. Comet moved it into the open because the end user now expects the agent to act as a browser. That forces platforms to respond publicly rather than silently rate-limit or block.

Why the API is not the solution

Every platform makes the same argument: if you want data, use the official API.

There are four problems with that position:

  1. APIs expose filtered data, not the real interface
  2. APIs enforce business rules that benefit the platform, not the user
  3. APIs can be revoked, rate-limited or paywalled at any moment
  4. APIs do not allow an AI agent to behave as a full substitute for browsing

An API gives access to data. A browser gives access to reality.

Comet is built on the idea that the web itself is the API. Amazon is built on the idea that the API is the only compliant interface.

That ideological gap is what makes this more than a technical issue.

Legal terms were not written for autonomous agents

Most Terms of Service assume only two actors exist:

  • A human user who browses normally
  • A bot or scraper that is not authorised

Comet breaks that binary. It is neither a bot nor a human. It is a delegated user agent. And no mainstream legal framework has a defined category for that.

The question now being tested:

If a user is allowed to view a page, is an AI assistant acting on that user’s device also allowed?

If the answer becomes no, the web is no longer open. It becomes a permissioned space where the browser must be approved the same way an API client is approved.

That would reshape the entire future of AI agents.

The architecture problem for developers

If you are building an AI agent today, you have three architectural choices:

1. API-only integration

Safe, reliable, limited, bound by the platform’s commercial incentives.

2. Browser simulation with headless clients

Flexible, real-time, but now legally exposed and detectable.

3. Hybrid model

Use APIs for policy-sensitive data, browser access for public-page logic and fallbacks when data is missing.

The Comet incident shows that option two will not scale without conflict. Option three is where most serious AI systems will land, because pure API dependence is a single point of business failure.

Compliance risk for companies adopting agents

If your company uses a tool like Comet to gather product, pricing or research data, you are not shielded by the vendor. Terms of service often assign liability to the user making the request, not the software provider.

That means:

  • A blocked agent can break your workflow without warning
  • A platform can issue legal notice directly to your organisation
  • Your internal compliance team has no visibility into how the agent fetched the data

Until governance models mature, using autonomous browser-based agents in production environments is a policy risk, not just a technical one.

Why platforms care more than ever

AI agents change the economics of web access. A single human browsing a website creates almost no load. A single AI assistant acting for thousands of users creates continuous machine-driven traffic.

Platforms do not want an ecosystem where they host the data, pay the bandwidth bill and lose the commercial surface to a third-party AI interface. The moment Comet became real, platform incentives changed.

This is not about scraping costs. It is about losing control of the customer journey.

What happens next

There are only three possible outcomes.

Scenario 1: Platform lock-in wins

AI agents must sign TOS-bound API agreements. The open web becomes a read-only surface for humans only.

Scenario 2: User rights extend to agents

If a user can browse, their agent can browse. Browser simulation becomes legally protected.

Scenario 3: Layered access

Public pages remain open, but automated extraction above a certain frequency is blocked unless whitelisted.

The most realistic outcome is scenario three. It preserves platform control but avoids full lockdown.

What developers should prepare for

  1. Expect stricter bot detection

    Headless browsing will require continuous adaptation.

  2. Build agent identity layers

    Agents will need signed tokens tied to user accounts, not anonymous requests.

  3. Keep API fallback paths ready

    If a platform blocks simulated browsing, the product cannot fail silently.

  4. Log everything

    Legal and compliance review will need full traceability per request.

  5. Avoid single-platform dependencies

    A shutdown from one provider must not break your core features.

This dispute is not an edge case. It is the first domino.

Why this matters more than scraping

The debate is not about whether Perplexity violated a ToS line item. It is about whether the next generation of interfaces will belong to platforms or users.

If AI agents cannot browse, they cannot replace the front end. If APIs control access, platforms control which features AI systems are allowed to deliver.

That would mean the web remains technically open but practically gated.

The Amazon vs Perplexity Comet situation is not a lawsuit or a shutdown. It is a signal. Platforms are preparing to enforce rules against autonomous web agents at scale. Anyone building AI-driven retrieval, research or product analysis tools will face the same constraints.

The companies that survive will not be the ones with the best model. They will be the ones with the most resilient access architecture.


Want a breakdown of how to design compliant agent architectures that avoid lock-in

Read more on Scalevise about AI agent governance and workflow automation.

Top comments (8)

Collapse
 
hubspottraining profile image
HubSpotTraining

You’re oversimplifying the “user delegation” argument. If an agent can impersonate a user and bypass rate limits, then it is a bot whether you call it delegation or not. Every fraud system in the world treats it that way.

Collapse
 
alifar profile image
Ali Farhat

Fair point. The term “delegation” won’t survive once risk engines classify the traffic as synthetic. What matters is not the intention, but the signal profile. If the traffic pattern is machine-scaled, it will get flagged regardless of the legal framing.

Collapse
 
js402 profile image
Alexander Ertli

Let's see where this is going...

I'm betting on the stricter bot detection + user rights extend to agents but with more captchas to solve.

That's the only thing that's logical to me personally.

Collapse
 
alifar profile image
Ali Farhat

The real question isn’t whether agents can solve CAPTCHAs.
It’s whether platforms will legally allow an agent to be treated as a first-class user.

Collapse
 
bbeigth profile image
BBeigth

Platforms don’t block headless browsers because of risk. They block them because they kill monetisation and offload costs onto the host.

Collapse
 
alifar profile image
Ali Farhat

That’s accurate. The enforcement layer is framed as security, but the root driver is economic asymmetry. Agents externalise compute, bandwidth and UI while extracting value. No platform will allow that indefinitely.

Collapse
 
rolf_w_efbaf3d0bd30cd258a profile image
Rolf W

The whole “web is the API” argument collapses the moment the traffic becomes industrialised. Public HTML is not a rights contract. Platforms have zero obligation to support unapproved machine access.

Collapse
 
alifar profile image
Ali Farhat

Correct. Public visibility is not the same as programmatic entitlement. Once autonomous agents industrialise retrieval, platforms will enforce scarcity through policy or protocol. That is the economic layer people ignore.