Eli

Posted on May 15 • Originally published at blog.browserman.run

Trust-sensitive agents need visible friction

#ai #agents #browserautomation #security

Background workers are great until the agent is about to submit something under your name.

That is the line where automation changes character.

For low-risk work, the best interface is often no interface. Let the agent classify the lead, summarize the page, enrich the record, monitor the inbox, or draft the first pass. If the action is reversible, internal, and easy to inspect later, hiding the machinery is a feature.

But a different kind of workflow is arriving: agents that apply for jobs, publish content, issue refunds, update customer records, book travel, send replies, change dashboards, and operate logged-in web apps.

Those are not just browser tasks. They are trust-sensitive actions.

The product problem is not removing every bit of friction. It is deciding where friction belongs.

The browser is where authority lives

Agents want browsers because the browser is where work already happens.

APIs are incomplete. Internal tools are inconsistent. OAuth scopes are often too coarse. The real state of a workflow is usually scattered across tabs, forms, dashboards, inboxes, and account-specific UI.

A logged-in browser session can already do the thing. That is why it is useful. It is also why it is risky.

The useful browser bridge keeps auth and session state out of the prompt while still letting the agent work. That separation is the point: the agent gets a tool boundary, not a pasted credential bundle.

The boundary is not only “can it act?” It is also “what credentials does it silently inherit?”

Once an agent can use a real browser session, it is not just controlling pixels. It is borrowing authority: the user’s account, the company’s account, the brand’s account, the customer support account, the payment dashboard, the CMS, the CRM.

That is the part most demos skip.

The demo asks:

Can the agent click the button?

The production question is:

What should the agent have to prove before it clicks?

Background workers are not always the right shape

A useful signal came from a builder working on a job-application agent. They had built the normal backend shape: queue, workers, background tasks, submit pipeline. Then they ripped it out and kept the browser extension.

The reason was not that workers are bad. The reason was that job applications are trust-sensitive. Submitting under a person’s name needs a surface the user can see and interrupt.

That distinction generalizes.

A worker queue is great for:

collecting public information;
extracting structured data;
drafting content;
checking status;
classifying tickets;
preparing a recommendation.

A visible surface becomes important when the agent is about to:

submit a job application;
publish under a brand account;
send a customer reply;
process a refund;
update a CRM record;
pay, book, delete, or change account state.

The final mile matters because the action is no longer just computation. It changes the world under a real identity.

Ask what the agent can prove before it clicks

A smart pointer is not just a better cursor. It is a permission boundary.

Before an agent clicks, the product should be able to answer questions like:

Which account is this action using?
Which page, customer, order, or record is about to change?
Is this action reversible?
Is the dollar amount below the approval threshold?
Is this a draft, or is it about to be published?
Did the agent inspect the right evidence?
Is the user seeing the same final state the agent is acting on?

For browser agents, the click is often the last step in a chain of assumptions. The useful interface is not a giant approval modal for every action. It is a way to make the critical assumptions visible at the moment they matter.

A refund example makes this concrete.

If a support agent processes a $4,500 refund with no policy, no human in the loop, and no rule saying “anything over $500 needs approval,” the failure is not just model quality. The failure is the permission boundary.

The product allowed a high-blast-radius action to look like an ordinary tool call.

Approval gates should be thresholded, not everywhere

The wrong lesson is: make humans approve everything.

That turns an agent into a slow workflow UI. It also teaches users to click through approvals without thinking.

The better lesson is: place friction according to blast radius.

A practical policy model might separate actions by:

read vs write;
reversible vs irreversible;
internal-only vs customer-facing;
low-dollar vs high-dollar;
draft vs publish/send/submit;
ordinary path vs exception;
one record vs bulk action.

Most agent workflows need lanes, not one giant permission switch.

Read-only browser work can be broad. Drafting can be relatively free. Low-risk routine actions can be automated. But actions that spend money, affect customers, publish publicly, delete data, or submit under someone’s name should slow down.

The point is not to make agents timid. The point is to keep autonomy legible.

Completion is not the click

There is another quiet failure mode in business automation: false completion.

A refund clicked in Stripe is not necessarily finished. The customer may still need a note. Access may need to be removed. A duplicate refund check may still be pending. The CRM may need a record. The support thread may need a final reply.

The browser click can be successful while the workflow is incomplete.

That means receipts matter.

After the action, the agent should leave behind enough context for a human or another system to understand what happened:

what changed;
where it changed;
which evidence was used;
which policy allowed it;
whether follow-up tasks remain;
where to inspect or reverse the action if needed.

For trust-sensitive workflows, “the agent says it did it” is not a receipt.

A receipt is browser state, tool calls, before/after context, policy decisions, and artifacts that line up. It should be replayable enough that someone can reconstruct the action months later: which tool was called, on what input, under what authority, and what changed.

The BrowserMan angle: delegated browser authority

BrowserMan should not be framed as “an agent can click websites.” That is becoming table stakes.

The stronger frame is delegated browser authority.

A real Chrome session already contains trust. It has the cookies, tabs, login state, and messy context that make work possible. BrowserMan gives agents controlled access to that session while keeping cookies local and letting access be scoped, audited, approved, and revoked.

That means the important design questions are not only:

Can the agent navigate?
Can it click?
Can it type?

They are:

Which browser session can this agent use?
Which job is it allowed to perform?
Which actions need a visible pause?
What proof is required before the click?
What receipt exists after the action?
How does the user take access back?

The browser session is authority.

The future of browser agents is not just better clicking. It is better delegation.

DEV Community