David Loibner

Posted on Jun 10

Agent workflows need an impact boundary

#ai #agents #security #devtools

In part 1, I wrote about why coding agents should not hold write credentials.

GitHub was the example, because the problem is easy to see there. A coding agent can read a repository, reason about a change, and produce useful work. But if the same agent also owns the token that creates branches, commits, or pull requests, the proposal and the authority to create impact are too close together.

The problem is not only GitHub.
The problem is the moment where an agent request becomes an external effect.

Agents are getting more useful because they can use tools. They can read files, call APIs, update tickets, prepare emails, run commands, inspect systems, and sometimes change state. That is exactly why the boundary matters more, not less.

The question is not only:

Can the agent use this tool?

The more important question is:

Should this specific request become impact now?

That is the missing layer I keep coming back to.

Tool access is not impact permission

A tool can be visible to the agent. The call can be valid. The arguments can be well formed. The agent can even have a reasonable goal.

Still, this does not automatically mean the requested effect should happen.

A GitHub tool may create a pull request. A database tool may update or delete rows. A cloud tool may deploy a configuration. An email tool may send a message. In all of these cases, the tool is not only returning information. It can change a system that someone cares about.

This is where I think many agent workflows are still too flat. They often treat tool access as if it already contained the whole decision. If the agent can call the tool, the tool executes. If the call succeeds, the system moves on.

That may be acceptable for many read-only or low-risk operations. But for tools that create external effects, I think there should be another step between the agent request and the target system.

The agent should be able to propose work. But the fact that a tool exists should not mean that every valid tool call becomes impact.

The missing layer is admission

The distinction that helped me most is this:

Scope defines what is possible.
Admission decides what is allowed now.
Logs record what happened.

These are related, but they are not the same thing.

Scope is mostly about design-time limits. Which tools are visible? Which paths are available? Which actions are impossible from the beginning? This is useful and necessary, because an agent should not even see tools or data it does not need.

Admission is different. It is about the concrete request in the current situation. The question is not only whether the operation exists or whether the agent generally has access to it. The question is whether this requested effect is allowed now, under the current state, scope, and policy.

An event log comes after that. It helps reconstruct what happened, which is important for audit and debugging. But a good history of what happened is not the same as a decision before impact.

In a normal system this may sound obvious. In agent workflows it is easy to miss, because the agent often sits directly in front of powerful tools. The tool call becomes the action. The action becomes the outcome. The boundary is only visible afterward, when something needs to be explained, reverted, closed, or cleaned up.

That is the part I think should move earlier.

State matters

A request cannot be judged only by its name.

The same operation may be fine in one state and wrong in another. A pull request against the expected branch head may be acceptable, while the same proposed change against stale repository state should be blocked or sent back. Updating test data may be harmless, while the same update against production may not be. Sending an email draft may be fine, while sending the message to real users may require review.

The operation is the same in a rough sense, but the situation is not.

That is why an agent request should be tied to the state it was based on. It should not be enough that the agent says it looked at the system. The decision layer should know, at least for the relevant parts, what state the agent was allowed to observe.

If the state has changed, the right answer should not be to guess and continue. It should be a structured conflict. The agent can then re-read the state and submit a new request.

This is not only a defensive mechanism. It also gives the agent useful feedback. The system does not have to say that the goal is forbidden. It can say that the proposal is stale.

That distinction matters if we want agents to work better inside boundaries instead of simply failing at them.

Per request, not per session

I also think the unit of permission matters.

A broad session permission is convenient. It can say that a certain agent session is allowed to write, deploy, send, or modify something for a limited time. For human workflows this kind of model is often acceptable, because the human user carries context and responsibility through the session.

For an agent, a session can become a temporary impact window.

The agent may retry. It may misunderstand a previous result. It may keep going from stale assumptions. It may call the same tool again with slightly changed arguments. If the session still has broad authority, the system may know which agent acted, but it did not decide each effect separately.

This is why I prefer the request or intent as the unit of decision.

Not:

this agent session may write for the next ten minutes

but rather:

this requested effect is allowed under this state, scope, and policy

That is a narrower form of authority. It does not prevent the agent from working. It only prevents broad access from silently turning into broad impact.

Agent retries are different

The same issue appears with idempotency.

In distributed systems, idempotency often protects against technical retries. A request times out, a response is lost, or a client sends the same request twice. The system should not create the same effect twice just because the transport was unreliable.

Agents retry for messier reasons.

They may reword the same goal. They may generate a slightly different payload. They may call another tool. They may try again because they did not understand that the previous attempt already created a pending result.

In that case, the prompt or payload may be different, while the intended effect is still the same.

This does not mean the boundary should magically guess meaning from free text. That would be too weak. A better approach is to make the agent submit a more structured request before impact is possible. The system should decide on the target, operation class, expected state, and requested outcome, not only on the natural-language prompt that produced it.

Then it becomes possible to ask better questions.

Is this effect already pending? Was it already completed? Did a previous attempt partially create it? Should the existing outcome be reused? Should the agent re-read state first?

Tool idempotency protects the request path.

Intent-level idempotency protects the workflow from repeated attempts toward the same effect.

This is not a replacement for review

An impact boundary does not prove that the agent is right.

It does not prove that generated code is good. It does not prove that a database change is meaningful. It does not prove that a deployment is a good idea. Human review, tests, domain knowledge, and normal engineering judgement are still needed.

The claim is narrower.

The agent should not be the component that turns its own request into external state change.

It can reason. It can propose. It can use tools. But when the requested action changes something outside the model, there should be a separate decision before impact.

That decision may be simple in low-risk cases. It may only check scope and freshness. In higher-risk cases it may require approval, reuse an existing outcome, or block the request. The exact implementation can differ between systems.

The architectural point stays the same.

Tool access should not become impact by default.

Why I care about this

I do not think production agent systems will become trustworthy only because the models get better or the tool interfaces get cleaner.

Better models help. Cleaner interfaces help. Sandboxes help. Logs help. Reviews help.

But when agents start acting on real systems, there is still one question that needs its own place:

who decides what becomes impact?

My answer is not that agents should be kept away from tools. The opposite is probably true. Agents become useful because they can interact with systems and do real work.

But useful work needs boundaries.

For human work, we already know this. We use roles, reviews, limits, approvals, and audit trails. We do not treat every possible action as automatically allowed just because someone can technically perform it.

Agent workflows need the same idea in a machine-readable form.

Let agents request work.
Let them propose changes.
Let them use tools.

But before their work changes an external system, there should be an impact boundary.

That is the layer I think is missing in many agent workflows.

Project: Impact Boundary Labs