DEV Community: David Loibner

Six articles, 200 views. So I read the feed's source code.

David Loibner — Wed, 29 Jul 2026 13:15:04 +0000

Six articles in two months. About 200 views. Total.

Here is the full table, because a postmortem without numbers is just a mood:

#	Article	Views	Reactions	Comments
1	Coding agents should not hold write credentials	58	1	5
2	Agent workflows need an impact boundary	26	0	0
3	Blocked is not failed: agents need boundary feedback	31	1	0
4	A write is not just a write	38	0	1
5	The reasoning was right, but the world shifted	29	1	2
6	The write was safe. The context was not.	<25	1	0

A series is supposed to compound. Each part should bring readers to the next one. Mine did the opposite.

The posts have not been live for the same amount of time, so this is not a controlled comparison. But no later part shows a clear sign of inheriting an audience from the one before. Two months in, the most-read article is still the first.

The comfortable explanation is that the articles are bad.

Maybe. A few readers did leave substantive comments, and another author picked up one of the terms from the series. That was encouraging, but two hundred views is far too small a sample to separate writing quality from distribution.

The honest version is narrower:

The writing never got a real test.

Another explanation is that quiet, conceptual titles do not work in a feed. That is not enough either. "your agent can think. it can't remember." had 144 reactions when I checked, so this kind of title can work.

That does not mean my specific titles worked. So I stopped guessing and looked at the platform.

The feed has source code

dev.to runs on Forem, and Forem is open source. Part of the logic behind its personalized feed is public. The repository contains the query builder, the feed variants, and the checked-in experiment mix.

I write about systems that make decisions from rules, state, and context. Then I sent six articles into another rule-driven system without reading its rules once.

So I read them.

One caveat before the numbers. The checked-in experiment config assigns 70 percent to one variant and 10 percent to each of three others, but that does not prove which exact configuration every DEV reader saw when my articles were published.

The irony of quoting potentially stale config in a series about stale state is not lost on me. Treat these numbers as a public model of the feed, not as a live production trace.

What the 70-percent public variant says

The relevance score is a product, not a sum. The query builder multiplies the configured factors, which means that one value near zero can sharply pull the candidate score down even when the others are strong.

For signed-in readers, discovery is also time sensitive in this variant. The age factor is 1.0 on publication day, 0.2 on day seven, and 0.02 on day eight. Candidates can still be considered for about two weeks, but the first days carry most of the weight of this factor.

Discussion on the article is another factor. Zero comments score 0.15 on that component, while two comments score 0.66. That makes the comments component about 4.4 times larger, but it does not mean that the article becomes 4.4 times more visible. The final ordering also uses a feed-success score, a clickbait penalty, and a small random term.

The feed is personal too. In the 70-percent public variant, the same article gets a factor of 1.0 for a signed-in reader who follows the author and 0.01 for one who does not. User-specific factors are skipped for anonymous readers, so the same article can rank very differently for different people.

Reactions, tag matches, language, and recommendation signals also take part in the calculation. No single factor explains the feed, but several weak ones can combine.

My series, read against that shape

The code does not explain the table, but it makes one mistake in my process easier to see.

I had almost no existing audience. Several posts drew no discussion, and for the others I do not know whether the comments arrived early enough to matter. Most of the time, I had no repeatable distribution process. I published, waited, and started writing the next part.

The few times I joined related discussions were also the times the work reached the right readers.

The feed does not have to reject an article. In this public model, silence can simply become another multiplier.

The best discussions around my work did not happen under my own articles. They started under other people's posts, where I showed up as a reader with something relevant to add.

That is not a ranking insight. It is a simpler one: I got more from joining an existing conversation than from pressing publish and waiting. My process mostly stopped at publish.

The title lesson is mine too. For an unknown author, the title carries most of the first impression. One of my weakest entrances was "Agent workflows need an impact boundary."

It gives the conclusion before it gives a stranger the problem.

The uncomfortable part

Article five, "The reasoning was right, but the world shifted", argues that an agent must not act on a world it never actually checked.

Article six, "The write was safe. The context was not.", makes a related point about the context an agent reasons from.

Both are about AI agents. They also turned out to be about me on dev.to.

The writing may have been fine, but it never reached enough readers to know. I had not read the context I was publishing into, so the reasoning could have been right while the world it shipped into was still a guess.

What I actually learned

I still do not know exactly why these six articles stayed small. The writing may be part of it, the titles may be part of it, and starting with almost no audience certainly did not help.

Reading the code did not settle any of that. It made one mistake clear:

I had measured views. I had not measured distribution.

I wrote six connected ideas and expected the series label to distribute them.

I did not give each article its own entrance, and I also treated publish as the end of the work.

A series is supposed to compound.

Mine kept asking every article to find its first reader again.

I know now that publishing is not enough. I still do not know how to do the rest without taking time away from the writing.

The write was safe. The context was not.

David Loibner — Wed, 22 Jul 2026 14:38:05 +0000

A request can be perfectly bound to state and still be built on the wrong view of the world.

In part 5, I wrote about binding an intent to the state it was based on. If the branch moved after the agent read it, the request should stop.

That check works. But it runs late.

By the time a request arrives at the write path, the agent has already decided what to propose. That decision was made earlier, based on whatever the agent happened to read.

And nothing in the system described what it had been allowed to read. It opened files, followed imports, and pulled in whatever looked relevant. The intent was formed from that.

A request can be perfectly bound to state and still be built on the wrong view of the world.

The earlier parts of this series treated reading as a precondition. The agent has to see something before it can propose anything. Reading was never treated as an operation that needs its own control.

I think that was a gap.

A read is not as passive as it looks

Start with the obvious version. Reading does not change the target.

Except that it often does. Opening an email can mark it as read. Fetching a record can bump a counter, write an access log, or cost money. Something changed, just by looking at it.

But that is the small problem.

The bigger one is that every read changes the agent. What it sees becomes what it knows, and what it knows becomes what it proposes next.

Read access is also not one permission.

Reading a repository is not the same as reading deployment secrets. Reading an email subject is not the same as downloading every attachment. Reading one record is not the same as exporting a table.

These are all called "read". The word is too coarse.

Whoever writes the content gets a say in the plan

Prompt injection is usually described as bad instructions tricking a model. It is also a question of what the agent was allowed to read in the first place.

A ticket, an email, or a file is not passive text to a language model. Content can read like an instruction. Task data and control signals arrive through the same channel.

So if an agent freely decides what to pull into its context, anyone who can leave text where the agent will look has a say in what it does next. Whether it is a comment on an issue, a line in a config file or in a support ticket from a stranger.

The email can be fetched correctly, be the current version, and come from a known sender. It can still carry instructions meant to steer the agent.

An allowed read is not a safe read.

It is worth being careful about what any boundary can promise here. It cannot control how a model reads text. There is no reliable way to clean natural language so that manipulation is gone.

What it can control is smaller. Which sources are read. Which fields come back. How much is returned. Whether the content is marked as untrusted.

The goal is not to declare the content safe. It is to know where it came from, and how much of it got in.

The maze makes this easy to see

Imagine an agent in a maze.

Giving it the full map is one way to answer a read request. Giving it only its current position and the cells next to it is another. Both may be enough to decide the next move. They expose very different amounts of information.

The agent can decide where it wants to go. It should not be able to hand itself a bigger map.

That is the distinction I want to look at here.

State view: what the agent was actually shown, and under what authority.
State binding: how that view connects to what it proposes.

This is about state that arrives through tools. Not the whole prompt, not memory, not everything the model happens to have in context.

This is not just read access

A sandbox can limit which files, networks, or processes an agent can reach. That stays necessary.

But reachability is not observation. Knowing a file could be reached does not tell you what the agent actually got, which version it was, or under whose permission.

A state view is narrower. It says which source was read, which fields came back, which version it was, and who allowed it.

A sandbox limits where the agent can go. A state view records what it was actually handed.

A read can still leak

A write is meant to change the external system. A read usually is not. That still does not make it free.

When an agent pulls content into its context, that content travels further than the system it came from. It can end up with a model provider, a runtime, a log, a trace, a tool server, or another sub-agent.

No target state changed. The information boundary changed.

That is a reach question, applied to reading.

What a state view looks like

If a state view is a controlled observation, something has to issue it and be able to check it later. The one thing it cannot be is the agent itself.

It also has to be written after the read came back, not when the read was allowed. Otherwise it only records what the agent could have seen, not what it actually got.

{
  "state_view_id": "sv_01J...",
  "view_version": 3,
  "source": {
    "system": "github",
    "repository": "example/app",
    "path": "config/deployment.yaml"
  },
  "observation": {
    "source_version": "blob:8f31c2",
    "fields": ["content"],
    "observed_at": "2026-07-22T10:15:00Z"
  },
  "authority": {
    "scope": "repository_read",
    "granted_by": "policy:repo-read"
  },
  "issued_by": "boundary",
  "content_trust": "untrusted"
}

The exact fields do not matter. What matters is that the agent did not write this record, and that it says enough to be checked later.

A view also does not have to be built in one read. Agents explore. They read, think, follow a reference, and read again. The view grows.

But the record itself should not quietly change. Each addition needs its own version, or a new record that points back at the previous one. Otherwise a request could point at a view ID that means something different by the time anyone checks it.

This is the same stale-state problem, one level up. A reference that moves under you is not a reference.

So the request points at one exact version.

{
  "operation": "modify_file",
  "target": "config/deployment.yaml",
  "state_view_ref": {
    "id": "sv_01J...",
    "version": 3
  }
}

State binding ties a request to the view it was given. State view control says whether the agent should have had that view at all.

Not every read needs a decision

I do not think every read should trigger an approval or a heavy policy check. That would make agents unusable, and it would be the wrong lesson.

Most reads can be admitted automatically inside a scope that was set beforehand.

The claim is narrower.

The agent may ask for a view. It should not be able to give itself one.

Read and write are not the same boundary

A write needs a decision before it can create impact. A read needs a scope and a record before the agent reasons on it, and again every time the view grows.

There is an uncomfortable side to this. The better an agent gets, the more it explores. It searches, follows links, and calls more tools until it has enough to act. The capability that makes it useful is the same one that widens its view.

That is why gating the write alone is not enough. The request can look perfect when it arrives. Current state, correct format, properly bound. And still be built on something the system never meant to show.

Controlling the outcome is half the path. The other half is controlling the view it came from.

A write changes the external system.
A read changes the world the agent reasons about.

Both belong inside the boundary.

Project: Impact Boundary Labs

The reasoning was right, but the world shifted

David Loibner — Wed, 08 Jul 2026 06:35:26 +0000

While working on the GitHub adapter, a gateway that lets AI agents create pull requests on GitHub, the source_state field first looked like a small technical detail.

It was not the operation itself, or the target. It was only a reference to the state the agent had seen before proposing a change.

But after working through the write path, this field started to look less like metadata and more like part of the safety model. A proposed change is not only defined by what it wants to do. It is also defined by the state in which that proposal made sense.

This is easy to miss.

An agent can read a repository, produce a reasonable change, and submit a clean intent. Nothing about that has to be wrong. But while the agent is planning, the repository can move. A human can push a fix. Another workflow can update the same file. A branch can advance.

In that case, the agent may still be reasoning correctly over the state it saw.

The problem is that this state no longer exists.

The reasoning was right, but the world shifted.

That is the stale state problem in agent workflows. And it is why I think agent workflows need state-bound intent.

The illusion of a static world

From the outside, even from the boundary's point of view, a stale request can look just like any other: the operation has the same name, the target path is still allowed, the input is still well formed.

But it is not. The proposal belonged to an older state of the repository, formed before the branch moved, before the file changed, before another workflow created a related result.

This is why stale state is not only a data freshness problem. For agent workflows, it becomes an admission problem: a decision about whether a proposed change is allowed to become a real effect. We call that decision point an MCP Boundary: the same pattern behind the GitHub adapter and the wider work we do on MCP gateways. The boundary should not only ask whether the operation is allowed on the target. It should also know whether the target is still in the state that made the operation reasonable.

The write happens now.
The reasoning happened before.

That gap is where the problem appears.

State binding

The intent should not float freely between the agent and the system that later creates impact. It should carry the state reference it depends on, a practice we call state binding: attaching a proposed effect to the exact state it was reasoned about.

For a GitHub adapter, this can be the branch head or a file hash. In other systems, it may be a ticket status, a data version, or a configuration version. The exact field is less important than the relation it creates.

The request is no longer only:

apply this change to this target.

It becomes:

apply this change to this target, if the target is still in the state this change was based on.

That small addition changes the meaning of the request. The boundary can now reject the drift instead of guessing through it. If the state still matches, the request continues through the normal checks. If the state moved, the request should stop before impact.

This does not mean the agent gets to declare that the state is safe. The state reference has to be checked by the boundary, or by the trusted system behind it. Otherwise it would only be another claim from the agent.

The agent can say what it saw.

The boundary has to verify whether that is still true.

This is not a new idea

The underlying idea is old. Git has base commits. Databases have version checks. HTTP has ETags. Security people know the broader class of problems around time of check and time of use.

So the point is not that agent systems discovered stale state. The point is that agents make the read-to-write gap wider and less predictable.

An agent reads, plans, calls tools, revises, retries, and only later submits a proposed effect. During that gap, the target system keeps moving. A change that was safe five minutes ago may be wrong now, not because the goal changed, but because the state around the goal changed.

That makes state-bound intent more than a backend detail. It becomes part of the boundary. Agent requests should not only be checked by operation, target, and policy. They should also be checked against the state that produced them.

When the state moved

When the agent submits an intent, the boundary compares the submitted state reference with the current target state.

If they match, the request can continue. It may still need review, be outside the allowed scope or be blocked for other reasons. But at least it is still attached to the state the agent used.

If they do not match, the request should stop before impact.

A stale request is not a malformed request. It is not an unauthorized request. It is not necessarily a forbidden goal. It is a request whose basis expired.

So the safe next action is not to retry the same write with slightly different arguments. It is to read the target again and submit a new intent based on the current state.

This is where boundary feedback matters. The system should say, in a structured way, that no impact happened because the state reference was stale:

{
  "decision_status": "conflict",
  "outcome_status": "no_impact",
  "reason_code": "stale_state_reference",
  "source_state": "sha:abc123",
  "current_state": "sha:def456",
  "required_next_action": "re_read_target_state",
  "retryable": false
}

The important part is not the JSON format.

The important part is the next action.

The boundary is not saying that the goal is forbidden. It is saying that this request no longer belongs to the current state of the target.

Where the binding is less clear

Not every operation has a clean state reference. Some targets are easy to bind: a Git commit, a file hash, a database row version. Others are harder. A draft email may depend on context that is difficult to reduce to one stable token.

In those cases, the system should be honest. If the state cannot be bound well, the boundary may need to be stricter. That connects back to reach: the less precisely you can bind the state behind an effect, the more careful admission should become.

Why this matters for autonomous work

A human often notices when the world has moved. A developer sees that the branch changed. A support worker sees that the ticket was already closed.

Agents do not carry that situational judgement automatically. They can continue from an earlier observation with full confidence, even when it is already outdated, because the reasoning is still coherent relative to the old state.

That is exactly the issue.

Good reasoning over stale state can still produce bad impact.

State binding does not fix this by making the agent correct. It does not prove that the code is good, that an email is wise, or that a data update is meaningful. Tests, review, and engineering judgement remain necessary.

The claim is narrower. Without the state it was based on, the boundary cannot know whether it is admitting the request the agent actually reasoned about.

A request is not only an operation against a target.
It is an operation against a target, based on a state.

If that state changed, the request changed too.

No state binding, no safe admission.

Project: Impact Boundary Labs

A write is not just a write

David Loibner — Sat, 04 Jul 2026 22:00:18 +0000

The point became clear to me while working on the GitHub adapter (a more or less simple GitHub gateway for AI agents creating PRs).

At first, creating a pull request looks like one write. A branch is created, commits are pushed, and a PR appears. The API call is concrete, and the result is easy to name.

But after building around it for a while, the word "write" started to feel too imprecise. A pull request is not only a technical object. It changes the state of a repository and creates something that another person may have to review. If it duplicates an existing proposal, it can already be noise before anyone reviews it. If it touches a deployment file instead of a README, the same operation suddenly has a very different weight.

The operation name did not change.
The effect did.

In the previous parts, I argued that the agent should not hold the write credentials, that tool access is not the same as impact permission, and that blocked requests should return boundary feedback instead of generic tool failures.

This leads to the next point (or let's call it problem).

Even if write authority is moved behind a boundary, the word "write" is still too coarse.

A permission table can say that an action may write. That is useful, because it separates reading from changing. However, it does not say where the change lands, who sees it, which systems react to it, or whether another decision is still possible before the effect becomes larger.

A draft and a sent email are both writes. A pull request and a merge are both writes. Updating test data and changing production data are both writes. Technically, all of them change state. Operationally, they are not the same event.

This is the weakness of flat write access. A system can enforce the permission correctly and still allow the wrong kind of effect.

Reach is the missing property

The useful distinction is not only whether an action changes state.
The useful distinction is reach.

By reach, I mean how far the effect travels before another decision is required. A change that stays inside an isolated workspace has low reach. A change that creates work for another person, becomes visible to users, changes production, or sends something outside the system has higher reach.

This is why the same technical write may need different treatment. A draft email still stays inside a controlled workflow, while a sent email leaves it. A pull request is not a merge, but it already changes the repository workflow. A staging change can be acceptable with little friction, while the same change against production should pass a stronger gate.

The boundary should therefore not only ask whether the verb is allowed. It should ask how far the requested effect is allowed to go.

This is the point where "write" stops being a sufficient impact class.

Reversibility is not enough

A tempting classification is reversible vs. irreversible.

This helps, but it is not stable enough as the main boundary.

Real systems are rarely cleanly reversible. A draft may allocate an ID. A temporary object may leave logs, cache entries, metrics, notifications, or review noise. A staging change may start automatic work. A pull request may consume reviewer attention even if it is later closed.

The primary object may be deleted, but the system has already changed.

Therefore, reversibility should only be a secondary signal. A reversible action can still cross an important boundary. An action that cannot be fully undone can still be isolated and low impact. What matters for admission is not only whether the object can be removed, but what became visible, costly, dependent, or relevant while it existed.

Reach captures that better than reversibility.

The boundary needs structure, not guesswork

A boundary cannot reliably control reach if the agent only submits plain language.

A request such as "update the customer record" is not enough. It may mean a harmless internal note, a billing change, a correction in production data, or a change that notifies someone. These are different effects, even if the sentence looks similar.

A second model may help as an additional signal, but it should not be the main boundary. If the control layer has to read a paragraph and guess the effect, the hard part has only moved into another interpretation step.

The cleaner design is that the agent submits a structured intent before an external effect is possible. The request should make the relevant parts explicit: the kind of action, the target, the expected state, the environment, and the intended result.

This does not mean that the agent gets to declare its own safety.

The agent can propose what it wants to do. But the system around the boundary has to make the relevant effect visible. A path, repository, branch, account, environment, URL, or table is not only a string in the request. It has to be resolved by the system that owns the boundary.

This is important because a tool name does not tell us enough.

A generic tool may be able to do harmless and risky work. Treating the whole tool as high impact creates review noise. Treating it as low impact misses dangerous use. The useful classification is therefore not "this tool is safe" or "this tool is dangerous". The useful classification depends on the resolved target and on the effect that the interface exposes.

A write to an agent-private scratch folder is different from a write to a deployment file. A request to a status endpoint is different from a request that changes billing. A test data update is different from a production data update.

This only works when the interface exposes enough detail. If the target cannot be resolved, the request should be treated as broad. If a tool hides what it may change, the boundary cannot magically recover that information from the tool name. In that case, the safer answer is to wrap the tool with a narrower interface, require review, or block the request.

Narrow admission only works when narrow effects are visible.

The tool is not the boundary.
The resolved effect is.

Small actions can add up

Per-request admission does not mean memoryless admission.

A single small write may be acceptable, while many small writes can become a larger operational effect. This is especially relevant for agents because retries are not always simple technical retries. The agent may rephrase the same goal, split it into smaller steps, or move toward the same blocked outcome through another route.

The boundary should be able to notice that pattern.

However, this memory must not depend on an identifier the agent can freely choose. If the agent can reset a trace ID, rename the task, or create a new parent label to escape accumulated context, the control is not real.

Cross-request signals have to be tied to something the agent cannot invent on its own, such as a workflow record created by the boundary, the underlying user or service identity, a signed work order, or another trusted identity.

With that constraint, workflow memory does not contradict per-request admission. Each requested effect is still evaluated separately. There is no broad write window. But the decision is not blind to recent denials, repeated similar intents, unusual volume, duplicate effects, or ignored boundary feedback.

At some point, the next safe action is no longer another small admission.
It is review.

The boundary is not the whole target system

There is also a limit to what the boundary can claim.

The boundary sees the request at the interface. It does not automatically know every automatic effect that happens inside the target system afterward. A narrow-looking write may start internal rules, background work, notifications, audit records, or other follow-up actions.

The agent requested one write.
The system may produce many effects.

This does not make the boundary useless. It defines its responsibility. The boundary controls the handoff from agent request to admitted effect. The target system still has to own what happens internally through clear behavior, access control, data rules, and production safeguards.

If an endpoint can create a stronger effect behind the scenes, that effect should be part of what the endpoint exposes to the boundary. If the boundary cannot know it, the request should be treated as broader or sent to a stronger gate inside the target system.

The boundary is not a proof that everything behind the interface is harmless.

The boundary is a controlled handoff point.

The narrower claim

This does not make the agent correct.

It does not prove that generated code is good, that an email is wise, that a data update is meaningful, or that a deployment is safe. Tests, review, domain knowledge, and normal engineering judgement remain necessary.

The claim is narrower.

A boundary should control how far an agent request is allowed to reach before another decision is required.

That reach is not defined by the verb alone. It depends on the resolved target, the surrounding workflow, and the effects exposed by the target system.

Write access only says that state may change. It leaves open where the change lands, who sees it, and what other systems start doing because of it.

That is why write access is too flat for agent work.

A draft, a pull request, a staging update, a production write, an external message, and a payment may all be writes.

They do not reach the same world.

The verb is not the boundary.
The effect is.

Project: Impact Boundary Labs

Blocked is not failed: agents need boundary feedback

David Loibner — Thu, 18 Jun 2026 08:26:19 +0000

In part 2, I wrote about why tool access is not the same as impact permission.

The main point was that an agent request should not automatically become an external effect, only because a tool is visible and the call is technically valid. For tools that can change real systems, there should be an admission step before impact.

But this leads to a practical follow-up question.

What happens when the admission layer says no?

In many current agent setups, this situation is treated like a normal tool failure. The agent calls a tool, the request violates a rule, the runtime returns a generic error, and the tool call fails.

At first sight, this seems acceptable. The unsafe action was blocked, so the system did its job.

However, I think this is only half of the problem.

A blocked action should not change the external target state. That part is clear. But the response should also help the agent understand what kind of next step is still valid. Otherwise the boundary stops one action, but it does not help the agent work inside the boundary.

That is the distinction I want to look at here.

A blocked action is not just a failed tool call. It can also be a structured decision.

Generic errors are not enough

If a human developer hits a permission wall, the situation is usually manageable. The developer reads the error, understands the constraint, and changes the approach. Even if the error message is not perfect, the human can infer what probably happened.

An autonomous agent reacts differently.

If the agent receives a normal tool exception, such as 403 Forbidden, permission denied, or tool failed, it does not necessarily understand that a deliberate boundary was enforced. It may interpret the failure as a temporary tool problem, a formatting issue, or a prompt problem.

This matters because the reasoning loop is still active. The agent may try to repair the situation by guessing. It may reword the same request, call another tool, generate a slightly different payload, or repeat the previous attempt because it did not understand why the action was blocked.

In this case, the generic error did not really guide the workflow. It only converted a policy decision into noise inside the agent loop.

This is one of the reasons why I think agent boundaries should return more than ordinary execution errors.

Blocked is an outcome

An admission layer should treat a denial as a controlled outcome, not as an unexpected crash.

If a request is blocked, invalid, or conflicts with stale state, the external system should not change. But the response sent back to the agent should still be precise enough to explain the safe next step.

For example, imagine an agent proposes a write based on a file state that is no longer current. A human developer may have changed the file while the agent was planning. In a normal tool flow, this might appear as a failed write or a generic conflict.

A structured boundary response can express the situation more usefully:

{
  "decision_status": "conflict",
  "outcome_status": "no_impact",
  "reason_code": "stale_state_reference",
  "required_next_action": "re_read_target_state",
  "retryable": false
}

The important point is not the exact field names. The important point is that the agent does not have to guess.

The response says that the request did not fail because the goal is impossible. It failed because the state reference is stale. The next safe action is therefore not to retry the same request, but to read the target state again and submit a new request.

It should be noted that retryable: false does not mean that the task is impossible. It means that this exact request should not simply be repeated. The agent has to update its state first.

This distinction is useful because it separates different kinds of blocked work.

A path outside the allowed scope is not the same as stale state. A request that needs human approval is not the same as a duplicate attempt. A malformed input is not the same as a policy violation.

If all of these cases look like a generic tool failure, the agent has no good signal for what to do next.

Boundary feedback should guide the next step

The useful part of boundary feedback is not that it explains every internal rule. It should not do that.

The useful part is that it can turn a blocked request into a safe next action.

If the state is stale, the next action may be to re-read state. If the path is outside the allowed scope, the next action may be to choose an allowed path. If the requested effect already exists, the next action may be to reuse or poll the existing outcome. If the blast radius is too high, the next action may be to wait for review.

This does not make the boundary softer. The blocked action still stays blocked. The external system still remains unchanged.

But the agent is not left with a dead end. It receives a constrained signal about how to continue in a compliant way.

This is especially relevant for autonomous workflows. A human can often recover from unclear errors. An agent may instead create more retries, more tool calls, and more confusion. Structured feedback reduces that unnecessary search space.

The Policy Oracle problem

There is a real counterpoint here.

If every blocked request returns very precise feedback, the boundary can become a way to probe the policy. A compromised or misaligned agent could send many slightly different requests and use the returned reason codes to learn where the limits are.

So feedback has to be useful, but not unlimited.

For normal operational friction, clear reason codes are helpful. Stale state, malformed input, missing fields, or an already existing outcome are usually safe to explain. These are not secret policy details. They are workflow facts that help the agent recover.

But the situation changes if the system sees repeated violations, ignored next-action hints, or suspicious probing across many variants. In that case, the feedback should become less detailed. A generic rejection, a cooldown, or human review may be more appropriate than continuing to expose precise policy boundaries.

The general rule is simple:

Boundary feedback should guide compliant work.
It should not become a policy oracle.

Feedback is not scoring

There is another distinction that matters.

Feedback is agent-facing. It describes what the agent needs for the next step. It can say that state is stale, that a path is out of scope, or that approval is required.

Scoring or auditing is different. That layer can observe how the agent behaves over time. It can track whether the agent repeats the same blocked request, ignores required next actions, or keeps asking for broader access than the task seems to need.

I would not expose that full evaluation to the agent. If the agent sees the complete scoring logic, it may start optimizing for the score instead of solving the actual task.

Therefore, the agent-facing response should stay limited to what is needed for a compliant next action. The deeper behavioral evaluation can remain part of audit, monitoring, or later review.

This separation is important because the boundary has two jobs that should not be mixed. It should help the agent continue safely, but it should also allow the system to notice when the agent is not adapting.

Controlling the loop

We do not give agents boundaries because we assume they are useless or because they will always fail.

The opposite is closer to the point.

Agents need boundaries because they are becoming useful enough to act on real systems. Real work has limits, rules, current state, review paths, and consequences.

A boundary that only returns a generic failure is too rigid for useful agent workflows. It stops one action, but it does not tell the agent what a safe next step would be.

A better boundary should keep the external system unchanged while still giving the agent enough structured feedback to continue correctly.

That is the practical value I see here.

Blocked should not mean that the workflow disappears into a generic error.

Blocked should mean:

the requested impact did not happen,
the reason is known,
and the next safe action is constrained.

This is the difference between a wall and a working boundary.

Let agents explore solutions. Block them when the request crosses a boundary. But when something is blocked, make the next safe step explicit.

Project: Impact Boundary Labs

Agent workflows need an impact boundary

David Loibner — Wed, 10 Jun 2026 09:59:26 +0000

In part 1, I wrote about why coding agents should not hold write credentials.

GitHub was the example, because the problem is easy to see there. A coding agent can read a repository, reason about a change, and produce useful work. But if the same agent also owns the token that creates branches, commits, or pull requests, the proposal and the authority to create impact are too close together.

The problem is not only GitHub.
The problem is the moment where an agent request becomes an external effect.

Agents are getting more useful because they can use tools. They can read files, call APIs, update tickets, prepare emails, run commands, inspect systems, and sometimes change state. That is exactly why the boundary matters more, not less.

The question is not only:

Can the agent use this tool?

The more important question is:

Should this specific request become impact now?

That is the missing layer I keep coming back to.

Tool access is not impact permission

A tool can be visible to the agent. The call can be valid. The arguments can be well formed. The agent can even have a reasonable goal.

Still, this does not automatically mean the requested effect should happen.

A GitHub tool may create a pull request. A database tool may update or delete rows. A cloud tool may deploy a configuration. An email tool may send a message. In all of these cases, the tool is not only returning information. It can change a system that someone cares about.

This is where I think many agent workflows are still too flat. They often treat tool access as if it already contained the whole decision. If the agent can call the tool, the tool executes. If the call succeeds, the system moves on.

That may be acceptable for many read-only or low-risk operations. But for tools that create external effects, I think there should be another step between the agent request and the target system.

The agent should be able to propose work. But the fact that a tool exists should not mean that every valid tool call becomes impact.

The missing layer is admission

The distinction that helped me most is this:

Scope defines what is possible.
Admission decides what is allowed now.
Logs record what happened.

These are related, but they are not the same thing.

Scope is mostly about design-time limits. Which tools are visible? Which paths are available? Which actions are impossible from the beginning? This is useful and necessary, because an agent should not even see tools or data it does not need.

Admission is different. It is about the concrete request in the current situation. The question is not only whether the operation exists or whether the agent generally has access to it. The question is whether this requested effect is allowed now, under the current state, scope, and policy.

An event log comes after that. It helps reconstruct what happened, which is important for audit and debugging. But a good history of what happened is not the same as a decision before impact.

In a normal system this may sound obvious. In agent workflows it is easy to miss, because the agent often sits directly in front of powerful tools. The tool call becomes the action. The action becomes the outcome. The boundary is only visible afterward, when something needs to be explained, reverted, closed, or cleaned up.

That is the part I think should move earlier.

State matters

A request cannot be judged only by its name.

The same operation may be fine in one state and wrong in another. A pull request against the expected branch head may be acceptable, while the same proposed change against stale repository state should be blocked or sent back. Updating test data may be harmless, while the same update against production may not be. Sending an email draft may be fine, while sending the message to real users may require review.

The operation is the same in a rough sense, but the situation is not.

That is why an agent request should be tied to the state it was based on. It should not be enough that the agent says it looked at the system. The decision layer should know, at least for the relevant parts, what state the agent was allowed to observe.

If the state has changed, the right answer should not be to guess and continue. It should be a structured conflict. The agent can then re-read the state and submit a new request.

This is not only a defensive mechanism. It also gives the agent useful feedback. The system does not have to say that the goal is forbidden. It can say that the proposal is stale.

That distinction matters if we want agents to work better inside boundaries instead of simply failing at them.

Per request, not per session

I also think the unit of permission matters.

A broad session permission is convenient. It can say that a certain agent session is allowed to write, deploy, send, or modify something for a limited time. For human workflows this kind of model is often acceptable, because the human user carries context and responsibility through the session.

For an agent, a session can become a temporary impact window.

The agent may retry. It may misunderstand a previous result. It may keep going from stale assumptions. It may call the same tool again with slightly changed arguments. If the session still has broad authority, the system may know which agent acted, but it did not decide each effect separately.

This is why I prefer the request or intent as the unit of decision.

Not:

this agent session may write for the next ten minutes

but rather:

this requested effect is allowed under this state, scope, and policy

That is a narrower form of authority. It does not prevent the agent from working. It only prevents broad access from silently turning into broad impact.

Agent retries are different

The same issue appears with idempotency.

In distributed systems, idempotency often protects against technical retries. A request times out, a response is lost, or a client sends the same request twice. The system should not create the same effect twice just because the transport was unreliable.

Agents retry for messier reasons.

They may reword the same goal. They may generate a slightly different payload. They may call another tool. They may try again because they did not understand that the previous attempt already created a pending result.

In that case, the prompt or payload may be different, while the intended effect is still the same.

This does not mean the boundary should magically guess meaning from free text. That would be too weak. A better approach is to make the agent submit a more structured request before impact is possible. The system should decide on the target, operation class, expected state, and requested outcome, not only on the natural-language prompt that produced it.

Then it becomes possible to ask better questions.

Is this effect already pending? Was it already completed? Did a previous attempt partially create it? Should the existing outcome be reused? Should the agent re-read state first?

Tool idempotency protects the request path.

Intent-level idempotency protects the workflow from repeated attempts toward the same effect.

This is not a replacement for review

An impact boundary does not prove that the agent is right.

It does not prove that generated code is good. It does not prove that a database change is meaningful. It does not prove that a deployment is a good idea. Human review, tests, domain knowledge, and normal engineering judgement are still needed.

The claim is narrower.

The agent should not be the component that turns its own request into external state change.

It can reason. It can propose. It can use tools. But when the requested action changes something outside the model, there should be a separate decision before impact.

That decision may be simple in low-risk cases. It may only check scope and freshness. In higher-risk cases it may require approval, reuse an existing outcome, or block the request. The exact implementation can differ between systems.

The architectural point stays the same.

Tool access should not become impact by default.

Why I care about this

I do not think production agent systems will become trustworthy only because the models get better or the tool interfaces get cleaner.

Better models help. Cleaner interfaces help. Sandboxes help. Logs help. Reviews help.

But when agents start acting on real systems, there is still one question that needs its own place:

who decides what becomes impact?

My answer is not that agents should be kept away from tools. The opposite is probably true. Agents become useful because they can interact with systems and do real work.

But useful work needs boundaries.

For human work, we already know this. We use roles, reviews, limits, approvals, and audit trails. We do not treat every possible action as automatically allowed just because someone can technically perform it.

Agent workflows need the same idea in a machine-readable form.

Let agents request work.
Let them propose changes.
Let them use tools.

But before their work changes an external system, there should be an impact boundary.

That is the layer I think is missing in many agent workflows.

Project: Impact Boundary Labs

Coding agents should not hold write credentials.

David Loibner — Sat, 30 May 2026 12:50:22 +0000

I have been thinking a lot about coding agents lately.

Not really about whether they can write good code, because usually they can, sometimes they can't. That part is obvious. But the risk is shifting from wrong answers to wrong outcomes.

The part that feels more important to me is this:
should the agent actually own the write authority?

We already don't trust humans without roles, limits, reviews, and accountability. Developers use PRs, pilots use checklists, bank clerks have transfer limits. Capable agents need the same structure, but machine-readable.

Right now a lot of setups still look roughly like this: the agent reads the repo, decides what to change, holds a GitHub token, and then creates commits, branches, or PRs.

I don't think this is the right default.

The agent can reason.
The agent can inspect files.
The agent can propose changes.

But the moment it can directly create external impact, the problem changes.

It is no longer just:

did the agent say something wrong?

It becomes:

did the agent create the wrong outcome?

That is a much more expensive failure mode.

Intent is not authority

The pattern I like more is simple: the agent reads directly, proposes intent, and a boundary decides before an adapter materializes admitted work.

So the agent does not get the write credentials.
It submits a structured intent instead, which could look like:

{
  "operation": "write",
  "target": {
    "repo": "example/app",
    "branch": "main",
    "path": "docs/config/agent-policy.md"
  },
  "source_state": {
    "blob_sha": "8f31c2..."
  },
  "requested_effect_hash": "sha256:..."
}

This is then not a command anymore, it is a suggestion, or an intent.
The system still has to decide whether this proposed outcome should exist.

That decision layer can check whether the actor, repo, path, source state, operation, and requested effect are valid for this situation, and whether the result should become a reviewable PR.

Only after that should there be an outcome.

For example:

{
  "decision": "admitted",
  "checks": {
    "scope": "pass",
    "source_state": "pass",
    "policy": "pass",
    "idempotency": "pass"
  },
  "outcome": {
    "type": "pull_request",
    "status": "created",
    "reviewable": true
  }
}

The core rule is:

No impact without admission.

The flow would look like this:

This is not the same as a sandbox

A sandbox is useful.
But I think it solves a different problem.

A sandbox asks where the agent can run, whether it can use the network, whether it can execute commands, which files it can access, and whether it can escape the environment.

A gateway asks:

should this concrete proposed outcome exist?

That difference matters because a sandbox can stop escape, it does not decide whether a proposed outcome should exist.
If the agent has a valid GitHub token inside the allowed environment, it can still use allowed tools to create an unwanted result.
The action can be technically allowed and still be the wrong outcome.

That is why I think the boundary should sit between intent and impact, not only around execution.

Sandbox isolates execution.
Gateway isolates impact.

Why GitHub is a good first target

GitHub already has a good human pattern:

a change proposal is not a merge.

Pull Requests are familiar because they are reviewable and they fit how developers already work.

But with agents there is one step before the PR that also matters:

An agent proposal should not automatically become PR impact.
A PR is already a real side effect. It creates a branch, commits, review work, and changes the state of the repository.

So the agent should not directly create it with its own write token.

The flow I want is more like this: the agent reads the repository, submits structured intent, the gateway checks state, scope, policy, and idempotency, and the GitHub adapter creates a reviewable PR only after admission. The PR should contain evidence about the decision.

The adapter is not the authority, it only materializes admitted work.
And the agent never receives the GitHub write credentials.

This does not make the code correct

This is important:

A boundary like this does not prove that the generated code is good. It does not replace CI, human review, or semantic correctness. It only controls the transition from proposed work to external impact.

That narrower claim is the whole point. I think many agent systems mix reasoning, decision, and impact together.

But these should be separated. The agent owns reasoning. The boundary owns the decision. The adapter owns controlled materialization. The target system should only receive admitted impact.

Why I care about this

I don't think production agent systems will be trusted just because the models get smarter. They will be trusted when the path from agent work to external change becomes explicit.

For every real outcome, I want to know what the agent proposed, what state it read, which rules were checked, why it was admitted or blocked, what outcome was created, whether a human can review it, and whether we can audit it later.

That is the layer I have been working on with Impact Boundary Labs.
The first implementation is GitHub-first:

agents can read repositories directly, but write intents go through a deterministic gateway that creates reviewable Pull Requests with evidence.

GitHub is not the whole idea, it is just the first concrete place to prove the pattern, because repositories have clear state, branches, commits, PRs, and review.

The broader principle is:

Let agents reason.
Stop them at intent.
Control what becomes outcome.

Project: Impact Boundary Labs

This is my very first article here on dev.to! I’d love to hear your thoughts on this architecture. How are you currently securing your agent workflows?

Since I'm new here, I'm highly open to feedback - let me know in the comments what I can improve or what we should talk about in Part 2!