Alex Shev

Posted on May 27

Approval Gates for AI Agents: Draft Approval Is Not Publish Approval

#ai #devtools #automation #productivity

One of the easiest ways to make an AI agent dangerous is to give it a vague approval.

Not because the agent is malicious.

Because humans use words like "ok", "approved", and "ship it" casually.

In a normal conversation, that is fine. In an automated workflow, it can become a bug.

If an agent is drafting an article, writing a tweet, preparing a pull request, generating social posts, or touching an external platform, the question is not just:

Did the human approve this?

The better question is:

What exactly did the human approve?

That distinction matters more than most agent tooling discussions admit.

The small approval bug that becomes a big workflow problem

Imagine this simple flow:

An agent prepares 10 social posts.
The human says "looks good".
The agent publishes all 10 immediately.

Maybe that was correct.

Maybe it was not.

The human might have meant:

The copy looks good.

The agent interpreted it as:

Publish everything now.

That is not a model intelligence problem. It is a workflow design problem.

When an action leaves the local workspace, ambiguity gets expensive.

Publishing, emailing, commenting, deploying, charging a card, posting to social media, and modifying production systems should not depend on a loose interpretation of "ok."

They need explicit gates.

The approval types I separate now

For agent workflows, I like separating approval into at least four different meanings.

1. Approval to draft

This means:

Yes, prepare the thing.

The agent can research, outline, write, generate assets, create local files, and prepare a proposal.

But it cannot publish.

This is the safest default for content work.

For example:

Prepare a DEV.to draft about approval gates.

That should create a local markdown draft or an unpublished draft, depending on the workflow.

It should not post the article publicly.

2. Approval of the draft

This means:

The content direction is acceptable.

The human may be approving the argument, the structure, the tone, or the asset selection.

But that still does not automatically mean:

Publish it right now.

This is where many agent systems get sloppy.

Draft approval is not publish approval.

3. Approval to publish

This needs to be explicit.

The agent should know:

what platform
what asset or draft
whether to publish now or schedule
whether comments/replies/source attribution are included
whether the approval applies to one post or a batch

For example:

Publish this DEV.to article now.

or:

Post only the first X drafts, with the source as the first reply.

That is much safer than letting the agent infer a public action from a vague "ok."

4. Approval for automation reminders

This one is subtle.

A reminder or cron job should often prepare work, not perform external actions.

For example:

Every morning, find 3 candidate topics and ask me which one to use.

That is different from:

Every morning, publish a post.

The first one keeps the human in the loop.

The second one creates a recurring external action, which is much riskier.

Most teams should start with reminder-based automation before action-based automation.

Why this matters for CLI and agent workflows

CLI workflows make this even more important because agents can act fast.

A coding agent can edit files, run scripts, create branches, call APIs, deploy apps, write comments, and open browser sessions.

That speed is useful only if the boundaries are clear.

The workflow should define what the agent may do locally without asking and what requires a human gate.

For example, I am comfortable letting an agent do this freely:

inspect a repo
write a local draft
run tests
generate local assets
prepare a post
create a report
summarize findings

I want a stronger gate before this:

publish to DEV.to
post on X
send email
comment on Reddit
deploy to production
spend money
modify customer data

The difference is not whether the agent is capable.

The difference is whether the action is reversible, private, and low-risk.

Local work is cheap to revise.

External work creates a public or operational footprint.

A simple pattern: declare the gate in the task

One practical fix is to write the gate directly into the task.

Instead of:

Write an article about approval gates.

Use:

Prepare a local draft only. Do not publish. Return the file path and wait for explicit publish approval.

Instead of:

Make comments for Reddit.

Use:

Prepare comment drafts only. Do not post. Recommend the safest first set and wait for approval.

Instead of:

Deploy this.

Use:

Create a preview deployment first. Do not promote to production until I explicitly approve production.

This sounds boring, but boring is good here.

Good agent workflows are often just normal operational discipline written down clearly enough that the agent cannot guess wrong.

If the workflow supports metadata, make the gate machine-readable:

gate: draft_only
external_actions: false
next_approval_required: publish_to_devto
scope: single_article

That tiny block is not bureaucracy. It gives the agent a state it can report, verify, and refuse to exceed.

The agent should report its current gate

The agent should also say what state the work is in.

For example:

Status: local draft only.
No external publication happened.
Next gate: human approval to publish.

or:

Status: posted.
Verified final URL.
No additional replies published.

This makes the workflow auditable.

It also prevents the human from having to infer what happened.

When agents are doing real work, "done" is not enough.

The agent should report:

what changed
where the artifact is
what was verified
what did not happen
what approval is needed next

That last part is important.

Good agents should not only complete tasks. They should make the boundary of the next task clear.

Batch work needs an even stronger gate

Batch publishing is where approval ambiguity gets especially risky.

If an agent prepares 25 comments, does approval mean:

publish all 25 now?
publish the safest 5?
publish one per day?
publish only after checking each target again?
publish drafts after human edits?

Those are very different actions.

For batch workflows, I like adding two fields:

Approval scope:
Cadence decision:

Example:

Approval scope: first 5 comments only
Cadence decision: publish today, one by one, stop on warning

or:

Approval scope: all 10 posts approved as drafts
Cadence decision: schedule one per day at 10 AM

That tiny bit of structure prevents a lot of mistakes.

It also gives the agent a concrete stop condition.

This belongs inside skills

This is one reason I care about reusable agent skills.

A skill should not only say:

Here is how to perform the task.

It should also say:

Here is what requires approval.
Here is what can be done locally.
Here is how to verify the result.
Here is when to stop.

That is the difference between a tool and a workflow.

A tool gives the agent power.

A skill gives the agent operating rules.

It can encode approval gates as reusable policy, not just task steps.

For example, a publishing skill should define:

draft-only mode
review mode
publish mode
source/comment behavior
verification steps
rollback or correction process
rate-limit and spam-warning stop conditions

Without that, the agent has to infer the process from the conversation.

That is exactly where mistakes happen.

The rule I use

My current rule is simple:

If the action is external, public, paid, destructive, or hard to undo, the approval must name the action.

"Looks good" is enough for a draft.

It is not enough for publication.

"Ok" is enough to continue local work.

It is not enough to spend money, post publicly, email someone, or modify production.

For those actions, the approval should be explicit:

Publish this now.
Send this email.
Deploy to production.
Post these 5 comments.
Charge this card.

The goal is not to slow the agent down.

The goal is to make speed safe.

Final thought

The fix is not complicated: write the gates down.

Make the agent report which gate it is at.

And never let draft approval silently become publish approval.

I am collecting and building practical examples of this kind of agent workflow discipline at Terminal Skills: reusable skills that teach agents not only which tools to use, but how to work safely and repeatably.

Disclosure: I used AI assistance while drafting this article, then reviewed and edited it manually.

DEV Community

Approval Gates for AI Agents: Draft Approval Is Not Publish Approval

The small approval bug that becomes a big workflow problem

The approval types I separate now

1. Approval to draft

2. Approval of the draft

3. Approval to publish

4. Approval for automation reminders

Why this matters for CLI and agent workflows

A simple pattern: declare the gate in the task

The agent should report its current gate

Batch work needs an even stronger gate

This belongs inside skills

The rule I use

Final thought

Top comments (0)