DEV Community

Alex Shev
Alex Shev

Posted on

Approval Gates for AI Agents: Draft Approval Is Not Publish Approval

One of the easiest ways to make an AI agent dangerous is to give it a vague approval.

Not because the agent is malicious.

Because humans use words like "ok", "approved", and "ship it" casually.

In a normal conversation, that is fine. In an automated workflow, it can become a bug.

If an agent is drafting an article, writing a tweet, preparing a pull request, generating social posts, or touching an external platform, the question is not just:

Did the human approve this?
Enter fullscreen mode Exit fullscreen mode

The better question is:

What exactly did the human approve?
Enter fullscreen mode Exit fullscreen mode

That distinction matters more than most agent tooling discussions admit.


The small approval bug that becomes a big workflow problem

Imagine this simple flow:

  1. An agent prepares 10 social posts.
  2. The human says "looks good".
  3. The agent publishes all 10 immediately.

Maybe that was correct.

Maybe it was not.

The human might have meant:

The copy looks good.
Enter fullscreen mode Exit fullscreen mode

The agent interpreted it as:

Publish everything now.
Enter fullscreen mode Exit fullscreen mode

That is not a model intelligence problem. It is a workflow design problem.

When an action leaves the local workspace, ambiguity gets expensive.

Publishing, emailing, commenting, deploying, charging a card, posting to social media, and modifying production systems should not depend on a loose interpretation of "ok."

They need explicit gates.


The approval types I separate now

For agent workflows, I like separating approval into at least four different meanings.

1. Approval to draft

This means:

Yes, prepare the thing.
Enter fullscreen mode Exit fullscreen mode

The agent can research, outline, write, generate assets, create local files, and prepare a proposal.

But it cannot publish.

This is the safest default for content work.

For example:

Prepare a DEV.to draft about approval gates.
Enter fullscreen mode Exit fullscreen mode

That should create a local markdown draft or an unpublished draft, depending on the workflow.

It should not post the article publicly.

2. Approval of the draft

This means:

The content direction is acceptable.
Enter fullscreen mode Exit fullscreen mode

The human may be approving the argument, the structure, the tone, or the asset selection.

But that still does not automatically mean:

Publish it right now.
Enter fullscreen mode Exit fullscreen mode

This is where many agent systems get sloppy.

Draft approval is not publish approval.

3. Approval to publish

This needs to be explicit.

The agent should know:

  • what platform
  • what asset or draft
  • whether to publish now or schedule
  • whether comments/replies/source attribution are included
  • whether the approval applies to one post or a batch

For example:

Publish this DEV.to article now.
Enter fullscreen mode Exit fullscreen mode

or:

Post only the first X drafts, with the source as the first reply.
Enter fullscreen mode Exit fullscreen mode

That is much safer than letting the agent infer a public action from a vague "ok."

4. Approval for automation reminders

This one is subtle.

A reminder or cron job should often prepare work, not perform external actions.

For example:

Every morning, find 3 candidate topics and ask me which one to use.
Enter fullscreen mode Exit fullscreen mode

That is different from:

Every morning, publish a post.
Enter fullscreen mode Exit fullscreen mode

The first one keeps the human in the loop.

The second one creates a recurring external action, which is much riskier.

Most teams should start with reminder-based automation before action-based automation.


Why this matters for CLI and agent workflows

CLI workflows make this even more important because agents can act fast.

A coding agent can edit files, run scripts, create branches, call APIs, deploy apps, write comments, and open browser sessions.

That speed is useful only if the boundaries are clear.

The workflow should define what the agent may do locally without asking and what requires a human gate.

For example, I am comfortable letting an agent do this freely:

  • inspect a repo
  • write a local draft
  • run tests
  • generate local assets
  • prepare a post
  • create a report
  • summarize findings

I want a stronger gate before this:

  • publish to DEV.to
  • post on X
  • send email
  • comment on Reddit
  • deploy to production
  • spend money
  • modify customer data

The difference is not whether the agent is capable.

The difference is whether the action is reversible, private, and low-risk.

Local work is cheap to revise.

External work creates a public or operational footprint.


A simple pattern: declare the gate in the task

One practical fix is to write the gate directly into the task.

Instead of:

Write an article about approval gates.
Enter fullscreen mode Exit fullscreen mode

Use:

Prepare a local draft only. Do not publish. Return the file path and wait for explicit publish approval.
Enter fullscreen mode Exit fullscreen mode

Instead of:

Make comments for Reddit.
Enter fullscreen mode Exit fullscreen mode

Use:

Prepare comment drafts only. Do not post. Recommend the safest first set and wait for approval.
Enter fullscreen mode Exit fullscreen mode

Instead of:

Deploy this.
Enter fullscreen mode Exit fullscreen mode

Use:

Create a preview deployment first. Do not promote to production until I explicitly approve production.
Enter fullscreen mode Exit fullscreen mode

This sounds boring, but boring is good here.

Good agent workflows are often just normal operational discipline written down clearly enough that the agent cannot guess wrong.

If the workflow supports metadata, make the gate machine-readable:

gate: draft_only
external_actions: false
next_approval_required: publish_to_devto
scope: single_article
Enter fullscreen mode Exit fullscreen mode

That tiny block is not bureaucracy. It gives the agent a state it can report, verify, and refuse to exceed.


The agent should report its current gate

The agent should also say what state the work is in.

For example:

Status: local draft only.
No external publication happened.
Next gate: human approval to publish.
Enter fullscreen mode Exit fullscreen mode

or:

Status: posted.
Verified final URL.
No additional replies published.
Enter fullscreen mode Exit fullscreen mode

This makes the workflow auditable.

It also prevents the human from having to infer what happened.

When agents are doing real work, "done" is not enough.

The agent should report:

  • what changed
  • where the artifact is
  • what was verified
  • what did not happen
  • what approval is needed next

That last part is important.

Good agents should not only complete tasks. They should make the boundary of the next task clear.


Batch work needs an even stronger gate

Batch publishing is where approval ambiguity gets especially risky.

If an agent prepares 25 comments, does approval mean:

  • publish all 25 now?
  • publish the safest 5?
  • publish one per day?
  • publish only after checking each target again?
  • publish drafts after human edits?

Those are very different actions.

For batch workflows, I like adding two fields:

Approval scope:
Cadence decision:
Enter fullscreen mode Exit fullscreen mode

Example:

Approval scope: first 5 comments only
Cadence decision: publish today, one by one, stop on warning
Enter fullscreen mode Exit fullscreen mode

or:

Approval scope: all 10 posts approved as drafts
Cadence decision: schedule one per day at 10 AM
Enter fullscreen mode Exit fullscreen mode

That tiny bit of structure prevents a lot of mistakes.

It also gives the agent a concrete stop condition.


This belongs inside skills

This is one reason I care about reusable agent skills.

A skill should not only say:

Here is how to perform the task.
Enter fullscreen mode Exit fullscreen mode

It should also say:

Here is what requires approval.
Here is what can be done locally.
Here is how to verify the result.
Here is when to stop.
Enter fullscreen mode Exit fullscreen mode

That is the difference between a tool and a workflow.

A tool gives the agent power.

A skill gives the agent operating rules.

It can encode approval gates as reusable policy, not just task steps.

For example, a publishing skill should define:

  • draft-only mode
  • review mode
  • publish mode
  • source/comment behavior
  • verification steps
  • rollback or correction process
  • rate-limit and spam-warning stop conditions

Without that, the agent has to infer the process from the conversation.

That is exactly where mistakes happen.


The rule I use

My current rule is simple:

If the action is external, public, paid, destructive, or hard to undo, the approval must name the action.

"Looks good" is enough for a draft.

It is not enough for publication.

"Ok" is enough to continue local work.

It is not enough to spend money, post publicly, email someone, or modify production.

For those actions, the approval should be explicit:

Publish this now.
Send this email.
Deploy to production.
Post these 5 comments.
Charge this card.
Enter fullscreen mode Exit fullscreen mode

The goal is not to slow the agent down.

The goal is to make speed safe.


Final thought

The fix is not complicated: write the gates down.

Make the agent report which gate it is at.

And never let draft approval silently become publish approval.


I am collecting and building practical examples of this kind of agent workflow discipline at Terminal Skills: reusable skills that teach agents not only which tools to use, but how to work safely and repeatably.

Disclosure: I used AI assistance while drafting this article, then reviewed and edited it manually.

Top comments (0)