DEV Community

Cover image for AI Agent Safety Need Stop Signs, Not Just Instructions
Adamma for Ota

Posted on • Originally published at ota.run

AI Agent Safety Need Stop Signs, Not Just Instructions

AI agents do not only need better instructions.

They need stop signs.

That is one of the clearest reasons Ota exists as software execution governance for humans and AI agents. A repo should not merely tell an agent what it can try. It should declare what the agent must not do, when it must stop, and what requires human approval.

Prompts and AGENTS.md files are useful. They give agents context: how the project is organized, what style to follow, how to summarize changes, and which areas need caution.

But advice is not a boundary.

An instruction says:

Be careful with database commands.
Enter fullscreen mode Exit fullscreen mode

A stop sign says:

Do not run destructive database commands unless explicitly approved.
Enter fullscreen mode Exit fullscreen mode

An instruction says:

Avoid editing generated files.
Enter fullscreen mode Exit fullscreen mode

A stop sign says:

These paths are protected. Stop if the requested edit falls outside the writable boundary.
Enter fullscreen mode Exit fullscreen mode

That difference matters because modern agents are no longer passive readers. They inspect repos, choose commands, edit files, run checks, interpret failures, and report completion.

If the repo gives them only guidance, they still have to infer the boundary.

Ota’s position is sharper: agent execution should not depend on inference. It should be governed by the repo.

Instructions tell agents what to attempt

Most agent guidance is written as advice.

It says:

  • follow the existing style
  • prefer small changes
  • run tests before finishing
  • avoid touching generated files
  • do not expose secrets
  • explain what changed

That helps. It makes agents less generic and more aware of the repo they are working inside.

But it still leaves the dangerous questions open.

Which tests should the agent run?
Which commands are allowed?
Which files are generated?
Which services require approval?
Which failures mean “fix the code” and which mean “stop and ask”?
Which paths are out of bounds?

A capable agent may make reasonable guesses.

But reasonable guesses are not governance.

For low-risk editing, guidance may be enough. For repo execution, CI, automation, and agentic development, the repo needs something stronger.

Stop signs define when not to continue

A stop sign is not a suggestion.

It is a boundary.

In a repo, stopping rules should cover at least five areas.

1. Secrets and credentials

An agent should not invent secrets, request private values indirectly, or edit sensitive environment files just to make a task pass.

If a command needs an API key, database password, cloud token, or private credential, the correct behavior is not improvisation.

The correct behavior is to stop and report the blocker.

2. External services

Some tasks depend on systems outside the repo: cloud infrastructure, managed databases, payment providers, queues, object storage, or production-like services.

If those services are unavailable, the agent should not patch code around the failure.

It should identify the missing dependency and stop.

3. Unsafe mutation

Some commands change state.

deploy
publish
db:reset
terraform apply
Enter fullscreen mode Exit fullscreen mode

These are not cousins of test, lint, or build.

If a task can mutate external state, delete data, publish packages, or affect infrastructure, the repo should not outsource that decision to the agent’s confidence.

That boundary should be declared.

4. Protected paths

Agents need to know where they can work.

Source files and tests may be open. Generated files, migrations, lockfiles, production config, and environment files may need review or approval.

This is not about slowing the agent down.

It is about preventing quiet damage in files that carry operational weight.

5. Verification limits

Agents also need to know when verification is finite.

A long-running dev server is not a verification result.
A watch mode is not a handoff signal.
A task that never terminates is not the same as a bounded check.

Agent-safe tasks need finite verification paths: run, finish, report status.

Without that, the agent may wait indefinitely, stop too early, or report success without a meaningful result.

This is execution governance

This is bigger than prompt quality.

If an agent runs a risky command, edits a protected file, or treats missing credentials as a code problem, the issue is not only that the agent made a poor choice.

The repo failed to govern execution.

Software execution governance means the repo can declare:

  • what it needs
  • how it should be prepared
  • what can be executed
  • what requires approval
  • where agents can write
  • when verification is complete
  • when execution must stop

That is the frame Ota is built around.

Not “better setup docs.”

Not “another task runner.”

Ota is the contract-first way to make execution boundaries explicit for humans, CI, automation, and AI agents.

How Ota makes stop signs explicit

In an Ota-backed repo, stopping rules do not have to live only in prose.

The contract can declare safe tasks, verification tasks, writable paths, protected paths, setup requirements, and readiness blockers.

That gives agents a governed operating model:

If the task is declared safe, proceed.
If setup is required, prepare from the contract.
If the contract is invalid, stop.
If secrets or credentials are missing, stop.
If the requested edit is outside writable paths, stop.
If the task mutates external state without approval, stop.
If verification is complete, report the result.
Enter fullscreen mode Exit fullscreen mode

That is stronger than telling an agent to “be careful.”

Ota’s agent quickstart follows this same principle: agents should prefer repo-local contracts when they exist, execute declared safe tasks, parse JSON output instead of scraping terminal prose, and stop when blockers involve secrets, credentials, external services, unsafe mutation, or paths outside declared boundaries.

The command surface supports that model:

  • ota doctor checks readiness and surfaces blockers before work begins.
  • ota validate checks whether the contract itself is usable.
  • ota tasks shows what work the repo has declared.
  • ota up --dry-run previews setup before changing the environment.
  • ota run <task> --json runs declared work and returns stable status for automation.

The point is not that every agent action needs ceremony.

The point is that dangerous ambiguity should be removed before execution happens.

AGENTS.md still matters

This does not make AGENTS.md useless.

It means AGENTS.md should do what prose does best: explain context.

Use it for style, conventions, architectural notes, review expectations, and collaboration preferences.

Use Ota for the execution boundary.

A clean split looks like this:

AGENTS.md:
How the agent should behave.

ota.yaml:
What the repo allows, requires, verifies, and refuses.
Enter fullscreen mode Exit fullscreen mode

One gives the agent context.

The other governs the repo.

Together, they produce a better operator: one that understands the project and knows where the guardrails are.

Stop signs build trust

Teams do not trust agents because agents sound confident.

They trust agents when the repo constrains what the agent can do, makes the approved path obvious, and produces evidence for what happened.

A good stop sign does not make agents less useful.

It makes them dependable.

It tells the agent:

Move quickly here.
Slow down here.
Stop here.
Ask here.
Report this.
Do not guess.
Enter fullscreen mode Exit fullscreen mode

That is the behavior serious teams need as AI agents move from code suggestion into repo execution.

Conclusion

AI agents need instructions.

But instructions alone are not enough.

A repo that only tells agents what to do still leaves too much room for unsafe interpretation. The next layer is stopping rules: clear boundaries for secrets, external services, unsafe mutation, protected paths, and finite verification.

That is why Ota’s contract-first model matters.

It turns agent safety from advice into execution governance.

The future of AI-assisted development will not be won by repos that merely prompt agents better.

It will be won by repos that know when agents should stop.


Originally posted @ ota.run

Top comments (0)