DEV Community

Ross
Ross

Posted on

AI Agents Don’t Need More Prompts. They Need Execution Boundaries.

AI agents are moving from chat into action.

They can call tools, send emails, update records, delete data, trigger workflows, deploy code, issue refunds, change IAM permissions, and interact with MCP servers.

That shift is powerful.

It is also where things start to get dangerous.

Most AI safety conversations still focus on the model:

  • Can we make the model follow instructions?
  • Can we stop prompt injection?
  • Can we make the agent reason better?
  • Can we stop it hallucinating?

Those questions matter.

But they miss the moment that matters most:

What happens when the agent is about to actually do something?

Because at that point, the prompt is no longer the control surface.

The execution boundary is.

The problem: the agent can be wrong, but the side effect still happens

Imagine an agent connected to a refund tool.

The user asks:

Refund order ord-123 for £25.

The agent correctly calls:

python issue_refund(order_id="ord-123", amount_cents=2500)

Fine.

But now imagine the agent is prompt-injected, confused, compromised, or just wrong.

It calls:

python issue_refund(order_id="ord-456", amount_cents=250000)

Or it repeats the same refund twice.

Or it uses a proof meant for one customer against another customer.

Or it calls a more dangerous tool than the one the user actually authorised.

At that point, another system has to decide:

Is this exact action allowed to happen?

Not “does the model seem trustworthy?”

Not “did the prompt say to be careful?”

Not “does this look roughly similar to the original request?”

The question is:

Is there valid proof for this exact action, with these exact parameters, for this exact service, right now?

If not, the side effect should not execute.

The idea: no valid proof, no execution

I’ve been working on an open-source project called Actenon Kernel.

The idea is simple:

AI agents can propose actions. Protected systems decide whether those actions are allowed to execute.

Actenon is not a prompt filter.

It is not an output moderator.

It does not try to make the model truthful.

It sits at the execution boundary and refuses consequential actions unless the caller presents a cryptographic proof bound to the exact action being attempted.

That proof can bind:

  • the action name
  • the capability
  • the exact parameters
  • the target resource
  • the intended audience/service
  • expiry time
  • replay protection
  • policy or approval evidence

If the proof is missing, expired, replayed, audience-mismatched, malformed, or bound to different parameters, the action is refused before the side effect.

A tiny example

The mental model looks like this:

python from actenon import ActenonGate gate = ActenonGate.local_dev(audience="service:refunds") action = gate.build_action( "refund.issue", "payment.refund", {"order_id": "ord-123", "amount_cents": 2500}, target_type="order", target_id="ord-123", tenant_id="demo", requester_id="support-agent", ) # Local demo only. # In production, this proof would be minted by your auth layer, # policy engine, approval workflow, or control plane. proof = gate.mint_proof(action) outcome = gate.protect( action, proof, lambda: issue_refund("ord-123", 2500), audience="service:refunds", )

The important part is the lambda.

If the proof does not validate, that function never runs.

The model can ask.

The boundary decides.

Why this matters for MCP and agent tools

MCP makes it easier for agents to reach tools.

That is useful.

But it also means a model-visible tool can become a bridge into real systems: filesystems, databases, CRMs, terminals, deployment pipelines, payment systems, and internal admin workflows.

So the question becomes:

How does the tool decide whether a specific call should execute?

Actenon’s answer is that the MCP tool should not rely on the model behaving correctly. It should require proof at the point of execution.

A prompt-injected agent might call the tool.

The tool still refuses unless the proof matches the exact action.

Why this is different from IAM

IAM answers:

Who or what has access?

Actenon answers:

Is this exact agentic action authorised right now?

Those are different controls.

An agent may have access to a refund API.

That does not mean every refund amount, every customer, every retry, and every target should be allowed.

IAM is necessary.

But for autonomous or semi-autonomous agents, it is not always granular enough at execution time.

Local demo

The repo includes a tiny interactive demo:

bash python examples/interactive_execution_demo.py

It shows:

text ✅ approved refund: ord-123 £25.00 -> executed 🛑 hallucinated refund: ord-456 £2,500.00 -> refused / INTENT_MISMATCH 🛑 replay approved refund -> refused / DUPLICATE_REPLAY 🛑 refund with no proof -> refused / PCCB_REQUIRED

Only the approved action reaches the side-effect function.

Everything else is dropped.

What I’m looking for

I’d love feedback from people building with:

  • MCP
  • LangChain / LangGraph
  • Claude tools
  • OpenAI tool calling
  • coding agents
  • internal workflow agents
  • agentic CI/CD
  • AI admin tools
  • finance, healthcare, IAM, or regulated workflows

The question I’m trying to sharpen is:

Where should the proof boundary sit in real-world agent architectures?

Repo here, if useful:

https://github.com/Actenon/actenon-kernel

The goal is not to make every agent safe.

The goal is to make consequential action surfaces deterministic.

No valid proof, no execution.

Top comments (0)