Enforra

Posted on May 21

System prompts are not a security boundary for AI agents

#ai #security #opensource #agents

AI agents are moving from generating text to taking actions.

They can run commands, send emails, issue refunds, update records, call internal tools, and touch production workflows.

That changes the security model.

A system prompt can guide an agent, but it should not be the thing that enforces policy.

If an action has a real side effect, there should be a control point before that action happens.

The problem

When an agent calls a tool, the important event is not the text the model generated.

The important event is the tool call.

That is where something real can happen.

A model can be manipulated.
A model can hallucinate.
A user can ask for something risky.
A prompt can be ignored or misunderstood.

So the question should not only be:

Did the model intend to do this?

It should also be:

Should this action be allowed under company policy?

A simple example

Imagine a support agent that can issue refunds.

A prompt might say:

Only issue refunds when appropriate.

That is useful guidance, but it is not enforcement.

A better pattern is to check the action before the refund tool executes.

For example:

Refund under $100: allow
Refund between $100 and $500: require approval
Refund over $500: block

Now the rule is not just hidden inside the prompt.

It is enforced before the tool callback runs.

What Enforra does

I’m building Enforra as an open source SDK for AI agent runtime control.

It sits before your tool callbacks and returns a decision before anything executes:

allow
block
require_approval
log_only

The application still owns the actual tool execution.

Enforra does not run your tools remotely.

It gives your app a policy decision before the callback is called.

Why this matters

As agents move into production, teams need more than prompt instructions and logs after the fact.

They need clear policy checks around actions that matter.

That becomes important when agents can touch money, customer data, internal systems, production infrastructure, or business workflows.

The goal is not to make agents less useful.

The goal is to make sure useful agents have a clear control point before they do something risky.

Open source

The initial Enforra SDK is here:

Top comments (1)

Curious to hear how others are handling this in production agent systems.

Are you putting policy checks before tool execution, or mostly relying on prompts, framework guardrails, and logs after the fact?