DEV Community

Cristian Diaz Koziuk
Cristian Diaz Koziuk

Posted on

Let your LLM take real-world actions — without giving it the last word

Most "AI agent" tutorials wire the model straight to execution:

user asks → model decides → system runs

That's fine for a demo. It's dangerous the moment an action can charge a card,
send over a paid channel, publish content, or breach a plan limit. "The model
decided" is not an acceptable audit trail.

I kept rebuilding the same guardrails across projects, so I extracted the
pattern: Safe Automation Control Plane (SACP).

The idea, in one line

The AI proposes.
Hard rules decide what's allowed.
Validators decide what may execute.
Executors only run validated decisions.

The model never has authority. It optimizes inside a box that deterministic
rules draw for it, and every decision is validated, cached, costed and audited
before anything runs.

Three composable pieces

  • Decision Engine — turns any action into a validated, audited decision. Rules first, AI second, validators last.
  • AI Model Layer — the only place the LLM lives: model selection, caching, usage metering, circuit breaking, schema validation, prompt-injection defense.
  • Outbound Gateway — one controlled door for every external API call: tokens, idempotency, retries, breaker, rate limit, cost ledger.

The part worth reading first: what broke

This came out of a production system, so there's a lessons-learned doc
of real bugs, not theory:

  • The model returned expiresAt dates from its training cutoff — already in the past. Lesson: the AI doesn't know real time; normalize time fields server-side.
  • The policy engine silently allowed everything because a lazy registry was never initialized in tests. A "fail open" default is a loaded gun.
  • A refresh-token race: two workers refreshing in parallel, the second consuming a token the first already rotated, leaving the account dead. Fixed with an atomic lease.

Most of these aren't AI bugs — they're the bugs of putting a non-deterministic
component inside a deterministic, audited system.

Try it


bash
npm install sacp-core

import { DecisionEngine, PolicyEngine } from 'sacp-core';

const policy = new PolicyEngine();
policy.register('router_ai.campaign_send', (snap) => {
  const ctx = snap.context as { balance: number; cost: number };
  return ctx.balance >= ctx.cost
    ? { allowed: true }
    : { allowed: false, reasonCode: 'BALANCE_INSUFFICIENT' };
});

// No model wired yet → a conservative rule-only decision, never an exception.
const engine = new DecisionEngine({ policy });
const { output } = await engine.decide({
  tenantId: 't_123',
  action: { type: 'campaign_send' },
  risk: { riskLevel: 'low' },
  context: { balance: 1000, cost: 200 },
});
// output.decision → 'allow' | 'block' | 'require_approval' | 'split'
Zero runtime dependencies, ports & adapters — your database and model provider
stay yours. There's a runnable Claude adapter example
with structured outputs and refusal handling.

Honest caveat: the token-savings numbers in the docs are an illustrative
cost model, not a measured benchmark — they ship with a formula you plug your
own rates into.

Repo (MIT, EN/ES docs): https://github.com/cristiandkzk/SACP

I'd love to hear how others handle "AI proposes, rules dispose."
Enter fullscreen mode Exit fullscreen mode

Top comments (0)