The Invoice Problem
Your agent just approved a $47,000 invoice to a vendor it has never seen before. At 2 AM. On a Saturday.
The model that powered the decision passed all safety checks — the output was not toxic, not biased, not hallucinated. The function call was syntactically correct. The tool executed successfully. By every standard metric in the AI safety ecosystem, nothing went wrong.
Except that the agent had a $5,000 financial limit. The vendor was not in the approved supplier list. The time-of-day risk profile was elevated. And the person who delegated authority to this agent explicitly excluded wire transfers from its scope.
None of these constraints exist in the model. They exist in the organization. And today, almost nobody is enforcing them.
The Gap Nobody Talks About
The AI safety conversation has been dominated by model-level concerns: alignment, jailbreaks, hallucination, content policy. These are real problems with real teams working on them. OpenAI, Anthropic, Google, and Meta all invest heavily in making model outputs safe.
But there is a different category of problem that emerges the moment an agent gets a credit card, a database connection, a deployment key, or an email account. The question shifts from "is this output harmful?" to a set of questions that no model provider can answer:
- Is this agent authorized to perform this action?
- Who delegated that authority, and do they have it to give?
- Does this action violate organizational policy?
- What is the quantified risk, given the agent's track record?
- Does this conflict with something another agent is already doing?
- Can we prove all of the above to an auditor?
These are not AI problems. These are institutional governance problems. They existed for human employees — hiring policies, delegation of authority, separation of duties, spending limits, audit trails. We built entire professions (compliance, internal audit, risk management) to handle them.
When agents become economic actors — entities that commit real resources on behalf of real principals — we need the same infrastructure, rebuilt for machines.
Why I Built This
I spent 15 years designing the financial and operational control systems institutions run on — Delegation of Authority structures, procurement controls, approval chains, separation-of-duty frameworks. I have worked across government entities, publicly listed corporate groups, and the world's first university dedicated to artificial intelligence. I project-managed a UAE Federal Government mandate reengineering 200+ institutional processes. I contributed to a $70M+ ERP deployment spanning 36 subsidiaries and 30,000 employees. I have worked across 500+ institutional policies and governance frameworks.
The pipeline inside AgentCTRL — authority graphs, policy evaluation, delegation chains, risk scoring — is not an invention. It is a digitization of the control systems I built my career on. Those systems were built for humans. AgentCTRL rebuilds them for AI agents.
When I looked at what the AI industry was building for agent governance, I saw prompt-level instructions, content filters, and simple guardrails. I did not see Delegation of Authority. I did not see separation of duty. I did not see the institutional infrastructure that every enterprise already requires for human employees. That gap is what AgentCTRL exists to close.
Why Model Providers Cannot Ship This
The argument against a standalone governance layer is: "Won't OpenAI / Anthropic just build this?" The short answer is no, for the same structural reasons that Nvidia does not build Stripe.
Multi-model governance. OpenAI cannot govern an agent running on Claude. Cross-model enforcement requires an independent layer.
Organization-specific authority. "VP of Finance approves up to $50K, CFO above that, agents inherit from their creating user" is not AI capability. It is organizational structure. Every customer's authority graph is different.
Regulatory audit trails. Auditors do not trust the system being governed to audit itself. The enforcement layer and the execution layer must be separate.
Cross-agent policy composition. When 15 agents across procurement, finance, and operations are executing concurrently, the governance question is not single-action safety. It is composite organizational risk.
AWS ships IAM. Enterprises still buy SailPoint. Oracle ships audit logging. Enterprises still buy Imperva. Cloud providers ship security groups. Enterprises still buy Wiz. Platform vendors always ship basic governance as a feature. Dedicated governance layers survive when compliance requirements exceed the built-in, customers are multi-vendor, and governance must be independent of the thing being governed.
What Structural Enforcement Looks Like
AgentCTRL is a Python library — zero dependencies, framework-agnostic — that implements a 5-stage sequential decision pipeline. Every action proposed by an agent is checked against autonomy constraints, policy rules, delegated authority limits, quantified risk, and cross-agent conflicts. Each stage can short-circuit the pipeline. The result is always one of three: ALLOW (execute it), ESCALATE (ask a human), or BLOCK (stop it). If any stage throws an error, the pipeline defaults to BLOCK. Fail-closed, always.
Back to the opening scenario. That $47,000 invoice to an unknown vendor at 2 AM on a Saturday? Here is what happens when the action passes through the pipeline:
-
Autonomy: The agent is level 2 — cleared for
invoice.approve. Passes. - Policy: The organization has a rule: invoices above $5,000 require escalation. The amount is $47,000. ESCALATE.
The pipeline short-circuits. The invoice never gets approved. A human reviews it. The audit trail records exactly what happened, why, and who delegated the authority that the agent was operating under.
But it goes deeper. If the policy had allowed amounts up to $50,000, the pipeline would have continued:
- Authority: The agent's delegation chain gives it a $10,000 financial limit from the VP of Finance. $47,000 exceeds this. ESCALATE.
- Risk scoring: Novel vendor (+20%), off-hours (+10%), high-value (+25%). Combined risk: 0.75 (CRITICAL). ESCALATE.
Three independent layers, each capable of catching the problem. This is not prompt engineering. Policies cannot be jailbroken. Authority limits cannot be talked around. Risk scores are deterministic, not probabilistic. The tool call does not happen unless the pipeline approves it.
Both Directions
Most governance conversations focus on outbound: controlling what your agents do. But there is an equally important inbound question: when someone else's agent calls your API, accesses your MCP tools, or triggers your webhooks, who decides whether to let them in?
AgentCTRL handles both. The same pipeline, configured with different policies:
Outbound: Your finance agent wants to approve an invoice. Is the amount within its limit? Does policy allow it? Is the risk acceptable?
Inbound: An external agent calls your /v1/customers endpoint. Is it verified? Does it have the right credentials? Is it allowed to access PII? Same five stages. Same ALLOW / ESCALATE / BLOCK.
This matters because the agent-to-agent economy is coming. When your agent negotiates with a vendor's agent, both sides need governance. Not just safety — institutional governance with audit trails.
Trust as Credit
Static permission systems assume you know what an agent will do before it does it. The real world does not work that way.
AgentCTRL's trust calibration system treats agent trust like credit. New agents start with no track record — everything escalates. As an agent accumulates governed actions (50+ actions with >90% success rate), it earns a risk discount. The pipeline becomes more permissive for agents that demonstrate reliability, and tightens for agents that do not.
This is the architecture for dynamic autonomy — not a feature we claim is complete, but the structural foundation. The pipeline already evaluates trust context. The risk engine already applies calibration discounts. What comes next is making those thresholds adaptive.
The Real Scarcity
Here is the question nobody in AI is asking yet: what is the cost of a human looking at something?
The cost of AI compute is approaching zero. A token costs fractions of a cent. An agent can evaluate a thousand invoices in the time it takes a human to review one. But human judgment — the ability to assess a novel situation, weigh organizational context, and make a judgment call — is finite. There are 8 billion humans. Each has roughly 10 productive hours a day. That number does not change regardless of how many agents exist.
Today, governance systems ask: "Is this allowed?" That question produces a binary answer. A blanket $5,000 approval threshold wastes human attention on routine transactions that the agent handles perfectly, while missing novel $500 risks that actually need a human eye.
The better question is: "Is this worth consuming human attention?" That reframes governance from static rule enforcement to attention-cost optimization. The expected loss from an autonomous agent error, weighed against the cost of interrupting a human, is the calculation that matters.
We are not building this yet. But it is the direction the architecture supports, and it is the question that will define the next generation of governance systems. Static rules were built for a world where humans did the work and needed to be checked. In a world where agents do the work, the governance system's job is to protect the scarcest resource: human judgment.
What This Is Not
AgentCTRL is not a prompt filter. It does not look at model outputs.
It is not an orchestration framework. It does not run agents or manage workflows.
It is not a model-level safety tool. It does not compete with RLHF, constitutional AI, or content classifiers.
It is the layer that answers: "Given that this agent wants to take this action, with these parameters, at this time — should the action actually execute?" That question is orthogonal to whether the model is aligned. A perfectly aligned model can still produce an action that violates organizational policy.
Try It
pip install agentctrl
python -m agentctrl # see the pipeline demo
agentctrl validate '{"agent_id": "analyst", "action_type": "invoice.approve", "action_params": {"amount": 6000}}'
agentctrl init # scaffold starter policies + authority config
74 tests. Zero dependencies. Apache 2.0.
The code is at github.com/moeintel/AgentCTRL. The PyPI package is agentctrl.
I built this because AI agents need the same institutional controls that human employees have had for decades. Not because agents are dangerous — because agents are economic actors, and economic actors need institutional infrastructure.
Those systems were built for humans. AgentCTRL rebuilds them for AI agents.
AgentCTRL is built by MoeIntel. Created by Mohammad Abu Jafar.

Top comments (0)