L_X_1

Posted on Jan 5 • Originally published at policylayer.com

Will AI Ever Be Good Enough to Not Need Spending Limits?

#security #opinion

"Won't AI just get better at this?"

As language models improve, won't they eventually become reliable enough to handle money without external guardrails?

The short answer: No. And understanding why reveals something fundamental about how we should think about AI safety.

The Improvement Trajectory

Let's steel-man the argument. AI capabilities are improving rapidly:

Better alignment: Constitutional AI, RLHF, and new training techniques make models more reliable at following instructions
Longer context: Models can now hold millions of tokens, reducing "forgotten" instructions
Formal reasoning: Chain-of-thought and tool use make agents more predictable
Agent frameworks: LangChain, CrewAI, and others add structure around LLM decision-making

Given this trajectory, why wouldn't AI agents eventually become trustworthy enough to handle financial transactions without external policy enforcement?

The Fundamental Distinction: Probabilistic vs Deterministic

Here's the core issue: LLMs are probabilistic by design. They predict the next token based on statistical patterns. Even a 99.99% reliable model fails 1 in 10,000 times.

For most applications, 99.99% is excellent. For financial transactions, it's not good enough.

Consider a trading agent making 1,000 transactions per day. At 99.99% reliability:

Per day: 0.1 expected failures
Per month: 3 expected failures
Per year: 36 expected failures

Now imagine one of those failures is "send entire balance to wrong address" instead of "minor formatting error." The expected value of that tail risk is catastrophic.

A deterministic policy—if (amount > dailyLimit) reject()—has a 0% failure rate. Not 99.99%. Zero. The transaction either passes or it doesn't. There's no statistical distribution of outcomes.

This isn't about AI being bad. It's about the mathematical difference between probabilistic and deterministic systems.

The Seatbelt Analogy

Cars have gotten dramatically safer over the decades:

Crumple zones
Anti-lock brakes
Electronic stability control
Autonomous emergency braking
Lane departure warnings

And yet we still have seatbelts. We still have airbags. We still have speed limits.

Why? Because safety systems are layered. Each layer handles different failure modes. The fact that cars rarely crash doesn't mean we remove the protection for when they do.

The same principle applies to AI agents:

Layer	Purpose	Type
Training/RLHF	Make the model generally safe	Probabilistic
System prompts	Guide behaviour for this use case	Probabilistic
Agent framework	Add structure and validation	Mixed
Policy layer	Hard limits that cannot be exceeded	Deterministic

Improving Layer 1 doesn't eliminate the need for Layer 4. They serve different purposes.

Separation of Concerns

There's an architectural principle at play here: the system making decisions shouldn't also control the guardrails.

If your AI agent both decides what to spend AND enforces spending limits, those limits exist only as long as the agent respects them. A sufficiently convincing prompt injection, a hallucination, or an edge case in the training data could bypass them.

External policy enforcement creates a separation:

┌─────────────────┐     ┌─────────────────┐     ┌─────────────┐
│    AI Agent     │────▶│  Policy Layer   │────▶│  Execution  │
│   (decides)     │     │   (enforces)    │     │  (acts)     │
└─────────────────┘     └─────────────────┘     └─────────────┘
        │                       │
   Probabilistic          Deterministic
   Can be influenced      Cannot be influenced
   by inputs              by the agent

The policy layer doesn't care how good the AI is. It doesn't care if the AI was jailbroken or made a mistake or had a brilliant insight. It just checks: does this transaction comply with the rules? Yes or no.

The Human Parallel

Here's a thought experiment: even trusted humans have spending limits.

A senior employee at a company might be brilliant, reliable, and have 20 years of tenure. They still can't wire $1M without approval. Not because they're untrustworthy—because spending controls aren't about trust.

They're about:

Risk management: Limiting blast radius of any single decision
Compliance: Demonstrating controls to auditors and regulators
Process: Creating checkpoints for high-stakes actions
Recovery: Ensuring mistakes can be caught before they're irreversible

AI agents should have the same constraints. Not because they're bad at their jobs, but because that's how you manage financial risk in any system.

The Compliance Reality

For enterprise deployments, there's a practical consideration: regulators don't accept "the AI is really good now" as a control.

SOC 2, PCI-DSS, and financial regulations require demonstrable, auditable controls. You need to show:

What limits exist
How they're enforced
That they cannot be bypassed
An audit trail of decisions

A policy engine provides all of this. An AI agent's internal reasoning—no matter how sophisticated—doesn't satisfy these requirements.

When AI Improves, Policy Layers Get Better Too

There's an implicit assumption in "won't AI make this obsolete?" that policy layers are static while AI advances.

In reality, as agents become more capable, policies become more sophisticated:

Today's policies:

Daily spending limits
Per-transaction caps
Recipient whitelists

Future policies (as agents take on more complex tasks):

Cross-agent coordination limits
Portfolio allocation constraints
Velocity detection across multiple assets
Conditional approvals based on market conditions

The policy layer evolves alongside the agents it protects. Better AI means agents can do more—which means more sophisticated guardrails are needed, not fewer.

The Bottom Line

The question isn't "will AI get good enough?" It's "good enough for what?"

For making decisions? AI is already good and getting better.

For eliminating the need for independent safety controls? Never. That's not how safety engineering works.

Probabilistic systems require deterministic guardrails. This is true whether the system is 90% reliable or 99.99% reliable. The guardrails aren't a commentary on the AI's capability—they're a recognition that financial systems require mathematical certainty, not statistical confidence.

Your AI agent can be brilliant. It should still have spending limits.

Ready to add deterministic guardrails to your AI agents?

Quick Start Guide - Get running in 5 minutes
GitHub - Open source SDK

DEV Community