"Won't AI just get better at this?"
As language models improve, won't they eventually become reliable enough to handle money without external guardrails?
The short answer: No. And understanding why reveals something fundamental about how we should think about AI safety.
The Improvement Trajectory
Let's steel-man the argument. AI capabilities are improving rapidly:
- Better alignment: Constitutional AI, RLHF, and new training techniques make models more reliable at following instructions
- Longer context: Models can now hold millions of tokens, reducing "forgotten" instructions
- Formal reasoning: Chain-of-thought and tool use make agents more predictable
- Agent frameworks: LangChain, CrewAI, and others add structure around LLM decision-making
Given this trajectory, why wouldn't AI agents eventually become trustworthy enough to handle financial transactions without external policy enforcement?
The Fundamental Distinction: Probabilistic vs Deterministic
Here's the core issue: LLMs are probabilistic by design. They predict the next token based on statistical patterns. Even a 99.99% reliable model fails 1 in 10,000 times.
For most applications, 99.99% is excellent. For financial transactions, it's not good enough.
Consider a trading agent making 1,000 transactions per day. At 99.99% reliability:
- Per day: 0.1 expected failures
- Per month: 3 expected failures
- Per year: 36 expected failures
Now imagine one of those failures is "send entire balance to wrong address" instead of "minor formatting error." The expected value of that tail risk is catastrophic.
A deterministic policy—if (amount > dailyLimit) reject()—has a 0% failure rate. Not 99.99%. Zero. The transaction either passes or it doesn't. There's no statistical distribution of outcomes.
This isn't about AI being bad. It's about the mathematical difference between probabilistic and deterministic systems.
The Seatbelt Analogy
Cars have gotten dramatically safer over the decades:
- Crumple zones
- Anti-lock brakes
- Electronic stability control
- Autonomous emergency braking
- Lane departure warnings
And yet we still have seatbelts. We still have airbags. We still have speed limits.
Why? Because safety systems are layered. Each layer handles different failure modes. The fact that cars rarely crash doesn't mean we remove the protection for when they do.
The same principle applies to AI agents:
| Layer | Purpose | Type |
|---|---|---|
| Training/RLHF | Make the model generally safe | Probabilistic |
| System prompts | Guide behaviour for this use case | Probabilistic |
| Agent framework | Add structure and validation | Mixed |
| Policy layer | Hard limits that cannot be exceeded | Deterministic |
Improving Layer 1 doesn't eliminate the need for Layer 4. They serve different purposes.
Separation of Concerns
There's an architectural principle at play here: the system making decisions shouldn't also control the guardrails.
If your AI agent both decides what to spend AND enforces spending limits, those limits exist only as long as the agent respects them. A sufficiently convincing prompt injection, a hallucination, or an edge case in the training data could bypass them.
External policy enforcement creates a separation:
┌─────────────────┐ ┌─────────────────┐ ┌─────────────┐
│ AI Agent │────▶│ Policy Layer │────▶│ Execution │
│ (decides) │ │ (enforces) │ │ (acts) │
└─────────────────┘ └─────────────────┘ └─────────────┘
│ │
Probabilistic Deterministic
Can be influenced Cannot be influenced
by inputs by the agent
The policy layer doesn't care how good the AI is. It doesn't care if the AI was jailbroken or made a mistake or had a brilliant insight. It just checks: does this transaction comply with the rules? Yes or no.
The Human Parallel
Here's a thought experiment: even trusted humans have spending limits.
A senior employee at a company might be brilliant, reliable, and have 20 years of tenure. They still can't wire $1M without approval. Not because they're untrustworthy—because spending controls aren't about trust.
They're about:
- Risk management: Limiting blast radius of any single decision
- Compliance: Demonstrating controls to auditors and regulators
- Process: Creating checkpoints for high-stakes actions
- Recovery: Ensuring mistakes can be caught before they're irreversible
AI agents should have the same constraints. Not because they're bad at their jobs, but because that's how you manage financial risk in any system.
The Compliance Reality
For enterprise deployments, there's a practical consideration: regulators don't accept "the AI is really good now" as a control.
SOC 2, PCI-DSS, and financial regulations require demonstrable, auditable controls. You need to show:
- What limits exist
- How they're enforced
- That they cannot be bypassed
- An audit trail of decisions
A policy engine provides all of this. An AI agent's internal reasoning—no matter how sophisticated—doesn't satisfy these requirements.
When AI Improves, Policy Layers Get Better Too
There's an implicit assumption in "won't AI make this obsolete?" that policy layers are static while AI advances.
In reality, as agents become more capable, policies become more sophisticated:
Today's policies:
- Daily spending limits
- Per-transaction caps
- Recipient whitelists
Future policies (as agents take on more complex tasks):
- Cross-agent coordination limits
- Portfolio allocation constraints
- Velocity detection across multiple assets
- Conditional approvals based on market conditions
The policy layer evolves alongside the agents it protects. Better AI means agents can do more—which means more sophisticated guardrails are needed, not fewer.
The Bottom Line
The question isn't "will AI get good enough?" It's "good enough for what?"
For making decisions? AI is already good and getting better.
For eliminating the need for independent safety controls? Never. That's not how safety engineering works.
Probabilistic systems require deterministic guardrails. This is true whether the system is 90% reliable or 99.99% reliable. The guardrails aren't a commentary on the AI's capability—they're a recognition that financial systems require mathematical certainty, not statistical confidence.
Your AI agent can be brilliant. It should still have spending limits.
Ready to add deterministic guardrails to your AI agents?
- Quick Start Guide - Get running in 5 minutes
- GitHub - Open source SDK
Top comments (0)