DEV Community

Cover image for Why LLM predictions are dangerous for Finance?
Paulo Victor Gomes
Paulo Victor Gomes

Posted on

Why LLM predictions are dangerous for Finance?

Thats right, prediction are not the same as calculations, and LLMs don't calculate, they predict the next word.

Stanford's HELM report shows GPT-4 hits 90%+ accuracy on math tasks. But 90% accuracy in financial calculations means 1 in 10 transactions could be wrong. Would you ship that to production?

Why FinTech Can't Ignore This

If you're building payment systems, trading algorithms, or accounting software, AI math errors aren't just bugs—they're customer and compliance nightmares:

Regulatory scrutiny: Wrong calculations in financial reports = SEC problems
Customer trust: One miscalculated portfolio balance destroys credibility, FOR EVER. Who will trust in a financial company that doesn't calculate right?
Audit trails: How do you explain "the AI got confused" to auditors?
Compound errors: Small mistakes in interest calculations become massive losses, I mean, if you know the power of interest bearing you might get me

I can say to you, I tested the most famous AI assistants and vibe-code tools in the market: VScode with Github copilot, Cursor, Claude code, Replit, you named it... All of them do a really good job to kick-off, they work also very well for some classical CRUD features and some next level coding as well, like improving the scalability of some parts of the code. But, when you let they go, just doing calculations and validating by them-selves, the problem start to get serious...

Real Code, Real Problems

Let's see how AI math failures could break financial systems:

Problem 1: Interest Calculation Gone Wrong (Clojure by Claude Opus 4.1)

(defn compound-interest [principal rate time]
  ;; AI confidently writes this...
  (* principal (Math/pow (+ 1 rate) time)))

;; Looks right, but what about edge cases?
(compound-interest 100000 0.05 10) ;; => 162889.46

;; But AI might miss precision issues:
(compound-interest 100000.01 0.05 10) ;; Floating point errors
(compound-interest 100000 -0.05 10)   ;; Negative rates break everything
(compound-interest 100000 0.05 0)     ;; Edge case: time = 0
Enter fullscreen mode Exit fullscreen mode

The Problem: AI generates plausible code but misses edge cases that financial systems must handle. Negative interest rates, floating-point precision, and boundary conditions aren't just theoretical—they're regulatory requirements.

And this is still very surface of the problem, we are still not mixing the problems, like a customer with a Credit Card bill that gets late, he pays a minimum and buys a new think of other interest rate that needs to consider the tax of the country. This is a perfect place for AI halucinations.

Problem 2: Portfolio Rebalancing Logic (Rust by DeepSeek-R1)

rust// AI-suggested portfolio rebalancing
fn rebalance_portfolio(assets: &mut Vec<Asset>, target_weights: &[f64]) {
    let total_value: f64 = assets.iter().map(|a| a.value).sum();

    // AI logic seems reasonable...
    for (i, asset) in assets.iter_mut().enumerate() {
        let target_value = total_value * target_weights[i];
        asset.value = target_value; // Wait... this isn't how rebalancing works!
    }
}

struct Asset {
    symbol: String,
    value: f64,
    shares: f64,
    price: f64,
}
Enter fullscreen mode Exit fullscreen mode

The Problem: AI confidently generates code that looks mathematically sound but completely misunderstands the business logic. Real rebalancing involves buying/selling shares at market prices, not magically changing asset values.

The AI pattern-matched "rebalancing" with "adjusting proportions" but missed the crucial financial mechanics.

What's Missing in AI Math

Real mathematical reasoning needs:

Symbolic reasoning: Apply rules systematically, not guess patterns
Persistent memory: Remember that principal = 100000 throughout the calculation
Domain knowledge: Understand that negative interest rates are possible in modern finance, also all possible caveats and laws
Formal verification: Prove that functions handle all valid inputs correctly
Meta-cognition: Know when it's uncertain about financial regulations

But ok, and what...?
Should we avoid AI in all financial systems? Of course not, also I'll never be this person saying that AI is useless, also not being this person saying AI is already replacing everything it sees. The fact is that this AI revolution, as everything in life lives in the middle term, I mean, neither black, neither white, besides we as a humans like to see polarization, the world doesn't work like that. That said, I suggest you, full usage of AI on financial world, but keep your aware and cautions ON, kind of creating your own guidelines, you can start from mine ones:

AI usage guidelines for Finance:

  • Double-check all AI-generated financial calculations
  • Use math-optimized models (GPT-4, Claude Opus) but never trust them blindly
  • Implement comprehensive validation layers
  • Never let AI directly touch critical financial paths without supervision
  • Maintain audit trails for all AI-assisted calculations
  • Identify errors fast, rollback and learn from it

The Future

Mathematics is the foundation of financial systems. An AI that truly understands math—not just mimics it—could revolutionize risk management, fraud detection, and algorithmic trading.

But we're not there yet. Current LLMs can bluff convincingly, but financial markets don't accept "close enough."

The future belongs to AI that knows the difference between prediction and calculation. Until then, trust but verify—especially with other people's money.

References

The Limitations of Language Models - Academic deep dive

Stanford HELM Report - Comprehensive AI model evaluations
SEC Guidance on AI in Finance - systemic risk on capital markets - Regulatory perspective
Apple paper the Illusion of Thinking - Understanding the Strengths and Limitations of Reasoning Models via the Lens of Problem Complexity
Formal Verification in Financial Systems - Why math matters in finance
Mathematical Reasoning in AI - Current capabilities and limitations

Top comments (0)