DEV Community

quantoracledev
quantoracledev

Posted on

How to Give Your LangChain Agent Reliable Quant Finance Math (in 10 minutes)

TL;DR

Large language models are great at reasoning about finance and noticeably unreliable at doing finance math. Ask an LLM to price an option and the price it returns can drift across runs. Ask for the Greeks and the higher-order ones (vanna, charm, speed) frequently come back wrong or inconsistent.

This is a known failure mode — and it's not specific to any particular model. The fix is standard engineering: call a dedicated calculator. This post walks through how to give any LangChain agent access to 73 deterministic quantitative finance endpoints (options pricing, Greeks, risk metrics, portfolio optimization, Monte Carlo, backtests, etc.) via one line of code. First 1,000 calls/day are free — no signup.

The problem in 30 seconds

# What you hope happens
response = llm.invoke("Price a European call: spot=100, strike=105, 6 months, 20% vol, 5% rate")
# "$4.58" ✓
Enter fullscreen mode Exit fullscreen mode
# What actually happens in production
# - Price may land close, but drifts run-to-run
# - Delta, gamma, vega often reasonable; vanna, charm, speed, color frequently wrong
# - Numerics that depend on chained reasoning (IV solver, barrier options, path-dependent) degrade further
Enter fullscreen mode Exit fullscreen mode

The math is deterministic. The model isn't. For anything agent-driven — backtests, risk management, paper trading, analysis pipelines — you need same-input-same-output calculations.

The fix: QuantOracle

QuantOracle is a REST API with 63 pure quant calculators plus 10 "composite" workflows (strategy backtests, portfolio rebalance plans, options strategy optimizers, hedging recommendations, full risk tearsheets). All citation-verified against Hull, Wilmott, Bailey & Lopez de Prado.

  • 1,000 free calls/IP/day, no API key
  • Paid tier uses x402 micropayments in USDC on Base or Solana ($0.002–$0.10/call)
  • Deterministic: same inputs always produce the same outputs
  • MCP server, LangChain toolkit, OpenAI GPT, and plain REST all supported

Hook it into LangChain in one line

pip install langchain-quantoracle langchain-openai langchain
Enter fullscreen mode Exit fullscreen mode
from langchain_quantoracle import QuantOracleToolkit
from langchain_openai import ChatOpenAI
from langchain.agents import AgentExecutor, create_tool_calling_agent
from langchain_core.prompts import ChatPromptTemplate

# Load every QuantOracle tool — all 73 endpoints become LangChain tools
tools = QuantOracleToolkit().get_tools()

llm = ChatOpenAI(model="gpt-4o")
prompt = ChatPromptTemplate.from_messages([
    ("system", "You are a quant analyst. Use QuantOracle tools for all financial math — never compute in-context."),
    ("human", "{input}"),
    ("placeholder", "{agent_scratchpad}"),
])

agent = create_tool_calling_agent(llm, tools, prompt)
executor = AgentExecutor(agent=agent, tools=tools, verbose=True)
Enter fullscreen mode Exit fullscreen mode

That's it. Your agent now has Black-Scholes, 22 portfolio risk metrics, Kelly sizing, 13 technical indicators, Monte Carlo, strategy backtests, and 60+ others.

Example 1: Price an option with Greeks

result = executor.invoke({
    "input": "Price a European call with spot=100, strike=105, 6 months to expiry, "
             "20% annualized vol, 5% risk-free. I want the price, delta, gamma, vega, and theta."
})
Enter fullscreen mode Exit fullscreen mode

The agent picks the right tool (options_price), calls it, and returns:

Price: $4.58
Greeks:
  Delta: 0.4612
  Gamma: 0.0281
  Theta: -0.0211 (daily)
  Vega:  0.2808
Enter fullscreen mode Exit fullscreen mode

These are the exact Black-Scholes values. Reproducible across runs.

Example 2: Full risk analysis from a returns series

result = executor.invoke({
    "input": "Here are daily returns: [0.01, -0.02, 0.03, 0.005, -0.01, 0.02, -0.015, 0.025, "
             "0.01, -0.005, 0.015]. Give me a complete risk breakdown."
})
Enter fullscreen mode Exit fullscreen mode

The agent calls the risk_full-analysis composite (one API call that replaces 7 individual ones) and returns:

Risk Tearsheet (11 periods):
  Sharpe: 2.83
  Sortino: 4.59
  VaR (95%): -0.03
  Max Drawdown: -0.03
  Kelly leverage: 10.65x
  Hurst: 0.50 (neutral — random walk)
  CAGR: 122.98%
Enter fullscreen mode Exit fullscreen mode

Same inputs always produce the same output. No drift, no hallucinations, no flaky Sharpe calculations.

Example 3: Backtest a strategy

result = executor.invoke({
    "input": "Backtest a 20/50 SMA crossover on this price series: "
             "[100, 101, 102, ...]. Initial capital $10000, 5 bps commission."
})
Enter fullscreen mode Exit fullscreen mode

The agent calls backtest_strategy (a composite endpoint that replaces ~10 individual calls) and gets back: Sharpe ratio, Calmar, max drawdown, win rate, list of trades, equity curve, and a buy-and-hold benchmark comparison.

When to use composites vs individual calculators vs batch

The toolkit exposes three tiers of tools, each for a different situation:

  • Individual calculators (options_price, risk_portfolio, stats_hurst-exponent, ...) — fine-grained control, one concept per call. Free tier.
  • Composite workflows (backtest_strategy, portfolio_rebalance-plan, options_strategy-optimizer, hedging_recommend, risk_full-analysis, ...) — bundle 5–15 calculator calls into one round trip with a purpose-built output. Paid-only ($0.015–$0.10 each), but dramatically cheaper and faster than hand-chaining the pieces.
  • Batch endpoint (POST /v1/batch) — run up to 100 arbitrary calculator calls in a single HTTP request. Ideal for parameter sweeps, walk-forward backtests, or any workload where latency dominates cost. Price is the sum of the individual prices — no markup. First batch call per IP is free; subsequent batches are paid via x402.

Rule of thumb:

  • One calculation → individual calculator
  • Named workflow (risk analysis, backtest, hedge selection) → composite
  • Many small calculations at once → batch

A single backtest run that would be 200 HTTP calls one at a time becomes 2 batch calls. If your agent iterates, batch usually wins on both latency and cost.

Filter by category to keep tool lists small

A common LangChain pitfall: 73 tools in the prompt confuses smaller models. Filter by category:

# Options-only agent
tools = QuantOracleToolkit(categories=["options", "derivatives"]).get_tools()

# Risk/portfolio-only agent
tools = QuantOracleToolkit(categories=["risk", "portfolio", "stats"]).get_tools()

# Crypto-focused agent
tools = QuantOracleToolkit(categories=["crypto", "simulate"]).get_tools()
Enter fullscreen mode Exit fullscreen mode

Available categories: options, derivatives, risk, indicators, simulate, portfolio, fixed-income, fi, stats, crypto, fx, macro, tvm, trade, pairs, backtest, hedging.

Past the free tier

After 1,000 calls/day (per IP), the API returns HTTP 402 with an x402 payment requirements header. If you're using an x402-capable HTTP client (e.g. AgentCash, Coinbase AgentKit), payments are automatic — USDC on Base or Solana, $0.002–$0.10 per call. Otherwise the toolkit raises an exception and you can add a payment layer yourself.

Why this matters for agentic systems

When an agent makes 50 tool calls during a backtest, every calculation has to be right. An LLM that's 85% accurate on Black-Scholes doesn't produce a backtest — it produces noise. Moving all math to a deterministic calculator means:

  • Reproducible results (your next run produces the same Sharpe)
  • Cacheable (you can memoize by input hash)
  • Auditable (you can replay any step)
  • Fast (sub-millisecond per calculation on the server)
  • Cheap (orders of magnitude less than equivalent LLM tokens)

This pattern — LLM for reasoning + deterministic APIs for compute — is the one thing that actually works for production agent systems. Pick it up now and you don't have to rebuild once your agent starts taking real actions.

Links

Free tier is generous, no signup required, MIT licensed. If you're building an agent that touches financial math — options pricing, portfolio analytics, risk, backtests — try it before rolling your own.

Top comments (0)