Last week I watched GPT-4 spend 2,000 tokens, 3 seconds, and $0.04 to pick the wrong A/B test variant. Then I replaced it with a single API call that took 0.01ms, cost $0.01, and gave the mathematically correct answer.
This isn't a hot take. It's arithmetic.
The Prompt That Costs $0.04 and Gets It Wrong
Here's what most agent builders do when they need to select the best variant from an A/B test:
System: You are a data-driven optimizer. Analyze the following A/B test
results and select the variant to show next.
User: I have three email subject lines being tested:
- Variant A: 500 sends, 175 opens (35% rate)
- Variant B: 300 sends, 126 opens (42% rate)
- Variant C: 12 sends, 8 opens (66.7% rate)
Which variant should I send to the next batch?
GPT-4 picks Variant B as the "balanced choice." Wrong. This is a multi-armed bandit problem. UCB1 selects Variant C because it's under-explored -- the exploration bonus outweighs the exploitation score.
The 0.01ms Alternative
curl -X POST https://oraclaw-api.onrender.com/api/v1/optimize/bandit \
-H 'Content-Type: application/json' \
-d '{"arms":[{"id":"A","pulls":500,"totalReward":175},{"id":"B","pulls":300,"totalReward":126},{"id":"C","pulls":12,"totalReward":8}],"algorithm":"ucb1"}'
Response: Variant C selected. Score 1.543. Exploitation 0.667 + Exploration 0.876. Mathematically provable.
| Task | GPT-4 | OraClaw |
|---|---|---|
| A/B test selection | ~2,000 tokens, 3s, $0.04, sometimes wrong | 0.01ms, $0.01, always correct |
| Schedule optimization | ~5,000 tokens, 8s, $0.10, approximate | 2ms, $0.01, provably optimal (HiGHS) |
| Risk assessment | ~3,000 tokens, 5s, $0.06, no confidence intervals | 5ms, $0.02, VaR + CVaR + CI |
| Anomaly detection | ~1,500 tokens, 2s, $0.03, threshold guessing | 0.01ms, $0.01, Z-score + IQR |
| Time series forecast | ~4,000 tokens, 6s, $0.08, no model | 0.08ms, $0.01, ARIMA + Holt-Winters |
19 Algorithms, Zero LLM Tokens
OraClaw ships 19 deterministic algorithms: Multi-Armed Bandits (UCB1/Thompson/LinUCB), CMA-ES, Genetic Algorithm, LP/MIP solver (HiGHS), Monte Carlo simulation, Bayesian inference, ensemble models, time series forecasting, VaR/CVaR portfolio risk, anomaly detection, graph analysis (PageRank/Louvain), and A* pathfinding.
14 of 17 endpoints respond in under 1ms. 1,072 tests passing.
Three Ways to Integrate
REST API -- curl any endpoint, no signup for free tier (100 calls/day)
MCP Server -- npx @oraclaw/mcp-server gives Claude/GPT 12 optimization tools
npm SDKs -- npm install @oraclaw/bandit @oraclaw/solver @oraclaw/risk
Try It Now
Every curl example hits the live API. Try the interactive demo -- no signup.
- API: oraclaw-api.onrender.com
- Demo: web-olive-one-89.vercel.app/demo
- GitHub: github.com/Whatsonyourmind/oraclaw
- npm: @oraclaw
Free tier: 100 calls/day, no auth. Paid: $9/mo. AI agents pay with USDC via x402 protocol.
LLMs are extraordinary at language. They're terrible at math. Stop making your agents think about optimization. Give them a calculator.
OraClaw is MIT licensed. 1,072 tests. Star us on GitHub if this saved you some tokens.
Top comments (0)