DEV Community

Vitor Calvi
Vitor Calvi

Posted on

How I Built an AI Trading Agent That Learns From Every Trade (And Why 90% of Strategies Fail)

#ai

How I Built an AI Trading Agent That Learns From Every Trade (And Why 90% of Strategies Fail)

Most trading bots are dumb.

They follow rules like "buy when RSI < 30 and volume spikes" — and never question whether the market has fundamentally changed.

But markets evolve. Regimes shift. Volatility clusters. Correlations break. A strategy that printed money last month might be dead today.

So I built something different: an AI trading agent that runs on your own machine, remembers every trade it's ever made, and evolves its strategies through brutal backtest validation.

Here's the architecture, the methodology, and the uncomfortable truth about what happens when you actually test your trading ideas.


The Core Idea

What if a trading system could:

  1. Watch markets autonomously — scanning for opportunities every 15 minutes
  2. Remember what happened before — encoding every trade outcome into a searchable memory
  3. Recall similar situations — "this setup looks like the one that lost 3% last Tuesday"
  4. Validate before committing — running walk-forward backtests before any real capital goes in
  5. Evolve winners automatically — mutating successful strategy parameters to stay ahead of regime changes

That's Cognitive Trader. And it's not a signal bot. It's a cognitive system.


How It Works

Step 1: Autonomous Market Analysis

Every 15 minutes (configurable), the system runs a cognitive cycle:

  • Polls prices for BTC, ETH, SOL, and any user-defined symbols
  • Aggregates signals from technical indicators, active trading goals, and memory recalls
  • The LLM makes a decision: STAY_SILENT, SPEAK (send a Telegram alert), or ACT (execute a trade)

The LLM layer supports GPT-4, Claude, Gemini, and even local models via Ollama. You choose the brain.

Step 2: Vector-Embedded Memory

This is where it gets interesting. Every closed trade gets processed through two "memory wires":

W1 — Manual Fact Extraction
When a trade closes, the system extracts factual statements:

"RSI divergence on BTCUSDT produced +2.3% in low volatility regime"
"Breakout failed on ETHUSDT during FOMC announcement"

W2 — Vector Similarity Retrieval
Using Ollama's nomic-embed-text, every fact gets encoded into a vector embedding. When a new market situation arises, the system retrieves the most similar past experiences via cosine similarity.

So when the LLM is deciding whether to act, it doesn't just see current prices. It sees what happened the last 5 times this exact pattern appeared.

Step 3: The LLM Firewall

Before any trade executes, it passes through a risk-gated decision layer I call "The LLM Firebreak":

  • Position sizing: min(notional_cap, risk_based_size) — the smaller wins
  • Stop losses: capped at 5% from entry, always
  • Daily drawdown killswitch: if 24h PnL drops below -$500, the autonomous cycle halts
  • Maximum 3 concurrent positions
  • OCA (One-Cancels-All) bracket orders: every entry gets paired SL/TP algo orders

The LLM can suggest a trade, but the risk engine has veto power. Always.


The Brutal Truth About Backtesting

Here's what nobody tells you about trading strategies:

~90% of hypotheses die in walk-forward validation.

They look amazing in-sample. Beautiful equity curves. Sharpe ratios above 2. Then you run them on out-of-sample data and they bleed money.

Classic overfitting.

I built a backtest dashboard that runs your hypotheses through walk-forward validation and shows you the truth:

  • Which strategies are ALIVE (survived out-of-sample)
  • Which strategies are DEAD (failed validation)
  • Win rate, total P&L, Sharpe ratio per run
  • A cumulative ROI bar that compounds only from surviving strategies

No curve-fitting illusions. No cherry-picked time windows. Just brutal, honest validation.


Strategy Evolution

The strategies that survive backtesting don't just sit there. They evolve.

The system automatically mutates their parameters:

  • Tighter or wider stop losses
  • Different entry thresholds
  • New symbol combinations
  • Adjusted position sizing

It's Darwinian. Only the fittest compound. The rest get pruned.


Local-First Architecture

Everything runs on your machine:

  • Your API keys never leave your hardware
  • Your memory database is local (PostgreSQL)
  • Your LLM calls go through your own OpenRouter/Ollama setup
  • No cloud dependency, no data leakage, no third-party access

You can run it with cloud LLMs (GPT-4, Claude, Gemini) or fully locally with Ollama. Your choice.


The Tech Stack

Layer Technology
Language Go (primary), JavaScript (dashboard)
LLM OpenRouter (GPT-4, Claude, Gemini) + Ollama (local)
Memory PostgreSQL + vector embeddings (nomic-embed-text)
Exchange Binance USDT-M futures (testnet or live)
Dashboard Static HTML/CSS/JS — no frameworks
Notifications Telegram bot

What I Learned

1. Most Trading Ideas Are Bad

The first time I ran my hypothesis generator, I was excited. Then the backtests came back. 9 out of 10 strategies died. Humbling, but honest.

2. Memory Changes Everything

A trading system that remembers its mistakes is fundamentally different from one that doesn't. The vector embedding recall is the closest thing to "experience" a machine can have.

3. Local-First Is Non-Negotiable

When it comes to trading, your keys are your sovereignty. Anything that requires sending API keys to a third party is a non-starter.

4. LLMs Are Good at Reasoning, Bad at Math

The LLM layer excels at pattern recognition and contextual reasoning. But it needs a deterministic risk engine underneath. Never let an LLM calculate position sizes directly.


What's Next

  • Multi-timeframe analysis (15m + 1h + 4h confluence)
  • Sentiment integration (news, social, on-chain)
  • Portfolio-level risk management (cross-position correlation)
  • More exchange support beyond Binance

Drop a comment — I read every one.


Built with Go, PostgreSQL, Ollama, and a healthy distrust of in-sample backtests.

Top comments (0)