I didn't set out to build a trading bot.
I set out to answer a simple question: can a small algorithm, running on a $100 account, consistently beat gold?
After two sessions, one brutal backtest, and a complete strategy overhaul — here's exactly what we built, what failed, and what's now running live. No fluff, no paid course upsell. Just the raw journey.
🟡 The Asset: XAU/USD (Gold Spot)
Gold is one of the most-traded instruments in the world. It moves on inflation data, Fed decisions, geopolitical fear, and sometimes just pure momentum. For a bot, that's both exciting and terrifying.
Why gold?
- Trades nearly 24/5 — no gaps like individual stocks
- High volatility = bigger moves per signal
- Free price data available (no expensive data subscriptions)
- Retail-accessible via brokers like OANDA, FXCM, and IG
Starting capital: $100. Real-world risk. No pretend money.
🤖 The Architecture: What We Actually Built
The system has three layers:
Layer 1 — The Data Collector
A Python script (xauusd_scalper_scrape.py) runs every hour, fetching the live XAU/USD spot price from gold-api.com (completely free, no API key). It:
- Appends every tick to a local
price_history.jsonlfile - Buckets spot ticks into hourly OHLC bars (open/high/low/close)
- Computes EMA50 and ATR14 on the rolling bar set
- Evaluates the trading rules and writes the signal decision to disk
Layer 2 — The Rule Engine
This is the brain — and crucially, it's 100% deterministic. No AI decides whether to buy or sell. A Python function checks three conditions at 16:00 UTC every weekday:
IF close > 3-hour NY range high
AND close > EMA50 (trend filter)
→ BUY | Stop: 1.5×ATR below entry | Target: 5R above entry
IF close < 3-hour NY range low
AND close < EMA50
→ SELL | Stop: 1.5×ATR above entry | Target: 5R below entry
OTHERWISE → HOLD
No discretion. No "feelings". The rule either fires or it doesn't.
Layer 3 — The Telegram Relay
An AI model (running free via OpenRouter) reads the brief and fires a formatted message to @HermesGoldBot on Telegram. This is the only thing the AI does — format text. It has no authority over the trade decision.
The whole thing runs as a Hermes cron job, scheduled at the top of every hour.
💀 The Strategy That Failed First
Before we found the winner, we tried the "obvious" approach: a multi-indicator confluence system using EMA crossovers, MACD histogram, RSI zones, and Bollinger Band touches.
It sounded sophisticated. The backtest results were sobering:
| Metric | Result |
|---|---|
| Starting Capital | $100 |
| Final Capital | $39.04 |
| Total Return | -61% |
| CAGR | -37.4% |
| Profit Factor | 0.72 |
| Max Drawdown | 61.6% |
| Win Rate | 46.2% |
| Total Trades | 1,274 |
A profit factor below 1.0 means the strategy loses money structurally. We ran 1,274 trades and came out with $39 from $100.
Why did it fail?
The killer was the partial take-profit setup. The strategy took half the position off at 1R and let the rest run. On paper, "locking in profit" sounds prudent. In reality, it caps your winners while your losers still run to full stop. That destroys your reward-to-risk ratio and makes profitability almost impossible.
Secondary killer: the EMA/MACD/BB confluence on 15-minute gold bars generates a lot of noise signals. The more conditions you add, the more you feel like you're being selective — but you're just adding correlated filters that all react to the same underlying price action.
🔬 The Search for a Real Edge
We ran a systematic grid search across 15 strategy archetypes on 2+ years of hourly GC=F data (13,735 bars, November 2023 → April 2026). Every strategy was tested with realistic friction ($1.20 round-trip per ounce for spread + slippage).
The data was split 70/30 in-sample / out-of-sample — strategy parameters were tuned on the first 70%, then validated cold on the final 30% that was never touched during optimization. This prevents curve-fitting.
The Graveyard
| Strategy | Profit Factor | Verdict |
|---|---|---|
| Multi-indicator confluence (original) | 0.72 | Dead |
| Pullback to EMA21 | 0.72 | Dead |
| MACD zero-cross + trend filter | 0.93 | Dead |
| Donchian 20 breakout | 0.89 | Dead |
| RSI 20/80 mean reversion | ~1.0 | Flat |
| Long-only trend follow 4R | 1.09 | Marginal |
The Survivor
| Strategy | Full PF | OOS PF | CAGR | Max DD | Trades |
|---|---|---|---|---|---|
| ORB NY 3h + EMA50 + 5R | 1.91 | 2.71 | +41% | 17.5% | 99 |
The 3-Hour NY Opening Range Breakout wasn't just the best strategy. It was the only one that actually improved on out-of-sample data — PF went from 1.34 in-sample to 2.71 out-of-sample. That's the opposite of overfitting. It suggests the strategy has a genuine edge, not just lucky parameters.
🏆 The Winning Strategy: How It Actually Works
Every weekday, the New York trading session opens at 13:00 UTC. For the first three hours (13:00, 14:00, 15:00 UTC bars), the bot silently observes — recording the highest high and lowest low of that window.
At 16:00 UTC, it makes its one decision for the day:
Long setup: If the current hourly close pushes above the 3-hour range high, and price is above EMA50 (confirming the broader uptrend), it enters a BUY.
Short setup: If the close drops below the 3-hour range low, and price is below EMA50, it enters a SELL.
The sizing:
- Risk exactly 2% of account per trade
- Stop = 1.5 × ATR(14) from entry (volatility-adjusted, not arbitrary)
- Target = 5R (five times the stop distance)
The math over 2 years:
- 44 wins, 55 losses (44.4% win rate)
- Average winner: +$4.74 | Average loser: -$1.99
- That's a 2.38 reward-to-risk ratio
You only need to be right 30% of the time to break even at 5R targets. Hitting 44% is a genuine edge.
📊 Full Backtest Results
Dataset: 13,735 hourly bars | Nov 2023 → Apr 2026 | 2.01 years
Universe: GC=F (Gold Futures, proxy for XAU/USD spot)
Friction: $1.20 per oz round-trip (spread + slippage)
Risk: 2% per trade | One position at a time
$100 → $199.40 (+99.4% total | +41.0% CAGR)
Profit Factor: 1.91 (full) | 2.71 (out-of-sample)
Win Rate: 44.4%
Avg R:R: 2.38
Max Drawdown: 17.5%
Trades: 99 (~1 every 8 days)
For context: buy-and-hold gold over the same period returned +142% (gold went from $2,003 → $4,841). The strategy returned +99% — but with controlled drawdowns and defined risk per trade. A real edge, not a bull market narrative.
⚙️ Going Live: The Stack
Everything runs on a Linux server with Hermes Agent (open-source AI agent framework) as the orchestration layer.
Files on disk:
~/.hermes/trading-agent/
config.json ← strategy parameters
state.json ← capital, positions, drawdown
price_history.jsonl ← every price tick
signals_v2.jsonl ← every signal with full reasoning
best_strategy.json ← deployed strategy config
~/.hermes/scripts/
xauusd_scalper_scrape.py ← the rule engine (v2, deterministic)
The cron job:
- Runs at the top of every hour (
0 * * * *) - Script runs first: fetches price, builds bars, evaluates rules, writes signal
- AI model (free tier) runs second: reads the brief, sends Telegram message
- Model used:
openai/gpt-oss-20b:freevia OpenRouter — completely free
Telegram commands you can send to @HermesGoldBot:
-
/status— current capital, P&L, drawdown -
/signals— last 5 trade signals -
/now— force an immediate scan -
/pause//resume— pause or resume the bot -
/risk— current risk level and lot sizing
🔴 What We Learned the Hard Way
1. Partial TPs are profit killers.
Taking half off at 1R feels safe. It's not. It caps your winners and the math works against you unless your win rate is above 60% (rare in systematic trading).
2. More indicators ≠ more edge.
The original system had 6 indicators. The winning system has 3 (price, EMA50, ATR). Simplicity won.
3. Out-of-sample validation is non-negotiable.
Any strategy can be curve-fitted to look good on historical data. If it doesn't hold up on data it's never seen, it's not a strategy — it's a memory.
4. The LLM shouldn't be the decision-maker.
We started with an LLM making the buy/sell calls. It failed spectacularly (PF 0.72). Moving the LLM to "Telegram messenger" and making a deterministic Python function the decision-maker fixed everything. AI for formatting, rules for trading.
5. Free ≠ unreliable (if you pick right).
We tested 7 free-tier models. Most were heavily rate-limited and failed. gpt-oss-20b:free on OpenRouter worked reliably. The entire system costs $0/month to run.
🔭 What's Next
The bot is live and collecting data. The first real signal will fire at 16:00 UTC on the next clean breakout day. Meanwhile, there are several improvements queued:
- News filter — skip signals when high-impact USD/Gold events fall within ±2h of the trigger (FOMC, NFP, CPI). This is probably the single highest-value improvement.
- London session ORB — run the same logic on the 7:00 UTC London open. Doubles trade frequency.
- OANDA paper trading integration — close the loop with actual order placement so P&L is real fills, not simulated.
- Live vs backtest tracking — after 30 live trades, compare live PF vs backtest PF. If they diverge by >30%, pause and investigate.
💬 Final Thought
The market doesn't care how clever your strategy sounds. It only cares whether you're right more often than you're wrong — when it counts.
The ORB strategy isn't glamorous. It doesn't use machine learning. It doesn't read news sentiment. It just watches where price consolidates in the first three hours of New York trading, waits for a decisive break in one direction, confirms the trend, and takes the trade.
Simple. Systematic. Backtested. Live.
We'll post updates as the live signals come in. Follow along on
Built with: Python · Hermes Agent · OpenRouter (free tier) · gold-api.com · Telegram Bot API
Strategy validated on 13,735 hourly bars, Nov 2023 – Apr 2026
Past performance does not guarantee future results. This is not financial advice.
Top comments (0)