5 Backtesting Mistakes That Make Your Algo Look Like a Money Printer (Until It Isn't)

#algotrading #python #trading #programming

A 16-year-old just posted a forex algo on Reddit with a Sharpe ratio of 7.78 over 25 years. Zero losing years. 98% profitable months.

The comments were exactly what you'd expect: "what are you smoking?" and "show me the live results."

Here's the thing — the kid probably didn't do anything dishonest. These numbers show up constantly in backtests. They just don't survive first contact with live markets. After building TradeSight (an open-source strategy tournament system that evolves trading strategies overnight), I've watched hundreds of backtests go from god-tier to mediocre the moment you stress-test them properly.

Here are the five mistakes that keep producing lottery-ticket backtests.

1. OHLC Bar Resolution Lies About Fill Order

The mistake: Using M15 (or any timeframe) OHLC bars and assuming your limit order got filled at the price you wanted, in the order you wanted.

Why it kills you: A 15-minute bar tells you four numbers: open, high, low, close. It does not tell you whether price hit your take-profit or stop-loss first. If both prices exist within the same bar, your backtester has to guess — and most guess in your favor.

The fix: Use tick data for any strategy where entries or exits can happen mid-bar. If you can't get tick data, at minimum implement the "SL checked before TP" rule for same-bar conflicts.

For reference, IC Markets offers 30 months of tick data for free. It's not 25 years, but it'll tell you fast whether your edge is real or an artifact of bar resolution.

2. Spreads That Don't Exist in the Real World

The mistake: Backtesting with current spreads across historical data.

Why it kills you: EURUSD at 0.1 pip spread is realistic in 2026 during London session. In 2005? Try 1.5-2.0 pips. During the 2008 financial crisis? Some retail brokers were quoting 5+ pip spreads on majors.

The fix: Run your backtest with period-appropriate spreads. Pre-2010: 1.0-2.0 pips for major pairs. Pre-2015: 0.5-1.0 pips. Current: whatever your broker quotes. If your profit factor drops below 1.2 with realistic historical spreads, the edge was mostly the spread assumption.

3. Development Bias (The Survivorship Problem You Can't See)

The mistake: Building many strategy variations, picking the one with the best backtest, then claiming it's "unoptimized."

Why it kills you: Even if your final strategy only has 2 tunable parameters, the process of getting there involved hundreds of decisions. Which pairs to trade. Which timeframe. Which indicator. Each decision was implicitly informed by looking at the data.

The fix: True out-of-sample testing means setting aside a chunk of data you never look at during development.

In TradeSight's tournament system, we do this automatically — strategies compete on unseen data every night. The ones that only look good on training data get killed off naturally.

4. Ignoring Execution Reality

The mistake: Assuming your limit orders fill at the exact price you specify.

Why it kills you live: Queue position. If you place a limit order at 1.1050, you're behind every order already sitting at that level. In backtesting, you assume you're first in line. In reality, price might touch 1.1050 and reverse without filling you.

The fix: Add 1-tick slippage to every fill. Assume worst-case queue position. If your strategy is sensitive to these assumptions, the edge is in the execution model, not the market.

5. No Stress Testing of Assumptions

The mistake: Running one backtest with one set of parameters and declaring victory.

Why it kills you: Markets aren't stationary. The volatility regime of 2005 is completely different from 2020.

The fix: Before trusting any backtest, perturb it. Widen spreads. Delay entries. Remove your best trades. Shift parameters. If the equity curve goes from god-tier to flat after minor friction, the edge is more fragile than it looks.

The Uncomfortable Truth

Most retail algo traders are running backtests that look amazing because the backtest engine is doing them favors they don't realize. The path from "Sharpe 7 backtest" to "consistently profitable live" isn't tweaking parameters — it's eliminating every assumption that flatters your results.

Build your stress tests first. Then build your strategy.

TradeSight runs strategy tournaments overnight — hundreds of parameter variations compete on unseen data, losers get killed, winners evolve. MIT licensed, self-hosted, no API keys required for demo mode.