DEV Community

Matthieu David
Matthieu David

Posted on

The hardest part of building a no-code backtester wasn't the backtest. It was the export.

Two backtest engines will never agree by default. I learned that the slow way.

I build a no-code backtester. You drag blocks (indicators, conditions, entries, exits), hit run, and get an equity curve in about 30 seconds on 10 years of minute data. The fun part was never the backtest itself. The part that ate months was the export: turning that visual strategy into Pine Script that, when you paste it into TradingView, produces the same trades my own engine did.

It almost never did at first. Same rules, same data, two completely different equity curves. If you have ever ported a strategy from one platform to another and watched the numbers fall apart, you already know the feeling.

Here is what actually causes the divergence, and how I got it under 2%.

Why two engines disagree

The naive assumption is that a strategy is a set of rules, and rules are rules. They are not. A strategy is rules plus an execution model, and the execution model is full of decisions nobody writes down.

1. Bar timing (this is the big one). When does a signal fire? On the close of the current, still-forming bar, or on the close of the last confirmed bar? If you evaluate close[0] (the current bar), your backtest looks amazing and your live results are garbage, because in real time that bar is not closed yet. This is repainting. The only honest answer is to evaluate on close[1], the previous confirmed bar. If your engine uses close[0] and your Pine code uses close[1] (or the reverse), the two will diverge on every single signal.

2. Indicator warmup and seeding. An EMA needs a starting value. Some implementations seed it with the first price, some with an SMA of the first N bars, some with zero. RSI has the same issue with its first average gain and loss. Run the same EMA(200) through two libraries and the first few hundred bars will not match. On a 10-year backtest that tail is small, but if your entries cluster early it matters.

3. Order fill assumptions. A market order fired on a bar close: does it fill at that close, or at the next bar open? When stop loss and take profit sit inside the same candle, which one fills first? The candle only gives you O, H, L, C, so you cannot know the real path. You have to pick a rule (worst case: assume the stop hits first) and apply the exact same rule in both engines. Pick differently and your win rate shifts by several points.

4. Floating point and rounding. Tick size, price rounding, position sizing rounded to lots. Tiny per-trade, but it compounds across thousands of trades.

5. Sessions and gaps. Where does a daily bar start in UTC? How do you handle the weekend gap in forex? A one-hour offset in session boundaries silently shifts every intraday signal.

None of these are bugs. They are unwritten choices, and two engines made them independently.

The fix: stop trying to match Pine, and define one model both can express

My first instinct was to make my Python engine reproduce TradingView exactly. Wrong direction. Pine is a black box that changes, and you cannot unit test someone else's cloud.

What worked was the opposite. I defined a single execution model that is the lowest common denominator both engines can express without ambiguity, then forced both sides to obey it:

  • Signals evaluate on confirmed bars only. close[1], never close[0]. The engine enforces this at the system level, so a strategy literally cannot reference the forming bar. Repainting stops being a discipline problem and becomes impossible.
  • Market orders fill at the next bar open. No "fill at this close" shortcut.
  • Same-candle stop and target resolve worst case first.
  • One indicator implementation, with the warmup recurrence written to match Pine's documented behavior (not a generic library default).

Once the model is pinned down, the export becomes a compiler problem instead of a guessing game.

Deterministic codegen, not string templating

Each visual block maps to a small, pure Pine fragment. An RSI block is the same Pine every time. A "crosses below" condition is the same Pine every time. Strategy export is just composing those fragments in topological order and wiring the inputs. No branching on "what did the user probably mean," because the model already removed the ambiguity.

That determinism is what makes the result testable, which leads to the only part that actually guarantees anything.

The parity harness is the whole product

A "2% divergence guarantee" means nothing if you measure it once by hand. So the real work was a test harness:

  1. Generate a batch of strategies covering the block library (trend, mean reversion, SMC patterns, multi-condition).
  2. Run each through my engine on a fixed dataset.
  3. Export each to Pine, run it on the same symbol and timeframe in TradingView, pull the results.
  4. Assert that trade count, win rate, and final equity diverge by less than 2%. If any strategy breaks the threshold, the build fails.

Most of the early failures were not in the codegen. They were in the model assumptions above. Every time a strategy blew past 2%, it pointed at one more unwritten decision I had not pinned down. The harness was less a test and more a way to find the assumptions I did not know I was making.

What I would tell anyone building cross-engine anything

  • The rules are the easy part. The execution model is the product.
  • Repainting is not a feature you add. It is a default you have to actively forbid. Enforce close[1] at the system level or your users will footgun themselves.
  • Do not chase parity by reverse engineering the other engine. Define a model both can express and make both obey it.
  • If you claim a number, gate it in CI. A guarantee you cannot reproduce automatically is marketing, not engineering.

I am building this as a visual tool so traders who cannot code can still get an honest backtest and walk out with clean Pine Script. If you want to see the export side of it, it is here: the visual backtester I work on. But the parity lessons above apply whether you use it or roll your own.

If you have shipped cross-platform strategy export and solved the same-candle stop/target problem differently, I would genuinely like to hear how. That one still keeps me up.

Top comments (0)