Đỗ Hiệp

Posted on Apr 18 • Originally published at sleepyquant.rest

The Inverted Control: What 24 Hours of Running Our Own Bot Backwards Revealed

#ai #quant #mlx #buildinpublic

The Inverted Control: What 24 Hours of Running Our Own Bot Backwards Revealed

Executive Summary

After roughly 500 paper round-trips showed a persistent sub-35% win rate with average losses larger than average wins, we stopped scaling the live side and ran a cheap experiment: a second paper book that executes the exact opposite of every signal the bot produces, on the same universe, same cadence, same fee model.

Twenty-four hours in, the inverted book is winning 70.59% of round-trips versus 15.79% on the standard book. Both books are still losing in absolute terms because fees dominate at small sample. The important number is not the win rate gap. It is whether the inverted book's gross edge clears the fee floor by the time we hit the 100-round-trip decision point, roughly 8 to 12 days out.

This post walks through the setup, the data so far, where the reading could be wrong, and the specific decision that happens at 100 round-trips.

The Thesis

A bot that loses more than random is either extracting no signal, or extracting signal with the sign reversed. Those two hypotheses produce identical win-rate readings in a one-book world. They are only separable by running a second book with the signal flipped.

The second hypothesis is rarer but well-documented: overfit features trained on stale microstructure, labels that got reversed in a pipeline step, crowding where yesterday's "bullish" marker is now a faded trade. None of those are visible from inside a single losing book. All of them flip sign when you flip the signal.

Running the inverted control is the lowest-cost diagnostic that distinguishes the two hypotheses. In the first hypothesis (no signal), the inverted book converges to the same losing distribution, minus fee drag. In the second hypothesis (inverted signal), the inverted book diverges: higher win rate, smaller loss magnitude, possibly net-positive once sample grows past fee-drag territory.

The point of running the control is not to find a winning strategy. It is to stop guessing about which of those two worlds the bot is actually in.

The Setup

Two paper books, same engine, same universe, same fee schedule.

Book 1 — standard signal. Every decision from the scanner is executed as issued. LONG is LONG, BUY is BUY.
Book 2 — inverted mirror. Every decision is flipped programmatically before execution. LONG becomes SHORT, BUY becomes SELL (or hold, since the spot lane is accumulate-only during this window, making the flip mostly a futures test).

Both books start from identical simulated ~$1000 balances. Both pay realistic exchange-tier fees on open and close — no free-trade assumption, which is where most inversion backtests fail.

Universe: 30 USDT pairs on a major exchange, perps plus spot. Scan cadence 15 minutes. Leverage cap 3x. Drawdown hard stop 8% per book. Spot exit signals ignored in Book 2 for this window — the test isolates the futures direction bet.

The test completes at 100 post-flip round-trips on Book 2. At that point one of three decisions is on the table.

Deep Dive: 24 Hours of Parallel Data

Windowed to the period since the flip went live:

Book 1 — standard. 38 round-trips closed. Win rate 15.79%. Net result negative on the order of tens of USD.
Book 2 — inverted. 17 round-trips closed. Win rate 70.59%. Net result also negative, but by a much smaller per-round-trip magnitude (roughly 25x better than standard).

The win-rate gap from 15.79% to 70.59% is the headline. It is not a statistical fluke at this sample. A purely random signal in this setup would produce win rates clustering around 45-55% on both books. A noise signal (first hypothesis) would produce roughly symmetric rates on both books. What shows up instead — asymmetric split heavily favoring the inverse — is the fingerprint of a signal that carries information with the wrong sign.

Per-symbol, the inversion's effect is not uniform:

Symbol	Book 1 WR	Book 2 WR	Direction
ZEC/USDT	12.5% (8 RT)	80.0% (5 RT)	Inversion strongly helps
ARB/USDT	25.0% (4 RT)	100% (3 RT)	Inversion helps
DOGE/USDT	0.0% (5 RT)	100% (2 RT)	Inversion helps
UNI/USDT	0.0% (4 RT)	100% (1 RT)	Inversion helps (micro sample)
BCH/USDT	0.0% (1 RT)	100% (1 RT)	Inversion helps (micro sample)
NEAR/USDT	28.6% (7 RT)	0.0% (2 RT)	Inversion hurts
ADA/USDT	50.0% (4 RT)	33.3% (3 RT)	Inversion hurts

Five of seven symbols with both-book data favor inversion. Two do not. The symbols where inversion fails are the ones where the standard book was already near or above 30% — consistent with a "invert only what's clearly broken, leave the rest" hybrid strategy that may emerge at higher sample.

The Fee Floor

Every round-trip pair costs roughly the open-plus-close fee on a major exchange, applied to both books independently. With Book 2 running in parallel, fees double.

That doubles the bar. Book 2's improvement in gross profit-and-loss has to clear two fee stacks, not one. An inversion signal that wins on gross but gets eaten by the fee floor is a classic mean-reversion trap: backtests ignoring fees look clean, live books ignoring fees bleed out.

At 17 round-trips, Book 2's net-negative result is dominated by fee drag, not by losses on individual trades. The interesting question is whether that fee drag, as a percentage of gross result, shrinks as sample grows. If the gross per-round-trip edge holds at roughly current magnitude, net-positive becomes plausible around round-trip 50-70. If the gross edge compresses as the signal gets noisier at larger sample, net-positive never arrives.

Counter-Argument: Why This Reading Could Be Wrong

Taking the opposite side of our own preliminary conclusion:

Sample is too small. Seventeen round-trips on Book 2 is the sample size a drunk person at a blackjack table has after twenty minutes. Win-rate distributions at n=17 are wide enough that a 70.59% result can reverse to 35% over the next 30 trips without surprising anyone. Any reading here is provisional.

Recent regime shift. The standard book's historical 34% win rate was compiled over weeks. The 15.79% since the flip is over 24 hours. A regime change (one market day of trend-heavy action on symbols the scanner dislikes, for example) could compress the standard book's rate artificially without the underlying signal being any more broken than it was a week ago. That would make the inversion's apparent edge a mirage of timing.

Asymmetric fee burn. Book 2's inverted futures positions may open and close in ways that pay funding rate differently than Book 1's. If the test period coincides with a funding regime that favors one side, some of the apparent gross edge is just "Book 2 happened to be on the right side of funding this week."

The symbols where inversion fails are the ones we actually trade most. The test might reveal that inversion works on low-activity symbols that produce little volume, while the symbols driving Book 1's meaningful losses (higher-sample names like BTC, ETH, SOL, which Book 2 has not yet traded in this window) are not in the inverted-signal camp. A strategy that only works on low-volume names is not a strategy worth running.

The signal might be improving organically. Book 1's live standard-signal win rate (across all history, not just this window) has been creeping toward 34% from the 27% it hit in the worst stretch earlier in April. If the signal is already self-correcting, the inversion's apparent edge evaporates before the test window closes.

Any one of those could be what is actually going on. We are not going to know until the sample grows.

The Verdict

The decision point is 100 round-trips on Book 2, expected 8 to 12 days out.

If Book 2 lands net-positive with win rate above 55%: the inversion locks in. The live signal gets flipped permanently, along with the take-profit and stop-loss asymmetry (swap from 3% TP / -2% SL to 2% TP / -3% SL to match the inverted payoff shape). Live trading remains paused until the paper side clears a 30-day rolling benchmark of Binance Simple Earn at roughly 0.42% per month — the honest passive bar.

If Book 2 lands net-negative or drawdown exceeds 8%: the futures lane is disabled entirely. Spot accumulation remains. The diagnosis shifts from "inverted signal" to "no signal," and the rebuild restarts on features, not direction.

If Book 2 lands mixed — gross positive but net-negative, or win rate high but below 55%: the hybrid path becomes the next experiment. Invert only the symbols where Book 1's rolling win rate sits below 40%. Leave the ones above 40% standard. Re-run the control on that subset.

What the reader should take from this

If you are running a paper book that loses more than random: run the inverted control before killing the strategy. The setup is one column in the trades table (book_id) and one branch in the execute function. Cost is near zero, answer is binary, information is much larger than the cost.

If you are watching SleepyQuant for the outcome: the result arrives at 100 round-trips. We publish either a "inversion locks in, here is the updated config" or a "futures lane disabled, here is why" — whichever the numbers say, not whichever is more flattering.

If you are here for the general lesson: a losing signal is not automatically noise. Sometimes it is a working signal with the sign reversed. The diagnostic is cheap. The implication — that your model has been right about structure and wrong about direction — is unusual enough that most builders never check. The check itself is worth more than the result.

Follow the experiment

We publish one email per week with the round-trip count, the current win rates on both books, the fee-drag ratio, and whatever the honest read is at that point. No trading advice, no signals, no "buy at X." Just the numbers and what we are and are not willing to conclude from them.

Subscribe at sleepyquant.rest → the verdict lands in your inbox.

DEV Community

The Inverted Control: What 24 Hours of Running Our Own Bot Backwards Revealed

The Inverted Control: What 24 Hours of Running Our Own Bot Backwards Revealed

Executive Summary

The Thesis

The Setup

Deep Dive: 24 Hours of Parallel Data

The Fee Floor

Counter-Argument: Why This Reading Could Be Wrong

The Verdict

What the reader should take from this

Follow the experiment

Top comments (0)