Originally published at chudi.dev
A strategy that works on average might not work in all market conditions. Position sizing that is fixed ignores this. A self-tuner adapts.
This post covers the architecture of a self-tuning position sizing system for a prediction market bot: how it reads its own trade history, computes a performance score, and translates that score into a bet size multiplier — without overfitting to variance or blowing up during quiet periods.
TL;DR
- The tuner reads recent trade outcomes from SQLite and computes a performance score
- Performance score drives a multiplier (0.5x to 1.5x) applied to base bet size
- Lookback window should match your strategy's mean reversion speed — not be as long as possible
- Hard clamps prevent the tuner from ever sizing at 0% or above a safe ceiling
- A circuit breaker (separate from tuning) provides a hard stop on consecutive losses
Why Self-Tune?
A fixed position size of $15 treats a period where your strategy is firing at 75% win rate the same as a period where it is firing at 45% win rate.
The insight behind self-tuning: recent performance is predictive of near-future performance for some strategy types. When the strategy is aligned with current market conditions (momentum strategies during trending markets), it performs above baseline. When conditions shift, performance degrades before you manually notice.
If you can detect that degradation early and reduce size, you protect capital. If you detect outperformance and increase size, you compound gains faster.
The risk: reacting to variance as if it were a regime change. After any 10-trade sample, even a 65% win rate strategy will sometimes have 4-6 consecutive losses purely by chance. A tuner that reduces size aggressively on 6 losses in a row is chasing noise.
Architecture Overview
TradeDB (SQLite)
|
v
PerformanceReader (queries recent N trades)
|
v
ScoreComputer (win rate, rolling Sharpe, or custom metric)
|
v
MultiplierMapper (score → bet multiplier, with clamps)
|
v
SizingOutput (base_bet × multiplier → final bet)
Each component is stateless except TradeDB. The tuner runs before each trade decision and recomputes fresh.
TradeDB: Persisting Outcomes
The tuner needs trade history. SQLite is sufficient for single-instance bots.
import sqlite3
from dataclasses import dataclass
from typing import List, Optional
import time
@dataclass
class TradeRecord:
trade_id: str
timestamp: float
entry_price: float
size_usdc: float
pnl: float # positive = profit, negative = loss
resolved: bool
class TradeDB:
def __init__(self, db_path: str):
self._conn = sqlite3.connect(db_path, check_same_thread=False)
self._create_table()
def _create_table(self):
self._conn.execute("""
CREATE TABLE IF NOT EXISTS trades (
trade_id TEXT PRIMARY KEY,
timestamp REAL NOT NULL,
entry_price REAL NOT NULL,
size_usdc REAL NOT NULL,
pnl REAL,
resolved INTEGER DEFAULT 0
)
""")
self._conn.commit()
def record_trade(self, record: TradeRecord):
self._conn.execute("""
INSERT OR REPLACE INTO trades
(trade_id, timestamp, entry_price, size_usdc, pnl, resolved)
VALUES (?, ?, ?, ?, ?, ?)
""", (
record.trade_id, record.timestamp,
record.entry_price, record.size_usdc,
record.pnl, int(record.resolved)
))
self._conn.commit()
def get_recent_resolved(self, n: int, max_age_secs: float = None) -> List[TradeRecord]:
query = """
SELECT trade_id, timestamp, entry_price, size_usdc, pnl, resolved
FROM trades
WHERE resolved = 1
"""
params = []
if max_age_secs is not None:
cutoff = time.time() - max_age_secs
query += " AND timestamp >= ?"
params.append(cutoff)
query += " ORDER BY timestamp DESC LIMIT ?"
params.append(n)
rows = self._conn.execute(query, params).fetchall()
return [
TradeRecord(*row[:5], bool(row[5]))
for row in rows
]
The critical decision here is WHERE resolved = 1. Unresolved trades have no outcome yet and cannot inform performance scoring. Including open positions in win rate calculations produces garbage scores.
PerformanceReader: Computing a Score
The simplest score: win rate over the last N resolved trades.
from dataclasses import dataclass
from typing import List
@dataclass
class PerformanceScore:
n_trades: int
win_rate: float
avg_pnl: float
confidence: str # ANECDOTAL / LOW / MODERATE
class PerformanceReader:
LOOKBACK_TRADES = 30
ANECDOTAL_THRESHOLD = 10
LOW_CONFIDENCE_THRESHOLD = 30
def __init__(self, db: TradeDB):
self._db = db
def compute(self) -> PerformanceScore:
trades = self._db.get_recent_resolved(self.LOOKBACK_TRADES)
if not trades:
return PerformanceScore(0, 0.5, 0.0, "NO_DATA")
n = len(trades)
wins = sum(1 for t in trades if t.pnl > 0)
win_rate = wins / n
avg_pnl = sum(t.pnl for t in trades) / n
if n < self.ANECDOTAL_THRESHOLD:
confidence = "ANECDOTAL"
elif n < self.LOW_CONFIDENCE_THRESHOLD:
confidence = "LOW"
else:
confidence = "MODERATE"
return PerformanceScore(n, win_rate, avg_pnl, confidence)
The confidence field matters. A tuner operating on 5 trades is reacting to pure noise. The multiplier mapping should discount heavily for ANECDOTAL confidence.
Rolling Sharpe: A More Stable Alternative
Win rate ignores the size of wins and losses. Rolling Sharpe accounts for both:
import statistics
def compute_rolling_sharpe(trades: List[TradeRecord], risk_free: float = 0.0) -> float:
if len(trades) < 3:
return 0.0 # not enough data
returns = [t.pnl / t.size_usdc for t in trades] # per-dollar return
mean_r = statistics.mean(returns)
std_r = statistics.stdev(returns)
if std_r < 1e-9:
return 0.0 # all trades identical, no variance info
return (mean_r - risk_free) / std_r
Positive Sharpe = strategy is generating return above its variance. Negative Sharpe = return doesn't justify the variance. Sharpe of 1.0+ is a strong signal; Sharpe of 0.3 is borderline.
MultiplierMapper: Score to Bet Size
The multiplier maps a performance score to a scaling factor. Linear interpolation between confidence-weighted bounds:
class MultiplierMapper:
# Multiplier bounds
FLOOR = 0.5 # never go below half base bet
CEILING = 1.5 # never go above 1.5x base bet
NEUTRAL = 1.0 # no adjustment when performance is baseline
# Win rate thresholds
BASELINE_WIN_RATE = 0.55 # expected win rate for this strategy
STRONG_WIN_RATE = 0.70 # scale up significantly
WEAK_WIN_RATE = 0.45 # scale down significantly
def compute_multiplier(self, score: PerformanceScore) -> float:
# Insufficient data: default to neutral or slight reduction
if score.confidence == "NO_DATA":
return self.FLOOR # no history = minimum size
if score.confidence == "ANECDOTAL":
return self.NEUTRAL * 0.75 # reduce but don't stop
win_rate = score.win_rate
if win_rate >= self.STRONG_WIN_RATE:
raw = self.CEILING
elif win_rate <= self.WEAK_WIN_RATE:
raw = self.FLOOR
else:
# Linear interpolation between floor and ceiling
t = (win_rate - self.WEAK_WIN_RATE) / (
self.STRONG_WIN_RATE - self.WEAK_WIN_RATE
)
raw = self.FLOOR + t * (self.CEILING - self.FLOOR)
# Hard clamp regardless of edge cases
return max(self.FLOOR, min(self.CEILING, raw))
The hard clamp at the end is not redundant. Floating point edge cases, database corruption, or integer overflow in trade records can produce extreme scores. The clamp ensures the tuner never outputs a multiplier that could cause catastrophic position sizing.
Putting It Together: SelfTuner
class SelfTuner:
def __init__(self, db: TradeDB, base_bet_usdc: float):
self._db = db
self._base_bet = base_bet_usdc
self._reader = PerformanceReader(db)
self._mapper = MultiplierMapper()
def get_bet_size(self) -> float:
score = self._reader.compute()
multiplier = self._mapper.compute_multiplier(score)
sized = self._base_bet * multiplier
# Log for auditing
print(
f"[SelfTuner] n={score.n_trades} wr={score.win_rate:.2f} "
f"conf={score.confidence} mult={multiplier:.2f} "
f"bet=${sized:.2f}"
)
return round(sized, 2)
Usage in the signal handler:
async def on_signal(direction: Direction):
bet_size = tuner.get_bet_size()
await executor.execute(direction, size_usdc=bet_size)
The tuner runs fresh before each trade. It does not cache its result across the session — market conditions and performance data change throughout the day.
The Circuit Breaker: Hard Stop vs. Soft Tuning
The self-tuner adjusts gradually. The circuit breaker stops the bot entirely.
They are separate systems and serve different purposes:
- Self-tuner: adjusts to gradual regime changes. Soft. Continuous adjustment.
- Circuit breaker: prevents catastrophic loss from N consecutive losses. Hard. Binary stop.
import json
import os
class CircuitBreaker:
def __init__(self, state_path: str, max_consecutive: int = 4):
self._path = state_path
self._max = max_consecutive
def _load(self) -> dict:
if not os.path.exists(self._path):
return {"consecutive_losses": 0}
with open(self._path) as f:
return json.load(f)
def _save(self, state: dict):
with open(self._path, "w") as f:
json.dump(state, f)
def record_outcome(self, win: bool):
state = self._load()
if win:
state["consecutive_losses"] = 0
else:
state["consecutive_losses"] += 1
self._save(state)
def is_tripped(self) -> bool:
state = self._load()
return state.get("consecutive_losses", 0) >= self._max
def reset(self):
self._save({"consecutive_losses": 0})
The circuit breaker persists to disk. If the bot restarts after N consecutive losses, it remains tripped. You must manually reset it after investigating the losses.
The Lookback Window Problem
The hardest design decision: how many trades to look back?
Too short (5-10 trades): pure variance. After 10 trades, a 65% win rate strategy will have periods of 4-6 consecutive losses 15-20% of the time by pure chance. A 10-trade tuner reacts to this as a regime change and cuts size — then misses the recovery.
Too long (50-100 trades): too slow. If your strategy genuinely degrades (market conditions shift, new market makers enter, fees change), a 100-trade lookback takes weeks to detect the change and adjust.
The calibration approach: measure how quickly your strategy's performance autocorrelates. If today's win rate predicts tomorrow's win rate at r=0.7 over 5-trade windows, use 5-trade windows. If r=0.2 (essentially no autocorrelation at short windows), you need a longer window to find the signal.
For the Polymarket 5-minute BTC strategy I use: 30-trade lookback, 72-hour max age. Trades older than 72 hours are excluded because BTC volatility regimes shift faster than that. A quiet Saturday is not predictive of an active Monday.
Backtesting the Tuner
You need to backtest the self-tuner separately from the underlying strategy, because tuning can underperform fixed sizing.
Scenarios where fixed sizing beats self-tuning:
- High-variance strategies where variance is not predictive (high noise floor)
- Short-lived edge periods where the tuner scales up just as performance reverts
- Strategies with long autocorrelation where the tuner reacts too fast
The backtest loop:
def backtest_tuner(trades: List[TradeRecord], base_bet: float) -> dict:
tuner_equity = 1000.0
fixed_equity = 1000.0
history = []
for i, trade in enumerate(trades):
# Fixed sizing
fixed_pnl = (trade.pnl / trade.size_usdc) * base_bet
fixed_equity += fixed_pnl
# Tuner sizing
if i >= 10: # minimum history
recent = trades[max(0, i-30):i]
wins = sum(1 for t in recent if t.pnl > 0)
wr = wins / len(recent)
mult = max(0.5, min(1.5, 0.5 + wr)) # simple linear
sized = base_bet * mult
else:
sized = base_bet
tuner_pnl = (trade.pnl / trade.size_usdc) * sized
tuner_equity += tuner_pnl
history.append({
"trade": i,
"fixed_equity": fixed_equity,
"tuner_equity": tuner_equity,
})
return {
"fixed_final": fixed_equity,
"tuner_final": tuner_equity,
"outperformance": tuner_equity - fixed_equity,
"history": history,
}
If the tuner does not outperform fixed sizing in backtests, use fixed sizing. The tuner adds complexity and latency. It only makes sense if it demonstrably improves risk-adjusted returns.
What the Production System Looks Like
In production, the self-tuner runs as a lightweight module called once before each trade decision. It queries a local SQLite database, computes the score, and returns a bet size in under 10ms.
The full sequence for each signal:
- Signal fires (Binance momentum detected)
- Circuit breaker check — is it tripped? If yes, abort.
- SelfTuner.get_bet_size() — what is the current bet size?
- Execute at computed size
- On resolution, record outcome to DB
- If loss: CircuitBreaker.record_outcome(win=False)
The tuner and circuit breaker are independent layers. You can disable either without affecting the other. This separation makes debugging straightforward: a tripped circuit breaker is visually obvious in the log; a tuner operating at 0.75x is logged with explicit score and multiplier.
Top comments (0)