Chudi Nnorukam

Posted on Feb 26 • Originally published at chudi.dev

Self-Tuner: Building an Adaptive Position Sizing System in Python

#trading #python #positionsizing #automation

Originally published at chudi.dev

A strategy that works on average might not work in all market conditions. Position sizing that is fixed ignores this. A self-tuner adapts.

This post covers the architecture of a self-tuning position sizing system for a prediction market bot: how it reads its own trade history, computes a performance score, and translates that score into a bet size multiplier — without overfitting to variance or blowing up during quiet periods.

TL;DR

The tuner reads recent trade outcomes from SQLite and computes a performance score
Performance score drives a multiplier (0.5x to 1.5x) applied to base bet size
Lookback window should match your strategy's mean reversion speed — not be as long as possible
Hard clamps prevent the tuner from ever sizing at 0% or above a safe ceiling
A circuit breaker (separate from tuning) provides a hard stop on consecutive losses

Why Self-Tune?

A fixed position size of $15 treats a period where your strategy is firing at 75% win rate the same as a period where it is firing at 45% win rate.

The insight behind self-tuning: recent performance is predictive of near-future performance for some strategy types. When the strategy is aligned with current market conditions (momentum strategies during trending markets), it performs above baseline. When conditions shift, performance degrades before you manually notice.

If you can detect that degradation early and reduce size, you protect capital. If you detect outperformance and increase size, you compound gains faster.

The risk: reacting to variance as if it were a regime change. After any 10-trade sample, even a 65% win rate strategy will sometimes have 4-6 consecutive losses purely by chance. A tuner that reduces size aggressively on 6 losses in a row is chasing noise.

Architecture Overview

TradeDB (SQLite)
    |
    v
PerformanceReader  (queries recent N trades)
    |
    v
ScoreComputer      (win rate, rolling Sharpe, or custom metric)
    |
    v
MultiplierMapper   (score → bet multiplier, with clamps)
    |
    v
SizingOutput       (base_bet × multiplier → final bet)

Each component is stateless except TradeDB. The tuner runs before each trade decision and recomputes fresh.

TradeDB: Persisting Outcomes

The tuner needs trade history. SQLite is sufficient for single-instance bots.

import sqlite3
from dataclasses import dataclass
from typing import List, Optional
import time

@dataclass
class TradeRecord:
    trade_id: str
    timestamp: float
    entry_price: float
    size_usdc: float
    pnl: float          # positive = profit, negative = loss
    resolved: bool

class TradeDB:
    def __init__(self, db_path: str):
        self._conn = sqlite3.connect(db_path, check_same_thread=False)
        self._create_table()

    def _create_table(self):
        self._conn.execute("""
            CREATE TABLE IF NOT EXISTS trades (
                trade_id TEXT PRIMARY KEY,
                timestamp REAL NOT NULL,
                entry_price REAL NOT NULL,
                size_usdc REAL NOT NULL,
                pnl REAL,
                resolved INTEGER DEFAULT 0
            )
        """)
        self._conn.commit()

    def record_trade(self, record: TradeRecord):
        self._conn.execute("""
            INSERT OR REPLACE INTO trades
            (trade_id, timestamp, entry_price, size_usdc, pnl, resolved)
            VALUES (?, ?, ?, ?, ?, ?)
        """, (
            record.trade_id, record.timestamp,
            record.entry_price, record.size_usdc,
            record.pnl, int(record.resolved)
        ))
        self._conn.commit()

    def get_recent_resolved(self, n: int, max_age_secs: float = None) -> List[TradeRecord]:
        query = """
            SELECT trade_id, timestamp, entry_price, size_usdc, pnl, resolved
            FROM trades
            WHERE resolved = 1
        """
        params = []
        if max_age_secs is not None:
            cutoff = time.time() - max_age_secs
            query += " AND timestamp >= ?"
            params.append(cutoff)

        query += " ORDER BY timestamp DESC LIMIT ?"
        params.append(n)

        rows = self._conn.execute(query, params).fetchall()
        return [
            TradeRecord(*row[:5], bool(row[5]))
            for row in rows
        ]

The critical decision here is WHERE resolved = 1. Unresolved trades have no outcome yet and cannot inform performance scoring. Including open positions in win rate calculations produces garbage scores.

PerformanceReader: Computing a Score

The simplest score: win rate over the last N resolved trades.

from dataclasses import dataclass
from typing import List

@dataclass
class PerformanceScore:
    n_trades: int
    win_rate: float
    avg_pnl: float
    confidence: str  # ANECDOTAL / LOW / MODERATE

class PerformanceReader:
    LOOKBACK_TRADES = 30
    ANECDOTAL_THRESHOLD = 10
    LOW_CONFIDENCE_THRESHOLD = 30

    def __init__(self, db: TradeDB):
        self._db = db

    def compute(self) -> PerformanceScore:
        trades = self._db.get_recent_resolved(self.LOOKBACK_TRADES)

        if not trades:
            return PerformanceScore(0, 0.5, 0.0, "NO_DATA")

        n = len(trades)
        wins = sum(1 for t in trades if t.pnl > 0)
        win_rate = wins / n
        avg_pnl = sum(t.pnl for t in trades) / n

        if n < self.ANECDOTAL_THRESHOLD:
            confidence = "ANECDOTAL"
        elif n < self.LOW_CONFIDENCE_THRESHOLD:
            confidence = "LOW"
        else:
            confidence = "MODERATE"

        return PerformanceScore(n, win_rate, avg_pnl, confidence)

The confidence field matters. A tuner operating on 5 trades is reacting to pure noise. The multiplier mapping should discount heavily for ANECDOTAL confidence.

Rolling Sharpe: A More Stable Alternative

Win rate ignores the size of wins and losses. Rolling Sharpe accounts for both:

import statistics

def compute_rolling_sharpe(trades: List[TradeRecord], risk_free: float = 0.0) -> float:
    if len(trades) < 3:
        return 0.0  # not enough data

    returns = [t.pnl / t.size_usdc for t in trades]  # per-dollar return
    mean_r = statistics.mean(returns)
    std_r = statistics.stdev(returns)

    if std_r < 1e-9:
        return 0.0  # all trades identical, no variance info

    return (mean_r - risk_free) / std_r

Positive Sharpe = strategy is generating return above its variance. Negative Sharpe = return doesn't justify the variance. Sharpe of 1.0+ is a strong signal; Sharpe of 0.3 is borderline.

MultiplierMapper: Score to Bet Size

The multiplier maps a performance score to a scaling factor. Linear interpolation between confidence-weighted bounds:

class MultiplierMapper:
    # Multiplier bounds
    FLOOR = 0.5      # never go below half base bet
    CEILING = 1.5    # never go above 1.5x base bet
    NEUTRAL = 1.0    # no adjustment when performance is baseline

    # Win rate thresholds
    BASELINE_WIN_RATE = 0.55    # expected win rate for this strategy
    STRONG_WIN_RATE = 0.70      # scale up significantly
    WEAK_WIN_RATE = 0.45        # scale down significantly

    def compute_multiplier(self, score: PerformanceScore) -> float:
        # Insufficient data: default to neutral or slight reduction
        if score.confidence == "NO_DATA":
            return self.FLOOR  # no history = minimum size
        if score.confidence == "ANECDOTAL":
            return self.NEUTRAL * 0.75  # reduce but don't stop

        win_rate = score.win_rate

        if win_rate >= self.STRONG_WIN_RATE:
            raw = self.CEILING
        elif win_rate <= self.WEAK_WIN_RATE:
            raw = self.FLOOR
        else:
            # Linear interpolation between floor and ceiling
            t = (win_rate - self.WEAK_WIN_RATE) / (
                self.STRONG_WIN_RATE - self.WEAK_WIN_RATE
            )
            raw = self.FLOOR + t * (self.CEILING - self.FLOOR)

        # Hard clamp regardless of edge cases
        return max(self.FLOOR, min(self.CEILING, raw))

The hard clamp at the end is not redundant. Floating point edge cases, database corruption, or integer overflow in trade records can produce extreme scores. The clamp ensures the tuner never outputs a multiplier that could cause catastrophic position sizing.

Putting It Together: SelfTuner

class SelfTuner:
    def __init__(self, db: TradeDB, base_bet_usdc: float):
        self._db = db
        self._base_bet = base_bet_usdc
        self._reader = PerformanceReader(db)
        self._mapper = MultiplierMapper()

    def get_bet_size(self) -> float:
        score = self._reader.compute()
        multiplier = self._mapper.compute_multiplier(score)
        sized = self._base_bet * multiplier

        # Log for auditing
        print(
            f"[SelfTuner] n={score.n_trades} wr={score.win_rate:.2f} "
            f"conf={score.confidence} mult={multiplier:.2f} "
            f"bet=${sized:.2f}"
        )

        return round(sized, 2)

Usage in the signal handler:

async def on_signal(direction: Direction):
    bet_size = tuner.get_bet_size()
    await executor.execute(direction, size_usdc=bet_size)

The tuner runs fresh before each trade. It does not cache its result across the session — market conditions and performance data change throughout the day.

The Circuit Breaker: Hard Stop vs. Soft Tuning

The self-tuner adjusts gradually. The circuit breaker stops the bot entirely.

They are separate systems and serve different purposes:

Self-tuner: adjusts to gradual regime changes. Soft. Continuous adjustment.
Circuit breaker: prevents catastrophic loss from N consecutive losses. Hard. Binary stop.

import json
import os

class CircuitBreaker:
    def __init__(self, state_path: str, max_consecutive: int = 4):
        self._path = state_path
        self._max = max_consecutive

    def _load(self) -> dict:
        if not os.path.exists(self._path):
            return {"consecutive_losses": 0}
        with open(self._path) as f:
            return json.load(f)

    def _save(self, state: dict):
        with open(self._path, "w") as f:
            json.dump(state, f)

    def record_outcome(self, win: bool):
        state = self._load()
        if win:
            state["consecutive_losses"] = 0
        else:
            state["consecutive_losses"] += 1
        self._save(state)

    def is_tripped(self) -> bool:
        state = self._load()
        return state.get("consecutive_losses", 0) >= self._max

    def reset(self):
        self._save({"consecutive_losses": 0})

The circuit breaker persists to disk. If the bot restarts after N consecutive losses, it remains tripped. You must manually reset it after investigating the losses.

The Lookback Window Problem

The hardest design decision: how many trades to look back?

Too short (5-10 trades): pure variance. After 10 trades, a 65% win rate strategy will have periods of 4-6 consecutive losses 15-20% of the time by pure chance. A 10-trade tuner reacts to this as a regime change and cuts size — then misses the recovery.

Too long (50-100 trades): too slow. If your strategy genuinely degrades (market conditions shift, new market makers enter, fees change), a 100-trade lookback takes weeks to detect the change and adjust.

The calibration approach: measure how quickly your strategy's performance autocorrelates. If today's win rate predicts tomorrow's win rate at r=0.7 over 5-trade windows, use 5-trade windows. If r=0.2 (essentially no autocorrelation at short windows), you need a longer window to find the signal.

For the Polymarket 5-minute BTC strategy I use: 30-trade lookback, 72-hour max age. Trades older than 72 hours are excluded because BTC volatility regimes shift faster than that. A quiet Saturday is not predictive of an active Monday.

Backtesting the Tuner

You need to backtest the self-tuner separately from the underlying strategy, because tuning can underperform fixed sizing.

Scenarios where fixed sizing beats self-tuning:

High-variance strategies where variance is not predictive (high noise floor)
Short-lived edge periods where the tuner scales up just as performance reverts
Strategies with long autocorrelation where the tuner reacts too fast

The backtest loop:

def backtest_tuner(trades: List[TradeRecord], base_bet: float) -> dict:
    tuner_equity = 1000.0
    fixed_equity = 1000.0
    history = []

    for i, trade in enumerate(trades):
        # Fixed sizing
        fixed_pnl = (trade.pnl / trade.size_usdc) * base_bet
        fixed_equity += fixed_pnl

        # Tuner sizing
        if i >= 10:  # minimum history
            recent = trades[max(0, i-30):i]
            wins = sum(1 for t in recent if t.pnl > 0)
            wr = wins / len(recent)
            mult = max(0.5, min(1.5, 0.5 + wr))  # simple linear
            sized = base_bet * mult
        else:
            sized = base_bet

        tuner_pnl = (trade.pnl / trade.size_usdc) * sized
        tuner_equity += tuner_pnl

        history.append({
            "trade": i,
            "fixed_equity": fixed_equity,
            "tuner_equity": tuner_equity,
        })

    return {
        "fixed_final": fixed_equity,
        "tuner_final": tuner_equity,
        "outperformance": tuner_equity - fixed_equity,
        "history": history,
    }

If the tuner does not outperform fixed sizing in backtests, use fixed sizing. The tuner adds complexity and latency. It only makes sense if it demonstrably improves risk-adjusted returns.

What the Production System Looks Like

In production, the self-tuner runs as a lightweight module called once before each trade decision. It queries a local SQLite database, computes the score, and returns a bet size in under 10ms.

The full sequence for each signal:

Signal fires (Binance momentum detected)
Circuit breaker check — is it tripped? If yes, abort.
SelfTuner.get_bet_size() — what is the current bet size?
Execute at computed size
On resolution, record outcome to DB
If loss: CircuitBreaker.record_outcome(win=False)

The tuner and circuit breaker are independent layers. You can disable either without affecting the other. This separation makes debugging straightforward: a tripped circuit breaker is visually obvious in the log; a tuner operating at 0.75x is logged with explicit score and multiplier.

DEV Community