Bill Tu

Posted on Apr 8

Lessons from Fixing 8 Bugs in a Quantitative Trading Framework

#codequality #opensource #performance #python

QuantFlow is an open-source quantitative trading system written in Python. After the initial release, a thorough code review uncovered 8 bugs — ranging from silent correctness errors to quadratic performance traps. This article walks through each bug, explains why it matters in the context of financial software, and shows the fix.

The bugs fall into three categories: correctness errors that produce wrong numbers, safety gaps that allow dangerous operations, and performance issues that make the system unusable at scale.

Bug 1: Python Properties That Silently Ignore Arguments

The PerformanceReport class defined sharpe_ratio and sortino_ratio as Python @property decorators, but the method signatures included parameters:

@property
def sharpe_ratio(self, risk_free_rate: float = 0.0, periods: int = 252) -> float:
    ...

This is a subtle Python gotcha. The @property decorator transforms a method into a descriptor that takes no arguments beyond self. The risk_free_rate and periods parameters exist in the signature but can never be passed by the caller. result.sharpe_ratio always uses the defaults. Attempting result.sharpe_ratio(risk_free_rate=0.02) raises TypeError: 'float' object is not callable.

In quantitative finance, the risk-free rate matters. A Sharpe ratio calculated with a 0% risk-free rate versus a 4% rate can tell very different stories about a strategy's risk-adjusted performance. Silently hardcoding this to zero is a correctness problem.

The fix removes @property and makes them regular methods:

def sharpe_ratio(self, risk_free_rate: float = 0.0, periods: int = 252) -> float:
    if len(self.equity_curve) < 2:
        return 0.0
    returns = np.diff(self.equity_curve) / self.equity_curve[:-1]
    excess = returns - risk_free_rate / periods
    if np.std(excess) == 0:
        return 0.0
    return float(np.mean(excess) / np.std(excess) * np.sqrt(periods))

This is a breaking change — callers must update from result.sharpe_ratio to result.sharpe_ratio() — but it's the right trade-off. A method that pretends to accept parameters but ignores them is worse than a clean API change.

Bug 2: Drawdown Calculated from Cash, Not Equity

The risk manager's drawdown circuit breaker compared cash against equity_peak:

# Before: uses cash only
signal = self.risk_manager.check(
    signal=signal,
    current_price=bar.close,
    capital=portfolio.cash,        # BUG: just cash, not total equity
    equity_peak=portfolio.equity_peak,
    ...
)

The equity_peak tracks the highest total equity (cash + positions). But the drawdown check only looked at cash. When a position is open, cash drops by the position cost — even if the position is profitable. This creates false drawdown triggers.

Example: you start with $100,000 and buy $80,000 worth of stock. Cash is now $20,000. The stock goes up 10%, so your equity is $108,000. But the drawdown check sees $20,000 vs $100,000 peak — an 80% "drawdown" — and halts trading on a profitable portfolio.

The fix passes total equity instead:

signal = self.risk_manager.check(
    signal=signal,
    current_price=bar.close,
    equity=portfolio.equity,       # FIXED: total equity
    equity_peak=portfolio.equity_peak,
    ...
)

The risk manager's parameter was renamed from capital to equity to make the semantics explicit.

Bug 3: Position Market Value Frozen at Entry Price

The Position.market_value property was defined as:

@property
def market_value(self) -> float:
    return self.quantity * self.entry_price

This returns the cost basis, not the current market value. Since Portfolio.equity sums market_value across all positions, the equity curve never reflects unrealized gains or losses. A position that doubled in value still reports its original cost. A position that lost 50% still reports full value.

This means the equity curve, the Sharpe ratio, the max drawdown — every metric derived from equity — is wrong whenever positions are open.

The fix adds a current_price field to Position that gets updated on every bar:

@dataclass
class Position:
    symbol: str
    side: OrderSide
    quantity: float
    entry_price: float
    entry_time: datetime
    current_price: float = 0.0    # updated each bar
    ...

    @property
    def market_value(self) -> float:
        return self.quantity * self.current_price

    @property
    def cost_basis(self) -> float:
        return self.quantity * self.entry_price

The engine calls portfolio.update_market_prices(symbol, bar.close) on every bar before computing equity. The old entry_price-based value is preserved as cost_basis for PnL calculations.

Bugs 2 and 3 are related — fixing the equity calculation (Bug 3) makes the drawdown fix (Bug 2) actually meaningful. Without accurate equity, even the correct drawdown formula produces wrong results.

Bug 4: Unlimited Short Selling Without Margin

The portfolio's execute_signal method checked cash sufficiency for buy orders but had no check for sell (short) orders:

# Only checked for buys
if side == OrderSide.BUY and cost > self.cash:
    return None
# Shorts? No check at all.

In real markets, short selling requires margin — you need collateral to borrow shares. Without any check, the system could open unlimited short positions regardless of available capital. A strategy bug that generates continuous sell signals would create unbounded short exposure.

The fix adds a configurable margin requirement:

if side == OrderSide.SELL:
    margin_required = fill_price * quantity * self.short_margin_rate + commission
    if margin_required > self.cash:
        logger.info("Insufficient margin for short: need %.2f, have %.2f",
                     margin_required, self.cash)
        return None

The short_margin_rate defaults to 1.0 (100% margin), meaning you need the full position value in cash to short. This is conservative but safe. Users can adjust it for their broker's actual margin requirements.

Bug 5: O(n²) Memory Allocation in Data Feed

The DataFeed.bars() method built the close price history by appending to a Python list and creating a new numpy array on every bar:

closes = []
for _, row in df.iterrows():
    closes.append(row["close"])
    yield Bar(..., close_prices=np.array(closes, dtype=np.float64))

For a backtest with n bars, this creates n numpy arrays of sizes 1, 2, 3, ..., n. Total memory allocated: O(n²). Total copy operations: O(n²). For 10,000 bars, that's ~50 million float copies just for the price history.

The fix pre-allocates the full array once and passes slices:

all_closes = df["close"].values.astype(np.float64)
for i, (_, row) in enumerate(df.iterrows()):
    yield Bar(..., close_prices=all_closes[:i + 1])

all_closes[:i + 1] is a numpy view, not a copy. Zero additional memory allocation. The total memory for close prices drops from O(n²) to O(n), and the allocation count drops from n to 1.

On a 10,000-bar backtest, this eliminates roughly 400MB of transient memory allocation.

Bug 6: EMA Recomputes Full Series on Every Bar

The EMA.calculate() method called self.series(prices)[-1]:

def calculate(self, prices: np.ndarray) -> Optional[float]:
    if len(prices) < self.period:
        return None
    return float(self.series(prices)[-1])

series() builds the complete EMA array for all prices. Since strategies call calculate() on every bar with a growing price array, the total work over a backtest is O(n²) — computing the full series of length 1, then 2, then 3, ..., then n.

The fix computes only the final value:

def calculate(self, prices: np.ndarray) -> Optional[float]:
    if len(prices) < self.period:
        return None
    alpha = 2.0 / (self.period + 1)
    ema = float(np.mean(prices[:self.period]))
    for i in range(self.period, len(prices)):
        ema = alpha * prices[i] + (1 - alpha) * ema
    return ema

This is still O(n) per call (it walks the full price array), but it avoids the numpy array allocation overhead. A further optimization would cache the previous EMA value and compute incrementally in O(1), but that requires the indicator to be stateful — a design change for a future version.

Bug 7: Stochastic %D Misaligned with %K

The Stochastic oscillator's %D line (a moving average of %K) was computed by:

Extracting non-NaN values from %K into a compact array
Computing SMA on the compact array
Mapping results back to original indices

d = SMA(self.d_period).series(k[~np.isnan(k)])
d_full = np.full(n, np.nan)
valid_idx = np.where(~np.isnan(k))[0]
d_full[valid_idx[:len(d)]] = d

The problem: SMA.series() returns an array with leading NaN values (the warm-up period). When mapped back via valid_idx[:len(d)], these NaN values land on the wrong indices. The %D line ends up shifted relative to %K.

The fix computes %D directly with proper alignment:

d = np.full(n, np.nan)
for i in range(self.k_period - 1 + self.d_period - 1, n):
    window = k[i - self.d_period + 1: i + 1]
    if not np.any(np.isnan(window)):
        d[i] = np.mean(window)

This is straightforward: for each position, take the last d_period values of %K, check they're all valid, and average them. The index arithmetic ensures %D starts exactly d_period - 1 bars after %K begins.

Bug 8: MACD Returns NaN Instead of None

The MACD.calculate() method checked for insufficient data and returned (None, None, None). But when there was enough data for the MACD line but not enough for the signal line's EMA to fully warm up, series() would produce NaN values. The calculate() method then returned float(NaN):

# Before: NaN leaks through
return float(macd_line[-1]), float(signal_line[-1]), float(histogram[-1])

Downstream strategies check if hist is None but don't check for NaN. A NaN histogram passes the None check, enters the crossover logic, and produces garbage signals — NaN comparisons in Python always return False, so prev_hist <= 0 < hist silently fails.

The fix adds a NaN guard:

m, s, h = macd_line[-1], signal_line[-1], histogram[-1]
if np.isnan(m) or np.isnan(s) or np.isnan(h):
    return None, None, None
return float(m), float(s), float(h)

This is a general principle for indicator APIs: the contract should be "returns a number or None", never NaN. NaN propagates silently through arithmetic and comparisons, making bugs extremely hard to trace. None forces explicit handling.

Patterns Across the Bugs

Looking at all 8 bugs together, a few patterns emerge:

Financial software has a low tolerance for "close enough." Bugs 2, 3, and 8 all produce numbers that look plausible but are wrong. A Sharpe ratio of 1.8 vs 1.7 might not raise eyebrows, but if the difference comes from using cost basis instead of market value, every decision based on that number is compromised. In most software, off-by-a-little is acceptable. In quantitative finance, it's the difference between a profitable strategy and a losing one.

Python's flexibility is a double-edged sword. Bug 1 (@property with parameters) is a language-level trap. Python doesn't warn you. The code runs fine. The parameters just don't do anything. Static type checkers like mypy would catch this, which is a strong argument for running them in CI.

Performance bugs compound. Bugs 5 and 6 are both O(n²) issues that individually might be tolerable on small datasets. But they multiply: the data feed creates O(n²) arrays, and then the EMA indicator does O(n²) work on each one. On a 10,000-bar backtest, the combined effect is severe.

Safety checks are not optional. Bug 4 (unlimited short selling) is the kind of bug that doesn't matter until it does — and then it matters a lot. A strategy bug that generates spurious sell signals would create unbounded short exposure. In a live trading system, this could be catastrophic.

Verification

After applying all fixes, the multi-strategy backtest produces more accurate results:

================================================================================
  MULTI-STRATEGY COMPARISON
================================================================================
  Strategy                      Return   Trades   Win Rate   Sharpe     Max DD
--------------------------------------------------------------------------------
  SMACrossover                 228.29%       17     17.65%     1.80     -1.24%
  RSIMeanReversion              71.01%        4     25.00%     1.02    -22.57%
  MACDTrend                    366.14%       28     42.86%     2.18     -1.11%
  BollingerBreakout            857.60%       55     30.91%     2.84    -11.26%
================================================================================

The max drawdown numbers changed (e.g., RSIMeanReversion went from -16.86% to -22.57%) because they now reflect true equity-based drawdown instead of the cash-only approximation. The previous numbers were optimistic — they understated risk.

All fixes are included in QuantFlow v0.2.0.

QuantFlow is available at github.com/iwtxokhtd83/QuantFlow. Issues and contributions are welcome.

DEV Community