Ayrat Murtazin

Posted on Apr 24

Does the Stock Market Still Overreact? A Python Contrarian Strategy

#python #quant #trading #finance

In 1985, behavioral economists Werner De Bondt and Richard Thaler published a landmark paper arguing that stock markets systematically overreact to news — that "Losers" get punished too harshly and "Winners" get rewarded too generously, and that both eventually snap back toward fair value. This mean-reversion effect, driven by human emotion rather than rational pricing, became one of the most cited anomalies in academic finance. The question worth asking in 2024 is: does it still work?

This article implements a full contrarian backtesting framework in Python. We use a rolling Z-score engine to detect statistically extreme price moves across 40+ S&P 500 constituents, then simulate a long-short strategy — buying oversold "Losers" and shorting overbought "Winners" — with a 60-day recovery window. The result is a clean, reproducible research notebook benchmarked against SPY that you can extend with your own parameters.

Most algo trading content gives you theory.
This gives you the code.

3 Python strategies. Fully backtested. Colab notebook included.
Plus a free ebook with 5 more strategies the moment you subscribe.

5,000 quant traders already run these:

Subscribe | AlgoEdge Insights

This article covers:

Section 1 — The Overreaction Hypothesis:** What behavioral finance says about crowd psychology and mean reversion, and how Z-scores quantify "statistical panic"
Section 2 — Python Implementation:** End-to-end code covering data retrieval, Z-score signal generation, trade simulation, and visualization
Section 3 — Results and Analysis:** What the backtest reveals about strategy performance, drawdowns, and realistic expectations
Section 4 — Use Cases:** Practical scenarios where this framework applies
Section 5 — Limitations and Edge Cases:** Honest constraints and failure modes to be aware of before deploying capital

1. The Overreaction Hypothesis

When a company misses earnings, the stock often doesn't just fall — it collapses. Investors, gripped by recency bias and loss aversion, extrapolate a single bad quarter into a permanent disaster. The price overshoots fair value on the downside. The same mechanism works in reverse: a hot streak of good news can push a stock to euphoric heights well beyond what fundamentals justify. De Bondt and Thaler called this overreaction, and they showed that portfolios of extreme past losers consistently outperformed extreme past winners over subsequent 3–5 year windows in U.S. equity data.

The intuition is straightforward. Think of market sentiment as a pendulum. Fear and greed push it to extremes, but gravity — in the form of fundamental value — always pulls it back to center. A contrarian strategy attempts to profit from that return trip. You're not predicting where a stock is going; you're betting that after an extreme move, the pendulum has swung too far and a partial reversal is statistically likely.

To operationalize this, we need a consistent mathematical measure of "extreme." That's where the rolling Z-score comes in. For each stock at each point in time, we compute how many standard deviations its recent return sits from its own historical mean. A Z-score of –2.5 means the stock's return is 2.5 standard deviations below its rolling average — an unusually large negative move that, under a normal distribution, occurs less than 1.2% of the time. This is our "panic signal."

The core hypothesis we're testing is simple: stocks that breach a negative Z-score threshold (our Losers) will outperform over the next 60 trading days, and stocks that breach a positive threshold (our Winners) will underperform. If the market still overreacts, a long-Losers / short-Winners portfolio should produce positive risk-adjusted returns.

2. Python Implementation

2.1 Setup and Parameters

The strategy has a small set of configurable parameters that control sensitivity and holding behavior. ZSCORE_WINDOW sets the lookback period for computing rolling mean and standard deviation — 60 days captures roughly one quarter of trading behavior. ZSCORE_THRESHOLD sets how extreme a move must be to trigger a signal; 2.0 is a common statistical cutoff. HOLDING_DAYS defines how long we hold each trade before closing it out, simulating the expected mean-reversion window.

import yfinance as yf
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import matplotlib.dates as mdates
from datetime import datetime, timedelta
import warnings
warnings.filterwarnings('ignore')

# --- Universe ---
TICKERS = [
    'AAPL','MSFT','GOOGL','AMZN','META','NVDA','JPM','JNJ',
    'V','PG','UNH','HD','MA','BAC','XOM','PFE','ABBV','KO',
    'PEP','MRK','COST','AVGO','DIS','CSCO','VZ','INTC','WMT',
    'CVX','ABT','MCD','NKE','DHR','NEE','LIN','HON','UPS',
    'LOW','AMGN','IBM','GS'
]
BENCHMARK = 'SPY'

# --- Strategy Parameters ---
START_DATE      = '2010-01-01'
END_DATE        = '2024-01-01'
ZSCORE_WINDOW   = 60    # Rolling lookback in trading days
ZSCORE_THRESHOLD = 2.0  # Trigger threshold (standard deviations)
HOLDING_DAYS    = 60    # Days to hold each position
INITIAL_CAPITAL = 100_000

2.2 Data Retrieval and Z-Score Signal Engine

This section downloads adjusted closing prices for the full universe and computes daily log returns. Log returns are preferred over simple returns for their additive time-series properties and better approximation of normality. The rolling Z-score is then computed per ticker, flagging Loser signals (Z < –threshold) and Winner signals (Z > +threshold).

# --- Download price data ---
raw = yf.download(TICKERS + [BENCHMARK], start=START_DATE, end=END_DATE,
                  auto_adjust=True, progress=False)

# Flatten MultiIndex columns if present
if isinstance(raw.columns, pd.MultiIndex):
    prices = raw['Close']
else:
    prices = raw[['Close']]

prices.dropna(axis=1, how='all', inplace=True)

# --- Log returns ---
log_returns = np.log(prices / prices.shift(1))

# --- Rolling Z-Score per ticker ---
def rolling_zscore(series, window):
    roll_mean = series.rolling(window).mean()
    roll_std  = series.rolling(window).std()
    return (series - roll_mean) / roll_std

zscore_df = log_returns.apply(lambda col: rolling_zscore(col, ZSCORE_WINDOW))

# --- Signal flags ---
loser_signal  = zscore_df < -ZSCORE_THRESHOLD   # Oversold: buy signal
winner_signal = zscore_df >  ZSCORE_THRESHOLD   # Overbought: short signal

print(f"Total Loser signals:  {loser_signal.sum().sum()}")
print(f"Total Winner signals: {winner_signal.sum().sum()}")

2.3 Contrarian Strategy Backtest

For each trading day, we scan for new signals and open positions. A long position is opened on a Loser signal; a short position on a Winner signal. Each position is held for exactly HOLDING_DAYS before closing. Portfolio equity is updated daily based on open position P&L, and normalized against the benchmark for comparison.

# --- Backtest Engine ---
def run_backtest(prices, loser_signals, winner_signals, holding_days, capital):
    equity_curve = []
    open_positions = []  # list of dicts: {ticker, direction, entry_price, entry_date, close_date}
    cash = capital
    tickers = [c for c in prices.columns if c != BENCHMARK]
    dates = prices.index[ZSCORE_WINDOW:]

    for date in dates:
        # Open new positions
        for ticker in tickers:
            if ticker not in prices.columns:
                continue
            if loser_signals.loc[date, ticker]:
                open_positions.append({
                    'ticker': ticker, 'direction': 1,
                    'entry_price': prices.loc[date, ticker],
                    'entry_date': date,
                    'close_date': date + pd.tseries.offsets.BDay(holding_days)
                })
            elif winner_signals.loc[date, ticker]:
                open_positions.append({
                    'ticker': ticker, 'direction': -1,
                    'entry_price': prices.loc[date, ticker],
                    'entry_date': date,
                    'close_date': date + pd.tseries.offsets.BDay(holding_days)
                })

        # Mark-to-market open positions
        daily_pnl = 0.0
        still_open = []
        for pos in open_positions:
            ticker = pos['ticker']
            if ticker not in prices.columns or date not in prices.index:
                still_open.append(pos)
                continue
            current_price = prices.loc[date, ticker]
            ret = (current_price - pos['entry_price']) / pos['entry_price']
            daily_pnl += pos['direction'] * ret * (capital / max(len(open_positions), 1))
            if date < pos['close_date']:
                still_open.append(pos)

        open_positions = still_open
        cash += daily_pnl
        equity_curve.append({'date': date, 'equity': cash})

    return pd.DataFrame(equity_curve).set_index('date')

equity = run_backtest(prices, loser_signal, winner_signal, HOLDING_DAYS, INITIAL_CAPITAL)

# --- Benchmark equity curve ---
spy_prices = prices[BENCHMARK].dropna()
spy_curve  = (spy_prices / spy_prices.iloc[0]) * INITIAL_CAPITAL
spy_curve  = spy_curve.loc[equity.index[0]:]

2.4 Visualization

The chart below overlays the contrarian strategy equity curve against the SPY benchmark. Pay attention to divergences during high-volatility regimes — the 2020 COVID crash is particularly informative, as it generated a flood of extreme Z-score signals that drove a sharp strategy recovery.

plt.style.use('dark_background')
fig, axes = plt.subplots(2, 1, figsize=(14, 9), gridspec_kw={'height_ratios': [3, 1]})

# --- Equity curves ---
ax1 = axes[0]
ax1.plot(equity.index, equity['equity'], color='#00BFFF', linewidth=1.8,
         label='Contrarian Z-Score Strategy')
ax1.plot(spy_curve.index, spy_curve.values, color='#FF6B6B', linewidth=1.5,
         linestyle='--', label='SPY Benchmark')
ax1.set_title('Contrarian Overreaction Strategy vs SPY Benchmark',
              fontsize=15, fontweight='bold', pad=15)
ax1.set_ylabel('Portfolio Value ($)', fontsize=11)
ax1.legend(fontsize=10)
ax1.yaxis.set_major_formatter(plt.FuncFormatter(lambda x, _: f'${x:,.0f}'))
ax1.grid(alpha=0.2)

# --- Drawdown ---
ax2 = axes[1]
roll_max  = equity['equity'].cummax()
drawdown  = (equity['equity'] - roll_max) / roll_max * 100
ax2.fill_between(drawdown.index, drawdown.values, 0,
                 color='#FF4444', alpha=0.6, label='Drawdown (%)')
ax2.set_ylabel('Drawdown (%)', fontsize=11)
ax2.set_xlabel('Date', fontsize=11)
ax2.legend(fontsize=9)
ax2.grid(alpha=0.2)

plt.tight_layout()
plt.savefig('contrarian_backtest.png', dpi=150, bbox_inches='tight')
plt.show()

# --- Summary Statistics ---
total_return = (equity['equity'].iloc[-1] / INITIAL_CAPITAL - 1) * 100
spy_return   = (spy_curve.iloc[-1] / INITIAL_CAPITAL - 1) * 100
max_dd       = drawdown.min()
print(f"\nStrategy Total Return : {total_return:.1f}%")
print(f"SPY Total Return      : {spy_return:.1f}%")
print(f"Max Drawdown          : {max_dd:.1f}%")

Figure 1. Strategy equity curve (blue) vs SPY benchmark (red dashed) from 2010–2024, with percentage drawdown panel below — note the sharp recovery bursts following Z-score signal clusters during high-volatility periods such as Q1 2020.

Enjoying this strategy so far? This is only a taste of what's possible.

Go deeper with my newsletter: longer, more detailed articles + full Google Colab implementations for every approach.

Or get everything in one powerful package with AlgoEdge Insights: 30+ Python-Powered Trading Strategies — The Complete 2026 Playbook — it comes with detailed write-ups + dedicated Google Colab code/links for each of the 30+ strategies, so you can code, test, and trade them yourself immediately.

Exclusive for readers: 20% off the book with code MEDIUM20.

Join newsletter for free or Claim Your Discounted Book and take your trading to the next level!

3. Results and Analysis

The rolling Z-score approach successfully identifies clusters of extreme moves — particularly around earnings seasons, macro shock events, and sector rotations. In testing across the 40-ticker universe from 2010 to 2024, Loser signals (Z < –2.0) are generated roughly 2–4% of trading days per ticker, meaning the strategy is selective rather than constantly in the market. This is intentional: you want to be positioned only when the statistical evidence for overreaction is strong.

The most instructive periods are regime transitions. During the 2020 COVID drawdown, the signal engine fired on nearly every ticker simultaneously — a rare synchronized panic event. The 60-day holding window captured a significant portion of the subsequent mean-reversion rally. Conversely, during steady trending bull markets (2013–2014, 2017), signal frequency drops sharply and the strategy sits largely idle, ceding ground to a passive SPY position. This is not a bug — it is the correct behavior for a mean-reversion system.

Realistic expectations for this strategy class: modest positive alpha in sideways or volatile markets, underperformance relative to a pure-long benchmark during strong uptrends, and meaningful drawdown risk during momentum-driven crashes where mean reversion takes longer than the 60-day window to materialize. The max drawdown figure — visible in the lower panel — is the most important number to internalize before trading this live. Sharp, brief drawdowns in volatile regimes are the cost of admission.

4. Use Cases

Volatility regime overlay: Use the Z-score signal density as a real-time fear gauge. When 10+ tickers simultaneously breach the –2.0 threshold, the market is pricing in systemic panic — a historically favorable entry environment for contrarian long positions in index ETFs.
Pairs and sector rotation: Adapt the long-short framework to sector ETFs (XLF, XLE, XLK) rather than individual stocks to reduce idiosyncratic risk while preserving the mean-reversion logic.
Risk management trigger: Integrate Z-score extremes as a position-sizing signal in an existing portfolio — scale into oversold names that also meet fundamental quality screens (low debt, positive free cash flow) to add behavioral edge on top of factor exposure.
Research and hypothesis testing: The modular structure of the backtest engine makes it straightforward to swap in alternative signals (RSI extremes, volume-adjusted returns) or different holding windows, turning this into a general-purpose behavioral finance laboratory.

5. Limitations and Edge Cases

Survivorship bias. The ticker universe is defined upfront using current S&P 500 constituents. This means the backtest systematically excludes companies that were delisted, went bankrupt, or were removed from the index during the sample period — overstating historical performance. A production-grade implementation requires a point-in-time constituent database.

Transaction costs and slippage. The backtest assumes frictionless execution. In practice, stocks that trigger extreme Z-score signals are often in the midst of a sharp move, making fill prices unpredictable. Spreads widen, liquidity thins, and slippage can consume a significant portion of the theoretical edge — especially for smaller-cap names.

Regime dependency. Mean reversion is not a universal law. In strong momentum environments — persistent bull markets or structural sector downturns — oversold stocks continue lower and the short leg (Winner) continues higher. The 60-day holding window is too short to survive extended trending regimes, and the strategy can string together multiple consecutive losing trades.

Z-score normality assumption. The rolling Z-score assumes that returns are approximately normally distributed within the lookback window. In practice, return distributions are fat-tailed and skewed, particularly around earnings and macro events. A Z-score of –2.0 during normal times represents a different probability than the same score during a volatility spike.

Capital allocation naivety. The backtest divides capital equally across all open positions at the time of signal generation. In real markets, position sizing should account for volatility (e.g., ATR-based sizing), correlation between open trades, and portfolio-level risk limits.

Concluding Thoughts

The De Bondt and Thaler overreaction hypothesis remains a productive framework for thinking about equity markets — not because it guarantees profits, but because it anchors strategy design in observable human behavior. Fear and greed still move prices beyond fair value. The rolling Z-score gives us a disciplined, repeatable way to measure when that has happened.

The most valuable takeaway from this implementation is not a specific return figure but the analytical process: quantify extremes, define a recovery window, benchmark against passive alternatives, and stress-test against realistic costs. Adjusting the Z-score threshold between 1.5 and 3.0, or varying the holding window from 20 to 120 days, will reveal the sensitivity of the edge and help you understand exactly what market behavior you are betting on.

If you want to extend this further, consider adding a Hidden Markov Model layer to classify the current volatility regime before enabling signals — only trading the contrarian strategy when the HMM identifies a "mean-reverting" state, and switching to momentum when it identifies a trending regime. That combination is where behavioral finance and machine learning start to produce genuinely robust systematic strategies.