Ayrat Murtazin

Posted on Apr 18

Monte Carlo Stock Price Simulation with Historical and Implied Volatility in Python

#python #quant #trading #finance

Forecasting future stock prices is not about prediction — it is about quantifying the distribution of possible outcomes. Monte Carlo simulation, powered by Geometric Brownian Motion (GBM), gives you exactly that: a probabilistic map of where a stock could trade over a given horizon, expressed through thousands of simulated paths. This technique sits at the foundation of modern options pricing, risk management, and portfolio stress-testing.

In this article, you will build a full Monte Carlo simulation engine in Python using real market data from Yahoo Finance. The implementation covers two parallel models — one driven by historical volatility (HV) and one by Black-Scholes implied volatility (IV) — so you can compare how different volatility inputs reshape the simulation output. You will generate confidence cones, compute tail probabilities, and visualize the full final-price distribution across 10,000 simulated scenarios.

Most algo trading content gives you theory.
This gives you the code.

3 Python strategies. Fully backtested. Colab notebook included.
Plus a free ebook with 5 more strategies the moment you subscribe.

5,000 quant traders already run these:

Subscribe | AlgoEdge Insights

This article covers:

Section 1 — Core Concepts:** What Geometric Brownian Motion is, how volatility drives simulation, and why HV and IV differ
Section 2 — Python Implementation:** Full code pipeline from data download through simulation, comparison, and visualization
2.1 Setup and Parameters
2.2 Fetching Data and Computing Historical Volatility
2.3 Running the Monte Carlo Simulation
2.4 Visualization — Confidence Cones and Final Price Distribution
Section 3 — Interpreting the Results:** What the simulation output actually tells you and how to read the confidence intervals
Section 4 — Use Cases:** Practical applications across options trading, risk management, and portfolio analysis
Section 5 — Limitations and Edge Cases:** Where GBM-based simulation breaks down and what to watch for

1. Geometric Brownian Motion and the Role of Volatility

Geometric Brownian Motion is the mathematical engine behind the most widely used models in quantitative finance, including the Black-Scholes options pricing framework. At its core, GBM describes how an asset price evolves through two competing forces: a deterministic drift term that represents the expected return, and a stochastic diffusion term that represents randomness scaled by volatility. Every tick of the clock introduces a new random shock, and the cumulative effect of those shocks over time produces the wide fan of possible price paths you see in a Monte Carlo simulation.

The discrete-time version of GBM that we use in simulation is written as:

S(t+1) = S(t) * exp((μ - σ²/2) * dt + σ * √dt * Z)

Where μ is the annualized drift (typically estimated from historical returns), σ is the annualized volatility, dt is the time step, and Z is a draw from a standard normal distribution. The σ²/2 correction term — known as the Itô correction — prevents the expected value of the log price from drifting upward faster than it should when volatility is high.

Now, the critical design decision in any Monte Carlo simulation is: which volatility estimate do you use? Historical volatility (HV) measures how much a stock has moved in the past, typically computed as the annualized standard deviation of daily log returns over a rolling window. Implied volatility (IV) is extracted from options market prices and reflects what the market collectively expects future volatility to be. HV is backward-looking. IV is forward-looking. They often diverge — especially around earnings events, macro releases, or periods of market stress — and that divergence has meaningful consequences for the shape and spread of your simulation output. Running both side by side lets you observe exactly how sensitive your probability estimates are to the volatility input.

2. Python Implementation

2.1 Setup and Parameters

The simulation is fully parameterized so you can adapt it to any ticker, time horizon, or volatility assumption. The key inputs are the ticker symbol, the lookback window for computing historical volatility, the number of simulated paths, the forecast horizon in trading days, and the implied volatility override used in the second model.

import numpy as np
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt
from scipy.stats import norm

# ── Configuration ──────────────────────────────────────────────────
TICKER          = "ASML"       # Stock to simulate
LOOKBACK_DAYS   = 252          # Trading days used to compute HV
N_SIMULATIONS   = 10_000       # Number of Monte Carlo paths
HORIZON_DAYS    = 63           # Forecast horizon (~1 quarter)
IMPLIED_VOL     = 0.32         # Annualized IV override (from options chain)
RISK_FREE_RATE  = 0.05         # Annualized risk-free rate used as drift proxy
CONFIDENCE_LOW  = 0.16         # Lower bound of 68% confidence band
CONFIDENCE_HIGH = 0.84         # Upper bound of 68% confidence band
TAIL_LOW        = 0.025        # Lower bound of 95% confidence band
TAIL_HIGH       = 0.975        # Upper bound of 95% confidence band
RANDOM_SEED     = 42

2.2 Fetching Data and Computing Historical Volatility

This section downloads adjusted closing prices, computes daily log returns, and derives the annualized historical volatility. The drift estimate uses the mean daily log return scaled to annual frequency. Both HV and the drift are computed from the same return series to keep the two models comparable.

# ── Download price data ────────────────────────────────────────────
raw = yf.download(TICKER, period="2y", auto_adjust=True, progress=False)
prices = raw["Close"].dropna()

# ── Log returns ────────────────────────────────────────────────────
log_returns = np.log(prices / prices.shift(1)).dropna()

# ── Historical volatility (annualized) ────────────────────────────
hv = log_returns.tail(LOOKBACK_DAYS).std() * np.sqrt(252)
mu = log_returns.tail(LOOKBACK_DAYS).mean() * 252   # Annualized drift

S0 = float(prices.iloc[-1])   # Starting price = last close
dt = 1 / 252                  # Daily time step

print(f"Ticker        : {TICKER}")
print(f"Last Close    : ${S0:,.2f}")
print(f"Hist. Vol     : {hv:.2%}")
print(f"Implied Vol   : {IMPLIED_VOL:.2%}")
print(f"Annual Drift  : {mu:.2%}")
print(f"Horizon       : {HORIZON_DAYS} trading days")

2.3 Running the Monte Carlo Simulation

The simulation loop advances the price forward one day at a time using the GBM update equation. Both the HV model and the IV model use the same random shocks — controlled by the fixed seed — so any difference in the output is attributable entirely to the volatility input, not sampling noise.

def run_gbm_simulation(S0, mu, sigma, dt, horizon, n_sims, seed=RANDOM_SEED):
    """
    Simulate n_sims price paths of length `horizon` days using GBM.
    Returns array of shape (horizon + 1, n_sims).
    """
    np.random.seed(seed)
    paths = np.zeros((horizon + 1, n_sims))
    paths[0] = S0

    shocks = np.random.standard_normal((horizon, n_sims))

    for t in range(1, horizon + 1):
        paths[t] = paths[t - 1] * np.exp(
            (mu - 0.5 * sigma ** 2) * dt
            + sigma * np.sqrt(dt) * shocks[t - 1]
        )
    return paths

# ── Run both models ────────────────────────────────────────────────
paths_hv = run_gbm_simulation(S0, mu, float(hv), dt, HORIZON_DAYS, N_SIMULATIONS)
paths_iv = run_gbm_simulation(S0, mu, IMPLIED_VOL,  dt, HORIZON_DAYS, N_SIMULATIONS)

# ── Summary statistics at horizon end ─────────────────────────────
for label, paths in [("Historical Vol", paths_hv), ("Implied Vol", paths_iv)]:
    finals = paths[-1]
    print(f"\n── {label} ──")
    print(f"  Mean final price : ${finals.mean():,.2f}")
    print(f"  Median           : ${np.median(finals):,.2f}")
    print(f"  5th percentile   : ${np.percentile(finals, 5):,.2f}")
    print(f"  95th percentile  : ${np.percentile(finals, 95):,.2f}")
    print(f"  P(above ${S0:.0f})   : {(finals > S0).mean():.2%}")

2.4 Visualization

The chart below plots the confidence cone for both models side by side — the shaded bands show where 68% and 95% of simulated paths fall at each point in time, with the median path drawn as a solid line. A second panel shows the final-price histogram across all 10,000 paths for each model, making the difference in spread immediately visible.

plt.style.use("dark_background")
time_axis = np.arange(HORIZON_DAYS + 1)

fig, axes = plt.subplots(1, 2, figsize=(16, 6))
fig.suptitle(f"{TICKER} — Monte Carlo Simulation: HV vs IV  ({N_SIMULATIONS:,} paths)",
             fontsize=14, fontweight="bold", y=1.02)

model_configs = [
    ("Historical Volatility", paths_hv, "#00bfff", axes[0]),
    ("Implied Volatility",    paths_iv, "#ff6b6b", axes[1]),
]

for title, paths, color, ax:
    median  = np.percentile(paths, 50, axis=1)
    mean    = paths.mean(axis=1)
    lo_68   = np.percentile(paths, CONFIDENCE_LOW  * 100, axis=1)
    hi_68   = np.percentile(paths, CONFIDENCE_HIGH * 100, axis=1)
    lo_95   = np.percentile(paths, TAIL_LOW        * 100, axis=1)
    hi_95   = np.percentile(paths, TAIL_HIGH       * 100, axis=1)

    ax.fill_between(time_axis, lo_95, hi_95, alpha=0.15, color=color, label="95% CI")
    ax.fill_between(time_axis, lo_68, hi_68, alpha=0.30, color=color, label="68% CI")
    ax.plot(time_axis, median, color=color,   lw=2,   label="Median")
    ax.plot(time_axis, mean,   color="white", lw=1.2, ls="--", label="Mean")
    ax.axhline(S0, color="grey", lw=0.8, ls=":")

    ax.set_title(title, fontsize=12)
    ax.set_xlabel("Trading Days Forward")
    ax.set_ylabel("Simulated Price ($)")
    ax.legend(fontsize=9)
    ax.grid(alpha=0.15)

plt.tight_layout()
plt.savefig("confidence_cone_hv_vs_iv.png", dpi=150, bbox_inches="tight")
plt.show()

# ── Final price distribution ───────────────────────────────────────
fig2, ax = plt.subplots(figsize=(12, 5))
ax.hist(paths_hv[-1], bins=120, alpha=0.5, color="#00bfff", label=f"HV ({float(hv):.1%})")
ax.hist(paths_iv[-1], bins=120, alpha=0.5, color="#ff6b6b", label=f"IV ({IMPLIED_VOL:.1%})")
ax.axvline(S0, color="white", lw=1.5, ls="--", label=f"Current Price ${S0:.0f}")
ax.set_title(f"{TICKER} — Final Price Distribution at Day {HORIZON_DAYS}")
ax.set_xlabel("Price ($)")
ax.set_ylabel("Frequency")
ax.legend()
ax.grid(alpha=0.15)
plt.tight_layout()
plt.savefig("final_price_distribution.png", dpi=150, bbox_inches="tight")
plt.show()

Figure 1. Left and right panels show the 68% and 95% confidence cones for the HV and IV models respectively; wider bands in the IV panel indicate that options markets are pricing in greater forward uncertainty than realized history suggests.

Enjoying this strategy so far? This is only a taste of what's possible.

Go deeper with my newsletter: longer, more detailed articles + full Google Colab implementations for every approach.

Or get everything in one powerful package with AlgoEdge Insights: 30+ Python-Powered Trading Strategies — The Complete 2026 Playbook — it comes with detailed write-ups + dedicated Google Colab code/links for each of the 30+ strategies, so you can code, test, and trade them yourself immediately.

Exclusive for readers: 20% off the book with code MEDIUM20.

Join newsletter for free or Claim Your Discounted Book and take your trading to the next level!

3. Interpreting the Simulation Output

The most important thing to understand about Monte Carlo output is that it does not give you a forecast — it gives you a probability distribution. When the HV model produces a 95th-percentile price of $980 and the IV model produces $1,040 for the same horizon, the difference is telling you something concrete: the options market is pricing in a wider range of outcomes than the trailing 252-day return history would suggest. That gap is a signal worth investigating before entering a position.

The confidence cone chart makes this divergence visible at every point in time, not just at expiry. Notice how the bands widen nonlinearly — this is the square-root-of-time effect baked into GBM. Uncertainty compounds. A 30-day horizon has roughly twice the width of a 7-day horizon, not six times, which has practical implications for how far out you should use these simulations with any confidence.

The final-price histogram is arguably the most actionable output. You can read directly from it: what fraction of the 10,000 paths finished above a target price, below a stop level, or within a defined range. If you are long a call option struck at 110% of the current price, the histogram immediately shows you the simulated probability of that option finishing in-the-money under each volatility regime — no closed-form formula required.

4. Use Cases

Options strategy evaluation. Overlay strike prices on the final distribution histogram to read off simulated probabilities of finishing in- or out-of-the-money. Compare HV and IV versions to see whether the options market is cheap or rich relative to realized history.
Risk management and position sizing. Use the 5th-percentile path as a stress scenario. If the simulated worst-case outcome at a 63-day horizon represents a loss larger than your risk budget, you have a quantitative basis to reduce position size before entry.
Earnings event modeling. Swap in the at-the-money implied vol from the options chain immediately before an earnings announcement and observe how the confidence cone expands relative to the quiet-period HV baseline. This is a standard technique for visualizing event-driven risk.
Portfolio-level simulation. Extend the engine to simulate multiple correlated assets simultaneously using a Cholesky decomposition of the return covariance matrix. The same GBM logic applies; only the shock generation step changes.

5. Limitations and Edge Cases

GBM assumes log-normally distributed returns. Real equity returns have fat tails — large moves occur far more frequently than the normal distribution predicts. This means your 95% confidence bands will be too narrow during market stress, and tail-risk probabilities will be systematically underestimated.

Volatility is not constant. GBM uses a single fixed sigma for the entire simulation horizon. In reality, volatility clusters and mean-reverts. Stochastic volatility models such as Heston address this, but add significant implementation complexity. If implied vol is elevated today, GBM will carry that level unchanged for the full 63-day horizon — which is rarely realistic.

The drift estimate is noisy. Estimating μ from 252 days of returns introduces enormous sampling error. A reasonable alternative is to set drift equal to the risk-free rate (risk-neutral measure), which is standard in options pricing contexts and removes the unreliable return estimation problem entirely.

The IV input is a single point, not a surface. Using the at-the-money 30-day implied vol as a flat sigma ignores the volatility skew and term structure embedded in the full options surface. For more accurate strike-specific probabilities, you would need to sample from a vol surface calibrated to market prices.

Survivor bias in backtesting. If you calibrate your simulation parameters on a stock that has already performed well, your drift estimate will be upward-biased. Always consider out-of-sample periods when validating the model's probability calibration.

Concluding Thoughts

Monte Carlo simulation with GBM is one of the most practical tools in a quant's toolkit precisely because it converts volatility — an abstract statistical concept — into a concrete probability map. By running both historical and implied volatility variants side by side, you gain immediate intuition about where realized history and market expectations diverge, which is often where the most interesting risk-reward opportunities live.

The implementation presented here is a working foundation, not a finished product. The natural next steps are to replace the constant-vol assumption with a mean-reverting volatility process, incorporate correlated multi-asset paths via Cholesky decomposition, or calibrate the simulation to the full implied volatility surface rather than a single ATM point. Each extension brings the model closer to production-grade risk tooling.

If you found this useful, the next articles in this series cover the Heston stochastic volatility model, volatility surface construction from live options chains, and Monte Carlo pricing of path-dependent exotic options — all built from scratch in Python. Follow along to keep building.