Have you ever wanted to use real stock options data to estimate a stock's expected move and find the "max pain" point for option sellers? In this post, we'll walk through a Python script that does just that β using basic options CSV data and some neat tricks with pandas
and numpy
.
π§ What's the Idea?
Options trading isn't just speculation β it also contains valuable information about market expectations. Using open interest and implied volatility, we can estimate:
- The expected price range of a stock by expiration.
- The Max Pain strike β the price where most option holders (especially retail) lose money.
π§Ύ What You'll Need
- Two CSV files:
calls.csv
andputs.csv
, each space-delimited. - Python 3 with
pandas
andnumpy
installed.
𧬠Sample Input Data
π¦ calls.csv
(truncated)
Contract Name Last Trade Date (EDT) Strike Last Price Bid Ask Change % Change Volume Open Interest Implied Volatility
LLY250815C00350000 ... 350 410.00 388.40 395.85 ... 1 1 98.14%
LLY250815C00400000 ... 400 343.46 351.85 356.90 ... 112 58 114.91%
π₯ puts.csv
(truncated)
Contract Name Last Trade Date (EDT) Strike Last Price Bid Ask Change % Change Volume Open Interest Implied Volatility
LLY250815P00350000 ... 350 1.51 0.03 1.80 ... 1 31 75.27%
LLY250815P00400000 ... 400 1.02 0.32 2.00 ... 1 68 65.16%
π§βπ» Python Code Walkthrough
import pandas as pd
import numpy as np
def clean_options_data(filepath):
# Automatically infer whitespace columns and clean headers
df = pd.read_csv(filepath, sep=None, engine='python')
df.columns = df.columns.str.strip()
df.replace('-', np.nan, inplace=True)
# Process Implied Volatility
if 'Implied Volatility' not in df.columns:
print("β 'Implied Volatility' column not found. Check headers.")
return None
df['Implied Volatility'] = df['Implied Volatility'].str.replace('%', '', regex=False)
df['Implied Volatility'] = pd.to_numeric(df['Implied Volatility'], errors='coerce') / 100
df['Strike'] = pd.to_numeric(df['Strike'], errors='coerce')
df['Open Interest'] = pd.to_numeric(df['Open Interest'], errors='coerce')
df.dropna(subset=['Strike', 'Open Interest', 'Implied Volatility'], inplace=True)
return df
This function loads and sanitizes the options data, ensuring percentages and missing values are properly handled.
π Estimate Expected Move
calls = clean_options_data("calls.csv")
puts = clean_options_data("puts.csv")
# Exit on error
if calls is None or puts is None:
print("Fix your CSV headers and retry.")
exit()
# ATM (At-the-Money) strike based on highest call open interest
atm_strike = calls.loc[calls['Open Interest'].idxmax(), 'Strike']
atm_iv = calls.loc[calls['Strike'] == atm_strike, 'Implied Volatility'].mean()
days_to_expiry = 75 # Use actual days to expiry
expected_move = atm_strike * atm_iv * np.sqrt(days_to_expiry / 365)
print(f"\nπ§ Estimated Stock Price: ${atm_strike:.2f}")
print(f"π Expected Β± Move in {days_to_expiry} days: ${expected_move:.2f}")
print(f"π Price Range: ${atm_strike - expected_move:.2f} to ${atm_strike + expected_move:.2f}")
π― Calculate Max Pain
The Max Pain price is where the total losses (for put and call holders) are minimized β this is often where market makers benefit most.
def max_pain(calls_df, puts_df):
strikes = sorted(set(calls_df['Strike']).union(set(puts_df['Strike'])))
total_pain = []
for strike in strikes:
call_pain = ((calls_df['Strike'] - strike).clip(lower=0) * calls_df['Open Interest']).sum()
put_pain = ((strike - puts_df['Strike']).clip(lower=0) * puts_df['Open Interest']).sum()
total_pain.append((strike, call_pain + put_pain))
pain_df = pd.DataFrame(total_pain, columns=['Strike', 'Total Pain'])
return pain_df.loc[pain_df['Total Pain'].idxmin(), 'Strike']
max_pain_strike = max_pain(calls, puts)
print(f"\nπ― Max Pain Strike: ${max_pain_strike:.2f}")
π§Ύ Output Example
β
Parsed headers: ['Contract Name', ..., 'Implied Volatility']
π§ Estimated Stock Price: $400.00
π Expected Β± Move in 75 days: $49.73
π Price Range: $350.27 to $449.73
π― Max Pain Strike: $390.00
π Takeaways
- You can estimate market expectations using just open interest and implied volatility.
- Max Pain is a powerful idea and sometimes a magnet for stock price behavior near expiry.
-
pandas
makes it incredibly easy to clean and analyze tabular data β even when messy.
π Whatβs Next?
You could extend this to:
- Plot open interest by strike.
- Animate price move cones.
- Combine with real-time price feeds using
yfinance
oralphavantage
.
π Repo Starter
Create a folder like:
/stocks_b
ββ calls.csv
ββ puts.csv
ββ analyze_options.py
Then run:
python analyze_options.py
Full Code
import pandas as pd
import numpy as np
def clean_options_data(filepath):
# Use Python engine to preserve multi-word headers like "Implied Volatility"
df = pd.read_csv(filepath, sep=None, engine='python')
# Strip any extra whitespace in headers
df.columns = df.columns.str.strip()
# Optional: print to confirm correct headers
print("β
Parsed headers:", df.columns.tolist())
# Replace '-' with NaN
df.replace('-', np.nan, inplace=True)
# Clean Implied Volatility
if 'Implied Volatility' not in df.columns:
print("β 'Implied Volatility' column not found. Check headers again.")
return None
df['Implied Volatility'] = df['Implied Volatility'].str.replace('%', '', regex=False)
df['Implied Volatility'] = pd.to_numeric(df['Implied Volatility'], errors='coerce') / 100
# Convert other numeric columns
df['Strike'] = pd.to_numeric(df['Strike'], errors='coerce')
df['Open Interest'] = pd.to_numeric(df['Open Interest'], errors='coerce')
# Drop rows missing key values
df.dropna(subset=['Strike', 'Open Interest', 'Implied Volatility'], inplace=True)
return df
# Load data
calls = clean_options_data("calls.csv")
puts = clean_options_data("puts.csv")
# Stop if headers are invalid
if calls is None or puts is None:
print("Fix your CSV headers and retry.")
exit()
# Estimate ATM strike and IV
atm_strike = calls.loc[calls['Open Interest'].idxmax(), 'Strike']
atm_iv = calls.loc[calls['Strike'] == atm_strike, 'Implied Volatility'].mean()
# Estimate move
days_to_expiry = 75
expected_move = atm_strike * atm_iv * np.sqrt(days_to_expiry / 365)
print(f"\nπ§ Estimated Stock Price: ${atm_strike:.2f}")
print(f"π Expected Β± Move in {days_to_expiry} days: ${expected_move:.2f}")
print(f"π Price Range: ${atm_strike - expected_move:.2f} to ${atm_strike + expected_move:.2f}")
# Max Pain
def max_pain(calls_df, puts_df):
strikes = sorted(set(calls_df['Strike']).union(set(puts_df['Strike'])))
total_pain = []
for strike in strikes:
call_pain = ((calls_df['Strike'] - strike).clip(lower=0) * calls_df['Open Interest']).sum()
put_pain = ((strike - puts_df['Strike']).clip(lower=0) * puts_df['Open Interest']).sum()
total_pain.append((strike, call_pain + put_pain))
pain_df = pd.DataFrame(total_pain, columns=['Strike', 'Total Pain'])
return pain_df.loc[pain_df['Total Pain'].idxmin(), 'Strike']
max_pain_strike = max_pain(calls, puts)
print(f"\nπ― Max Pain Strike: ${max_pain_strike:.2f}")
Top comments (0)