If you're fascinated by the options market or want to dive into financial data analysis using Python, this post is for you. We'll walk through a practical example that:
- Loads and cleans options data (calls and puts) from CSV files,
- Calculates key metrics like the At-The-Money (ATM) strike, expected price move, and Max Pain strike,
- Visualizes open interest and implied volatility across strike prices with clear, insightful charts.
Why This Matters
Options traders look at implied volatility and open interest to gauge market sentiment, liquidity, and price expectations. The Max Pain theory suggests that the stock price tends to gravitate toward the strike price where option holders collectively suffer the most loss — a concept useful for market timing.
The Code Breakdown
Let's start by looking at the full code that performs all these steps, then we’ll break down what’s happening:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import seaborn as sns
def clean_options_data(filepath):
# Load data with Python engine to handle complex delimiters and multi-word headers
df = pd.read_csv(filepath, sep=None, engine='python')
df.columns = df.columns.str.strip()
df.replace('-', np.nan, inplace=True)
# Check for 'Implied Volatility' and clean it up
if 'Implied Volatility' not in df.columns:
print("❌ 'Implied Volatility' column not found. Check headers.")
return None
# Convert implied volatility from percent string to decimal float
df['Implied Volatility'] = df['Implied Volatility'].str.replace('%', '', regex=False)
df['Implied Volatility'] = pd.to_numeric(df['Implied Volatility'], errors='coerce') / 100
# Convert numeric columns, coercing errors to NaN
df['Strike'] = pd.to_numeric(df['Strike'], errors='coerce')
df['Open Interest'] = pd.to_numeric(df['Open Interest'], errors='coerce')
# Drop rows missing critical numeric data
df.dropna(subset=['Strike', 'Open Interest', 'Implied Volatility'], inplace=True)
return df
# Load calls and puts data
calls = clean_options_data("calls.csv")
puts = clean_options_data("puts.csv")
if calls is None or puts is None:
print("Fix your CSV headers and retry.")
exit()
# Calculate At-The-Money (ATM) strike and implied volatility
atm_strike = calls.loc[calls['Open Interest'].idxmax(), 'Strike']
atm_iv = calls.loc[calls['Strike'] == atm_strike, 'Implied Volatility'].mean()
# Define days to expiration and estimate expected move
days_to_expiry = 75
expected_move = atm_strike * atm_iv * np.sqrt(days_to_expiry / 365)
# Function to calculate Max Pain strike price
def max_pain(calls_df, puts_df):
strikes = sorted(set(calls_df['Strike']).union(set(puts_df['Strike'])))
total_pain = []
for strike in strikes:
call_pain = ((calls_df['Strike'] - strike).clip(lower=0) * calls_df['Open Interest']).sum()
put_pain = ((strike - puts_df['Strike']).clip(lower=0) * puts_df['Open Interest']).sum()
total_pain.append((strike, call_pain + put_pain))
pain_df = pd.DataFrame(total_pain, columns=['Strike', 'Total Pain'])
return pain_df.loc[pain_df['Total Pain'].idxmin(), 'Strike']
max_pain_strike = max_pain(calls, puts)
print(f"ATM Strike: {atm_strike}")
print(f"Expected Move: ±{expected_move:.2f}")
print(f"Max Pain Strike: {max_pain_strike}")
# Visualization
sns.set(style="whitegrid")
fig, axes = plt.subplots(2, 1, figsize=(14, 10), sharex=True)
# Plot open interest for calls and puts
axes[0].bar(calls['Strike'] - 1, calls['Open Interest'], width=1.8, label='Calls OI', color='blue', alpha=0.6)
axes[0].bar(puts['Strike'] + 1, puts['Open Interest'], width=1.8, label='Puts OI', color='red', alpha=0.6)
axes[0].axvline(atm_strike, color='green', linestyle='--', label='ATM Strike')
axes[0].axvline(max_pain_strike, color='purple', linestyle='--', label='Max Pain Strike')
axes[0].set_ylabel('Open Interest')
axes[0].set_title('Open Interest by Strike Price')
axes[0].legend()
# Plot implied volatility for calls and puts
axes[1].plot(calls['Strike'], calls['Implied Volatility'], label='Calls IV', marker='o', color='blue')
axes[1].plot(puts['Strike'], puts['Implied Volatility'], label='Puts IV', marker='o', color='red')
axes[1].axvline(atm_strike, color='green', linestyle='--', label='ATM Strike')
axes[1].axvline(max_pain_strike, color='purple', linestyle='--', label='Max Pain Strike')
axes[1].axvspan(atm_strike - expected_move, atm_strike + expected_move, color='gray', alpha=0.2, label='Expected Move Range')
axes[1].set_xlabel('Strike Price')
axes[1].set_ylabel('Implied Volatility')
axes[1].set_title('Implied Volatility by Strike Price')
axes[1].legend()
plt.tight_layout()
plt.show()
What’s Happening?
Data Cleaning
The clean_options_data
function reads the CSV with flexibility (engine='python'
helps handle tricky delimiters and headers). It cleans up the columns, replaces dashes -
with NaN, and converts percent strings in implied volatility into decimal floats. It also filters out incomplete data.
Key Metrics Calculation
- ATM Strike: The strike price with the highest open interest in calls — a good proxy for where the market thinks the stock price is currently focused.
- Expected Move: Using ATM implied volatility and days to expiration, we calculate the standard deviation price move expected by options market participants.
- Max Pain: We calculate the strike price where the total financial pain (loss) to options holders is minimized, which some traders use to predict stock price behavior near expiry.
Visualization
We plot two charts stacked vertically:
- Open Interest shows how many contracts are open at each strike price — split between calls and puts.
- Implied Volatility tracks market expectations of volatility by strike price for calls and puts.
The charts highlight the ATM strike, max pain strike, and shade the expected move range, giving a full picture of market sentiment.
How to Use This
- Export your options data into CSV files with headers including Implied Volatility, Strike, and Open Interest.
- Run this Python script to clean, analyze, and visualize.
- Interpret the charts to understand where traders are most active, how volatility changes across strikes, and where the market might be “priced” to move.
Final Thoughts
This approach offers a neat window into the options market using just Python and some standard libraries. Whether you're a trader, data scientist, or financial analyst, this can serve as a strong foundation for deeper options analytics or automated strategies.
Top comments (0)