Claude Code for Data Science: Jupyter and Notebook Workflows

#jupyter #python #pandas #matplotlib

Originally published at claudeguide.io/claude-code-jupyter

Claude Code for Data Science: Jupyter and Notebook Workflows

Claude Code works with Jupyter notebooks through two paths: the CLI that can read and run notebook cells, and direct .ipynb file editing that treats notebook JSON as structured data. For most data science workflows, the most effective pattern is using Claude Code in the terminal alongside an open notebook — Claude generates code, you paste and run, iterate in 2026. This guide covers the patterns that work for EDA, visualization, and model development.

Setup: CLAUDE.md for Data Science Projects

# data-project CLAUDE.md

## Environment
- Python 3.12, Jupyter Lab
- Package manager: uv or pip
- Key packages: pandas, numpy, matplotlib, seaborn, scikit-learn, polars

## Data Conventions
- Raw data: data/raw/ (read-only, never modified)
- Processed: data/processed/
- Outputs: data/outputs/ (figures, reports)
- Column naming: snake_case
- Date columns: always parse as datetime, store UTC

## Code Style
- Type hints on functions
- Docstrings on public functions
- No magic numbers — name your constants
- DataFrames: prefer method chaining over intermediate variables

## Notebook Conventions
- First cell: imports only
- Second cell: configuration/constants
- Each analysis section: one markdown cell explaining what/why, then code cell(s)
- Save figures: always save to data/outputs/ in addition to displaying

## Testing
- Tests for data transformation functions: tests/
- Use pytest

Pattern 1: EDA (Exploratory Data Analysis)

# Generate a complete EDA notebook for a dataset
claude "Write Jupyter notebook cells for EDA on this dataset:
File: data/raw/sales_2026.csv
Known columns: date, product_id, quantity, price, region, customer_id

Generate cells for:
1. Load and basic info (shape, dtypes, head)
2. Missing value analysis (heatmap + counts)
3. Distribution of numeric columns (histograms)
4. Time series: monthly revenue trend
5. Top 10 products by revenue
6. Regional breakdown (bar chart)

Use seaborn for plots, save each figure to data/outputs/.
Each section: markdown explanation cell + code cell."

Generated output (example of one section):


python
# Cell: Missing Value Analysis
import missingno as msno
import matplotlib.pyplot as plt

# Count missing values
missing = df.isnull().sum()
missing_pct = (missing / len(df) * 100).sort_values(ascending=False)
missing_df = pd.DataFrame({'count': missing, 'pct': missing_pct})
print(missing_df[missing_df['count'] 

[→ Get Power Prompts 300 — $29](https://shoutfirst.gumroad.com/l/agfda?utm_source=claudeguide&utm_medium=article&utm_campaign=claude-code-jupyter)

*30-day money-back guarantee. Instant download.*