In this open-source research project, I explored the Pioneer Venus Orbiter (PVO) datasets to understand how ions escape from Venus’ upper atmosphere — a key question in planetary evolution.
This post walks through the process of building a full, reproducible analysis pipeline: from raw NASA data to cleaned visualizations, ready for Kaggle and GitHub.
🚀 1. Mission Background
The Pioneer Venus Orbiter (PVO), launched by NASA in 1978, carried instruments that measured the Venusian ionosphere, solar wind, and magnetic field.
Ion escape — the loss of atmospheric ions into space — is a crucial process that may explain Venus’ lack of a protective magnetosphere and its runaway greenhouse effect.
🧮 2. The Dataset
We use a concatenated dataset combining multiple PVO orbits, containing parameters like:
Column | Description | Units |
---|---|---|
ORBIT |
Orbit number | — |
PVO_TIME |
UTC timestamp | datetime |
LAT , LON , ALT_km
|
Position of spacecraft | degrees / km |
SZA |
Solar Zenith Angle | degrees |
SHA_hr |
Solar Hour Angle | hours |
DENSITY_xx_cc |
Ion density (mass/charge ≈ xx amu) | cm⁻³ |
👉 Full dataset and metadata: Kaggle Dataset: Pioneer Venus Orbiter Ion Escape
🧰 3. Project Setup
We maintain a clean project folder structure for reproducibility:
PVO-IonEscape-Analysis/
├── data/
├── notebooks/
│ └── PVO_analysis.ipynb
├── scripts/
│ └── clean_and_prepare_data.py
├── docs/
│ └── metadata.md
├── README.md
├── environment.yml
├── LICENSE
└── CITATION.cff
`
Environment Setup
bash
conda env create -f environment.yml
conda activate venus_env
`
Load and Clean Data
`python
import pandas as pd, numpy as np
df = pd.read_csv("ALL_ORBITS_CONCAT_MASS_MAPPED_DENSITY_ONLY.csv")
df['PVO_TIME'] = pd.to_datetime(df['PVO_TIME'], errors='coerce')
Replace invalid values
df.replace([99.99, 999.99, 99999.999, 9.9999e+04, 0.0], np.nan, inplace=True)
for col in [c for c in df.columns if 'DENSITY' in c]:
df[col] = df[col].where(df[col] > 0)
`
🌍 4. Scientific Objective
The goal: Quantify and visualize ion escape patterns across different ion species (O⁺, H⁺, He⁺, etc.) as a function of:
- Altitude (km)
- Solar zenith angle (SZA)
- Orbit phase
- Solar activity (if available)
We focus on dayside–nightside asymmetry, solar wind interaction, and density depletion at ionopause altitudes.
📊 5. Exploratory Analysis
Example: plotting ion density versus altitude.
`python
import seaborn as sns, matplotlib.pyplot as plt
sns.scatterplot(data=df, x="ALT_km", y="DENSITY_16_cc", hue="SZA", s=5, palette="viridis")
plt.title("Ion Density (O⁺) vs Altitude")
plt.xlabel("Altitude (km)")
plt.ylabel("O⁺ Density (cm⁻³)")
plt.show()
`
🧠 Observations:
- The O⁺ ion density peaks around 200–300 km.
- A sharp drop occurs near the ionopause (~500–800 km).
- Higher SZA (nightside) corresponds to weaker ion densities — consistent with solar EUV control.
💡 6. Toward Escape Flux Estimation
Future steps involve estimating ion escape flux (Φ):
[
Φ = \int n_i v_i , dA
]
where ( n_i ) is ion density, and ( v_i ) is outflow velocity (to be inferred from orbital velocity and solar wind coupling models).
🧩 7. Open Science & Reproducibility
This project is fully open-source and designed for reproducibility:
-
Data Cleaning Script:
scripts/clean_and_prepare_data.py
-
Environment Control:
environment.yml
-
Notebook for Analysis:
notebooks/PVO_analysis.ipynb
-
Metadata & Docs:
docs/metadata.md
📘 8. What’s Next?
Upcoming work includes:
- Cross-comparison with Venus Express and Parker Solar Probe plasma data
- Incorporation of solar wind pressure and magnetic topology
- Machine learning–based ionopause anomaly detection
🧑🔬 About the Author
👨🚀 Scientist (Physicist)
Exploring the boundary between planetary atmospheres and space plasma environments.
Follow me on GitHub: @yourusername
⭐ If you’re passionate about planetary data science — fork, explore, and contribute!
`
Top comments (0)