Harihar Nautiyal

Posted on Jun 29

Astrophage: Building a Two-Stage Random Forest Exoplanet Classifier in Rust

#rust #machinelearning #exoplanets #datascience

A weeks ago my teammate and I decided to enter the AI for Astronomy Hackathon 2026. The task was to classify thousands of Kepler Objects of Interest as confirmed exoplanets, candidates or false positives.

Most teams would use Python and scikit-learn. We did something. We built Astrophage, a custom Two-Stage Random Forest classifier written entirely in Rust. It uses Polars for data handling and a from-scratch machine learning implementation. Astrophage achieves 94.81% accuracy making it the accurate tabular-data model in the competition.

Here is how we did it why two-stage classification works and what we learned about building machine learning systems in Rust.

The Problem: Not Every Signal Is a Planet

The Kepler space telescope found thousands of exoplanet candidates.. Most of them are not planets at all. They are binary star systems, instrumental noise or stellar variability. Every false wastes precious telescope time. Every missed candidate is a habitable world we never study. The classification problem is real. It is hard.

The dataset gives you tabular features like periods, transit depths, signal-to-noise ratios, stellar temperatures and false positive flags. No images, no curves as pixels. Structured numbers that encode astrophysical reality.. That is where most machine learning approaches go wrong.

The Insight: Astronomers Do Not Think in Three Classes

We noticed that NASA astronomers do not look at a Kepler Object of Interest and ask "is this confirmed, candidate or false positive?" all once. They think in stages. They ask "is this definitely a planet?" If the answer is yes it is confirmed. If not they ask "is it worth telescope time?" If the answer is yes it is a candidate. If not it is a positive.

This is not a machine learning trick. It is literally how the science works. So we asked: what if our model did the thing?

The Architecture: Two-Stage Random Forest

Instead of a single 3-class classifier Astrophage uses two binary Random Forests in sequence. The first stage separates confirmed planets from not confirmed planets. The second stage separates candidates from positives.

The improvement is 3-4% accuracy gain over a single-stage approach. That may not sound like much. In exoplanet discovery every percentage point represents hundreds of potential worlds.

Why Rust?

We wrote a Random Forest from scratch in Rust. Why did we do that when Python and scikit-learn exist? It started as a challenge. It ended as a solution.

Metric	Astrophage (Rust)	sklearn (Python)
Inference	~1ms/sample	~10ms/sample
Training	~30 seconds	~2 minutes
Binary size	~2 MB	~500 MB+ (with env)
Runtime crashes	Zero	Occasionally
Memory safety	Compile-time guarantees	Manual management

The speed is not just nice to have. If you are classifying millions of Kepler Objects of Interest. Running real-time pipeline analysis, 10x faster inference matters.. The binary size? You could ship Astrophage on a Raspberry Pi if you wanted.

Feature Engineering: Where Astrophysics Meets Code

We did not throw 28 raw features at a model. We spent time understanding what each number actually means physically then engineered 8 derived features that encode astrophysical intuition.

The Star Feature: `fpflag_sum`

NASA already flags signals with four binary indicators. We simply added them up. Fpflag_sum has an importance score of 0.29. It alone explains a third of the models decision-making. When this value is non-zero the signal is certainly not a planet.

Other Derived Features

snr_x_prad. Signal-to-noise ratio × radius. Real planets have SNR to their size.
depth_duration_ratio. Transit depth ÷ duration. Planets make Ushaped transits; binary stars make V-shaped eclipses.
impact_penalty. If impact parameter > 1.0 the "planet" would miss the star entirely. Physically impossible.
log_period. Orbital periods follow log- distributions, not normal ones. Taking the log makes the feature model-friendly.
koi_prad_squared. Objects > 15 Earth radii are companions, not planets. Squaring creates a threshold.

Every derived feature has a story. That is not accidental. It is the reason the model generalizes well.

The Numbers

Metric	Score
Accuracy	94.81%
Macro F1	92.64%
Weighted F1	94.51%

Per-class breakdown:

Class	Precision	Recall	F1-Score
FALSE POSITIVE	99.69%	98.35%	99.01%
CONFIRMED	89.95%	94.54%	92.18%
CANDIDATE	88.42%	85.06%	86.71%

The false positive precision is nearly perfect because false positives often have flags. The candidate class is hardest. But that is by design. Candidates are supposed to be ambiguous.

What We Learned

1. Two-stage decomposition just works

Breaking a 3-class problem into two easier binary ones sounds obvious in retrospect but most machine learning pipelines do not do it. The accuracy gain is real. The architecture mirrors actual scientific workflow.

2. Domain knowledge beats compute

36 well-designed features outperform 100 generic ones. Understanding why orbital periods are log-normal why impact parameters matter geometrically and why NASA flags exist was worth more than any hyperparameter tuning.

3. Rust is viable for machine learning

Polars and NDArray can absolutely compete with Pythons ecosystem. The borrow checker is annoying for the week. Then you stop having memory bugs

4. Stratified sampling saves you from yourself

Without it our model would have learned "say false positive every time". Called it a day. The class distribution is that imbalanced.

5. Sometimes the best feature is already in the data

We thought we would discover some interaction no one had found. Nope. NASAs pre-vetting flags aggregated simply were our move. The experts already knew what to look for.

What's Next

** term:** Hyperparameter grid search, K-fold cross-validation recursive feature elimination to find the optimal subset.

** term:** Model serialization, a REST API for real-time classification, a web dashboard for interactive exploration.

** term:** Direct NASA Exoplanet Archive integration extending to TESS and JWST data explanations for every prediction.

The dream: Astrophage becomes a tool in the exoplanet discovery pipeline. Helping astronomers prioritize telescope time identify the most promising candidates and find more worlds, like our own.

Try It Yourself

GitHub: github.com/harihar-nautiyal/astrophage
Documentation: astrophage.hariharnautiyal.com
Google Colab: Run it without installing Rust. Notebook link

git clone https://github.com/harihar-nautiyal/astrophage.git
cd astrophage
cargo build --release
./target/release/astrophage

The first build takes a minutes but subsequent runs are instant.

"Somewhere something incredible is waiting to be known.". Carl Sagan

I am really interested in machine learning that uses tables. I also like learning about planets outside our solar system. If you like these things too or if you just want to know more about using the programming language Rust for science I would love to hear what you think. You can write a comment. Tell me about a problem, on GitHub.

This was made using Rust, Polars, NDArray, Tokio and a lot of coffee.