Showcase
Every time I wanted to test a factor idea, the workflow was always the same:
clean the factor
neutralize / standardize
run IC
build long-short portfolios
analyze exposures
And I kept rewriting the same pipeline over and over again.
So I built AlphaPurify — a lightweight library that tries to handle the whole factor research loop in one place.
What My Project Does
AlphaPurify is a Python library for factor construction, preprocessing, backtesting, and return attribution.
The idea is pretty simple:
give it a DataFrame with time, asset, price, and factor — and it handles the rest.
It currently supports:
Factor preprocessing (winsorization, standardization, neutralization, etc.)
IC / Rank IC analysis
Quantile-based long / short / long-short backtests
Factor return attribution (multi-factor exposures)
Interactive reports (via Plotly)
It’s fully vectorized + multiprocessing, so it runs pretty fast even on large datasets.
Target Audience
People who already do factor research (or are trying to get into it), especially:
quant students
researchers
anyone working with cross-sectional factors
It’s not meant to be a full trading system — more like a fast “idea validation tool”.
The project is still early stage, but usable.
Would really appreciate feedback or ideas on what to improve.
Comparison
From my experience using other tools:
vs Alphalens:
Alphalens is great for IC, but stops there.
AlphaPurify extends that into full backtesting + attribution.
vs Backtrader:
Backtrader is flexible, but you need to build everything yourself.
AlphaPurify is more opinionated and factor-focused.
vs Qlib:
Qlib is powerful but heavy.
AlphaPurify is much lighter and easier to start with.
vs QuantStats / Pyfolio:
Those focus on performance analysis, not factor testing.
So the goal here isn’t to replace them — just to make the factor workflow faster and simpler.
GitHub
https://github.com/eliasswu/Alphapurify
pip install alphapurify
If you’ve done factor research before —
I’m curious: what part of the workflow do you find the most annoying or repetitive?
That’s probably what I should optimize next.
Top comments (0)