Building a regime-switching particle filter in Rust — from Kalman 1960 to rayon-parallelized Monte Carlo
A Hungarian mathematician's 1960 invention, a British statistician's 1993 extension, and a Rust rewrite that eliminates 30 seconds of JIT warmup. Here's the story of state estimation under regime switches.
60 years of hidden state estimation
1960 — Rudolf Kálmán publishes "A New Approach to Linear Filtering and Prediction Problems." A single paper that would guide Apollo spacecraft to the Moon, enable GPS, and become the backbone of every control system. But it has a limitation: it assumes linear dynamics and Gaussian noise.
1968 — Extended Kalman Filter (EKF) linearizes nonlinear systems via Taylor expansion. Works well enough for slightly nonlinear systems, fails catastrophically for highly nonlinear ones.
1993 — Gordon, Salmond, and Smith publish the bootstrap particle filter. Instead of assuming a distribution shape, they represent the belief as a cloud of weighted samples (particles). Each particle is a hypothesis about the hidden state. Propagate, weight, resample. Repeat. No linearity assumptions. No Gaussian requirements.
Present — Regime-switching extensions add a second layer: each particle carries both a continuous state (position, velocity) AND a discrete mode (regime). The system can switch between fundamentally different behaviors — trending, mean-reverting, or chaotic — and the filter tracks which regime is active.
The problem: 19 functions × 2-5s JIT each = pain
My trading system uses a particle filter to track price regimes in real time. Three modes:
-
RANGE: Mean-reverting. Price oscillates around equilibrium.
velocity' = 0.5 × velocity + noise -
TREND: Directional. Price follows order flow imbalance.
velocity' = velocity + 0.3 × (gain × imbalance - velocity) × dt + noise - PANIC: High volatility. Random walk with large noise.
The Python+Numba implementation had 19 JIT-compiled functions. Every process restart: 30+ seconds of JIT compilation before the first estimate. In live trading, 30 blind seconds means missed regime transitions, delayed signals, frozen risk management.
And Numba holds the GIL. While 500 particles propagate through nonlinear dynamics, the entire Python runtime blocks.
19 functions in safe Rust
I rewrote everything in Rust with PyO3 bindings. Six categories:
Core particle filter
predict_particles → Rayon-parallelized regime-specific propagation
update_weights → Bayesian weight update via Gaussian likelihood
transition_regimes → Markov chain mode switching (3×3 matrix)
systematic_resample → O(N) two-pointer resampling
effective_sample_size → Degeneracy diagnostic (ESS = 1/Σwᵢ²)
estimate → Weighted mean + regime probabilities
Kalman smoothing
kalman_update → 2D level/slope tracker
slope_confidence_interval → 95% CI for slope
is_slope_significant → Directional significance test
kalman_slope_acceleration → Second derivative for early trend entry
Signal processing
calculate_vwap_bands → Volume-weighted price with σ-bands
calculate_momentum_score → Normalized momentum in [-1, +1]
rolling_kurtosis → Fat-tail detection (excess kurtosis)
Statistical tests
hurst_exponent → R/S analysis: trending vs mean-reverting
cusum_test → Page's CUSUM: structural break detection
volatility_compression → Range squeeze (short/long vol ratio)
Extended
particle_price_variance → Weighted variance of particle cloud
ess_and_uncertainty_margin → Combined ESS + regime dominance
adaptive_vwap_sigma → Kurtosis-adapted VWAP band width
The math that matters
Particle propagation (the regime-specific part)
Each particle i carries state [xᵢ, vᵢ] (log-price, velocity) and regime rᵢ ∈ {0, 1, 2}.
RANGE (r=0):
xᵢ' = xᵢ + vᵢ·dt + σ_pos[0]·√dt·εₓ
vᵢ' = 0.5·vᵢ + σ_vel[0]·√dt·εᵥ
The 0.5·vᵢ term pulls velocity toward zero — mean reversion.
TREND (r=1):
xᵢ' = xᵢ + vᵢ·dt + σ_pos[1]·√dt·εₓ
vᵢ' = vᵢ + 0.3·(G·imbalance - vᵢ)·dt + σ_vel[1]·√dt·εᵥ
Velocity tracks G × imbalance — the order flow signal.
PANIC (r=2):
xᵢ' = xᵢ + vᵢ·dt + σ_pos[2]·√dt·εₓ
vᵢ' = vᵢ + σ_vel[2]·√dt·εᵥ
Pure random walk with high noise.
Bayesian weight update
After observing the actual price z:
wᵢ' = wᵢ × exp(-0.5·(z - xᵢ)²/σ²_price[rᵢ]) × exp(-0.5·(vᵢ - G·imb)²/σ²_vel[rᵢ])
Particles close to the observation get high weight. Particles far away get low weight. Normalize. Resample when weights degenerate (ESS < N/2).
Why log-price space?
All operations use log(price). This makes the filter scale-invariant — the same parameters and noise levels work identically for a $0.50 penny stock and a $50,000 Bitcoin. Multiplicative price noise becomes additive in log space.
Engineering decisions
Rayon only where it matters
predict_particles is the only parallelized function. It's O(N) with substantial per-particle computation (regime branching, noise injection). The other 18 functions are either memory-bound (weight updates, resampling) or operate on small arrays (Kalman 2×2 matrices, signal windows).
Adding rayon to memory-bound functions would increase latency from thread pool overhead.
No internal randomness
The library takes pre-generated random arrays as input. This gives the caller:
- Full reproducibility (same seed = same output)
- Choice of RNG (numpy default, PCG64, whatever)
- Ability to inject structured noise for testing
Numerical stability
- Weight normalization:
+1e-300guard prevents underflow to zero - ESS denominator:
+1e-12prevents division by zero - Kalman covariance: symmetrized after every predict and update step
- dt guard:
max(dt, 1e-8).sqrt()prevents noise explosion at tiny timesteps
Hurst exponent — my favorite function
The Hurst exponent tells you if a price series is:
- H > 0.5: Trending (persistent — ups followed by ups)
- H = 0.5: Random walk (no memory)
- H < 0.5: Mean-reverting (anti-persistent — ups followed by downs)
Computed via R/S (Rescaled Range) analysis:
- For each window size
nin[min_window, max_window]:- Split series into blocks of size
n - For each block: compute range of cumulative deviations from mean, divide by standard deviation
- Average the R/S values
- Split series into blocks of size
- Fit
log(R/S) = H × log(n) + cvia least squares -
His the slope
This single number tells you whether to use trend-following or mean-reversion strategies. Combined with the particle filter's regime probabilities, you get a multi-scale view of market behavior.
Before and after
| Dimension | Python + Numba | Rust + PyO3 |
|---|---|---|
| Cold start (19 functions) | 30-90s JIT warmup | Zero |
| GIL | Held during all compute | Released |
| Parallelism | prange (limited) | Rayon work-stealing |
| Memory safety | Runtime bounds checks | Compile-time guarantees |
| Dependency weight | ~150 MB | ~2 MB single .so |
| Reproducibility | JIT varies by LLVM version | Deterministic binary |
Try it
pip install particle-filter-rs
import numpy as np
from particle_filter_rs import (
predict_particles, update_weights, systematic_resample,
effective_sample_size, estimate, transition_regimes,
kalman_update, hurst_exponent, cusum_test,
)
# Initialize 500 particles in log-price space
N = 500
particles = np.column_stack([
np.full(N, np.log(100.0)), # log-price
np.zeros(N), # velocity
])
regimes = np.zeros(N, dtype=np.int64)
weights = np.full(N, 1.0 / N)
# One filter step
rng = np.random.default_rng(42)
particles = predict_particles(
particles, regimes,
np.array([0.001, 0.002, 0.005]), # process noise per regime
np.array([0.01, 0.02, 0.05]),
imbalance=0.3, dt=1.0, vel_gain=0.1,
random_pos=rng.standard_normal(N),
random_vel=rng.standard_normal(N),
)
GitHub: https://github.com/RMANOV/particle-filter-rs
What I learned
- Rayon's overhead is real — for memory-bound functions, single-threaded Rust beats rayon-parallelized Rust
- Log-space is non-negotiable — a particle filter in linear price space needs different noise parameters for every asset. Log-space is universal.
- Deterministic filters are testable filters — external RNG makes every test reproducible, every bug reproducible
- The Kalman complement matters — particle filters alone give noisy estimates. A parallel Kalman provides smooth baseline + confidence intervals. Best of both worlds.
- 60 years of math still works — Kálmán's 1960 insight (recursive Bayesian estimation) is alive in every function in this library
Built with Rust, PyO3 0.27, rayon 1.10, and respect for 60 years of state estimation research.
Top comments (0)