Ruslan Manov

Posted on Jan 31

Rust + PyO3 Enhanced Ichimoku Cloud with Hull MA Smoothing

#ichimoku #hullmovingaverage #technicalanalysis #tradingindicators

Why I rewrote 11 trading indicators from Python to Rust (and got bit-exact parity)

A Japanese newspaper reporter spent 30 years perfecting a trading system by hand. I rewrote it in Rust. Here's the full story — the history, the math, and the engineering.

The problem: Numba's cold start kills live trading

My Python trading system relied on Numba-JIT compiled Ichimoku Cloud calculations. Numba is excellent — until your process restarts.

Every cold start: 2-5 seconds of JIT compilation per function. In a live trading loop that restarts on errors, those seconds mean missed signals. And Numba holds the GIL during execution, blocking every other Python thread.

I needed:

Zero startup latency
GIL-free execution
Bit-exact results (no behavioral changes)
Single-file deployment (no LLVM runtime)

Rust + PyO3 checked every box.

A brief detour: the man on the mountain

Before we get to code, the history matters — because it explains why Ichimoku is designed the way it is.

Goichi Hosoda was a Japanese newspaper reporter who began developing a trading system in the 1930s. His pen name was Ichimoku Sanjin (一目山人) — literally "a glance from a man on a mountain." His goal: a single chart that shows support, resistance, trend, momentum, and future projections — all at one glance.

He enlisted teams of university students to manually compute and backtest the system across decades of Japanese stock and commodity data. No computers. Just pencils, paper, and price tables.

He published Ichimoku Kinko Hyo (一目均衡表 — "one-glance equilibrium chart") in 1968, after 30 years of development. The parameters 9, 26, 52 weren't arbitrary — they mapped to the Japanese trading calendar: 9 trading days (1.5 weeks), 26 days (1 month), 52 days (2 months).

The system remained almost exclusively Japanese until the internet era. Western traders discovered it in the 2000s and recognized its power: not just an indicator, but a complete trading framework.

The five classical components

Component	Japanese	Formula	Purpose
Conversion Line	Tenkan-sen	(highest high + lowest low) / 2 over short period	Short-term equilibrium
Base Line	Kijun-sen	Same formula, medium period	Primary signal line
Leading Span A	Senkou Span A	(Tenkan + Kijun) / 2	Front cloud edge
Leading Span B	Senkou Span B	Same formula, long period	Back cloud edge
Lagging Span	Chikou Span	Close shifted back N periods	Trend confirmation

The area between Senkou Span A and B forms the cloud (kumo). Price above cloud = bullish. Below = bearish. Inside = transitioning. Cloud thickness = support/resistance strength.

The key innovation: Hull Moving Average

Classic Ichimoku uses (max + min) / 2 — it only reacts when a new extreme appears in the window. This creates stepped, laggy lines.

Alan Hull (2005) solved the fundamental lag-vs-smoothness tradeoff with an algebraic trick:

HMA(n) = WMA(sqrt(n),  2 * WMA(n/2) - WMA(n))

Why it works:

WMA(n) (slow) lags by ~n/2 bars
WMA(n/2) (fast) lags by ~n/4 bars
2 * fast - slow extrapolates ahead, compensating the slow line's lag
Final WMA(sqrt(n)) smoothing adds only sqrt(n)/2 bars of lag

Result: ~50% lag reduction with smooth output.

I applied this to Ichimoku by replacing the midpoint calculation with Hull MA of (high + low) / 2. Same cloud structure, faster reaction, smoother boundaries.

The Rust implementation

Architecture

Python layer
    │
    ▼
advanced_ichimoku_cloud (Rust, PyO3)
    ├── hull.rs          → wma, hullma (+ inner functions)
    ├── hull_signals.rs  → trend, pullback, bounce detection
    ├── ichimoku.rs      → classic Ichimoku
    ├── ichimoku_hull.rs → Hull-enhanced Ichimoku
    └── indicators.rs    → ema, atr

Key design: inner functions

Every computation exists as a plain fn (no PyO3 overhead). The #[pyfunction] wrappers just handle NumPy conversion and delegate:

// Used by ichimoku_hull.rs without FFI cost
pub(crate) fn hullma_inner(data: &[f64], period: usize) -> Vec<f64> {
    // Pure computation — no Python types
}

#[pyfunction]
fn hullma(py: Python, prices: PyReadonlyArray1<f64>, period: usize) -> Py<PyArray1<f64>> {
    let slice = prices.as_slice().unwrap();
    let result = hullma_inner(slice, period);
    PyArray1::from_vec(py, result).into()
}

This enables cross-module reuse: ichimoku_hull.rs calls hull::hullma_inner() directly, with zero FFI overhead.

Zero-copy I/O

Input: as_slice().unwrap() reads NumPy arrays directly — no copying, no allocation
Output: PyArray1::from_vec allocates once in Rust, transfers ownership to Python

GIL release

PyO3 releases the GIL during Rust computation by default. Other Python threads (WebSocket handlers, order management) run freely while indicators compute.

Proving parity: 25+ assertions at 1e-12 tolerance

The test suite implements every function in pure Python, generates identical random data (seed=42, N=200), and asserts:

np.testing.assert_allclose(rust_result, python_result, atol=1e-12)

All 11 functions. All edge cases (NaN propagation, initial positions, backfill behavior). If Rust disagrees with Python by more than 1e-12, the test fails.

============================================================
  Parity Tests: advanced-ichimoku-cloud
============================================================
  PASS  wma
  PASS  hullma
  PASS  hullma_trend
  PASS  hullma_pullback
  PASS  hullma_bounce
  PASS  ichimoku_line
  PASS  ichimoku_components
  PASS  ichimoku_line_hull
  PASS  ichimoku_components_hull
  PASS  ema
  PASS  atr
============================================================
  ALL 11 FUNCTIONS PASS PARITY TESTS
============================================================

Before and after

Dimension	Python + Numba	Rust + PyO3
First-call latency	2-5s JIT warmup	Zero
GIL	Held during execution	Released
Memory safety	Runtime bounds checks	Compile-time guarantees
Dependency weight	~150 MB (numba + llvmlite)	~2 MB single .so
Reproducibility	JIT varies across LLVM versions	Deterministic binary

Try it

pip install advanced-ichimoku-cloud

from advanced_ichimoku_cloud import (
    ichimoku_components,       # classic cloud
    ichimoku_components_hull,  # Hull-enhanced cloud
    hullma, wma, ema, atr,    # individual indicators
)

import numpy as np
high = np.random.rand(200) * 100 + 50
low = high - np.random.rand(200) * 5

tenkan, kijun, senkou_a, senkou_b = ichimoku_components(high, low, 9, 26, 52)

GitHub: https://github.com/RMANOV/advanced-ichimoku-cloud

What I learned

PyO3's as_slice() is the killer feature — zero-copy NumPy access makes Rust competitive even for small arrays
Inner function pattern is essential — without it, cross-module reuse requires double FFI
Bit-exact parity testing catches subtle issues (NaN propagation order, integer division rounding) that benchmarks miss
The history of your domain matters — understanding why Hosoda chose those parameters helped me design better enhanced variants

Built with Rust, PyO3 0.27, and a deep appreciation for a journalist who spent 30 years perfecting a chart.

DEV Community