Len

Posted on May 15 • Edited on Jun 5

A Gym-style API for algorithmic trading research, in Rust

#rust #tutorial #opensource #machinelearning

Separating strategy logic from execution, matching and reporting. Parameter grid search is a first-class citizen.

End-to-end workflow: Running make run in the chapaty-template project executes the example strategy from this blog post and produces a QuantStats tearsheet.

TL;DR: Chapaty is an open-source Rust backtesting framework with a Gym-style [1] reset / step / act API for algorithmic trading. Strategy logic lives in a single act function. Order execution, matching engine, data sync, and reporting sit behind the simulation environment. For the example strategy in this post, a 400-point parameter grid over 9 years of end-of-day market data runs in ~1 second on an 8-core laptop.

One bottleneck in algorithmic trading workflows is the time from ideation to a backtest result.

Existing tools sit along two axes. Hosted platforms like QuantConnect, TradingView, and MetaTrader 5 offer accessible UIs but rely on closed simulators (and in some cases proprietary languages like Pine Script and MQL5) that you can't fully audit. Open-source alternatives in Rust like Nautilus Trader and Barter exist and cover backtesting and live trading. Chapaty is built around separating the algorithm completely from the framework. A Rust-native Agent trait with reset / act, observations handed in per step, and parameter grids as a first-class citizen.

To speed up the Build-Measure-Learn loop [2] for trading strategies, we need two things:

An accessible framework that people are already comfortable with and that is well-established: Gym.
A fast programming language with fearless concurrency, so we can natively run many different strategies in parallel: Rust.

The motivation for Chapaty was to unify these worlds in an open-source project, so that everyone has access to a framework for developing algorithms without having to focus on infrastructure.

The approach that worked was separating strategy logic from the order execution, matching engine, data syncing, and reporting. Now, all the strategy code lives in one place, so iterating on ideas no longer requires touching the rest of the pipeline.

Design Decisions

This is how Chapaty handles common backtesting pitfalls:

Timeframe Synchronization: To avoid look-ahead bias across different timeframes (e.g., mixing 1m, 1h, and Economic Calendar data), the simulation steps through time by selecting the next strictly monotonically increasing point_in_time timestamp across all data sources. It then extends the view on the data up to this point in time.

Synchronizing point-in-time data: the cursor advances to the next strictly monotonically increasing timestamp across all data streams (1-minute, 1-hour, and economic calendar). At each step, every event up to the cursor becomes visible to the agent.

Pessimistic Evaluation: If an entry and a take-profit/stop-loss occur in the exact same candle, the framework defaults to pessimistic evaluation. It assumes the worst-case scenario (e.g., the stop-loss was hit before the take-profit). You can toggle this to "optimistic" if preferred.

Slippage & Fees: Fills occur at the SL/TP price, and I manually subtract a percentage fee and slippage per trade in the journal to approximate real-world results without over-engineering the matching engine. Realism on fills and slippage is still pragmatic rather than microstructure-accurate. A future version will improve fill modeling.

Data Feeds: The framework is decoupled from the data. It can process any structured event with a point_in_time. Personally, I run it mostly on crypto (Binance Spot: OHLCV, Trades, TPO, Volume Profile across multiple timeframes) and economic calendar events. Because of the abstraction, it works equally well with traditional equities or futures.

The matching engine, the fill logic, the look-ahead handling: they're all in the repository. You can read exactly how a trade gets evaluated. That matters more for backtesting than for most software. If you can't audit the simulator, you can't trust the result. It's the reason why Chapaty needs to be open source.

Backtesting 400 Parameter Combinations in 1 Second (with a Stop-and-Reverse SMA Crossover)

By isolating the strategy logic in an act function that receives an Observation, the strategy remains independent from the OHLCV data feed.

Here is how we can implement a Stop-and-Reverse SMA Crossover in three steps. To keep things fast, we will use a StreamingSma helper to handle the rolling calculations.

Step 1: Define the Agent State. First, we create our Agent structure. We need to define the market, our parameters (which allows us to easily run a grid search later), and our internal state cache. The last_processed_ts is used to prevent double-processing.

// Brings StreamingSma into scope
use chapaty::prelude::*;

pub struct DemoAgent {
   // Unique identifier for the market (e.g., BTC-USDT)
   pub ohlcv_id: OhlcvId,

   // Strategy parameters
   pub fast_period: u16,
   pub slow_period: u16,

   // === Internal ===

   // Streaming indicators to avoid O(n) window recalculations
   fast_sma: StreamingSma,
   slow_sma: StreamingSma,

   // Internal state cache
   current_fast: Option<f64>,
   current_slow: Option<f64>,
   trade_counter: i64,
   last_processed_ts: Option<DateTime<Utc>>,
}

Step 2: Implement the Execution Logic. Next, we implement the Agent trait. The logic flows in a strict pipeline: safely fetch the data → update internal state (idempotency check) → verify the signal → execute the trade.

(If you want to add more complex take-profit/stop-loss logic or other entry signals later, you simply extend this act implementation. I've kept it minimal here for brevity).

impl Agent for DemoAgent {
    fn act(&mut self, obs: Observation) -> ChapatyResult<Actions> {
        let market_view = &obs.market_view;

        // 1. Fetch the latest candle safely
        let Some(candle) = market_view.ohlcv().last_event(&self.ohlcv_id) else {
            return Ok(Actions::no_op());
        };

        // 2. Update Internal State (Idempotency check)
        if self.last_processed_ts != Some(candle.close_timestamp) {
            self.current_fast = self.fast_sma.update(candle.close.0);
            self.current_slow = self.slow_sma.update(candle.close.0);
            self.last_processed_ts = Some(candle.close_timestamp);
        }

        // 3. Check Signal Validity
        let (Some(fast), Some(slow)) = (self.current_fast, self.current_slow) else {
            return Ok(Actions::no_op()); // SMAs are still warming up
        };

        // 4. Determine Position Status
        let agent_id = self.identifier();
        let active_trade = obs.states.find_active_trade_for_agent(&agent_id);
        let market_id: MarketId = self.ohlcv_id.into();

        let mut actions = Actions::new();

        // 5a. Determine the Target State
        let desired_dir = if fast > slow {
            Some(TradeType::Long)
        } else if fast < slow {
            Some(TradeType::Short)
        } else {
            None // fast == slow, no clear signal
        };

        // 5b. Determine the Current State
        let current_dir = active_trade.map(|(_, state)| *state.trade_type());

        // 5c. Bridge the Gap
        if current_dir != desired_dir {
            // 1. Clear the old state if it exists
            if let Some((_, state)) = active_trade {
                actions.add(market_id, self.close_market(state.trade_id()));
            }

            // 2. Enter the new state if there is a signal
            if let Some(dir) = desired_dir {
                actions.add(market_id, self.open(dir));
            }
        }

        Ok(actions)
    }
}

To keep the act method readable, the actual order creation is put in helper methods. Passing entry_price: None directly translates into a market order.

impl DemoAgent {
    fn open(&mut self, trade_type: TradeType) -> Action {
        self.trade_counter += 1;
        Action::Open(OpenCmd {
            agent_id: self.identifier(),
            trade_id: TradeId(self.trade_counter),
            trade_type,
            quantity: Quantity(1.0),
            entry_price: None, // Market Order
            stop_loss: None,
            take_profit: None,
        })
    }

    fn close_market(&self, trade_id: TradeId) -> Action {
        Action::MarketClose(MarketCloseCmd {
            agent_id: self.identifier(),
            trade_id,
            quantity: None,
        })
    }
}

Step 3: Scaling it up with Parallel Grid Search. Now that we have a safe, idempotent agent, we can initialise a grid of them to run many parameter combinations in parallel.

pub struct DemoAgentGrid {
    ohlcv_id: OhlcvId,
    fast_period: GridAxis,
    slow_period: GridAxis,
}

impl DemoAgentGrid {
    pub fn baseline(ohlcv_id: OhlcvId) -> ChapatyResult<Self> {
        Ok(Self {
            ohlcv_id,
            fast_period: GridAxis::new("10", "30", "1")?,
            slow_period: GridAxis::new("40", "60", "1")?,
        })
    }

    pub fn build(self) -> Vec<(usize, DemoAgent)> {
        let fasts = self.fast_period.generate();
        let slows = self.slow_period.generate();
        let ohlcv_id = self.ohlcv_id;

        iproduct!(fasts, slows)
            .filter(|(f, s)| f < s)
            .enumerate()
            .map(|(uid, (fast, slow))| (uid, DemoAgent::new(ohlcv_id, fast as u16, slow as u16)))
            .collect()
    }
}

Grid search is parallelised across cores via Rayon. On an 8-core M2 MacBook Air, this simple SMA crossover grid (~400 parameter combinations over ~9 years of BTC daily candles) runs in roughly one second.

Generated Reports

Every run produces a full set of artifacts in chapaty/reports/:

tearsheet.html: QuantStats [3] report (Sharpe, Sortino, drawdown, rolling stats, vs-benchmark)

Cumulative returns of a bidirectional SMA 20/50 crossover on BTC-USDT (EoD) vs. the S&P 500 benchmark.

Heat map of monthly returns for the bidirectional SMA 20/50 crossover on BTC-USDT (EoD).

portfolio_performance.csv: aggregate metrics
equity_curve.csv / cumulative_returns.csv: time series for custom plots
trade_statistics.csv: per-trade summary stats
leaderboard.csv: top-N agents from a grid search, ranked

(Note: This is a truncated view showing just the top 3 results for a few key metrics. The full output ranks all agents across all performance statistics.)

Top-ranked agents across selected performance metrics, from the generated leaderboard.csv.

journal.csv: every trade, fully traceable (entry/exit UTC timestamps, prices, exit reason, realised PnL in ticks and dollars)

(Note: Here is a peek at a simplified version of the journal. I've hidden a few of the more verbose columns like exact stop-loss/take-profit hits and expected metrics for readability here, but the actual output tracks the full state of every trade.)

Simplified journal entries. Every trade is logged with entry/exit timestamps, prices, and realised PnL.

Final Thoughts

This approach really sped up my iteration loop, but I'm biased as I've worked with Gym-style APIs before, and the reset / step / act pattern feels natural to me.

Sweeping 400 parameter combinations is great for exploration, but it's also a fast way to curve-fit. The numbers in the leaderboard above are in-sample by construction. They tell you which parameters would have worked on the data you already have. They don't tell you which ones will work on the data you haven't seen yet. Anything past the exploration stage needs a proper out-of-sample split (walk-forward, train/test, or both) before any conclusion about a strategy holds up. Chapaty gives you the speed to run the search. The discipline to validate it is on you.