DEV Community

Cover image for An Open-Source Gym-Style Backtesting Framework for Algorithmic Trading in Rust
Len
Len

Posted on

An Open-Source Gym-Style Backtesting Framework for Algorithmic Trading in Rust

End-to-end Chapaty workflow from make run to generated tearsheet


End-to-end workflow: Running make run in the chapaty-template project executes the example strategy from this blog post and produces a QuantStats tearsheet.

TL;DR: Chapaty is an open-source Rust backtesting framework with a Gym-style [1] reset / step / act API for algorithmic trading. Strategy logic lives in a single act function. Order execution, matching engine, data sync, and reporting sit behind the simulation environment. For the example strategy in this post, a 400-point parameter grid over 9 years of end-of-day market data runs in ~1 second on an 8-core laptop.


One bottleneck in algorithmic trading workflows is the time from ideation to a backtest result.

Existing tools sit along two axes. Hosted platforms like QuantConnect, TradingView, and MetaTrader 5 offer accessible UIs but rely on closed simulators (and in some cases proprietary languages like Pine Script and MQL5) that you can't fully audit. Open-source alternatives in Rust like Nautilus Trader and Barter exist and cover backtesting and live trading. Chapaty is built around separating the algorithm completely from the framework. A Rust-native Agent trait with reset / act, observations handed in per step, and parameter grids as a first-class citizen.

To speed up the Build-Measure-Learn loop [2] for trading strategies, we need two things:

  1. An accessible framework that people are already comfortable with and that is well-established: Gym.
  2. A fast programming language with fearless concurrency, so we can natively run many different strategies in parallel: Rust.

The motivation for Chapaty was to unify these worlds in an open-source project, so that everyone has access to a framework for developing algorithms without having to focus on infrastructure.

The approach that worked was separating strategy logic from the order execution, matching engine, data syncing, and reporting. Now, all the strategy code lives in one place, so iterating on ideas no longer requires touching the rest of the pipeline.


Design Decisions

This is how Chapaty handles common backtesting pitfalls:

Timeframe Synchronization: To avoid look-ahead bias across different timeframes (e.g., mixing 1m, 1h, and Economic Calendar data), the simulation steps through time by selecting the next strictly monotonically increasing point_in_time timestamp across all data sources. It then extends the view on the data up to this point in time.

Synchronizing point-in-time data across multiple data streams


Synchronizing point-in-time data: the cursor advances to the next strictly monotonically increasing timestamp across all data streams (1-minute, 1-hour, and economic calendar). At each step, every event up to the cursor becomes visible to the agent.

Pessimistic Evaluation: If an entry and a take-profit/stop-loss occur in the exact same candle, the framework defaults to pessimistic evaluation. It assumes the worst-case scenario (e.g., the stop-loss was hit before the take-profit). You can toggle this to "optimistic" if preferred.

Slippage & Fees: Fills occur at the SL/TP price, and I manually subtract a percentage fee and slippage per trade in the journal to approximate real-world results without over-engineering the matching engine. Realism on fills and slippage is still pragmatic rather than microstructure-accurate. A future version will improve fill modeling.

Data Feeds: The framework is decoupled from the data. It can process any structured event with a point_in_time. Personally, I run it mostly on crypto (Binance Spot: OHLCV, Trades, TPO, Volume Profile across multiple timeframes) and economic calendar events. Because of the abstraction, it works equally well with traditional equities or futures.

The matching engine, the fill logic, the look-ahead handling: they're all in the repository. You can read exactly how a trade gets evaluated. That matters more for backtesting than for most software. If you can't audit the simulator, you can't trust the result. It's the reason why Chapaty needs to be open source.


Backtesting 400 Parameter Combinations in 1 Second (with a Stop-and-Reverse SMA Crossover)

By isolating the strategy logic in an act function that receives an Observation, the strategy remains independent from the OHLCV data feed.

Here is how we can implement a Stop-and-Reverse SMA Crossover in three steps. To keep things fast, we will use a StreamingSma helper to handle the rolling calculations.

Step 1: Define the Agent State. First, we create our Agent structure. We need to define the market, our parameters (which allows us to easily run a grid search later), and our internal state cache. The last_processed_ts is used to prevent double-processing.

// Brings StreamingSma into scope
use chapaty::prelude::*;

pub struct DemoAgent {
   // Unique identifier for the market (e.g., BTC-USDT)
   pub ohlcv_id: OhlcvId,

   // Strategy parameters
   pub fast_period: u16,
   pub slow_period: u16,

   // === Internal ===

   // Streaming indicators to avoid O(n) window recalculations
   fast_sma: StreamingSma,
   slow_sma: StreamingSma,

   // Internal state cache
   current_fast: Option<f64>,
   current_slow: Option<f64>,
   trade_counter: i64,
   last_processed_ts: Option<DateTime<Utc>>,
}
Enter fullscreen mode Exit fullscreen mode

Step 2: Implement the Execution Logic. Next, we implement the Agent trait. The logic flows in a strict pipeline: safely fetch the data → update internal state (idempotency check) → verify the signal → execute the trade.

(If you want to add more complex take-profit/stop-loss logic or other entry signals later, you simply extend this act implementation. I've kept it minimal here for brevity).

impl Agent for DemoAgent {
    fn act(&mut self, obs: Observation) -> ChapatyResult<Actions> {
        let market_view = &obs.market_view;

        // 1. Fetch the latest candle safely
        let Some(candle) = market_view.ohlcv().last_event(&self.ohlcv_id) else {
            return Ok(Actions::no_op());
        };

        // 2. Update Internal State (Idempotency check)
        if self.last_processed_ts != Some(candle.close_timestamp) {
            self.current_fast = self.fast_sma.update(candle.close.0);
            self.current_slow = self.slow_sma.update(candle.close.0);
            self.last_processed_ts = Some(candle.close_timestamp);
        }

        // 3. Check Signal Validity
        let (Some(fast), Some(slow)) = (self.current_fast, self.current_slow) else {
            return Ok(Actions::no_op()); // SMAs are still warming up
        };

        // 4. Look up active positions
        let agent_id = self.identifier();
        let active_trade = obs.states.find_active_trade_for_agent(&agent_id);
        let market_id: MarketId = self.ohlcv_id.into();

        let mut actions = Actions::new();

        // 5. Signal Logic (Stop-and-Reverse)
        if fast > slow {
            // Bullish Trend
            if let Some((_, state)) = active_trade {
                if state.trade_type() == &TradeType::Short {
                    // Close the Short
                    actions.add(market_id, self.close_market(state.trade_id()));
                    // Open a Long
                    actions.add(market_id, self.open(TradeType::Long));
                }
            } else {
                // Flat -> Open Long
                actions.add(market_id, self.open(TradeType::Long));
            }
        } else if fast < slow {
            // Bearish Trend
            if let Some((_, state)) = active_trade {
                if state.trade_type() == &TradeType::Long {
                    // Close the Long
                    actions.add(market_id, self.close_market(state.trade_id()));
                    // Open a Short
                    actions.add(market_id, self.open(TradeType::Short));
                }
            } else {
                // Flat -> Open Short
                actions.add(market_id, self.open(TradeType::Short));
            }
        }

        Ok(actions)
    }
}
Enter fullscreen mode Exit fullscreen mode

To keep the act method readable, the actual order creation is put in helper methods. Passing entry_price: None directly translates into a market order.

impl DemoAgent {
    fn open(&mut self, trade_type: TradeType) -> Action {
        self.trade_counter += 1;
        Action::Open(OpenCmd {
            agent_id: self.identifier(),
            trade_id: TradeId(self.trade_counter),
            trade_type,
            quantity: Quantity(1.0),
            entry_price: None, // Market Order
            stop_loss: None,
            take_profit: None,
        })
    }

    fn close_market(&self, trade_id: TradeId) -> Action {
        Action::MarketClose(MarketCloseCmd {
            agent_id: self.identifier(),
            trade_id,
            quantity: None,
        })
    }
}
Enter fullscreen mode Exit fullscreen mode

Step 3: Scaling it up with Parallel Grid Search. Now that we have a safe, idempotent agent, we can initialise a grid of them to run many parameter combinations in parallel.

pub struct DemoAgentGrid {
    ohlcv_id: OhlcvId,
    fast_period: GridAxis,
    slow_period: GridAxis,
}

impl DemoAgentGrid {
    pub fn baseline(ohlcv_id: OhlcvId) -> ChapatyResult<Self> {
        Ok(Self {
            ohlcv_id,
            fast_period: GridAxis::new("10", "30", "1")?,
            slow_period: GridAxis::new("40", "60", "1")?,
        })
    }

    pub fn build(self) -> (usize, Vec<(usize, DemoAgent)>) {
        let fasts = self.fast_period.generate();
        let slows = self.slow_period.generate();

        // 1. Eagerly collect valid combinations into a flat Vector
        let valid_args = iproduct!(fasts, slows)
            // Example filter: Fast must be less than Slow
            .filter(|(f, s)| f < s)
            .collect::<Vec<_>>();

        let total_combinations = valid_args.len();
        let ohlcv_id = self.ohlcv_id;

        // 2. Map to Agent instances
        let agents = valid_args
            .into_iter()
            .enumerate()
            .map(|(uid, (fast, slow))| (uid, DemoAgent::new(ohlcv_id, fast as u16, slow as u16)))
            .collect::<Vec<_>>();

        (total_combinations, agents)
    }
}

// In main.rs run:
// let (count, agents) = DemoAgentGrid::baseline(ohlcv)?.build();
// let leaderboard = env.evaluate_agents(agents.into_iter().par_bridge(), 100, count as u64)?;
Enter fullscreen mode Exit fullscreen mode

Grid search is parallelised across cores via Rayon. On an 8-core M2 MacBook Air, this simple SMA crossover grid (~400 parameter combinations over ~9 years of BTC daily candles) runs in roughly one second.


Generated Reports

Every run produces a full set of artifacts in chapaty/reports/:

tearsheet.html: QuantStats [3] report (Sharpe, Sortino, drawdown, rolling stats, vs-benchmark)

Cumulative returns of SMA 20/50 crossover on BTC-USDT vs S&P 500


Cumulative returns of a bidirectional SMA 20/50 crossover on BTC-USDT (EoD) vs. the S&P 500 benchmark.

Heat map of monthly returns for SMA 20/50 crossover on BTC-USDT


Heat map of monthly returns for the bidirectional SMA 20/50 crossover on BTC-USDT (EoD).

portfolio_performance.csv: aggregate metrics

equity_curve.csv / cumulative_returns.csv: time series for custom plots

trade_statistics.csv: per-trade summary stats

leaderboard.csv: top-N agents from a grid search, ranked

(Note: This is a truncated view showing just the top 3 results for a few key metrics. The full output ranks all agents across all performance statistics.)

Top-ranked agents from the leaderboard.csv output


Top-ranked agents across selected performance metrics, from the generated leaderboard.csv.

journal.csv: every trade, fully traceable (entry/exit UTC timestamps, prices, exit reason, realised PnL in ticks and dollars)

(Note: Here is a peek at a simplified version of the journal. I've hidden a few of the more verbose columns like exact stop-loss/take-profit hits and expected metrics for readability here, but the actual output tracks the full state of every trade.)

Simplified journal entries showing trade logs


Simplified journal entries. Every trade is logged with entry/exit timestamps, prices, and realised PnL.


Final Thoughts

This approach really sped up my iteration loop, but I'm biased as I've worked with Gym-style APIs before, and the reset / step / act pattern feels natural to me.

Sweeping 400 parameter combinations is great for exploration, but it's also a fast way to curve-fit. The numbers in the leaderboard above are in-sample by construction. They tell you which parameters would have worked on the data you already have. They don't tell you which ones will work on the data you haven't seen yet. Anything past the exploration stage needs a proper out-of-sample split (walk-forward, train/test, or both) before any conclusion about a strategy holds up. Chapaty gives you the speed to run the search. The discipline to validate it is on you.


Accessing the Code

The template repo has a make run quick-start that reproduces the demo above. It also ships with prompt templates for LLM-assisted development.

https://github.com/LenWilliamson/chapaty-template


References

[1] Farama Foundation, Gymnasium (2024), https://github.com/Farama-Foundation/Gymnasium

[2] E. Ries, The Lean Startup (2011), Crown Business

[3] R. Aroussi, QuantStats: Portfolio analytics for quants (2019), https://github.com/ranaroussi/quantstats

Top comments (0)