DEV Community

Cover image for Building a Low-Latency Trading Engine in Rust
Sumana
Sumana

Posted on

Building a Low-Latency Trading Engine in Rust

When I started building a perpetual futures engine in Rust, the requirements looked straightforward:

  • Binance streams price updates
  • Users constantly check balances
  • Liquidations happen automatically

All of this needs to run at the same time, with low latency and zero mistakes.

Naturally, I started with a Mutex.
It felt like the safe, obvious choice.

But that assumption didn’t last long.


The First Problem: Reads Were Slower Than They Should Be

In this system, most operations are reads:

  • balance checks
  • position queries

Writes (like price updates) are much less frequent.

But with a Mutex, everything queues behind everything.

So even simple reads were waiting behind writes—and worse, sometimes behind network delays.

That’s when it became clear:

I was treating all operations equally, even though they aren’t.


Switching to RwLock

The first real improvement was replacing Mutex with RwLock.

let engine = Arc::new(RwLock::new(Engine::new(1000.0)));
Enter fullscreen mode Exit fullscreen mode

Now:

  • Multiple readers can access the state at the same time
  • Writers still get exclusive access
// Read
let engine = engine.read().await;

// Write
let mut engine = engine.write().await;
Enter fullscreen mode Exit fullscreen mode

This change alone removed the biggest bottleneck.

Reads stopped blocking each other, and latency dropped significantly.


The Next Issue: Blocking vs Yielding

Even with RwLock, there’s another subtle issue.

If you use blocking locks, the entire thread waits.

That means while one task is waiting:

  • no other tasks can run
  • the thread is effectively idle

With async:

let engine = engine.write().await;
Enter fullscreen mode Exit fullscreen mode

The task yields instead of blocking.

That allows:

  • other API calls to run
  • other tasks to progress

This is what makes it possible to handle a large number of concurrent tasks efficiently.


Network I/O Was Still a Problem

Even after fixing locks, something still felt off.

The engine was directly tied to the WebSocket feed.

And network behavior is unpredictable:

  • sometimes fast
  • sometimes slow
  • sometimes delayed

That means your core engine inherits that unpredictability.


Decoupling with MPSC

The fix here was to separate concerns using a channel.

Instead of processing prices directly:

WebSocket  Channel  Engine
Enter fullscreen mode Exit fullscreen mode
let (tx, mut rx) = tokio::sync::mpsc::channel(100);
Enter fullscreen mode Exit fullscreen mode
  • The WebSocket task just sends updates
  • The engine processes them independently

This removes network jitter from the critical path.

The engine becomes more predictable, even if the network isn’t.


Sharing State Across Tasks

At this point, multiple parts of the system needed access:

  • WebSocket task
  • API handlers
  • liquidation logic

Rust doesn’t allow multiple owners by default, so this needs to be explicit.

That’s where Arc comes in:

let engine = Arc::new(RwLock::new(Engine::new(1000.0)));
Enter fullscreen mode Exit fullscreen mode

Each task gets a clone:

let engine_clone = engine.clone();
Enter fullscreen mode Exit fullscreen mode

Now everything shares the same state safely.


Financial Accuracy: Why f64 Doesn’t Work

This is one of those things that seems small but isn’t.

Using f64:

1000.0 - 0.1 - 0.1 - 0.1
= 999.7000000000001
Enter fullscreen mode Exit fullscreen mode

That error might look tiny, but in a trading system:

  • it accumulates
  • it affects PnL
  • it can break liquidation logic

So instead:

use rust_decimal::Decimal;
Enter fullscreen mode Exit fullscreen mode

Now calculations are exact.

No rounding surprises.


Handling Errors Without Crashing

One last thing: reliability.

In a system like this, a panic isn’t just a bug—it’s downtime.

So instead of:

.unwrap()
Enter fullscreen mode Exit fullscreen mode

Everything returns a Result<T, E>:

let position = self.positions.get(&id)
    .ok_or("Position not found")?;
Enter fullscreen mode Exit fullscreen mode

Errors are handled and propagated, not ignored.


Putting It Together

At a high level, the system looks like this:

WebSocket → MPSC Channel → Engine Processor
                                ↓
                         Arc<RwLock<Engine>>
                                ↓
                  API + Reads + Writes + Liquidations
Enter fullscreen mode Exit fullscreen mode

Each part solves a specific problem:

  • RwLock → efficient reads
  • async/await → no thread blocking
  • MPSC → isolates network behavior
  • Arc → shared ownership
  • Decimal → exact calculations
  • Result<T, E> → reliability

Final Thought

None of these choices are “fancy.”

They’re just responses to real constraints:

  • high read volume
  • unpredictable I/O
  • strict correctness requirements

Once those constraints are clear, the architecture almost designs itself.

Top comments (0)