DEV Community

Crashland
Crashland

Posted on

Everyone Is Talking About AI Trading Bots. We Actually Built One. Here's What Happened.

Search YouTube for "AI trading bot Polymarket" and you'll find dozens of videos. Most of them are people explaining how trading bots could work, or demoing someone else's code, or showing a backtest that conveniently ends before the losing period.

We got tired of watching. So we built one.

The original plan wasn't even market-making. We started looking for pricing arbitrage — mispricings between Polymarket and the underlying reference prices (BTC/ETH on Binance). When the bot first launched, we actually found a handful of real opportunities. The math worked. We captured them.

Then January 2025 hit. The opportunities dried up almost overnight. Commission fees ate the remaining edge. More sophisticated market makers entered the space with deeper capital and tighter spreads. What was briefly a gap in the market closed — quickly and completely.

So we pivoted to market-making. If you can't beat the arbitrage competition, become liquidity. Quote both sides, earn the spread, manage your inventory. That's where the real engineering challenge started.

We ran it on paper. We lost money (paper money, not real). The bot was quoting correctly, getting filled on both sides — and still bleeding. We figured out why. It's not what most people talk about when they discuss trading bots.

We're open sourcing the whole thing — the OMS, the algorithm, the UI, and this postmortem. Including the failure.

The most interesting part of this project wasn't the algorithm. It was realising how much complexity lives in order and position management before you ever get to ask "does my edge work?" That's what this article is mostly about.

One more thing before we start: the entire codebase was built with Claude as a coding partner — architecture decisions, Rust OMS design, WebSocket handling, EIP-712 signing. An AI agent built a trading bot designed to trade autonomously. We have thoughts on how that actually works in practice — which we'll get to at the end.


What Is Market-Making?

Market-making is the practice of continuously quoting both sides of a market — posting a buy price and a sell price simultaneously. Your profit comes from the bid-ask spread: if you buy at $0.48 and sell at $0.52, you pocket $0.04 per round trip. Do that thousands of times and the math works in your favor.

The catch is adverse selection. Sophisticated traders — people with better information than you — will selectively take your quotes when they know something you don't. They buy your asks when the asset is about to go up. They hit your bids when it's about to go down. You end up holding inventory at the wrong price, and your spread income gets wiped out by directional losses.

This dynamic is worse in prediction markets.

Why Polymarket Is a Different Beast

Polymarket is a decentralized prediction market where you trade contracts that resolve to exactly $1.00 (YES) or $0.00 (NO). You're not trading a continuous asset like a stock or cryptocurrency — you're pricing a probability. A contract trading at $0.65 says the market believes there's a 65% chance the event happens.

This creates several non-trivial problems for a market maker:

  • Binary resolution: every position eventually goes to zero or one. There's no averaging out over time. If you hold the wrong side through resolution, the loss is total.
  • Thin liquidity: most Polymarket markets have spreads of 5–20 cents and shallow books. Your orders move the market. Your fills are visible.
  • On-chain settlement: Polymarket uses Polygon. Every trade involves gas costs and on-chain settlement latency. This isn't just a fee — it affects how quickly you can update your quotes.
  • Information asymmetry: sophisticated participants track news, social media, and off-chain information. Your fair value model is competing against people with much richer signal.

We knew all of this going in. We built anyway. Here's what we found.


The Order Book Problem — Why Flow Control Is the Hard Part

When people think about algo trading, they think about the algorithm. The fancy model. The edge. But in practice, the algorithm is the easy part. The hard part is order flow management — keeping an accurate, real-time picture of every order you've placed and what's happened to it.

This is the problem that broke our TypeScript implementation. It's the reason we rewrote in Rust.

What an Order Book Actually Is

An order book is a live, continuously updating ledger of all outstanding buy and sell orders in a market. On the buy side, you have bids — offers to purchase at specific prices. On the sell side, you have asks — offers to sell. The difference between the best bid and best ask is the spread.

Your bot reads the order book to answer basic questions: What's the best bid right now? What's the best ask? How much volume is sitting at each price level? How deep is the book if I need to exit a large position?

The problem is that the order book changes constantly. Other traders are adding, cancelling, and filling orders at millisecond intervals. Any snapshot you take is already stale by the time you act on it.

Why Flow Control Matters

When you place an order, it enters the book with a status of "pending." It might:

  • Get filled completely
  • Get partially filled and sit on the book
  • Get cancelled by you or expired
  • Get failed due to an API error or insufficient funds

Meanwhile, your strategy loop is running on a timer, placing new orders. The market is moving. Your position is changing. If you lose track of the relationship between your pending orders and your actual filled position, you're flying blind.

Consider what happens when you lose track of a partial fill. Your strategy thinks you have 100 contracts pending at $0.60. In reality, 40 were filled — so you own 40 contracts, and 60 are still pending. If the strategy decides to buy more because it thinks there's nothing filled yet, you've just doubled your exposure without meaning to. In a market that resolves at zero, that's a disaster.

This is the order tracking problem. And it's not a theoretical concern — it's what caused real (paper) losses in our TypeScript version.

The Async Challenge

A production trading system has multiple things happening simultaneously:

  • Market data arrives on one WebSocket connection — order book updates, price changes, trade feed
  • Fill notifications arrive on another WebSocket — your orders being filled, partially filled, cancelled
  • Your strategy loop runs on a timer — evaluating the market and deciding whether to place, cancel, or modify orders
  • Your web UI needs to display current state — your positions, pending orders, P&L

All of this is concurrent. These tasks don't wait for each other. The fill notification WebSocket doesn't pause while your strategy loop is running. The strategy loop doesn't wait for the UI to finish rendering.

This means you need a data structure that multiple async tasks can safely read and write simultaneously — without locking each other out, without missing updates, without race conditions.

What Naive Implementations Look Like (And Why They Break)

The obvious approaches don't hold up under real conditions:

Python dict + threading.Lock: every read and write acquires a mutex lock. Under load, this creates lock contention. Your fill notification arrives while the strategy loop holds the lock. The fill update blocks. By the time it processes, your strategy has already made decisions based on stale state. You miss fills.

Go channels: elegant for message passing, but adds latency on the critical path. Order updates are queued sequentially. Under burst conditions — multiple fills arriving simultaneously during a market move — you build up queue lag. Your state lags reality.

TypeScript (our original implementation): JavaScript has no native concurrent data structures. The V8 event loop is single-threaded. We tried to manage concurrency with careful async/await sequencing, but event loop contention was unavoidable. Multiple async callbacks competing for the event loop meant we occasionally processed events out of order. Missed fills. Wrong position sizes. Stale quotes.

The TypeScript implementation worked fine in testing. It fell apart under realistic market conditions.

Rust's Answer: DashMap

DashMap<String, OrderInfo> is a concurrent hash map that shards internally. Instead of one big lock protecting the entire map, DashMap divides the map into independent shards and locks only the relevant shard for each operation. Most operations are effectively lock-free.

This means:

  • The strategy loop writes new orders on every tick — no lock, no contention
  • The WebSocket task updates fill states as notifications arrive — no lock, no contention
  • The web UI reads the current order state for display — no lock, no contention

Multiple async tasks access the same data structure simultaneously, each seeing a consistent, up-to-date view of order state. No missed fills. No race conditions. No stale data.

This is why we chose Rust. Not "Rust is fast." Not "Rust is memory safe." DashMap specifically fixed the class of bug that was breaking our production system.


The System Architecture

Here's the full data flow, from market data ingestion to order execution and tracking:

Polymarket WebSocket → OrderBookManager (in-memory book reconstruction)
Binance WebSocket    → CryptoPriceTracker (BTC/ETH reference prices)
                          ↓
              TradingManager.evaluate_algorithms()
                          ↓
              MarketData snapshot (frozen point-in-time view)
                          ↓
              Algorithm.evaluate(&MarketData) → TradingDecision
                          ↓
              TradingEngine.execute_order() → Exchange API
                          ↓
              OrderManager.register_order() → DashMap (lock-free)
                          ↓
              User WebSocket → fills update state in real-time
Enter fullscreen mode Exit fullscreen mode

The Key Design Principle

Strategies never touch the exchange.

This is the decision that made the codebase testable and maintainable. Each strategy receives a frozen MarketData snapshot — a point-in-time view of the order book, your current position, price references, and market metadata — and returns a TradingDecision. That's it. Pure function. No side effects. No direct exchange calls.

The TradingEngine handles execution. It takes a TradingDecision and translates it into API calls. Strategies don't know or care how orders are submitted.

This separation means:

  • You can unit-test strategies without mocking exchange APIs
  • You can swap strategies without touching execution code
  • You can replay historical market snapshots against a strategy to simulate its behavior
  • You can run multiple strategies simultaneously against the same execution layer

The Four Crates

The codebase is a Rust workspace with four crates, each with a clear responsibility:

rs-clob-client (~13k LOC) — The full Polymarket SDK. REST client, WebSocket management, and EIP-712 signing for on-chain order submission. Polymarket uses a CLOB (Central Limit Order Book) that settles on Polygon, so every order requires a cryptographic signature. This crate handles all of that. Critically, it's cleanly decoupled — swap it out and the rest of the system works against any exchange.

shared (~1.6k LOC) — Cross-crate types. The star of this crate is InventoryManager, which maintains two separate ledgers: order inventory (pending orders, capital locked but not yet filled) and position inventory (confirmed fills, actual exposure). This distinction matters more than it sounds. A naive implementation that only tracks fills will miscalculate available capital because it doesn't account for orders sitting on the book. We sized positions wrong until we built this properly.

trader (~13.5k LOC) — The market maker itself. TradingManager, OrderBookManager, TradingEngine, the AMM strategy, the web UI backend.

kalshi-trader (~9.2k LOC) — The same architecture applied to Kalshi, a regulated US prediction market exchange. The fact that we could port the system to a second exchange with manageable effort validates the architecture. The strategy/execution separation works.

The Order State Machine

Every order follows a defined lifecycle:

Pending → Partial → Filled
                 → Cancelled
                 → Failed
Enter fullscreen mode Exit fullscreen mode

OrderState is a Rust enum. The compiler enforces valid transitions. An order can't jump from Pending directly to a state that requires an intermediate step. In the TypeScript version, there was nothing stopping a bug from putting an order into an impossible state — an order that was simultaneously pending and filled, or cancelled after being fully filled. With an enum, illegal states are a compile error. They cannot exist.

Each OrderInfo carries a full audit trail: which algorithm placed the order, timestamps for each state transition, fill quantities at each step, and the market and token IDs. When something goes wrong, you have a complete record.


Why Rust — Four Specific Wins

We're not going to tell you Rust is fast and memory safe. You've heard that. Here are the four concrete ways Rust made this codebase better than the TypeScript original:

1. Illegal States Are Impossible

The OrderState enum means the type system enforces the state machine. You can't accidentally check if an order is "filled" before checking if it's "partial." You can't represent an order in an undefined transitional state. The compiler catches this class of bug at compile time, before it ever causes a missed trade or wrong position size.

2. DashMap Fixed Our Race Condition

As described above: the TypeScript implementation had missed fills due to event loop contention. DashMap's shard-based concurrent map architecture eliminated this entirely. Multiple async tasks — strategy loop, fill WebSocket, UI server — now access order state simultaneously without any of them blocking the others.

3. Exact Decimal Arithmetic

Order sizing in financial systems uses money. Money requires exact arithmetic. IEEE 754 floating-point arithmetic — what JavaScript, Python, and most languages use by default — introduces subtle rounding errors. 0.1 + 0.2 != 0.3 is the famous example, but in practice you get things like: position size calculated as $49.9999999 that rounds down to $49 instead of $50, causing systematic underbetting.

rust_decimal uses exact decimal representation, not binary floating point. We were overspending on some orders and underspending on others due to float drift in the TypeScript version. Switching to rust_decimal eliminated this. Every calculation throughout the financial math stack uses exact arithmetic.

4. Zero GC Pauses

Rust has no garbage collector. There are no GC pauses. In the TypeScript implementation, V8's garbage collector would occasionally pause execution for 50ms or more during a sweep. In a fast-moving prediction market, a 50ms pause means your quotes are stale. Stale quotes during a rapid market move mean you're offering prices that informed traders will happily take — classic adverse selection. Rust's ownership model means memory is freed deterministically, without pauses, without the GC ever touching your hot path.


The AMM Strategy

The strategy we implemented is an Automated Market Maker adapted for binary prediction markets. Here's how it works:

Fair Value Estimation

We price our quotes using two data streams simultaneously:

1. Binance real-time price feed — BTC/USDT and ETH/USDT spot prices, streamed via WebSocket. For markets correlated to crypto prices (e.g. "will BTC close above $X?"), the Binance price is the primary signal.

2. Theoretical probability model — We compute a fair probability for each binary market outcome using a formula that combines the current Binance price, historical volatility, and time remaining to resolution. This gives us a number between 0 and 1 — our estimate of what the contract should trade at.

The two streams feed into the same AMM pricing engine, which outputs a fair value we use as the center of our quote ladder.

How accurate is it? Surprisingly close. Our theoretical probability tracks the actual Polymarket price tightly under normal conditions. The problem isn't accuracy on average — it's what happens during directional moves, which we'll get to in the postmortem.

Price Ladder

Rather than placing a single order at fair value, we post a price ladder: multiple buy orders below fair value and multiple sell orders above it.

Buy at:  fair_value - spread
         fair_value - spread - delta
         fair_value - spread - 2*delta
         ...
Enter fullscreen mode Exit fullscreen mode

The spread and delta (step between rungs) are configurable. Deeper ladders capture more potential fills but expose more capital. Wider spreads mean higher profit per fill but fewer fills. Tuning these parameters is where much of the live strategy work lives.

Inventory Skew

When your position gets large on one side, you want to reduce that position — not add to it. The inventory_skew_factor adjusts quote aggressiveness based on net position.

If you're long (holding a lot of YES contracts), the bot widens the buy spread (buys less aggressively) and tightens the sell spread (sells more aggressively). This naturally mean-reverts your inventory back toward neutral without requiring explicit position management logic.

If you're short, the opposite: tighten the buy spread, widen the sell spread. The math here is straightforward, but getting the skew factor right — how aggressively to lean against inventory — requires empirical calibration.

Near-Expiry Logic

Binary markets have a resolution date. Near expiry, the dynamics change completely: probability estimates become more volatile, spreads widen, and the cost of holding the wrong side through resolution is maximal.

Our strategy stops buying when the probability exceeds 0.90 or when there are fewer than 60 seconds to resolution. Sell orders continue — we want to exit any remaining position before the market resolves, not compound it.

Position Sizing

For pure market-making, position sizing is controlled by max_collateral per market and the ladder depth/spread config. The spread determines profit per fill; the depth determines how much capital is deployed at each price level.

The codebase also includes a Kelly criterion implementation for directional strategies — sizing bets proportional to estimated edge. It's there if you want to experiment with directional approaches on top of the OMS infrastructure, but market-making profit comes from spread across many fills, not from sizing single directional bets. Worth exploring, not central to the core strategy.


The Honest Numbers — What Actually Went Wrong

Our paper trading simulation works like this: the algorithm generates both sides of the market — bids and asks — using its own pricing. When market price crosses our quoted price, we record a virtual fill. We track fills on each side, positions, and P&L without risking real capital.

The numbers from the simulation (Feb 25, 2026 — single day run):

Metric Value
Total orders placed 719,624
Filled 1,212 (0.17% fill rate)
Cancelled (replaced by new quotes) 463,295
Failed (rejected by sim) 252,185
Total filled quantity ~430,000 units

Fill breakdown by outcome and side:

Buy Sell
YES 106 fills 158 fills
NO 842 fills 106 fills

The 0.17% fill rate is normal for market makers — most quotes are cancelled and replaced as the book moves. But the composition of those fills tells the real story.

NO-side buying dominated: 842 of 1,212 fills were NO buys. YES fills were more balanced (106 buys, 158 sells — net seller, correctly reducing long exposure). But on the NO side, the bot was buying heavily and barely selling. 842 NO buys vs 106 NO sells — an 8:1 imbalance.

What the simulation revealed isn't what most people expect when they think "the algorithm was wrong." The bot was quoting. It was getting filled. Both sides. The problem was directional inventory accumulation on the NO side.

The Directional Imbalance Problem

When BTC price moves strongly in one direction during a binary market period, the bot accumulates the losing side. Classic example: price rallies sharply, YES probability goes above 0.6. The bot keeps filling NO buy orders — building up a losing NO position that resolves to zero at settlement. Meanwhile it's barely filling on the YES side because price has moved away from those quotes.

The root cause was subtle: inventory-aware quote skewing was only applied in the balanced price range (0.4–0.6). Once price moved outside that range, the bot switched to "winning-side favoring" mode with zero inventory consideration. Accumulated inventory was never worked off during directional moves — exactly when it needed to be.

From our BTC volatility analysis, the worst windows for this:

Period Hours (UTC) Vol Multiplier
US equity open 14–16 1.37–1.94x
Late night spike 22 1.65x
Quietest (safest) 11–12 0.55–0.65x

Fridays are 1.69x more volatile than average. Saturdays are 0.37x — the safest day to run tight spreads. We were running static spreads through all of this.

One more data point: Binance leads Polymarket price 97.5% of the time, with a median lag of ~5 ticks. This means sophisticated participants can see where Polymarket probability is about to move before the order book reflects it. When they take our liquidity, they already know which way it's going.

The Structural Problem: You Can't Fight the Big Market Makers

This isn't purely a model problem. It's a structural problem for any small market maker in a thin market.

Here's the dynamic: both you and the big market makers are computing theoretical fair value from the same Binance price feeds. Your estimates will be close. But they have more capital, tighter spreads, and crucially — their quotes get filled first.

When price moves directionally, the big market makers aggressively manage their inventory. They widen or pull the side that's accumulating. They're not going to sit on a growing NO position when BTC is pumping. They adjust.

If you can't move as fast, or don't have deep enough capital to offset one side, you end up buying into their order flow. They exit the NO side by selling into your bids. You absorb the inventory they're shedding. You're not trading against the market — you're absorbing the risk that the sophisticated player just offloaded onto you.

  • Your best NO bids get hit precisely when NO is the wrong side to hold
  • Your YES asks get lifted when YES is about to rally
  • You end up long the losing side every time

The only defense is to be the price-setter — enough capital to define the market and aggressively skew your own inventory. With limited capital, you absorb directional risk without the ability to control it.

What We Fixed

The fixes were surgical and data-driven:

Universal inventory skew: extended inventory-aware quote adjustment to all price levels, not just 0.4–0.6. Now the bot continuously pushes quotes away from the heavy side during directional moves.

Time-of-day spread adjustment: spreads now widen to ~1.94x during the US equity open (hour 15) and narrow to ~0.55x during the quiet Asian midday (hour 12). Derived from actual measured BTC volatility by hour. Static spreads during peak volatility were the single biggest source of adverse selection.

Bands strategy now inventory-aware: previously the Bands strategy had zero inventory consideration. Target price shifting now helps avoid one-sided accumulation.

These changes reduce losing-side accumulation and improve adverse selection defence. Whether they're enough to overcome the structural disadvantage of a small market maker in a thin market — that's what the next run will measure.


Why Open Source It If It Doesn't Work Yet

This is the question we knew people would ask.

The OMS is solid. DashMap for concurrent order tracking. Typed state machine for order lifecycle. Full EIP-712 signing stack. WebSocket management that handles reconnects and partial fills cleanly. Clean architecture with clear crate boundaries. This is reusable infrastructure, independent of whether our specific alpha works.

The framework is right. The data flow we built — market data ingestion, snapshot isolation, pure strategy functions, execution layer separation — is correct. These are the right abstractions. Other people building prediction market bots will hit the same architecture problems we hit, and we're saving them the time we spent figuring this out.

More eyeballs fix models faster. The probability calibration problem is exactly the kind of problem that benefits from community engagement. We have one approach (Binance price feeds). Someone else might have a better one. Opening the code invites those contributions in a way that closed development doesn't.

Transparency is valuable and rare. The internet is full of "our trading bot makes X% returns" posts. It's nearly empty of "here's what we built, here's what broke, here are the specific numbers." We think the latter is more useful. Sharing a genuine failure with full technical detail is more valuable to the community than another success story with the embarrassing parts cut out.

Two exchanges, one architecture — and that's the point. The Kalshi variant (kalshi-trader, 9.2k LOC) validates that the design generalises. Kalshi is a US-regulated prediction market with a completely different API, different authentication model, and different regulatory structure from Polymarket. We ported the entire trading system to it by swapping out one crate — rs-clob-client — while keeping the OMS, the strategy layer, and the web UI untouched. The core abstractions held. If you want to build on top of this for another exchange, that's the pattern: implement the exchange client interface, wire it up, everything else works.


On Building This With AI Agents

We used Claude heavily throughout this project. Here's what that actually looked like — not the marketing version.

The prerequisite is domain knowledge. You can't prompt "build me a market maker" and get something that works. We had already built and run a TypeScript version. We understood order lifecycle edge cases, inventory risk, and market microstructure before any AI was involved. The Rust rewrite worked because we wrote two documents — CODEBASE_SUMMARY.md and TRADING_ARCHITECTURE.md — that contained every hard decision already made. Those docs were the real prompts. Claude turned them into working Rust.

Where it worked well:

  • TypeScript → Rust translation on pattern-heavy code (WebSocket client, the boilerplate layer)
  • Web UI — Axum + WebSocket broadcasting is well-documented, Claude produced clean first-pass code
  • Bug fixes with clear symptoms — "this log shows X, the code does Y, fix it" is reliable

Where it got it wrong:

  • Fill simulator had a division-by-zero bug in the Binance price flip check. AI wrote the initial code but didn't anticipate the edge case where price was unchanged between ticks. Caught from production logs.
  • The OrderSimulator wasn't being passed to the algorithm in the right initialization order — a wiring bug invisible without full execution context in a single prompt.
  • RUST_LOG was silently ignored because EnvFilter wasn't wired into the tracing subscriber. AI generated the setup but missed the integration. Took log inspection to diagnose.
  • Near-expiry exit logic — knowing when to stop buying and start dumping before a market resolves is trading intuition. You can't prompt for that.

The honest framing: AI is an excellent translator and implementer when you can describe precisely what you want. The domain expertise came first. The AI came second. "10x faster implementation" — yes. "AI built a trading system from scratch" — no.


Key Lessons

1. Calibrate Your Fair Value Model Before You Trust It

Your fair value model will produce numbers that look reasonable. That doesn't mean they're accurate. Before trusting your probability estimates with real capital, test them empirically — measure the gap between predicted and actual outcomes on real data.

If you haven't done this, you don't know if your edge is real. Miscalibrated estimates will show up as systematic directional losses, exactly the pattern we saw.

Calibration testing is unglamorous work. It doesn't feel like building. But it's what separates a model that loses slowly and teaches you something from one that just loses.

2. The OMS Is Boring but Critical

The market-making algorithm is the fun part. The order management system is the plumbing. But a bad OMS will cause more losses than a mediocre algorithm. A bad model with a good OMS is recoverable — you lose slowly, you see what's happening, you can iterate. A good model with a bad OMS is a disaster — you lose fast, in ways you can't diagnose, and the errors compound.

Get your concurrent order tracking right before you worry about your alpha.

3. Paper Trading Flatters You

Paper trading doesn't tell you whether your strategy works. It tells you whether it would have worked in a counterfactual world where your orders don't move the market, sophisticated participants don't adapt to your patterns, and adverse selection doesn't exist.

Our paper losses are almost certainly optimistic. In live trading, the Binance → Polymarket mapping would have been worse (informed traders would have front-run us), the adverse selection would have been more severe (we'd have been more visible in the book), and our order flow impact would have shifted market prices against us.

Paper trading is useful for testing mechanics. It's not useful for testing edge.

We also tried Kalshi's sandbox environment as an alternative testing ground — the idea being that a regulated exchange with a real-money simulation environment might give more realistic fills than our internal simulator. It didn't work. The Kalshi sandbox exists, but there are no other participants trading in it. The order book is empty. A market maker with no one to trade against is just posting orders into a void. Real fills require a real, active market — there's no shortcut around that.

4. The Strategy/Execution Separation Pays Off

Pure strategy functions that receive data and return decisions are testable. You can write unit tests that pass in a market snapshot and assert on the output. You can replay historical data and see what the strategy would have done. You can run two strategies in parallel against the same execution layer for comparison.

The moment a strategy contains direct exchange calls, it becomes untestable without a live exchange connection. You can't unit test it. You can't replay history against it. Bugs in it only surface in production. We were deliberate about this separation, and it paid dividends every time we needed to debug a strategy behavior.


One More Idea: Cross-Exchange Arbitrage

Since the codebase already supports both Polymarket and Kalshi, there's an untested angle worth exploring: pricing differences between the two exchanges on the same underlying market.

When the same event is listed on both Polymarket and Kalshi, their probabilities sometimes diverge — briefly. If Polymarket prices a YES at $0.62 and Kalshi prices the same outcome at $0.58, you could buy on Kalshi and sell on Polymarket simultaneously, locking in a risk-free $0.04 spread if both sides fill.

We haven't tested this. Our belief is that the margins are very thin now — sophisticated participants with low-latency infrastructure have already compressed most of these gaps. But the infrastructure to try it exists in this codebase: two exchange clients, one shared OMS. If you want to explore it, it's a natural next experiment.


⚠️ Important Disclaimer

This code can and will trade real money if you configure it to do so.

TEST_MODE: true and simulate_orders: true in the config keep it in paper trading mode. If you set TEST_MODE: false and connect real credentials, it will place real orders on a real exchange with real funds.

We are not responsible for any financial losses incurred from running this code. Prediction market trading carries significant risk. Do not run this in live mode without:

  • Reading and understanding the full codebase
  • Testing thoroughly in paper mode first
  • Setting strict position limits you can afford to lose entirely
  • Understanding the directional risk dynamics described in this article

The OMS is solid. The strategy is experimental. Treat it accordingly.


FAQ

What is market-making on Polymarket?

Market-making on Polymarket means continuously posting buy (YES) and sell (NO) orders in a prediction market. You profit from the spread between your buy and sell prices. The risk is that better-informed traders take your liquidity when they believe the probability is mispriced in their favor — you end up holding the wrong side. Polymarket's binary resolution (contracts go to $1 or $0) and on-chain settlement make this harder than market-making in traditional asset classes.

Can I use Kelly criterion with this bot?

The codebase includes a Kelly implementation, but it's most relevant for directional strategies — not pure market-making. For market-making, your profit comes from the spread across many fills, not from sizing a single directional bet. The natural position controls are max_collateral per market and the ladder depth/spread config. If you're building a directional strategy on top of the OMS, Kelly is worth exploring — but validate your edge estimates empirically first.

Why Rust for algorithmic trading?

Not because it's fast (though it is). The specific reasons: DashMap provides lock-free concurrent order tracking that fixed real race conditions in our TypeScript implementation. The OrderState enum makes illegal state transitions a compile error. rust_decimal provides exact decimal arithmetic that eliminated float rounding bugs in order sizing. And zero GC pauses mean no stale quotes during market movements from garbage collection.

What caused the losses in paper trading?

Directional inventory accumulation. From the simulation data: 1,212 fills, with NO-side buys dominating at 842 fills vs 106 NO sells — an 8:1 imbalance. The bot was quoting both sides correctly, but NO buys filled at 8x the rate of NO sells. In a BTC-correlated market, this means the bot was accumulating NO inventory exactly when BTC price was moving in a direction that made YES more likely — building a position on the wrong side with no mechanism to exit it.

Is poly-rs-trader profitable?

No, not in its current state. The paper trading run lost $5,499. The OMS architecture is solid and the execution layer is correct, but the probability model needs significant work — particularly empirical calibration of confidence intervals. The alpha doesn't exist yet. That's why we're open sourcing it: the framework is right, the model needs iteration, and we'd rather build that in public with community input.

Can I use the OMS for my own trading bot?

Yes, that's exactly what we're hoping for. The rs-clob-client crate handles Polymarket-specific exchange interaction. The shared crate's types, InventoryManager, and order state machine are exchange-agnostic. The trader crate's TradingEngine and strategy separation pattern are reusable. Swap out the exchange client, implement your own strategy as a pure function, and the rest of the system works. The Kalshi variant proves this is feasible.


Get Involved

The code is at github.com/Crashland/polymarket-orderbook-trader.

If you're building on Polymarket or Kalshi, the OMS infrastructure should save you significant time. The EIP-712 signing, WebSocket management, and concurrent order tracking are the parts that take the longest to get right — they're done.

If you have ideas on probability calibration, we especially want to hear from you. Better approaches to mapping external price signals to prediction market probabilities, empirical calibration methods for fair value models, or alternative fair value models for binary markets — open an issue or a PR.

We're going to keep building on this. The next step is proper empirical validation of the confidence model before running any strategy in live conditions. More updates at Crashland.

Follow along for more builder transparency pieces. We'll keep sharing the numbers, the architecture, and the honest postmortems — win or lose.


Built by Crashland. The OMS state machine, AMM inventory skew, crate separation, and all trading logic are original design decisions. AI agents assisted with implementation. The engineering came first.

Top comments (0)