<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Boris Fesenko</title>
    <description>The latest articles on DEV Community by Boris Fesenko (@bjftradinggroup).</description>
    <link>https://dev.to/bjftradinggroup</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3872464%2F9357fa32-8357-471c-ac26-70119a8233dd.png</url>
      <title>DEV Community: Boris Fesenko</title>
      <link>https://dev.to/bjftradinggroup</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/bjftradinggroup"/>
    <language>en</language>
    <item>
      <title>Why crypto arbitrage windows close before your REST poll completes</title>
      <dc:creator>Boris Fesenko</dc:creator>
      <pubDate>Tue, 02 Jun 2026 12:53:03 +0000</pubDate>
      <link>https://dev.to/bjftradinggroup/why-crypto-arbitrage-windows-close-before-your-rest-poll-completes-3boc</link>
      <guid>https://dev.to/bjftradinggroup/why-crypto-arbitrage-windows-close-before-your-rest-poll-completes-3boc</guid>
      <description>&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9df5nlav3uvakshsk1ix.png" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F9df5nlav3uvakshsk1ix.png" alt=" "&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;: Crypto arbitrage windows on liquid pairs now close in under 100 ms. A REST polling loop typically takes 1–1.5 seconds round-trip. WebSocket delivers the same data in 20–100 ms. If you're still polling REST endpoints for orderbook data in 2026, you're missing the majority of opportunities — not because your strategy is wrong, but because your data plane is fundamentally too slow.&lt;/p&gt;

&lt;p&gt;This post walks through the math, shows a benchmark I ran on a handful of major exchanges, and provides production-grade Python code for a WebSocket client that handles reconnects, heartbeats, and orderbook reconstruction.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. The numbers that broke REST polling
&lt;/h2&gt;

&lt;p&gt;When I started writing crypto arbitrage bots a few years ago, polling Binance's REST API every 500 ms was perfectly acceptable. Spreads were wide, arbitrage windows lasted multiple seconds, and the orderbook for BTCUSDT moved slowly enough that a half-second-old snapshot was still tradeable.&lt;/p&gt;

&lt;p&gt;In 2026, the same approach doesn't work. Here are the numbers as they stand today:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Metric&lt;/th&gt;
&lt;th&gt;Value&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Median crypto arbitrage window on liquid pairs&lt;/td&gt;
&lt;td&gt;30–80 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Window closes in under 100 ms&lt;/td&gt;
&lt;td&gt;~90% of cases&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;REST round-trip latency (request → response → JSON parse)&lt;/td&gt;
&lt;td&gt;1.0–1.5 seconds&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;WebSocket update delivery latency (push from exchange to client)&lt;/td&gt;
&lt;td&gt;20–100 ms&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The math is brutal. A 100 ms window cannot be caught by a 1500 ms poll. By the time your REST response arrives, the orderbook you're reading is 15 cycles stale. You're not "slow" — you're not even in the same temporal universe as the event you're trying to react to.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Why REST is fundamentally slow
&lt;/h2&gt;

&lt;p&gt;REST APIs over HTTPS carry overhead that adds up:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;TCP handshake&lt;/strong&gt; — three packets to establish, typically 50–150 ms on intercontinental hops.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;TLS handshake&lt;/strong&gt; — another full round-trip, 30–100 ms.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;HTTP request/response&lt;/strong&gt; — the actual data exchange.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;JSON parse&lt;/strong&gt; — depending on payload size, 5–50 ms.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Rate-limit budget&lt;/strong&gt; — most exchanges cap REST to 10–20 requests per second per IP. Polling faster gets you banned.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Yes, modern clients use HTTP keep-alive to avoid steps 1 and 2 on every request. But you still pay them periodically. And rate limits are the real killer — even if you could parse responses in 1 ms, the exchange will throttle you after 20 requests.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# This is what every REST polling loop looks like.
# Every. Single. Iteration. Pays full round-trip cost.
&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;poll_orderbook&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;interval_ms&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;500&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;response&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;book&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;response&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;elapsed_ms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
        &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Got &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;book&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="s"&gt;bids&lt;/span&gt;&lt;span class="sh"&gt;'&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; bids in &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;elapsed_ms&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; ms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;interval_ms&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;elapsed_ms&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Running this against &lt;code&gt;https://api.binance.com/api/v3/depth?symbol=BTCUSDT&amp;amp;limit=20&lt;/code&gt; from a typical residential or VPS connection produces round-trip times of 800–1500 ms consistently. Best case: maybe 600 ms from a co-located server. Still 6× too slow for a 100 ms window.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. WebSocket: push, not poll
&lt;/h2&gt;

&lt;p&gt;WebSocket inverts the model. Instead of the client asking "what's the orderbook now?" twice a second and accepting that the answer is already stale, the client opens &lt;strong&gt;one persistent connection&lt;/strong&gt; and the exchange &lt;strong&gt;pushes updates the instant they happen&lt;/strong&gt;.&lt;/p&gt;

&lt;p&gt;Concretely:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;One TCP/TLS handshake&lt;/strong&gt; at connection time. Amortised across thousands of messages.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One subscription message&lt;/strong&gt; declaring what streams you want.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A continuous stream of deltas&lt;/strong&gt; flowing from server to client over the same connection.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;No rate limit on inbound messages&lt;/strong&gt; (the exchange controls the rate).&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The delivery latency on a properly-configured WebSocket client to a major crypto exchange is 20–100 ms, depending on geographic distance. That's the time between the exchange's matching engine processing an event and your code receiving the update. There is no polling overhead because there is no polling.&lt;/p&gt;

&lt;p&gt;Here's the bare-minimum Python client for Binance's depth stream:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;websockets&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;stream_depth&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;symbol&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;btcusdt&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wss://stream.binance.com:9443/ws/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;symbol&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;@depth20@100ms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;websockets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;ws&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ws&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;update&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;best_bid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bids&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="n"&gt;best_ask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;update&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;asks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
            &lt;span class="n"&gt;spread&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;best_ask&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;best_bid&lt;/span&gt;
            &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bid=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;best_bid&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; ask=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;best_ask&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; spread=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;spread&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;stream_depth&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run this and you'll get updates every 100 ms (the slowest tier — Binance offers &lt;code&gt;@100ms&lt;/code&gt;, &lt;code&gt;@1000ms&lt;/code&gt;, and unthrottled real-time streams). Each update arrives with the new top-of-book state. No polling. No rate-limit risk. The connection stays open as long as your process runs.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. A simple benchmark
&lt;/h2&gt;

&lt;p&gt;Here's a script that measures REST round-trip vs WebSocket inter-message arrival time for the same orderbook data. It's not a perfect apples-to-apples comparison (REST gives a full snapshot; WebSocket gives a stream of updates), but it makes the order-of-magnitude difference impossible to miss.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;statistics&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;websockets&lt;/span&gt;

&lt;span class="c1"&gt;# ----------------------------------------------------------------
# REST: measure round-trip for orderbook snapshot
# ----------------------------------------------------------------
&lt;/span&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;benchmark_rest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;latencies&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;start&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;requests&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;json&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="n"&gt;elapsed_ms&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;start&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;
        &lt;span class="n"&gt;latencies&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;elapsed_ms&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# respect rate limits
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;median_ms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;statistics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;median&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;latencies&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;p90_ms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;statistics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;quantiles&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;latencies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;p99_ms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;latencies&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# ----------------------------------------------------------------
# WebSocket: measure time between pushed updates
# ----------------------------------------------------------------
&lt;/span&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;benchmark_websocket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;gaps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;websockets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;ws&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;last&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;samples&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  &lt;span class="c1"&gt;# +1 to discard the first
&lt;/span&gt;            &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;ws&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;recv&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;perf_counter&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="n"&gt;gaps&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;now&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;last&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;1000&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;last&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;
    &lt;span class="n"&gt;gaps&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;gaps&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;:]&lt;/span&gt;  &lt;span class="c1"&gt;# discard first
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;median_ms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;statistics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;median&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gaps&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;p90_ms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;statistics&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;quantiles&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gaps&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;n&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)[&lt;/span&gt;&lt;span class="mi"&gt;8&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;p99_ms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;gaps&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c1"&gt;# ----------------------------------------------------------------
# Run both
# ----------------------------------------------------------------
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;__name__&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;__main__&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;rest_url&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;https://api.binance.com/api/v3/depth?symbol=BTCUSDT&amp;amp;limit=20&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="n"&gt;ws_uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wss://stream.binance.com:9443/ws/btcusdt@depth20@100ms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;

    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;REST:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;benchmark_rest&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rest_url&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WebSocket:&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;benchmark_websocket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ws_uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;samples&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;50&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A representative run from a European VPS to Binance:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;REST:&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'median_ms':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;920.4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'p&lt;/span&gt;&lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="err"&gt;_ms':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1180.5&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'p&lt;/span&gt;&lt;span class="mi"&gt;99&lt;/span&gt;&lt;span class="err"&gt;_ms':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;1485.2&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;WebSocket:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="err"&gt;'median_ms':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;100.2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'p&lt;/span&gt;&lt;span class="mi"&gt;90&lt;/span&gt;&lt;span class="err"&gt;_ms':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;105.1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;'p&lt;/span&gt;&lt;span class="mi"&gt;99&lt;/span&gt;&lt;span class="err"&gt;_ms':&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mf"&gt;142.8&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The WebSocket median is the throttle setting (&lt;code&gt;@100ms&lt;/code&gt;), not the underlying delivery latency — that's faster. The REST median is genuine round-trip cost. A 9× gap on the median; closer to 12× on the p99.&lt;/p&gt;

&lt;p&gt;Switching to Binance's unthrottled depth stream (&lt;code&gt;btcusdt@depth&lt;/code&gt;) drops the WebSocket median below 50 ms, widening the gap further.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. The architectural shift
&lt;/h2&gt;

&lt;p&gt;Moving from REST polling to WebSocket isn't just changing a library — it changes the architecture of your bot.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Before (REST polling):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌─────────────────┐    ┌──────────────────┐
│  Poll loop       │ ── │ Strategy engine │
│  every 500 ms    │    │ runs on snapshot │
└─────────────────┘    └──────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A single thread asks for state on a timer, hands the snapshot to the strategy, repeats. The strategy is &lt;strong&gt;stateless between polls&lt;/strong&gt; — it has no idea what happened in the gap.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;After (WebSocket event-driven):&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;┌──────────────────┐    ┌──────────────────┐    ┌──────────────────┐
│  WS connection    │ ── │ Local orderbook │ ── │ Strategy engine │
│  pushes deltas    │    │ kept current    │    │ reacts to events│
└──────────────────┘    └──────────────────┘    └──────────────────┘
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Now the client maintains a &lt;strong&gt;local replica of the orderbook&lt;/strong&gt;, applying deltas as they arrive. The strategy engine reacts to specific events (a bid lifted, an ask hit, a spread widening past a threshold). State is continuous, not sampled.&lt;/p&gt;

&lt;p&gt;This is more code. It's also the only way to react inside a 100 ms window.&lt;/p&gt;




&lt;h2&gt;
  
  
  6. Production-grade WebSocket client
&lt;/h2&gt;

&lt;p&gt;The bare-minimum example earlier works for a demo. For a real arbitrage bot, you need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Automatic reconnect&lt;/strong&gt; on disconnect&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Heartbeat / ping-pong&lt;/strong&gt; to detect dead connections faster than the OS will&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sequence number validation&lt;/strong&gt; to detect dropped messages (most exchanges include a sequence ID)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Local orderbook state&lt;/strong&gt; that applies deltas correctly&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backoff on reconnects&lt;/strong&gt; to avoid hammering the exchange after an outage&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Here's a more robust skeleton (Binance-style stream, simplified):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;
&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;websockets&lt;/span&gt;

&lt;span class="n"&gt;log&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;logging&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getLogger&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;ws_client&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;class&lt;/span&gt; &lt;span class="nc"&gt;CryptoOrderbookClient&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;__init__&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;on_update&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;uri&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;on_update&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;on_update&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_stop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_backoff_s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;

    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_stop&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;websockets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;connect&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
                    &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
                    &lt;span class="n"&gt;ping_interval&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;    &lt;span class="c1"&gt;# heartbeat every 20 s
&lt;/span&gt;                    &lt;span class="n"&gt;ping_timeout&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="c1"&gt;# treat as dead after 10 s no pong
&lt;/span&gt;                    &lt;span class="n"&gt;max_size&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="o"&gt;**&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="c1"&gt;# 1 MB message cap
&lt;/span&gt;                &lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;ws&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                    &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_backoff_s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;  &lt;span class="c1"&gt;# reset on successful connect
&lt;/span&gt;                    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WS connected to %s&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;uri&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                    &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;_consume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ws&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="nf"&gt;except &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;websockets&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ConnectionClosed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;OSError&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;WS disconnected: %s; backing off %ss&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_backoff_s&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_backoff_s&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;random&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;uniform&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
                &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_backoff_s&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;min&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_backoff_s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;30&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# exponential up to 30s
&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_consume&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ws&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;ws&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;loads&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;on_update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;json&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;JSONDecodeError&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;warning&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;non-JSON message dropped&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;handler crashed; continuing&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;stop&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;self&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;_stop&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;


&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;my_handler&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;bid&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bids&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="n"&gt;ask&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;float&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;asks&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="c1"&gt;# ... your strategy logic ...
&lt;/span&gt;    &lt;span class="nf"&gt;print&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bid=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;bid&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; ask=&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;ask&lt;/span&gt;&lt;span class="si"&gt;:&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="n"&gt;f&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;


&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;CryptoOrderbookClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wss://stream.binance.com:9443/ws/btcusdt@depth20@100ms&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;my_handler&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Notes on what this gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;&lt;code&gt;ping_interval=20, ping_timeout=10&lt;/code&gt;&lt;/strong&gt; is the single most important pair of settings. Exchanges will silently drop your connection during network blips; the OS-level TCP timeout is minutes. Without explicit ping-pong, you'll think you're connected for ages while receiving nothing. With it, you detect the dead connection in ~30 s and reconnect.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Exponential backoff&lt;/strong&gt; on reconnect prevents you from being the bot that DDoSes an exchange during their outage.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Catch all handler exceptions&lt;/strong&gt; at the top level. A bug in your strategy code should not kill the WebSocket loop and lose minutes of market data.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;For orderbook reconstruction with full delta application and sequence-number gap detection, see the &lt;a href="https://github.com/binance/binance-spot-api-docs/blob/master/web-socket-streams.md#how-to-manage-a-local-order-book-correctly" rel="noopener noreferrer"&gt;Binance WebSocket reference implementation&lt;/a&gt; — every major exchange has a similar document, and following it exactly is the only way to avoid subtle desync bugs.&lt;/p&gt;




&lt;h2&gt;
  
  
  7. What still needs REST
&lt;/h2&gt;

&lt;p&gt;WebSocket replaces REST for &lt;strong&gt;market data&lt;/strong&gt;. It does not replace REST for everything. Things that still belong on REST:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Order placement and cancellation&lt;/strong&gt; on most exchanges (some have WebSocket order entry; coverage is uneven).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Account balance queries&lt;/strong&gt;, position queries, fee tier lookups — infrequent enough that polling cost is irrelevant.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Historical data fetches&lt;/strong&gt; — REST is the right tool for "give me the last 1000 trades".&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One-shot administrative calls&lt;/strong&gt; — withdrawals, API key management, etc.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A real arbitrage bot in 2026 typically runs a &lt;strong&gt;WebSocket data plane&lt;/strong&gt; and a &lt;strong&gt;REST control plane&lt;/strong&gt; side by side. Market events arrive on WebSocket, orders go out on REST (or WebSocket order entry where available).&lt;/p&gt;




&lt;h2&gt;
  
  
  8. What happens to retail bots that don't make this transition
&lt;/h2&gt;

&lt;p&gt;A polling-based crypto arbitrage bot in 2026 isn't broken — it just runs into a degraded version of the problem:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Signals fire slower&lt;/strong&gt; because the bot only sees market state 2× per second.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Most opportunities have already closed&lt;/strong&gt; by the time the strategy reacts.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Per-trade edge collapses&lt;/strong&gt; as the bot consistently takes the worst price of the window.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Win rate drops&lt;/strong&gt; to the point where execution costs (fees + spread + slippage) exceed gross edge.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The strategy logic might be perfect. The execution layer is what kills it.&lt;/p&gt;

&lt;p&gt;This is the same dynamic that broke retail latency arbitrage on forex brokers a decade ago — except in crypto the resolution is happening over months, not years, because every major exchange now offers WebSocket and the technical bar is lower. The asymmetry will only get worse: traders running on WebSocket are pulling away from traders running on REST.&lt;/p&gt;




&lt;h2&gt;
  
  
  9. Where I came from on this
&lt;/h2&gt;

&lt;p&gt;For context: I'm one of the developers behind &lt;a href="https://bjftradinggroup.com" rel="noopener noreferrer"&gt;BJF Trading Group&lt;/a&gt;'s crypto arbitrage software. We migrated the entire market-data path from REST polling to WebSocket through late 2025 and Q1 2026. Internal measurements showed that signal-to-fill rate improved by roughly an order of magnitude on liquid pairs, and that strategies which had become marginal under polling (especially cross-exchange Hedge and intra-exchange Latency) were viable again under WebSocket.&lt;/p&gt;

&lt;p&gt;We rolled the lessons into a focused product configuration — &lt;a href="https://bjftradinggroup.com/sharptrader-crypto/" rel="noopener noreferrer"&gt;SharpTrader Crypto&lt;/a&gt; — built specifically around WebSocket-native execution for crypto exchanges, with Latency and Hedge strategy modules. It also integrates with US-accepting exchanges (Coinbase, Kraken, Gemini, Bitstamp), which mattered to us because most retail crypto arbitrage tools rely on Binance/Bybit and geo-block US residents.&lt;/p&gt;

&lt;p&gt;If you're maintaining your own bot, the rest of this article is everything you need to do the transition yourself. If you'd rather skip the connection-management plumbing and use something off-the-shelf, that's what we built.&lt;/p&gt;




&lt;h2&gt;
  
  
  10. TL;DR for the impatient
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Crypto arbitrage windows close in &lt;strong&gt;under 100 ms&lt;/strong&gt; on liquid pairs.&lt;/li&gt;
&lt;li&gt;REST polling takes &lt;strong&gt;1–1.5 seconds&lt;/strong&gt; round-trip. You will miss most opportunities.&lt;/li&gt;
&lt;li&gt;WebSocket pushes updates in &lt;strong&gt;20–100 ms&lt;/strong&gt;. You will catch most opportunities.&lt;/li&gt;
&lt;li&gt;Migration is not optional in 2026.&lt;/li&gt;
&lt;li&gt;The minimum reliable client needs: &lt;code&gt;ping_interval&lt;/code&gt;, &lt;code&gt;ping_timeout&lt;/code&gt;, exponential backoff, exception isolation, sequence-number validation. Skip any of these and the bot will silently lose minutes of market data when the connection blips.&lt;/li&gt;
&lt;li&gt;Keep REST for order placement, balances, history.&lt;/li&gt;
&lt;li&gt;The strategy logic in your bot is probably fine. The data plane is what's killing it.&lt;/li&gt;
&lt;/ul&gt;




&lt;h2&gt;
  
  
  Further reading
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://bjftradinggroup.com/latency-arbitrage-backtest-execution-time-gap/" rel="noopener noreferrer"&gt;Why latency arbitrage backtests don't survive in production&lt;/a&gt; — the broader execution-time gap problem, applied to forex but the same logic carries over.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://bjftradinggroup.com/forex-broker-audit-toolkit/" rel="noopener noreferrer"&gt;BEQI: Open-source toolkit to audit broker execution quality&lt;/a&gt; — five-dimension measurement of execution quality, originally for forex but the same five dimensions matter on crypto.&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://bjftradinggroup.com/forex-pairs-trading-statistical-arbitrage/" rel="noopener noreferrer"&gt;Forex pairs trading and statistical arbitrage explained&lt;/a&gt; — pairs trading pillar; the leg-risk dynamics it describes apply directly to spot–futures pair trading on crypto.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you want to compare notes on connection management or have a horror story about a dropped WebSocket that silently cost you a session of fills, drop a comment below — these stories are how everyone in this niche gets better at it.&lt;/p&gt;

</description>
      <category>cryptocurrency</category>
      <category>python</category>
      <category>websocket</category>
    </item>
    <item>
      <title>How Forex Brokers Detect Latency Arbitrage in 2026: A Technical Breakdown</title>
      <dc:creator>Boris Fesenko</dc:creator>
      <pubDate>Fri, 10 Apr 2026 20:50:52 +0000</pubDate>
      <link>https://dev.to/bjftradinggroup/how-forex-brokers-detect-latency-arbitrage-in-2026-a-technical-breakdown-1pfb</link>
      <guid>https://dev.to/bjftradinggroup/how-forex-brokers-detect-latency-arbitrage-in-2026-a-technical-breakdown-1pfb</guid>
      <description>&lt;p&gt;Latency arbitrage has existed as long as electronic forex trading has. The concept is straightforward: if you receive a price update faster than a broker reflects it, you can trade on the broker's stale quote before it catches up. For two decades, this was primarily an infrastructure problem — whoever had faster pipes won.&lt;/p&gt;

&lt;p&gt;In 2026, the infrastructure gap has largely closed for retail participants. Co-location at LD4, NY4, or TY3 is accessible to anyone willing to pay $100–400/month. Sub-5ms round-trip to most retail brokers is achievable. The bottleneck has shifted from hardware to detection: brokers have deployed increasingly sophisticated AI systems to identify and neutralize arbitrage order flow.&lt;/p&gt;

&lt;p&gt;This article breaks down exactly how those detection systems work — from simple heuristics to behavioral AI — and what the detection signature of latency arbitrage actually looks like from the broker's side.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why brokers care: the conflict of interest
&lt;/h2&gt;

&lt;p&gt;Before discussing detection mechanics, it's worth understanding the broker's incentive structure.&lt;/p&gt;

&lt;p&gt;Most retail forex brokers operate as market makers — they take the opposite side of client trades. When a client executes a profitable trade, the broker loses that amount. Conventional retail traders are unprofitable maybe 70% of the time; the statistical edge favors the broker. A latency arbitrageur running a well-configured setup might be profitable 65–75% of the time on short-hold positions. Across hundreds of trades per day, this is directly and measurably expensive for a market maker.&lt;/p&gt;

&lt;p&gt;ECN/STP brokers have less financial incentive to detect arbitrage — they earn commissions regardless of client P&amp;amp;L — but many still deploy detection systems due to pressure from liquidity providers who don't want to be the "fast feed" being arbitraged against their own retail distribution.&lt;/p&gt;

&lt;p&gt;The result: detection is financially motivated and has been improving consistently since around 2018.&lt;/p&gt;




&lt;h2&gt;
  
  
  Generation 1: Simple heuristics (2010–2018)
&lt;/h2&gt;

&lt;p&gt;Early detection systems were rule-based and relatively easy to circumvent. They looked for obvious patterns:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Short hold time filters.&lt;/strong&gt; If an account's average position duration was below a threshold (e.g., 30 seconds), it was flagged. Solution: add a time filter to hold positions longer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Win rate on short-duration trades.&lt;/strong&gt; A 70%+ win rate on positions held under 60 seconds, sustained over weeks, has no non-arbitrage explanation. Brokers tracked this per-account.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Fixed lot size uniformity.&lt;/strong&gt; Arbitrage bots often trade identical lot sizes across every signal. Statistical distribution of lot sizes across a genuine retail account looks nothing like this. Some brokers flagged accounts where 90%+ of trades were the same size.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;IP-based correlation.&lt;/strong&gt; Two accounts connecting from the same IP with mirrored P&amp;amp;L (one profits when the other loses) is the lock arbitrage signature. First-generation detection caught this at the network metadata level.&lt;/p&gt;

&lt;p&gt;These heuristics were effective against unsophisticated setups but generated significant false positives — legitimate algorithmic traders were caught in the same nets. They were also easy to work around: vary lot sizes slightly, hold positions longer, use different IPs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Generation 2: Statistical behavioral analysis (2018–2022)
&lt;/h2&gt;

&lt;p&gt;The second generation moved from hard thresholds to statistical modeling. Instead of "flag if hold time &amp;lt; 30s," the system builds a statistical profile of each account and compares it against a population model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Temporal correlation analysis.&lt;/strong&gt; This is the most powerful single latency arbitrage signal. The detection system timestamps every price update on its feed and every incoming order. For a latency arbitrageur, there is a statistically significant correlation between moments when the fast feed diverges from the broker's own price and the moment orders arrive. &lt;/p&gt;

&lt;p&gt;Specifically: if the broker measures the time delta between its own last price update and the arrival of an order, arbitrage accounts cluster orders at small deltas (they're trading the discrepancy). Normal retail accounts place orders without any correlation to the broker's internal price update timing.&lt;/p&gt;

&lt;p&gt;This is difficult to fake. You cannot make your order arrival time uncorrelated with price events without destroying the arbitrage signal itself.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Position lifetime distribution analysis.&lt;/strong&gt; A latency arbitrageur's position lifetime distribution is highly abnormal from a population perspective. The distribution is heavily skewed toward very short hold times (under 30 seconds) with a long tail. No conventional trading strategy produces this shape. The broker doesn't need a hard cutoff — they fit the distribution and flag accounts whose parameters fall outside the population's confidence interval.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Adverse selection measurement.&lt;/strong&gt; For market makers, the key metric is how often the price moves against them after filling a client order. Latency arbitrage fills are almost always immediately followed by adverse price movement (that's the point — the price is about to move). A normal retail account generates adverse selection at roughly random rates. An arbitrage account's fills are systematically followed by price movement in one direction. This signal is robust and hard to fake.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cross-account P&amp;amp;L correlation.&lt;/strong&gt; For lock arbitrage, the detection signature is: Account A profits at roughly the same times Account B loses. The correlation is high. Second-generation systems tracked this at the broker's clearing level. Two accounts whose P&amp;amp;L streams are strongly negatively correlated are almost certainly running a lock strategy.&lt;/p&gt;




&lt;h2&gt;
  
  
  Generation 3: Machine learning behavioral clustering (2022–present)
&lt;/h2&gt;

&lt;p&gt;The current generation uses machine learning to identify arbitrage accounts without relying on any single signal. The key innovation is clustering: rather than flagging accounts individually, the system builds behavioral feature vectors for every account and identifies clusters of accounts whose behavior is similar.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Feature vector construction.&lt;/strong&gt; For each account, the system constructs a high-dimensional feature vector including:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Mean and variance of position hold time&lt;/li&gt;
&lt;li&gt;Temporal correlation coefficient between orders and fast feed updates&lt;/li&gt;
&lt;li&gt;Win rate conditioned on hold time&lt;/li&gt;
&lt;li&gt;Lot size distribution (mean, variance, skewness)&lt;/li&gt;
&lt;li&gt;Time-of-day trading density (arbitrageurs concentrate on high-volatility sessions)&lt;/li&gt;
&lt;li&gt;Slippage distribution (positive vs negative slippage ratio)&lt;/li&gt;
&lt;li&gt;Order-to-fill time (consistent fast fills suggest algorithmic execution)&lt;/li&gt;
&lt;li&gt;IP metadata and session behavior&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Clustering.&lt;/strong&gt; Accounts are clustered by behavioral similarity. A cluster of accounts that all entered long EUR/USD positions within 200ms of each other on Tuesday at 13:47:23 UTC isn't coincidence — they're running the same software. The ML system doesn't need to know it's arbitrage software; it just knows these accounts are behaviorally correlated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Continuous retraining.&lt;/strong&gt; The system retrains on new data regularly. When arbitrage software adds new masking techniques, the behavioral fingerprint changes — and the detection system adapts. This is the arms race dynamic that makes the 2026 landscape fundamentally different from 2018.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cross-broker data sharing.&lt;/strong&gt; This is the most concerning development for arbitrage operators. There is no regulatory prohibition on brokers sharing behavioral metadata about client trading patterns. Some broker networks and shared liquidity arrangements informally share account flagging data. An account flagged at one broker in a network can be pre-flagged at another before any trades are placed.&lt;/p&gt;




&lt;h2&gt;
  
  
  What detection actually looks like from the trader's perspective
&lt;/h2&gt;

&lt;p&gt;Detection rarely manifests as an immediate account closure. The typical progression:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 1 — Monitoring.&lt;/strong&gt; The account is flagged as potentially arbitrage. No action is taken. The broker accumulates data to confirm.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 2 — Soft countermeasures.&lt;/strong&gt; The broker introduces artificial execution delays specifically on this account — typically 30–150ms added to order processing. The trader sees execution times increasing gradually. Slippage becomes systematically negative. The arbitrage window closes before the order fills. Profitability drops without any visible restriction.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 3 — Targeted spread widening.&lt;/strong&gt; The broker applies per-account spread markup on instruments used most frequently for arbitrage. From the trader's perspective, spreads appear wider than published rates. This is applied at the server level and isn't visible in standard platform spread displays.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Phase 4 — Account action.&lt;/strong&gt; Depending on the broker's ToS and appetite for risk: profit confiscation on flagged trades, account restriction, or closure with capital returned.&lt;/p&gt;

&lt;p&gt;The gradient approach is deliberate: it makes it harder for the trader to identify exactly when detection occurred and allows the broker to extract more data before acting.&lt;/p&gt;




&lt;h2&gt;
  
  
  The detection-resistance problem
&lt;/h2&gt;

&lt;p&gt;The fundamental challenge is that temporal correlation — the primary detection signal — is intrinsic to latency arbitrage. You cannot remove it without removing the strategy itself. Every order that is causally related to a fast-feed price event will be temporally correlated with that event.&lt;/p&gt;

&lt;p&gt;The countermeasures that exist work by adding noise to the signal rather than eliminating it:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Behavioral blending.&lt;/strong&gt; Running technical indicator-based entry triggers (RSI, candlestick patterns) in parallel with arbitrage execution creates an order history that partially resembles retail technical trading. The temporal correlation signal is diluted but not eliminated.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Virtual order systems.&lt;/strong&gt; Decoupling re-entry timing from fast-feed events by using software-side virtual orders that execute based on price levels rather than signal timing. The broker sees an order placed when price reached a certain level — not when the fast feed moved.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Account rotation and profile separation.&lt;/strong&gt; Maintaining multiple accounts with distinct behavioral profiles, IPs, and execution patterns to prevent clustering detection from linking them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Noise trading.&lt;/strong&gt; Adding deliberately unprofitable or breakeven trades to normalize the win rate distribution and adverse selection metrics.&lt;/p&gt;

&lt;p&gt;None of these fully eliminate the detection signal — they reduce its statistical strength. The detection system needs sufficient signal strength to confidently flag an account; countermeasures aim to keep the signal below that threshold.&lt;/p&gt;




&lt;h2&gt;
  
  
  Conclusions
&lt;/h2&gt;

&lt;p&gt;The 2026 broker detection landscape reflects a genuine technological arms race. The progression from simple hold-time heuristics to ML-based behavioral clustering represents a substantial increase in detection capability.&lt;/p&gt;

&lt;p&gt;What hasn't changed: latency arbitrage remains legal in all major jurisdictions. No financial regulator has classified it as market manipulation or any prohibited activity. The detection and restriction that traders face is contractual — brokers enforcing terms of service — not legal.&lt;/p&gt;

&lt;p&gt;What has changed: the infrastructure edge that defined latency arbitrage profitability for its first decade is now table stakes. The differentiating factors in 2026 are detection resistance, broker selection, and behavioral profile management.&lt;/p&gt;

&lt;p&gt;For a more detailed technical breakdown of infrastructure requirements and masking strategy mechanics, see: &lt;a href="https://bjftradinggroup.com/latency-arbitrage/" rel="noopener noreferrer"&gt;Latency Arbitrage: Complete Guide 2026&lt;/a&gt; — which covers VPS colocation benchmarks, fast feed architecture, and the specific masking strategies (Phantom Drift, BrightDuo) currently in production use.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;The author develops arbitrage trading software at BJF Trading Group, a Canadian HFT software company active in the arbitrage space since 2000.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;HFT&lt;/code&gt; &lt;code&gt;algorithmic trading&lt;/code&gt; &lt;code&gt;forex&lt;/code&gt; &lt;code&gt;arbitrage&lt;/code&gt; &lt;code&gt;market microstructure&lt;/code&gt; &lt;code&gt;trading systems&lt;/code&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>networking</category>
      <category>performance</category>
    </item>
  </channel>
</rss>
