TL;DR
I built Funding Finder, a free cross-exchange perpetual funding-rate arbitrage scanner. It polls Binance, Bybit, OKX, Bitget, MEXC, Hyperliquid, and Gate.io every 5 minutes — about 3700 USDT-margined perpetuals — ranks every base coin by the spread between its cheapest long leg and most expensive short leg, annualizes correctly per-symbol, and filters out illiquid noise.
Free web view, free JSON API, no signup. The OSS data collector is on GitHub under MIT.
This post is about the three non-obvious things most existing tools get wrong and how I handled each one. If you trade funding-rate arbitrage even casually, two of them have probably been silently distorting your scanner output.
Why I built it
I trade a small funding-arb book on the side. The data ergonomics for this strategy are surprisingly bad.
- Coinglass has the best UI but the entry API plan starts at $29/month with 30 req/min and a 180-day history cap.
- CoinAPI is commercial and pitched at institutions.
- Coinalyze is free but BTC-only.
- Each exchange's native API is free, but you have to scrape, normalize symbols, and reconcile funding intervals per venue. By the time you've done that, you've built half the product.
The strategy itself is dead simple: long the perp on the exchange with the lowest funding rate, short the perp on the exchange with the highest funding rate. Both legs are perp, no spot, beta-neutral. You capture the funding spread until convergence.
What I needed was a no-bullshit table that ranks current cross-exchange spreads by annualized yield, filtered by liquidity, across all the major venues. So I built it.
The architecture (deliberately boring)
collector.py → funding.db (SQLite + WAL) → api.py (Flask) → static/index.html
↑
│
5-min loop, polls 7 exchange public APIs in parallel
That's it. No Kafka, no Redis, no microservices, no Docker stack. One SQLite file, one Flask process, one HTML file. The whole thing fits in ~700 lines of Python and runs on a $5 VPS.
The schema is one table:
CREATE TABLE funding_rates (
exchange TEXT NOT NULL,
symbol TEXT NOT NULL,
base TEXT NOT NULL,
funding_rate REAL NOT NULL,
funding_interval_hours INTEGER NOT NULL DEFAULT 8,
next_funding_time INTEGER,
mark_price REAL,
volume_24h_usd REAL DEFAULT 0,
fetched_at INTEGER NOT NULL,
PRIMARY KEY (exchange, symbol, fetched_at)
);
The composite primary key gives you a free historical record. Run the collector in a 5-minute loop and after a week you have a queryable funding-rate history per (exchange, symbol).
Gotcha #1 — funding interval is not always 8 hours
Most tools you find on GitHub assume an 8-hour funding interval everywhere. They compute annualized yield as funding_rate × 3 × 365 (3 funding settlements per day × 365 days).
That's wrong.
Binance has 425 USDT-margined perps on a 4-hour cycle and 4 perps on a 1-hour cycle, in addition to the ~241 standard 8-hour ones. Bybit has 377 on 4-hour and 3 on 1-hour. Bitget and MEXC each have a similar spread. Hyperliquid is entirely on a 1-hour cycle. That's roughly 30% of the available CEX market (and 100% of Hyperliquid) that gets misannualized if you assume 8h.
A 0.05% per-period funding rate is:
- ≈ 55% APY on an 8h symbol (×3×365)
- ≈ 109% APY on a 4h symbol (×6×365)
- ≈ 438% APY on a 1h symbol (×24×365)
If your scanner shows the same annualized number for all of them, your scanner is lying to you.
The fix is to fetch the per-symbol funding interval at startup and cache it:
def _binance_funding_intervals() -> dict[str, int]:
r = requests.get("https://fapi.binance.com/fapi/v1/fundingInfo", timeout=15)
return {
item["symbol"]: int(item.get("fundingIntervalHours", 8))
for item in r.json()
}
For Bybit it's /v5/market/instruments-info?category=linear, paginated, with fundingInterval in minutes. Convert to hours, cache, you're done. For Bitget it's in /api/v2/mix/market/current-fund-rate. For MEXC it's collectCycle in the funding-rate response. For Gate.io it's in seconds in /futures/usdt/contracts. For Hyperliquid it's hardcoded to 1 (their entire venue is 1h).
Five exchanges, five subtly different conventions for the same field. Welcome to crypto data plumbing.
This single fix changed the top-10 of my own table substantially. The "best" 8h-symbol opportunity at the top before the fix was actually middling once I corrected the 4h-symbols above it.
Gotcha #2 — the liquidity filter is the actual product
Before the liquidity filter, my scanner found 279 cross-exchange opportunities at any given time. Sounds great.
It wasn't. The top 10 were a parade of:
- Coins about to be delisted (one exchange's funding swings violently as market makers hedge their unwind)
- New listings with thin order books on one of the two legs
- Tokens with $40k of daily volume — your $5k position would move the mark price by 2% on entry alone
Adding min_volume_24h_usd as a both-legs filter brought 279 → 51 at $5M, → 6 at $50M. The 6 at $50M+ are the actually-tradeable ones. Everything else is bait.
Code is trivial:
liquid = [g for g in group if (g.get("volume_24h_usd") or 0) >= min_volume]
if len(liquid) < 2:
continue
But the realization that the scanner is only as good as the liquidity floor it imposes is the part nobody seems to talk about. It's not a feature of "advanced" scanners, it's the default that should ship.
Gotcha #3 — symbol normalization matters more than you think
Each exchange has its own naming convention. Binance and Bybit use BTCUSDT. OKX uses BTC-USDT-SWAP. Bitget uses BTCUSDT. MEXC uses BTC_USDT. Gate.io uses BTC_USDT. Hyperliquid uses just BTC (no quote suffix because everything is USD-margined intrinsically).
If you join naively, you'll miss half your cross-exchange opportunities. Worse: if you don't know that Binance's 1000PEPEUSDT is the same coin as MEXC's PEPE_USDT (different multiplier), you'll compute a "spread" that's completely fake.
The simple normalization rules for the 90% case:
-
BTCUSDT→ baseBTC -
BTC-USDT-SWAP→ strip-USDT-SWAP, baseBTC -
BTC_USDT→ strip_USDT, baseBTC -
BTC(Hyperliquid) → baseBTC, appendUSDTfor the joined symbol
For the 10% edge cases (1000-multiplier coins, 10000-multiplier coins, exotic pairs), you either build a manual mapping or filter them out. I filter them out for v0 — better to under-report than to publish fake spreads.
def _base_from(sym: str, ex: str) -> str | None:
if ex == "okx" and sym.endswith("-USDT-SWAP"):
return sym.removesuffix("-USDT-SWAP")
if ex in ("mexc", "gateio") and sym.endswith("_USDT"):
return sym[:-5]
if sym.endswith("USDT"):
return sym[:-4]
return sym # Hyperliquid: name is already the base
What's next
- dYdX v4 and Coinbase Derivatives to round out the universe of major venues
-
Persistent history endpoint is already live (
/api/funding/history/<base>?hours=720) — the rolling collection populates SQLite, the endpoint just exposes it - Telegram alert bot, free, when an opportunity above a user-defined yield + liquidity floor appears
- API key system with a paid tier under $10/month for full history (12 months+) and higher rate limits
- Slippage estimation using order book depth, so the displayed APY accounts for the entry/exit cost on each leg
Source for the OSS collector + schema is on GitHub at <TBD-after-push> under MIT. The API/dashboard layer stays as a hosted service.
Try it: http://178.104.60.252:8083/
API docs: http://178.104.60.252:8083/docs
Tell me what's broken. Especially if you trade this strategy live — I want to hear about every gotcha that bit you, because the difference between a paper backtest and a real position is all in the details (settlement timing, mark-price gaps at funding-time, withdraw delays, leg correlation breakdown). I'll write a follow-up post on that once I've burned myself a few times.
Built solo. No funding round, no marketing budget, no enterprise sales team. Just one developer who got tired of paying for data that should be in the public domain anyway.
Top comments (0)