DEV Community

manja316
manja316

Posted on

How to backtest a Polymarket strategy with free 15-minute historical data

Draft — queued 2026-06-10 (Wed) 04:20 IST by the growth worker. Publish-ready for the 16:13 window.
Pre-publish checklist:
(1) Re-pull live headline numbers from api.protodex.io/stats and bump §0 if the archive grew (was 18.0M snapshots / 18,039 markets / 74 days at draft time).
(2) The code recipe is a pattern, not a backtest result — no per-dataset return number is claimed, so it ships safe. If you run it and want a punchier hook, paste the real Brier/calibration delta into the §3 callout.
(3) Confirm the Gumroad agyjd listing headline matches before publish — at draft time the listing still read "10.8M+ / 13,900+", behind every free surface. Bump the listing first or the article will out-claim the store.

Most people backtest prediction-market strategies wrong, and it's not their fault — the data to do it right is annoying to assemble. You need time series per contract (not just the final resolution), aligned to a clock, with the resolution label attached so you know who won. Polymarket's API gives you the live order book, but the moment a market resolves, that history is gone from where most people look.

So here's a clean recipe. Free data, ~40 lines of pandas, and the caveats that separate a backtest you can trust from one that lies to you.

0. The data

You need a price history per market at a fixed interval, plus the resolution outcome. I've been archiving Polymarket at a 15-minute cadence since late March — 18M+ price snapshots across 18,000+ markets, 74 days of history as of this writing, with resolution labels attached. The live market index is free to browse at protodex.io.

You can roll your own collector against the Polymarket CLOB + Gamma APIs (the snapshot loop is maybe 60 lines), or skip the three-month wait and grab the parquet bundle — link at the end. Either way, the analysis below is identical.

1. Load it

Assume a parquet with columns market_id, timestamp, price (the YES probability, 0–1), and resolved_yes (1/0, the eventual outcome).

import pandas as pd

df = pd.read_parquet("polymarket_history.parquet")
df["timestamp"] = pd.to_datetime(df["timestamp"], utc=True)
df = df.sort_values(["market_id", "timestamp"])

# Keep only resolved markets — you can't score an open one
resolved = df.dropna(subset=["resolved_yes"]).copy()
print(resolved["market_id"].nunique(), "resolved markets")
Enter fullscreen mode Exit fullscreen mode

2. The strategy: fade the longshots

The favorite-longshot bias is the most-documented inefficiency in betting markets: longshots (low-probability contracts) are systematically over-priced, favorites slightly under-priced. Translated to Polymarket: a contract trading at 8¢ resolves YES less than 8% of the time on average. So a naive edge is short the longshots (buy NO when YES is cheap).

Let's test whether that holds in the snapshots. Take each market's price in a chosen band at a fixed point before resolution, and compare to the realized outcome.

# For each market, grab the last snapshot at least 24h before its final timestamp
def snapshot_24h_out(g):
    end = g["timestamp"].max()
    cutoff = end - pd.Timedelta(hours=24)
    pre = g[g["timestamp"] <= cutoff]
    return pre.iloc[-1] if len(pre) else None

picks = (resolved.groupby("market_id", group_keys=False)
         .apply(snapshot_24h_out)
         .dropna())

# Longshot band: YES priced 2–15¢
band = picks[(picks["price"] >= 0.02) & (picks["price"] <= 0.15)]
implied = band["price"].mean()          # what the market said
realized = band["resolved_yes"].mean()  # what actually happened
print(f"implied YES {implied:.3f} vs realized YES {realized:.3f}")
Enter fullscreen mode Exit fullscreen mode

If realized < implied, the longshots were overpriced — the bias is present and shorting them has positive expectancy before costs.

3. The part most backtests skip — costs and survivorship

A backtest without these three corrections is marketing, not research.

1. Spread + fees. You don't trade at the mid. On thin Polymarket longshots the bid/ask can be 2–4¢ wide. A 1¢ edge on a 6¢ contract evaporates the moment you cross a 3¢ spread. Always subtract a realistic fill cost from the implied price before scoring P&L.

2. Survivorship / resolution timing. If your archive only kept markets that resolved cleanly, you've dropped the messy ones (extended, disputed, voided) — and those aren't random. Score against every market that hit your entry filter, not just the tidy winners.

3. Liquidity ceiling. A 20¢ edge on a market with $300 of depth is a $60 edge, not a strategy. Weight every backtested position by the order-book depth at entry, or you'll "discover" an edge you can't actually fill. (This is exactly why snapshot data needs the order book, not just last-price — depth is the difference between a paper edge and a real one.)

Do these three and the favorite-longshot edge usually survives — but smaller than the raw number, and only in the deeper markets. That gap is the finding.

4. Why 15-minute snapshots specifically

Tick data is overkill for this and a nightmare to store; daily closes are too coarse to catch the late convergence where most of the price action lives (markets snap toward 0/1 in the final hours). 15 minutes is the sweet spot: dense enough to study convergence and intraday moves, sparse enough that three months fits in a few hundred MB of parquet.


If you'd rather not run a collector for three months before you can test a single idea, I've packaged the full archive — parquet + CSV, resolution labels included, one-time purchase:

👉 Polymarket Full Dataset — 18M+ price snapshots, 18,000+ markets

And the live market index is free to poke at first: protodex.io.

What strategy would you test first — favorite-longshot, late-convergence momentum, or cross-market arbitrage? Drop it in the comments and I'll point you at the columns you'd need.

Top comments (0)