DEV Community

Cover image for 308 Labeled Polymarket Crash Trades — Free Dataset For Mean-Reversion Research
manja316
manja316

Posted on

308 Labeled Polymarket Crash Trades — Free Dataset For Mean-Reversion Research

If you want to study mean-reversion on prediction markets, the data you actually need does not exist publicly. Most "Polymarket datasets" are either:

  • Synthetic — generated for academic papers, no real money behind them.
  • Aggregate — hourly volume and last-price across thousands of markets. Useless for tactical signal research.

So I built one and open-sourced it: cross-signal-data.

pip install cross-signal-data
Enter fullscreen mode Exit fullscreen mode
from cross_signal_data import load
df = load()                          # pandas DataFrame, 308 rows
print(df["is_profitable"].mean())    # 0.802
Enter fullscreen mode Exit fullscreen mode

This is the actual labeled outcomes of 308 closed trades from a live Polymarket crash-recovery bot, with the signal features and the resolved outcome for each trade.

Also mirrored on HuggingFace: huggingface.co/datasets/LuciferForge/cross-signal-data.


What's in the dataset

19 columns, one row per closed trade:

Column Description
trade_id Sequential 0-indexed
market_id Polymarket market ID (queryable via gamma-api.polymarket.com)
question Market question text
outcome_label YES/NO outcome the bot bet on
entry_time When the crash signal fired (ISO-8601 UTC)
exit_time When the position closed
entry_price Per-share price at entry (0–1, Polymarket prices are probabilities)
exit_price Per-share price at exit
pre_crash_high Recent local-window high before the crash trigger
drop_pct (pre_crash_high − entry_price) / pre_crash_high × 100
size_usd USD allocated (typically $5)
shares Share count purchased
hold_hours Wall-clock hours from entry to exit
pnl_usd Realized P&L (theoretical, see below)
is_profitable 1 if pnl_usd > 0 else 0
exit_reason RECOVERY / TIMEOUT_48H / TIMEOUT
entry_hour_utc Hour-of-day at entry
entry_dow Day-of-week at entry (0=Monday)
recovered_to_pct_of_high exit_price / pre_crash_high × 100

Aggregate stats

  • 308 trades, 247 profitable (80.2% WR)
  • Date range: March 2026 – April 2026
  • Median hold: ~3 hours
  • Average drop_pct at entry: ~22%
  • Average recovery: ~85% of pre-crash high

Exit reason distribution

Reason Count What it means
RECOVERY 235 Price climbed back to ~90% of pre-crash high. Took profit.
TIMEOUT_48H 62 Held 48 hours without recovery. Sold at whatever the bid offered.
TIMEOUT 11 Older shorter-window timeout from earlier in the dataset.

Sports markets where the team had already lost the underlying game often end up in TIMEOUT_48H. So do political markets that crashed because the resolution fundamentals shifted, not just because of momentary panic. The bot's job is to filter those out before entering; the dataset shows where the filter fails.


How I used it

Loaded the data with the bundled loader, ran a logistic regression and a random forest, got 79.9% cross-validated accuracy from 7 features:

Feature RF importance
drop_pct 0.254
shares 0.200
entry_price 0.174
pre_crash_high 0.171
entry_hour_utc 0.110
entry_dow 0.059
size_usd 0.031

Translation: the bot's trigger filter is doing 100% of the work. A simple model that just learns "crashes with bigger drop_pct in the right time-of-day window are more likely to recover" basically reproduces the bot's actual win rate. There's no obvious feature engineering trick that beats the trigger.

The diurnal pattern is interesting. Hours 16, 21, 22 UTC have ~100% WR (small samples). Hour 8 UTC dips to ~55%. Off-peak hours (when US/EU traders are asleep, books are thin) are punishing.

df.groupby("entry_hour_utc")["is_profitable"].mean()
Enter fullscreen mode Exit fullscreen mode

Run that yourself and see.


Important caveat: theoretical P&L vs on-chain P&L

The pnl_usd column is theoretical — computed from the bot's recorded entry_price and exit_price. This assumes you got every share filled at those prices. In practice on thin Polymarket books, fills come in slightly worse, especially for TIMEOUT exits.

I built a separate audit tool that reconciles the bot's records against on-chain fills: pnl-truthteller. On this same 308-trade dataset, it surfaces:

Theoretical P&L:  +$33.49
Actual P&L:       -$89.01
Slippage cost:    -$122.50  (-365.8% of theoretical)
Enter fullscreen mode Exit fullscreen mode

So the bot has 80.2% trigger-level WR but is slightly underwater once slippage is included. That gap is worth more than the trigger itself — it tells you that the exit ladder strategy was walking thin books down. Interesting research question, exactly the kind of thing the dataset enables.

pip install pnl-truthteller
pnl-truthteller --wallet 0xYourProxyAddress
Enter fullscreen mode Exit fullscreen mode

If you build a strategy on top of the dataset, run pnl-truthteller against your live wallet too. Otherwise you'll think you're profitable when you aren't.


What this dataset is good for

  • Mean-reversion alpha studies — does crash-recovery actually work? At what drop_pct does it start working? The data has all the inputs.
  • Time-of-day effectsentry_hour_utc × is_profitable reveals diurnal patterns.
  • Hold-time tradeoffs — the win-rate vs hold-hours curve is in here.
  • Feature-engineering exercises — if you can predict is_profitable better than 80% accuracy from these features, you've found something.
  • Backtesting frameworks — real labeled data with real prices, suitable for cross-validation.

What it's NOT good for

  • General Polymarket research. Too narrow a slice (one bot, one signal, two months).
  • High-frequency studies. Only entry/exit timestamps, not tick-level.
  • Counterfactuals ("what would a different bot have done?"). Only triggered trades are recorded.

Known biases

1. Survivorship in the trigger

Only contains markets where the trigger fired (>20% drop, $0.04–$0.30 entry range). If you'd used a different threshold, you'd see different markets.

2. Selection in the entry-price band

Most rows are concentrated in $0.04–$0.30. Markets that crashed from $0.80 → $0.50 are absent (above the range). Markets at $0.02 are absent (below the floor).

3. Theoretical PnL ≠ realized PnL

See above. Use pnl-truthteller for slippage-adjusted analysis.

4. Time period

March–April 2026. Includes one Polymarket V1 → V2 migration window, various political events specific to the period, and Polygon-specific gas conditions.

Don't assume the patterns extrapolate forward indefinitely. Re-run the dataset extraction quarterly as it grows.


Reproducibility

The script that generated the dataset from the bot's positions.json is checked in: scripts/extract.py. Anyone with the bot's source data can rerun it and get the same output.

git clone https://github.com/LuciferForge/cross-signal-data
cd cross-signal-data
python scripts/extract.py \
    --positions /path/to/positions.json \
    --output data/crashes_v1.csv
Enter fullscreen mode Exit fullscreen mode

The dataset file is also bundled inside the pip package — cross_signal_data.load() returns the data without any external download.


License & citation

MIT. Use it, fork it, train on it, build a competitor strategy. The chain is public; the data is public; the code is public.

If you publish research using it:

@dataset{cross_signal_data_2026,
    title  = {cross-signal-data: Polymarket crash-recovery labeled dataset},
    author = {LuciferForge},
    year   = {2026},
    url    = {https://github.com/LuciferForge/cross-signal-data}
}
Enter fullscreen mode Exit fullscreen mode

Resources

If you build a model that beats 80% on this dataset, I want to know what feature you used. The bot's edge is mine until someone finds a better one.


LuciferForge runs a public-audited Polymarket trading bot, protodex.io (5,800+ MCP servers indexed), and the free Polymarket data API.

Top comments (0)