If you want to study mean-reversion on prediction markets, the data you actually need does not exist publicly. Most "Polymarket datasets" are either:
- Synthetic — generated for academic papers, no real money behind them.
- Aggregate — hourly volume and last-price across thousands of markets. Useless for tactical signal research.
So I built one and open-sourced it: cross-signal-data.
pip install cross-signal-data
from cross_signal_data import load
df = load() # pandas DataFrame, 308 rows
print(df["is_profitable"].mean()) # 0.802
This is the actual labeled outcomes of 308 closed trades from a live Polymarket crash-recovery bot, with the signal features and the resolved outcome for each trade.
Also mirrored on HuggingFace: huggingface.co/datasets/LuciferForge/cross-signal-data.
What's in the dataset
19 columns, one row per closed trade:
| Column | Description |
|---|---|
trade_id |
Sequential 0-indexed |
market_id |
Polymarket market ID (queryable via gamma-api.polymarket.com) |
question |
Market question text |
outcome_label |
YES/NO outcome the bot bet on |
entry_time |
When the crash signal fired (ISO-8601 UTC) |
exit_time |
When the position closed |
entry_price |
Per-share price at entry (0–1, Polymarket prices are probabilities) |
exit_price |
Per-share price at exit |
pre_crash_high |
Recent local-window high before the crash trigger |
drop_pct |
(pre_crash_high − entry_price) / pre_crash_high × 100 |
size_usd |
USD allocated (typically $5) |
shares |
Share count purchased |
hold_hours |
Wall-clock hours from entry to exit |
pnl_usd |
Realized P&L (theoretical, see below) |
is_profitable |
1 if pnl_usd > 0 else 0 |
exit_reason |
RECOVERY / TIMEOUT_48H / TIMEOUT |
entry_hour_utc |
Hour-of-day at entry |
entry_dow |
Day-of-week at entry (0=Monday) |
recovered_to_pct_of_high |
exit_price / pre_crash_high × 100 |
Aggregate stats
- 308 trades, 247 profitable (80.2% WR)
- Date range: March 2026 – April 2026
- Median hold: ~3 hours
- Average drop_pct at entry: ~22%
- Average recovery: ~85% of pre-crash high
Exit reason distribution
| Reason | Count | What it means |
|---|---|---|
| RECOVERY | 235 | Price climbed back to ~90% of pre-crash high. Took profit. |
| TIMEOUT_48H | 62 | Held 48 hours without recovery. Sold at whatever the bid offered. |
| TIMEOUT | 11 | Older shorter-window timeout from earlier in the dataset. |
Sports markets where the team had already lost the underlying game often end up in TIMEOUT_48H. So do political markets that crashed because the resolution fundamentals shifted, not just because of momentary panic. The bot's job is to filter those out before entering; the dataset shows where the filter fails.
How I used it
Loaded the data with the bundled loader, ran a logistic regression and a random forest, got 79.9% cross-validated accuracy from 7 features:
| Feature | RF importance |
|---|---|
drop_pct |
0.254 |
shares |
0.200 |
entry_price |
0.174 |
pre_crash_high |
0.171 |
entry_hour_utc |
0.110 |
entry_dow |
0.059 |
size_usd |
0.031 |
Translation: the bot's trigger filter is doing 100% of the work. A simple model that just learns "crashes with bigger drop_pct in the right time-of-day window are more likely to recover" basically reproduces the bot's actual win rate. There's no obvious feature engineering trick that beats the trigger.
The diurnal pattern is interesting. Hours 16, 21, 22 UTC have ~100% WR (small samples). Hour 8 UTC dips to ~55%. Off-peak hours (when US/EU traders are asleep, books are thin) are punishing.
df.groupby("entry_hour_utc")["is_profitable"].mean()
Run that yourself and see.
Important caveat: theoretical P&L vs on-chain P&L
The pnl_usd column is theoretical — computed from the bot's recorded entry_price and exit_price. This assumes you got every share filled at those prices. In practice on thin Polymarket books, fills come in slightly worse, especially for TIMEOUT exits.
I built a separate audit tool that reconciles the bot's records against on-chain fills: pnl-truthteller. On this same 308-trade dataset, it surfaces:
Theoretical P&L: +$33.49
Actual P&L: -$89.01
Slippage cost: -$122.50 (-365.8% of theoretical)
So the bot has 80.2% trigger-level WR but is slightly underwater once slippage is included. That gap is worth more than the trigger itself — it tells you that the exit ladder strategy was walking thin books down. Interesting research question, exactly the kind of thing the dataset enables.
pip install pnl-truthteller
pnl-truthteller --wallet 0xYourProxyAddress
If you build a strategy on top of the dataset, run pnl-truthteller against your live wallet too. Otherwise you'll think you're profitable when you aren't.
What this dataset is good for
- Mean-reversion alpha studies — does crash-recovery actually work? At what drop_pct does it start working? The data has all the inputs.
-
Time-of-day effects —
entry_hour_utc×is_profitablereveals diurnal patterns. - Hold-time tradeoffs — the win-rate vs hold-hours curve is in here.
-
Feature-engineering exercises — if you can predict
is_profitablebetter than 80% accuracy from these features, you've found something. - Backtesting frameworks — real labeled data with real prices, suitable for cross-validation.
What it's NOT good for
- General Polymarket research. Too narrow a slice (one bot, one signal, two months).
- High-frequency studies. Only entry/exit timestamps, not tick-level.
- Counterfactuals ("what would a different bot have done?"). Only triggered trades are recorded.
Known biases
1. Survivorship in the trigger
Only contains markets where the trigger fired (>20% drop, $0.04–$0.30 entry range). If you'd used a different threshold, you'd see different markets.
2. Selection in the entry-price band
Most rows are concentrated in $0.04–$0.30. Markets that crashed from $0.80 → $0.50 are absent (above the range). Markets at $0.02 are absent (below the floor).
3. Theoretical PnL ≠ realized PnL
See above. Use pnl-truthteller for slippage-adjusted analysis.
4. Time period
March–April 2026. Includes one Polymarket V1 → V2 migration window, various political events specific to the period, and Polygon-specific gas conditions.
Don't assume the patterns extrapolate forward indefinitely. Re-run the dataset extraction quarterly as it grows.
Reproducibility
The script that generated the dataset from the bot's positions.json is checked in: scripts/extract.py. Anyone with the bot's source data can rerun it and get the same output.
git clone https://github.com/LuciferForge/cross-signal-data
cd cross-signal-data
python scripts/extract.py \
--positions /path/to/positions.json \
--output data/crashes_v1.csv
The dataset file is also bundled inside the pip package — cross_signal_data.load() returns the data without any external download.
License & citation
MIT. Use it, fork it, train on it, build a competitor strategy. The chain is public; the data is public; the code is public.
If you publish research using it:
@dataset{cross_signal_data_2026,
title = {cross-signal-data: Polymarket crash-recovery labeled dataset},
author = {LuciferForge},
year = {2026},
url = {https://github.com/LuciferForge/cross-signal-data}
}
Resources
- Repo: github.com/LuciferForge/cross-signal-data
-
PyPI:
pip install cross-signal-data(pypi.org/project/cross-signal-data) - HuggingFace mirror: huggingface.co/datasets/LuciferForge/cross-signal-data
-
Slippage audit tool:
pip install pnl-truthteller— github - Bot source: github.com/LuciferForge/polymarket-crash-bot — same bot that produced this data
If you build a model that beats 80% on this dataset, I want to know what feature you used. The bot's edge is mine until someone finds a better one.
LuciferForge runs a public-audited Polymarket trading bot, protodex.io (5,800+ MCP servers indexed), and the free Polymarket data API.
Top comments (0)