DEV Community

vesper_finch
vesper_finch

Posted on

Building a Prediction Market Arbitrage Scanner in Python: Architecture and Results

Prediction markets are one of the few areas where AI automation can generate real alpha — not because AI is smarter at predicting events, but because it can analyze structural mispricings across thousands of markets simultaneously.

I built a scanner that does exactly this. Here's the technical breakdown.

The Problem

Polymarket has 750+ active events and 11,000+ markets. No human can monitor all of them. But three types of structural mispricings can be detected programmatically:

1. Outcome Sum Errors

For mutually exclusive events ("Who wins the election?"), the sum of all Yes prices must equal 1.0. Any deviation = guaranteed profit.

from polymarket_scanner import PolymarketClient, ArbitrageScanner

client = PolymarketClient()
scanner = ArbitrageScanner(client)

events = client.get_events(max_pages=15)
opportunities = scanner.scan_exclusive_outcomes(events)

for opp in opportunities:
    if opp.confidence == "high":
        print(f"{opp.description}")
        print(f"  Profit: {opp.estimated_profit_pct}%")
Enter fullscreen mode Exit fullscreen mode

2. Ladder Contradictions

Threshold markets must follow monotonicity: P("BTC > K") >= P("BTC > K").

Violations are logical impossibilities, and they appear more often than you'd think — I found 45 in a single scan.

ladders = scanner.scan_ladder_contradictions(events)

for opp in ladders:
    print(f"{opp.description}")
    print(f"  Strategy: {opp.strategy}")
    print(f"  Deviation: {opp.deviation_pct}%")
Enter fullscreen mode Exit fullscreen mode

3. Cross-Market Implications

"Trump wins" implies "Republican wins". If P(Trump) > P(Republican), that's a contradiction.

Finding these requires comparing markets across different events — an O(n²) problem that becomes tractable with keyword matching or embeddings.

Architecture Decisions

Why dataclasses, not dicts?

Every market and opportunity is a typed dataclass. This makes the code self-documenting and catches errors at development time rather than runtime.

@dataclass
class Market:
    question: str
    yes_price: float
    no_price: float
    liquidity: float
    volume_24h: float
    condition_id: str

@dataclass
class Opportunity:
    type: str
    description: str
    markets: list
    deviation_pct: float
    estimated_profit_pct: float
    min_liquidity: float
    strategy: str
    confidence: str  # "high", "medium", "low"
Enter fullscreen mode Exit fullscreen mode

Why confidence levels?

Not all "mispricings" are real. "Which teams make the playoffs?" has 30 markets summing to 16 — because 16 teams qualify. Without confidence filtering, you'd trade on noise.

Why no external dependencies beyond ?

Fewer dependencies = faster setup, fewer security risks, easier auditing. The Gamma API returns simple JSON. No need for heavy frameworks.

Real Results from Today's Scan

Metric Value
Events scanned 750
Markets scanned 11,821
High-confidence exclusive outcome mispricings 31
Ladder contradictions 45
Cross-market implications 20

Most high-liquidity markets are efficiently priced (< 0.5% deviation). The edges are in:

  • Lower-liquidity niche markets
  • Newly opened markets (before bots adjust)
  • Complex multi-leg logical relationships

Running Continuously

import time

client = PolymarketClient()
scanner = ArbitrageScanner(client)

while True:
    results = scanner.full_scan(max_pages=5)
    for opp in results["ladder_contradictions"]:
        if opp["deviation_pct"] > 5:
            send_alert(opp)  # your notification method
    time.sleep(300)
Enter fullscreen mode Exit fullscreen mode

Get the Full Toolkit

The complete scanner with all three analysis modules, API client, and examples is available for :

Polymarket Scanner Toolkit on Gumroad

Python 3.10+. One dependency (). Works out of the box.


What approaches are you using for prediction market analysis? I am especially interested in hearing about embedding-based semantic matching for cross-market analysis.

Top comments (0)