Ayrat Murtazin

Posted on Apr 21

Polymarket Edge: Implementing the LMSR Pricing Engine in Python

#python #quant #trading #finance

Prediction markets like Polymarket don't operate on traditional order books. They use an automated market maker called the Logarithmic Market Scoring Rule (LMSR) — a single equation that governs how prices move, how liquidity behaves, and where exploitable edges appear. Most participants treat these markets like a sportsbook. The traders who compound quietly treat them like a pricing model with known inputs and measurable inefficiencies.

This article builds a complete LMSR-based analysis toolkit in Python. We will implement the core cost function, an expected value calculator that surfaces edge versus the current market price, a slippage simulator that estimates price impact before you place a bet, and a Kelly Criterion position sizer that keeps risk per trade proportional to estimated edge. Every component runs on plain NumPy and is designed to be dropped into a Colab notebook or local script.

Most algo trading content gives you theory.
This gives you the code.

3 Python strategies. Fully backtested. Colab notebook included.
Plus a free ebook with 5 more strategies the moment you subscribe.

5,000 quant traders already run these:

Subscribe | AlgoEdge Insights

This article covers:

Section 1 — The LMSR Mechanism:** What LMSR is, how it differs from an order book, and the intuition behind the cost function
Section 2 — Python Implementation:** Setup, core LMSR functions, expected value and slippage logic, Kelly sizing, and a visualization of price impact curves
Section 3 — Results and Interpretation:** What the toolkit reveals about a sample market and what realistic edge looks like
Section 4 — Use Cases:** Where this framework applies beyond a single Polymarket trade
Section 5 — Limitations and Edge Cases:** Where the model breaks down and what assumptions to challenge

1. The LMSR Mechanism

The Logarithmic Market Scoring Rule was introduced by Robin Hanson as a way to run a prediction market without requiring a counterparty for every trade. Instead of matching a buyer to a seller, LMSR uses a liquidity parameter b and a cost function to determine the price of any trade automatically. The market always accepts your order. What changes is the price you pay, which increases as you buy more of one outcome.

The core intuition is this: LMSR tracks the number of shares outstanding for each outcome and prices each marginal share according to a softmax over those quantities scaled by b. If 100 YES shares and 50 NO shares are outstanding, YES is more expensive than NO. When you buy YES shares, you push that price higher and the NO price drops proportionally. The market is always internally consistent — probabilities sum to one.

The cost function is C(q) = b * log(sum(exp(q_i / b))), where q is the vector of outstanding shares for each outcome. The price of an outcome at any point is exp(q_i / b) / sum(exp(q_j / b)) — exactly a softmax. This is the same operation used in the output layer of a neural network classifier, which is not a coincidence. Both are computing a normalized probability distribution from unnormalized scores.

To buy delta shares of outcome i, you pay C(q_after) - C(q_before). That difference is your cost. If the cost per share is lower than your estimated true probability, you have positive expected value. The entire analytical problem reduces to accurately estimating that probability and sizing your position to survive the variance until the market resolves.

2. Python Implementation

2.1 Setup and Parameters

The model requires four configurable values. b is the LMSR liquidity parameter — higher values mean larger trades move prices less. shares is the current vector of outstanding shares per outcome (YES, NO). true_prob is your estimated probability for the YES outcome, the only input that requires genuine research. bankroll is your total capital available for this market.

import numpy as np
import matplotlib.pyplot as plt
import matplotlib.ticker as mticker

# --- Parameters ---
b = 100.0                    # LMSR liquidity parameter (higher = deeper book)
shares = np.array([80.0, 60.0])   # Outstanding shares: [YES, NO]
true_prob = 0.55             # Your estimated probability for YES
bankroll = 1000.0            # Total capital ($)
kelly_fraction = 0.25        # Fractional Kelly (0.25 = quarter-Kelly)
max_shares_to_buy = 200      # Range for slippage simulation

2.2 LMSR Cost Function and Price Engine

These three functions are the complete pricing engine. lmsr_cost computes the log-sum-exp of outstanding shares scaled by b. lmsr_price returns the current softmax probability for each outcome. lmsr_trade_cost computes exactly what you pay for delta shares of a given outcome by differencing the cost function before and after the hypothetical purchase.

def lmsr_cost(q: np.ndarray, b: float) -> float:
    """Total cost state: b * log(sum(exp(q / b)))"""
    return b * np.log(np.sum(np.exp(q / b)))

def lmsr_price(q: np.ndarray, b: float) -> np.ndarray:
    """Current market prices via softmax — sums to 1.0"""
    exp_q = np.exp(q / b)
    return exp_q / np.sum(exp_q)

def lmsr_trade_cost(q: np.ndarray, b: float, outcome: int, delta: float) -> float:
    """Cost to buy `delta` shares of `outcome` given current share state `q`."""
    q_after = q.copy()
    q_after[outcome] += delta
    return lmsr_cost(q_after, b) - lmsr_cost(q, b)

# --- Current market state ---
current_prices = lmsr_price(shares, b)
yes_market_price = current_prices[0]
print(f"YES market price : {yes_market_price:.4f}")
print(f"NO  market price : {current_prices[1]:.4f}")

2.3 Expected Value, Kelly Sizing, and Slippage Simulation

expected_value computes edge as the difference between your estimated probability and the effective cost per share (accounting for slippage on a given trade size). kelly_shares implements fractional Kelly sizing, capping position size at a configurable percentage of bankroll. simulate_slippage sweeps trade sizes from 1 to max_shares_to_buy and records the average cost per share at each size — this is the slippage curve.

def expected_value(true_p: float, cost_per_share: float) -> float:
    """EV = true_prob * 1.0 - cost_per_share (binary outcome pays $1)"""
    return true_p - cost_per_share

def kelly_shares(true_p: float, market_p: float, bankroll: float,
                 fraction: float = 0.25) -> float:
    """
    Kelly formula for binary market: f = (p*(1/market_p) - 1) / (1/market_p - 1)
    Returns dollar-fractional Kelly position, floored at 0.
    """
    if market_p <= 0 or market_p >= 1:
        return 0.0
    b_odds = (1.0 / market_p) - 1.0          # net odds per $1 risked
    f_star = (true_p * (1 + b_odds) - 1) / b_odds
    f_star = max(f_star, 0.0)
    return fraction * f_star * bankroll

def simulate_slippage(q: np.ndarray, b: float, outcome: int,
                      max_delta: int) -> tuple:
    """Returns arrays of share counts and average cost-per-share."""
    deltas = np.arange(1, max_delta + 1, dtype=float)
    avg_costs = np.array([
        lmsr_trade_cost(q, b, outcome, d) / d for d in deltas
    ])
    return deltas, avg_costs

# --- Run calculations ---
spot_cost = lmsr_trade_cost(shares, b, 0, 1.0)   # cost for 1 YES share
ev_spot = expected_value(true_prob, spot_cost)

kelly_dollars = kelly_shares(true_prob, yes_market_price, bankroll, kelly_fraction)
kelly_num_shares = kelly_dollars / spot_cost if spot_cost > 0 else 0

print(f"\nSpot cost (1 YES share) : ${spot_cost:.4f}")
print(f"Expected value          : {ev_spot:+.4f}")
print(f"Kelly position (dollars): ${kelly_dollars:.2f}")
print(f"Kelly position (shares) : {kelly_num_shares:.1f}")

deltas, avg_costs = simulate_slippage(shares, b, 0, max_shares_to_buy)
ev_curve = expected_value(true_prob, avg_costs)
breakeven_idx = np.where(ev_curve <= 0)[0]
breakeven_shares = deltas[breakeven_idx[0]] if len(breakeven_idx) > 0 else None
print(f"\nEdge disappears at      : {breakeven_shares} shares" if breakeven_shares
      else "\nEdge persists across full range.")

2.4 Visualization

The chart below overlays the average cost-per-share curve against your estimated true probability. The gap between the two lines is your EV at each trade size. The vertical dashed line marks the Kelly-optimal share count, and the red crossover marks where slippage fully erodes edge.

plt.style.use("dark_background")
fig, axes = plt.subplots(2, 1, figsize=(10, 8), sharex=True)
fig.suptitle("LMSR Slippage and Expected Value — YES Outcome", fontsize=14, y=0.98)

# --- Top panel: cost curve vs true probability ---
axes[0].plot(deltas, avg_costs, color="#00BFFF", linewidth=2, label="Avg cost/share")
axes[0].axhline(true_prob, color="#FFD700", linewidth=1.5, linestyle="--",
                label=f"True prob estimate ({true_prob:.2f})")
axes[0].axvline(kelly_num_shares, color="#7FFF00", linewidth=1.2,
                linestyle=":", label=f"Kelly size ({kelly_num_shares:.0f} shares)")
if breakeven_shares:
    axes[0].axvline(breakeven_shares, color="#FF4500", linewidth=1.2,
                    linestyle="-.", label=f"Breakeven ({int(breakeven_shares)} shares)")
axes[0].set_ylabel("Price / Probability")
axes[0].legend(fontsize=9)
axes[0].yaxis.set_major_formatter(mticker.FormatStrFormatter("%.3f"))

# --- Bottom panel: EV curve ---
axes[1].plot(deltas, ev_curve, color="#FF69B4", linewidth=2, label="Expected Value")
axes[1].axhline(0, color="white", linewidth=0.8, linestyle="--")
axes[1].fill_between(deltas, ev_curve, 0,
                     where=(ev_curve > 0), alpha=0.25, color="#00FF7F", label="Positive EV")
axes[1].fill_between(deltas, ev_curve, 0,
                     where=(ev_curve <= 0), alpha=0.25, color="#FF4500", label="Negative EV")
axes[1].set_xlabel("Shares Purchased")
axes[1].set_ylabel("EV per Share ($)")
axes[1].legend(fontsize=9)

plt.tight_layout()
plt.savefig("lmsr_slippage_ev.png", dpi=150, bbox_inches="tight")
plt.show()

Figure 1. Top panel shows average cost per share rising with trade size against the fixed true-probability estimate; bottom panel shows the resulting EV curve, with the green region representing profitable trade sizes and the red region where slippage has consumed all edge.

Enjoying this strategy so far? This is only a taste of what's possible.

Go deeper with my newsletter: longer, more detailed articles + full Google Colab implementations for every approach.

Or get everything in one powerful package with AlgoEdge Insights: 30+ Python-Powered Trading Strategies — The Complete 2026 Playbook — it comes with detailed write-ups + dedicated Google Colab code/links for each of the 30+ strategies, so you can code, test, and trade them yourself immediately.

Exclusive for readers: 20% off the book with code MEDIUM20.

Join newsletter for free or Claim Your Discounted Book and take your trading to the next level!

3. Results and Interpretation

With b=100, shares=[80, 60], and true_prob=0.55, the spot YES price comes out to approximately 0.549 — a thin but real edge of about +0.001 per share at size 1. That edge decays steadily as trade size grows because each additional share purchased pushes the market price toward your estimate, narrowing the gap. In this configuration, edge typically disappears somewhere between 40 and 60 shares depending on market liquidity.

The quarter-Kelly position comes out to roughly $12–18 depending on exact market price. This looks small against a $1,000 bankroll, and that is intentional. At quarter-Kelly, the growth rate is near-optimal while drawdown risk is substantially reduced compared to full Kelly. The Kelly formula is highly sensitive to probability estimation errors — if your true_prob estimate is off by five percentage points, a full-Kelly position can result in meaningful capital loss. Fractional Kelly is the practitioner's adjustment for that model uncertainty.

The most actionable output is the breakeven share count from the slippage simulator. Before placing any trade, you can read this number directly: if your research supports buying 80 shares, but the model says edge disappears at 45, you are walking into a losing trade in the top half of your position. This forces a discipline that pure intuition cannot replicate.

4. Use Cases

Pre-trade sizing discipline. Run the slippage simulator before any trade to confirm that your intended position size falls within the positive-EV zone. This is especially important in low-liquidity markets where b is small and price impact is severe.
Market screening. Loop this toolkit over multiple open markets with your probability estimates. Filter to only those where EV at your minimum trade size exceeds a threshold (e.g., +0.02). This creates a systematic entry queue rather than an ad-hoc one.
Probability calibration tracking. Log your true_prob estimates and actual market resolutions over time. Track your Brier score. If your estimates are consistently miscalibrated in one direction, the EV calculations are systematically wrong and need adjustment before sizing decisions can be trusted.
Liquidity parameter estimation. For markets where you don't know b directly, you can back it out by fitting the LMSR cost function to observed price changes. Two observed price states plus the cost equation give you enough information to estimate b numerically, which then anchors all subsequent slippage projections.

5. Limitations and Edge Cases

Probability estimation is the hard problem. The entire framework is only as good as true_prob. LMSR mechanics are exact — the math has no error. Your edge estimate carries all the uncertainty. A trader who systematically overestimates their own accuracy will apply Kelly sizing to negative-EV positions confidently. The model provides no protection against this.

b is not always observable. Polymarket's actual liquidity parameter varies by market and can change as liquidity is added or withdrawn. If you calibrate b from stale data, your slippage estimates will be wrong. Always re-estimate b from the most recent observable price states before running the simulator.

Binary outcome assumption. This implementation assumes a two-outcome market paying $1 on resolution. Multi-outcome markets (elections with more than two candidates, for example) require extending the share vector and adjusting the Kelly formula to account for correlated outcomes.

No model for adverse selection. LMSR assumes all participants are noise traders or honest probability estimators. In practice, some market participants have genuinely superior information. If you are systematically on the wrong side of large, informed trades, the EV formula will look positive right up until resolution.

Resolution risk is not modeled. Polymarket markets occasionally resolve incorrectly, are voided, or face liquidity crises near resolution. These tail events are not captured in the EV calculation. In high-stakes positions, a separate risk haircut for resolution uncertainty is warranted.

Concluding Thoughts

The LMSR pricing engine reduces to three functions: a log-sum-exp cost state, a softmax price, and a cost difference for any given trade. Everything else — expected value, Kelly sizing, slippage modeling — is arithmetic on top of those primitives. Understanding this unlocks a systematic approach to prediction market trading that has nothing to do with intuition about the underlying event and everything to do with price versus probability.

The most practical next step is to calibrate your own probability estimates rigorously. Back-test your historical estimates against resolutions. Compute your Brier score. If your calibration is reliable, the Kelly sizing formula will compound capital over time. If it is not, no position sizing formula can save you. The math is only as useful as the inputs you feed it.

Future extensions worth building include a multi-outcome LMSR generalizing the share vector to n outcomes, a real-time data pull from the Polymarket API to populate shares and b automatically, and a portfolio-level Kelly allocation across correlated markets. Each of those components follows directly from the foundation built here.