DEV Community

Waqas R
Waqas R

Posted on

How we predict the FIFA World Cup 2026 with a Dixon-Coles bivariate Poisson model

We're building Onside Arena — an open AI football analytics platform for the FIFA World Cup 2026 and FPL. Live model record: 75% of MD1 winners called correctly. Here's the technical core.

TL;DR

  • Dixon-Coles bivariate Poisson on team goal expectations
  • Bayesian-shrunk ratings learned from 12 past World Cups + 8 Premier League seasons (~32K matches)
  • Live recalibration after every played match in the tournament
  • Outputs per-match win/draw probabilities, scoreline distributions, and Monte Carlo simulations of the bracket
  • Receipts published live at onsidearena.com/world-cup-2026/model-record

Why Dixon-Coles

A standard independent-Poisson model assumes home and away goal counts are independent given attack/defence rates. That's wrong for football — 0-0 and 1-1 are over-represented vs Poisson, and 1-0 / 0-1 are under-represented. Dixon-Coles (1997) introduces a low-score correction term that down-weights the independence assumption near origin.

The rho parameter is learned from data. For our WC + PL training set, rho is approximately -0.13, which materially shifts predicted draw probabilities by 4-6 percentage points on average.

Where the team ratings come from

Attack/defence rates are not observed — they're estimated. We use a hierarchical Bayesian shrinkage model:

  • Each team has a latent attack strength and defence strength
  • Priors centered on confederation mean (UEFA, CONMEBOL, etc.) so newly-qualified nations aren't extreme outliers
  • Likelihood: every observed match score in our 32K-match corpus contributes evidence
  • MAP estimation via Stan-style sampler, but we cache point estimates per nation pair for fast scoring

Home advantage is a single global parameter (~0.31 log-goals), with a learned multiplier for neutral-venue WC matches (~0.83x of league home advantage).

Live recalibration

This is the part most public models don't do. After every WC 2026 match plays out:

  1. Compute the model's pre-match attack/defence rates and the actual scoreline
  2. Compute the Bayesian update to that team-pair's posterior
  3. Propagate the update to the team's confederation-cluster prior
  4. Re-score all future matches involving either team

Net effect: a side like Iraq, which had a wide posterior because of limited recent international form, sharpened ~2x faster than a side like France whose prior was already tight.

Sanity-check: what we got right and wrong

From MD1:

  • Argentina to top Group H @ 73% -> 2-0 vs Austria (correct)
  • France to top Group K @ 81% -> 3-0 vs Iraq (correct)
  • England to win Group C @ 68% -> won 2-0 (correct)
  • Germany draw @ 64% -> lost (model was too confident in Germany's defensive solidity vs current form)

Live accuracy: 24/32 calls correct = 75%. Brier score on win-probability: 0.179 (lower is better, 0.25 is naive baseline).

What's in the API

We publish the model's outputs as free JSON via MCP and REST:

  • GET /api/v1/wc/probabilities — per-match win/draw probabilities
  • GET /api/v1/wc/champions — current Monte Carlo champion distribution (10K sims)
  • GET /api/v1/wc/upsets — biggest projected upsets in upcoming 7 days
  • npm: onside-football-mcp — drop-in for Claude / Cursor / ChatGPT App Directory

Full docs at onsidearena.com/llms.txt.

What we'd love feedback on

Things we're still tuning:

  1. Squad-rotation prior: We don't yet condition on starting XI announcements — model still uses pre-tournament team ratings. Fix is in progress.
  2. Set-piece specialist weighting: A team's set-piece goal share is volatile and we under-weight it.
  3. Tail risk in knockouts: The model is conservative on extra-time and penalty shootouts. We use a separate logistic mixture there.

If you build prediction models for sports, or are interested in Bayesian methods applied to live recalibrating systems, would love to hear how you handle these problems.


Live model record (we update it after every match): https://onsidearena.com/world-cup-2026/model-record

Follow @onsidearena on X for daily picks and post-match receipts.

Top comments (0)