I open-sourced a World Cup 2026 prediction model — and tested it honestly

#opensource #javascript #datascience #showdev

Every World Cup, "supercomputer predicts the winner" headlines show up everywhere — and almost none of them let you see how the sausage is made. I wanted a forecast I could actually read, run, and argue with. So I built one for the 2026 World Cup, and I open-sourced the whole thing:

👉 github.com/Hicruben/world-cup-2026-prediction-model (MIT)

No machine-learning black box, no scraped bookmaker odds — just three classic, transparent pieces. And, more importantly, an honest, reproducible test of how good it actually is.

The model in three layers

1. Team strength (Elo). Every nation gets an Elo rating, seeded from long-run strength and then calibrated on hundreds of recent real internationals. Wins over strong sides in important games move a rating more than friendlies; recent form outweighs old form.

2. Each match (Dixon-Coles bivariate Poisson). Two ratings become expected goals, which feed a Dixon-Coles model to produce win/draw/loss probabilities. Dixon-Coles (1997) fixes a well-known flaw of plain Poisson: it under-counts the low-scoring draws (0-0, 1-1) that are so common in football.

import { matchProb } from "./elo.mjs";

// Elo 2056 vs Elo 1951, neutral venue
const p = matchProb(2056, 1951);
// → { winA: 0.45, draw: 0.26, winB: 0.29, expectedGoalsA: 1.6, expectedGoalsB: 1.2 }

3. The tournament (Monte Carlo). Play all 104 matches through the real bracket 10,000 times. Count how often each team reaches each round → championship and advancement probabilities.

There's a tiny CLI to poke at it:

$ node predict.mjs brazil argentina

  brazil (Elo 1994)  vs  argentina (Elo 2064)   [neutral]
  brazil           win   26.7%  ████████
  draw                   28.3%  █████████
  argentina        win   45.0%  █████████████

The part I actually care about: is it any good?

Anyone can spit out percentages. The hard question is whether they mean anything. So I tested it the honest way — walk-forward, out-of-sample. The script steps through 920 real internationals (Oct 2023 → May 2026) in date order, predicts each match using only data available before kickoff, then reveals the result and updates the ratings. No hindsight, no curve-fitting. One command reproduces it:

$ node backtest.mjs

=== Walk-forward backtest — 770 of 920 matches ===
MODEL
  Accuracy (top pick):   61.0%
  Favourite acc (p≥50%): 66.8%
  Brier (3-way, ↓):      0.536
BASELINES (same matches)
  Always pick home:      48.6%
  Coin-flip (uniform):   Brier 0.667

So: ~61% correct on a three-way (win/draw/loss) outcome, versus 49% for "always pick home" and ~33% for a coin toss. When the model had a clear favourite, it was right about two times in three. The Brier score (0.54 vs 0.67 for uniform) says the probabilities carry real information, not just the top pick.

What I learned (and what I won't claim)

It is not state-of-the-art, and it does not beat the betting market. A 61% hit rate also means ~2 in 5 matches surprise it — by design. Draws are genuinely the hardest thing to predict, and a 7-game tournament is dominated by variance.
Transparent baselines are underrated. No deep learning, ~300 lines of plain Node, zero dependencies — and it still lands in the same ballpark as far fancier models for tournament-level questions.
Calibration > accuracy. Getting the probabilities shaped right matters more than the headline hit rate, especially for a bracket simulation.

Try it / see it live

Clone it and run the backtest yourself (Node 18+, no deps):

git clone https://github.com/Hicruben/world-cup-2026-prediction-model.git
cd world-cup-2026-prediction-model
node backtest.mjs      # reproduce the numbers
node predict.mjs spain germany

The full 48-team tournament simulator (10k sims, live title odds, an interactive bracket) runs the same engine at cup26matches.com, and there's a plain-English write-up of the methodology and the backtest here.

I'd genuinely love feedback on the modelling — the Dixon-Coles ρ, the home-field handling, the best-third tiebreaks. Tear it apart in the comments or open an issue. ⭐ the repo if it's useful!