<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jerry Chen</title>
    <description>The latest articles on DEV Community by Jerry Chen (@jerry_chen_dbaa6838e17336).</description>
    <link>https://dev.to/jerry_chen_dbaa6838e17336</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3031788%2F89020bf3-ba11-453c-b3ca-47121d8ae19b.png</url>
      <title>DEV Community: Jerry Chen</title>
      <link>https://dev.to/jerry_chen_dbaa6838e17336</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/jerry_chen_dbaa6838e17336"/>
    <language>en</language>
    <item>
      <title>I open-sourced a World Cup 2026 prediction model — and tested it honestly</title>
      <dc:creator>Jerry Chen</dc:creator>
      <pubDate>Sun, 31 May 2026 15:08:33 +0000</pubDate>
      <link>https://dev.to/jerry_chen_dbaa6838e17336/i-open-sourced-a-world-cup-2026-prediction-model-and-tested-it-honestly-44d1</link>
      <guid>https://dev.to/jerry_chen_dbaa6838e17336/i-open-sourced-a-world-cup-2026-prediction-model-and-tested-it-honestly-44d1</guid>
      <description>&lt;p&gt;Every World Cup, "supercomputer predicts the winner" headlines show up everywhere — and almost none of them let you see how the sausage is made. I wanted a forecast I could actually read, run, and argue with. So I built one for the 2026 World Cup, and I open-sourced the whole thing:&lt;/p&gt;

&lt;p&gt;👉 &lt;strong&gt;&lt;a href="https://github.com/Hicruben/world-cup-2026-prediction-model" rel="noopener noreferrer"&gt;github.com/Hicruben/world-cup-2026-prediction-model&lt;/a&gt;&lt;/strong&gt; (MIT)&lt;/p&gt;

&lt;p&gt;No machine-learning black box, no scraped bookmaker odds — just three classic, transparent pieces. And, more importantly, an &lt;strong&gt;honest, reproducible test of how good it actually is.&lt;/strong&gt;&lt;/p&gt;

&lt;h2&gt;
  
  
  The model in three layers
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;1. Team strength (Elo).&lt;/strong&gt; Every nation gets an Elo rating, seeded from long-run strength and then calibrated on hundreds of recent real internationals. Wins over strong sides in important games move a rating more than friendlies; recent form outweighs old form.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. Each match (Dixon-Coles bivariate Poisson).&lt;/strong&gt; Two ratings become expected goals, which feed a Dixon-Coles model to produce win/draw/loss probabilities. Dixon-Coles (1997) fixes a well-known flaw of plain Poisson: it under-counts the low-scoring draws (0-0, 1-1) that are so common in football.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="nx"&gt;matchProb&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;./elo.mjs&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;// Elo 2056 vs Elo 1951, neutral venue&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;p&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;matchProb&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;2056&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1951&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="c1"&gt;// → { winA: 0.45, draw: 0.26, winB: 0.29, expectedGoalsA: 1.6, expectedGoalsB: 1.2 }&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;strong&gt;3. The tournament (Monte Carlo).&lt;/strong&gt; Play all 104 matches through the real bracket 10,000 times. Count how often each team reaches each round → championship and advancement probabilities.&lt;/p&gt;

&lt;p&gt;There's a tiny CLI to poke at it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;node predict.mjs brazil argentina

  brazil &lt;span class="o"&gt;(&lt;/span&gt;Elo 1994&lt;span class="o"&gt;)&lt;/span&gt;  vs  argentina &lt;span class="o"&gt;(&lt;/span&gt;Elo 2064&lt;span class="o"&gt;)&lt;/span&gt;   &lt;span class="o"&gt;[&lt;/span&gt;neutral]
  brazil           win   26.7%  ████████
  draw                   28.3%  █████████
  argentina        win   45.0%  █████████████
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The part I actually care about: is it any good?
&lt;/h2&gt;

&lt;p&gt;Anyone can spit out percentages. The hard question is whether they mean anything. So I tested it the honest way — &lt;strong&gt;walk-forward, out-of-sample&lt;/strong&gt;. The script steps through &lt;strong&gt;920 real internationals (Oct 2023 → May 2026)&lt;/strong&gt; in date order, predicts each match using &lt;em&gt;only&lt;/em&gt; data available before kickoff, then reveals the result and updates the ratings. No hindsight, no curve-fitting. One command reproduces it:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nv"&gt;$ &lt;/span&gt;node backtest.mjs

&lt;span class="o"&gt;===&lt;/span&gt; Walk-forward backtest — 770 of 920 matches &lt;span class="o"&gt;===&lt;/span&gt;
MODEL
  Accuracy &lt;span class="o"&gt;(&lt;/span&gt;top pick&lt;span class="o"&gt;)&lt;/span&gt;:   61.0%
  Favourite acc &lt;span class="o"&gt;(&lt;/span&gt;p≥50%&lt;span class="o"&gt;)&lt;/span&gt;: 66.8%
  Brier &lt;span class="o"&gt;(&lt;/span&gt;3-way, ↓&lt;span class="o"&gt;)&lt;/span&gt;:      0.536
BASELINES &lt;span class="o"&gt;(&lt;/span&gt;same matches&lt;span class="o"&gt;)&lt;/span&gt;
  Always pick home:      48.6%
  Coin-flip &lt;span class="o"&gt;(&lt;/span&gt;uniform&lt;span class="o"&gt;)&lt;/span&gt;:   Brier 0.667
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So: &lt;strong&gt;~61% correct on a three-way (win/draw/loss) outcome&lt;/strong&gt;, versus 49% for "always pick home" and ~33% for a coin toss. When the model had a clear favourite, it was right about two times in three. The Brier score (0.54 vs 0.67 for uniform) says the &lt;em&gt;probabilities&lt;/em&gt; carry real information, not just the top pick.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I learned (and what I won't claim)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;It is not state-of-the-art, and it does not beat the betting market.&lt;/strong&gt; A 61% hit rate also means ~2 in 5 matches surprise it — by design. Draws are genuinely the hardest thing to predict, and a 7-game tournament is dominated by variance.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transparent baselines are underrated.&lt;/strong&gt; No deep learning, ~300 lines of plain Node, zero dependencies — and it still lands in the same ballpark as far fancier models for tournament-level questions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Calibration &amp;gt; accuracy.&lt;/strong&gt; Getting the &lt;em&gt;probabilities&lt;/em&gt; shaped right matters more than the headline hit rate, especially for a bracket simulation.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Try it / see it live
&lt;/h2&gt;

&lt;p&gt;Clone it and run the backtest yourself (Node 18+, no deps):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/Hicruben/world-cup-2026-prediction-model.git
&lt;span class="nb"&gt;cd &lt;/span&gt;world-cup-2026-prediction-model
node backtest.mjs      &lt;span class="c"&gt;# reproduce the numbers&lt;/span&gt;
node predict.mjs spain germany
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The full 48-team tournament simulator (10k sims, live title odds, an interactive bracket) runs the same engine at &lt;strong&gt;&lt;a href="https://cup26matches.com" rel="noopener noreferrer"&gt;cup26matches.com&lt;/a&gt;&lt;/strong&gt;, and there's a plain-English write-up of the methodology and the backtest &lt;a href="https://cup26matches.com/en/methodology/" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;I'd genuinely love feedback on the modelling — the Dixon-Coles ρ, the home-field handling, the best-third tiebreaks. Tear it apart in the comments or open an issue. ⭐ the repo if it's useful!&lt;/p&gt;

</description>
      <category>opensource</category>
      <category>javascript</category>
      <category>datascience</category>
      <category>showdev</category>
    </item>
  </channel>
</rss>
