How likely is it for a football team to win certain number of matches in a season? well, in this article I used Python and Binomial probability theory to estimate how likely each team in the 2024/2025 English Premier League season was to achieve their observed win count.
Project Goal:
Calculate the number of games each team won
Calculate the cumulative Binomial Probability:
This measures how consistent or likely the team’s win total is, given their own win rate, using the binomial formula.Bonus: Calculate the Empirical Probability : It estimates how likely a team is to win a match based on past outcomes.
Tools and Data
- Data source: Premier League standings from the football-data.org API.
- Language: Python
- Libraries used: requests, pandas, math, dotenv (for API key), and matplotlib for visualization.
Steps in the Code
Load the API Key
We keep our API key in a .env file for security and load it using dotenv.Fetch EPL Standings
The code uses requests to pull the final 2024/25 league table. We extract team names and their total number of wins.Calculate Win Probabilities
For each team, we:
Estimate their win rate:
n = 38
p = wins / 38
Use the binomial probability formula to compute the probability of winning at least that many matches:
sum(binomial_prob(n, k, p) for k in range(wins, 39))
This tells us how likely their final win count is given their own win rate.
Build the Results Table
The final DataFrame includes:
- Team name
- Total wins
- Estimated win rate (in %)
- Binomial win probability (as decimal and percentage)
- All teams are sorted by number of wins (highest to lowest)
Visualize
I used matplotlib, the code plots the top 10 teams and their Estimated win rate.
Conclusion
By applying both binomial probability and empirical (frequentist) methods, we gain two complementary views of Premier League team performance in the 2024/25 season. The Win Probability (Binomial) evaluates how likely it is for a team to achieve at least that number of wins—based on its own success rate while, the Estimated Win Rate (%) gives a straightforward measure of how often a team won based on observed data.
Top comments (0)