DEV Community

MakenaKinyua
MakenaKinyua

Posted on

Calculating win probabilities of the EPL.

The English Premier League is about to resume for the next season and I hope all fans are ready for it! This a simple experiment to calculate and visualize win probabilities ; as a bernoulli distribution and binomial distribution using python.

Data from the 2024/2025 season obtained from Football Data Org API , an API for football, and I used several python libraries;

import requests 
import pandas as pd 
from scipy import stats 
import seaborn as sns
Enter fullscreen mode Exit fullscreen mode

After obtaining the data, it was converted from json data to a pandas DataFrame for wrangling and visualization.

1. Defining a function to calculate probabilities.
The defined function has two objectives: calculate the win, draw and loss probabilities and calculate the binomial probability of the teams winning the same amount of games based on the number of games they won.

Function to calculate Probability

i. The first part of the function
Calculates the win, loss and draw probabilities. It takes the number of games divided by the total games. This provides us with an understanding of the probability of the outcomes for the individual teams at any point during the season. It can be likened to a bernoulli distribution which calculates the probability of a success ie; probability of a win or no win.

ii. Second part of the function
Calculates the binomial probabilities using the scipy python library. We use the stats.binom.pmf which takes in the arguments (k, n, p) where;

k - number of successes which is number of games won
n - total games played
p - probability of a win

The binomial probabilities are interpreted as the probability of the team having the same number of wins for the next season.

2. Visualizing the results
From the results, I noticed the differences in team positions as a result of the calculated probabilities. I created a plot of both the win rate which is in orange and the win probability in blue just to help me understand the analysis.

Team Positions

Based on this, we see that Liverpool FC is most likely to be at the top of the table followed by Manchester City FC and Chelsea FC. The three bottom most teams have higher probability of winning the same amount of games than the rate of winning any games. They suffer the penalty of relegation onto a lower competition.

Conclusion
Working on this was interesting and I got to learn a lot through my trials and errors. There is so much that goes into predicting football outcome probabilities such as form, stage, players etc. I can't wait to explore these variables for a more informed prediction. As for now, I stand with Manchester United FC.

Top comments (0)