UNVEILING THE ODDS
In a recent assignment, I was tasked to find out the win probabilities of the most recent premier league teams for the next season. My methodology, grounded in python, utilizes the effective football-data.org API, which provides a clear, data-driven source of each team's performance. By simply making an API call, we can retrieve all the data we need for this simple model. I focused on the 2023 season of the Premier League for this project.
The python script I ran uses a secure API key, loaded from an environment file for privacy and security. The core of our data retrieval is a GET request to the following URL, which fetches all matches from the specified league and season: https://api.football-data.org/v4/competitions/PL/matches?season=2023. This returns all the relevant data required to accomplish the task at hand.
THE CALCULATION METHODOLOGY
The approach here was quite simple. Fetch the total wins for a particular team and the total number of matches played in that season. The win percentage is simply the ratio between the two expressed in percentage form. The python script processes the data in the following steps:
- Setup and Data Retrieval – The API credentials are setup and a request is made to the football-data.org endpoint. Whatever we retrieve is parsed using a dictionary.
- Filtering for Final Results: The API provides data for matches in various statuses (e.g., scheduled, live, finished). Our analysis is based on past performance, so we iterate through the list of matches and only consider those with the status “FINISHED”
- Aggregating Team Statistics: We create an empty dictionary called teams to store our performance data. For each finished match, we identify the home team, the away team, and the winner. We then update our teams dictionary. For both the home and away team, we increment the played counter. Based on the winner field ("HOME_TEAM" or "AWAY_TEAM"), we increment the wins counter for the corresponding team.
- Calculating and Displaying Probabilities: After iterating through all the finished matches, the teams dictionary contains the total number of games played and won for every team in the league. The final step is to loop through this dictionary and calculate the win probability using the formula:
- Win Probability= (Total Matches Played / Total Wins) × 100
OBSERVATION/CONCLUSION
Despite the fact that the method used fails to factor in opponent strength, home/away field or other parameters, it provides a solid baseline for understanding team performance and is a quick and efficient way to summarize a team's historical success within a season.
Top comments (0)