This article is an English translation of my article which was written on Brazilian Portuguese and posted here on my dev.to profile.
Initial Considerations
In this article we will use mathematical concepts like Expected Value and Probability Distribution, if you don’t know much about these concepts, you may still understand everything that’s being done in the article, but if you want to learn more about the content, I indicate the Khan Academy website especially the modules on Probability Distribution and Average - Expected Value, they are short and very explanatory videos about the concepts.
Introduction to the Project
In this project, we will use probability and statistics to predict the results of football matches. For this, we will use Python and its Numpy library, along with concepts of probability and statistics.
We will perform the following process, we will read a file containing all the results of AFC Ajax matches in the Dutch football league (Eredivisie) during the 18/19 season and we will, for each round, predict the score of the next match of the team, this prediction will consist of the Expected Value (EV) of goals scored by the club and the EV of goals conceded.
What Happens If We Guess Random Values?
For future reference, we will look at what would happen if we tried to guess the results of the matches using random values.
We will consider that Ajax can score from 0 (minimum number of goals scored in a match in our data) to 8 (maximum number recorded by Ajax in a match in our data) goals, that is, a total of 9 possible results, and allow from 0 (registered minimum) to 6 (registered maximum) goals, a total of 7 possibilities. We have that the probability of getting a match score prediction right is to get the right number of goals scored and the right number of goals conceded, so if we choose random values in the determinated interval we will have:
The Dutch league has a total of 34 matches, we will not make predictions for the first round, as we have no previous data to help us calculate a prediction. So, considering that we have 33 matches to try to get at least one right score, we will multiply 33 by the probability of a right match score, which gives us a value of around 0.5238 right score. This means that without mathematical tools, using random values, we are expected to get the right score of less than one match of the 33 analyzed. For the number of goals scored on a match, we have an expected value of 3.6667 (33 * 1/9) right results and for goals conceded 4.7143 (33 * 1/7).
So let's try to improve these values (which are very low) using math and programming.
Project Implementation
To create our project, first, we will create our scores file, this file will have a specific format and will be written as:
goalsscored,goalsconceded
For example, if Ajax scored 4 goals and conceded 2 in a match we will have in the file:
4,2
This file will be named resultados.txt
, and it is available in the project repository.
Now we are going to start the coding part of our project! We will begin importing the necessary library.
import numpy as np
Then we will open our scores file.
# Opening the file with our scores
fileResults = open("resultados.txt", "r")
After opening the file, we will insert the contents of the file into a list called matchesScores
using a list comprehension, which is a way of defining, creating, and maintaining lists in python. With this tool, we can create an iterator and fill lists within a single line of code.
At the end of the iteration, we will close the file (resultados.txt
) that was opened at the beginning of our code.
# Declaring our score list
matchesScores = []
# The for loop will work with every line of the file in each iteration
for lineofFile in fileResults:
"""
The next line of code will add the contents of a file line,
inside the braquets we have a list comprehension which
does the exact same work as the following code:
list = []
for x in l.split(","):
list.append(int(x))
results.append(list)
"""
matchesScores.append([int(x) for x in lineofFile.split(",")])
# The we will close our file
fileResults.close()
Now we will start analyzing the data obtained. But first, we will initialize some variables that will store our formatted data.
# We Will declare two lists, one containing the goals scored and one with the goals conceded
goals_scored = []
goals_conceded = []
# We will declare the number of time we got the goals scored, goals conceded and both of them right
right_round = 0
right_goals_scored = 0
right_goals_conceded = 0
We will then iterate through the entire matchesScores
list, separating the values it contains in goals scored and conceded and then calculating the expected value of each of these categories to calculate a score prediction for the next round.
For it, we will obtain the frequency of each number of goals, that is, how many times the team has scored 0 goals, 1 goal, 2 goals, and so on. We will do the same with the goals conceded. With the frequency of each number of goals, we will have the data to calculate our expected value.
For example, we can have a frequency like the one shown in the graph below (This is not the actual frequency of the data).
Example of how the frequency could look like
To define the goals scored and conceded we will code:
"""
We will go through our list of scores per round
and calculate the expected value of goals scored
and conceded for each round,
we will predict with these values and
then we will check if these values correspond
to the result that happened in the match.
"""
for round in range(len(matchesScores)):
goals_scored.append(matchesScores[round][0])
goals_conceded.append(matchesScores[round][1])
# Now we will get the frequency of the number of goals scored so far
num_goals, freq_num_goals = np.unique(goals_scored, return_counts=True)
# For organizational reasons, we will transform our values into a dictionary 'goals': frequency
dic_goals_scored = dict(zip(num_goals, freq_num_goals))
# We wil do the same with the goals conceded
num_goals, freq_num_goals = np.unique(goals_conceded, return_counts=True)
# For organizational reasons, we will transform our values into a dictionary 'goals': frequency
dic_goals_conceded = dict(zip(num_goals, freq_num_goals))
After that, we will calculate the expected value of the goals, that is, the values that are expected in the next match considering the values of the previous rounds. To calculate this value we will multiply all the values in the dictionary (number of goals scored) by their probability of occurrence (Frequency divided by the number of rounds) getting then our expected values.
expected_scored=0
for goal in dic_goals_scored.keys():
expected_scored += goal*(dic_goals_scored[goal]/len(goals_scored))
expected_conceded=0
for goal in dic_goals_conceded:
expected_conceded += goal*(dic_goals_conceded[goal]/len(goals_conceded))
After calculating our expected values, we will print our prediction and compare it with the result of the next round to see if we got the result of the match, the number of goals scored and the number of goals conceded right with our prediction.
# After calculating our prediction we will print it and compare to the real result
# The next line will round our values to the closest integer
expected_scored = int(np.around(expected_scored))
expected_conceded = int(np.around(expected_conceded))
"""
If we are in the last round we have no future round
to predict so we will stop our iteration
"""
if (round+1 == len(matchesScores)):
break
"""
Now we will print our expected value for the next round
as lists start at number 0 we have to add
1 to the round value to get the round currently being read,
that is, we have to add 2 to the number of the `round`
to get the value of the NEXT round.
"""
print(f'At the {round+2} round we predicted a result of Ajax {expected_scored} x {expected_conceded} opponent')
print(f'At the {round+2} we got a result of Ajax {matchesScores[round+1][0]} x {matchesScores[round+1][1]} opponent')
# We will check the results
if(expected_scored==matchesScores[round+1][0] and expected_conceded==matchesScores[round+1][1]):
right_round += 1
if(expected_scored==matchesScores[round+1][0]):
right_goals_scored += 1
if(expected_conceded==matchesScores[round+1][1]):
right_goals_conceded += 1
After the loop execution, we will check our number of right guesses.
# We Will print the results
print("We got {0:1d} of the matches results right, this is, {1:2.2f}%".format(right_round, (right_round/33)*100))
print("We got {0:1d} of the goals scored in a match right, this is, {1:2.2f}%".format(right_goals_scored, (right_goals_scored/33)*100))
print("We got {0:1d} of the goals conceded in a match right, this is, {1:2.2f}%".format(right_goals_conceded, (right_goals_conceded/33)*100))
The output of our program will look like this
> At the 2 round we predicted a result of Ajax 1 x 1 opponent
> At the 2 we got a result of Ajax 1 x 0 opponent
...
> At the 34 round we predicted a result of Ajax 3 x 1 opponent
> At the 34 we got a result of Ajax 4 x 1 opponent
> We got 4 of the matches results right, this is, 12.12%
> We got 7 of the goals scored in a match right, this is, 21.21%
> We got 15 of the goals conceded in a match right, this is, 45.45%
Note that we got 4 results right from a complete match, 8 times more than using random values, 7 predictions of goals scored, 2 times more, and 15 predictions of goals conceded, 3 times more.
The use of expected values helped a lot to improve our number of correct guesses. This shows how powerful simple concepts of probability and statistics can be in data analysis.
The program developed in this article is available in my gitlab repository. I hope I have helped you in any way, if you have any problems or questions feel free to leave a comment on this post or send me an email;).
Top comments (19)
Thanks for the information! I will definitely follow your advice next time when I will bet, to see how effective it is. I do my analysis in another way, but I don't think it is bad to try something new. I bet for a long time, so I have some experience in this field, also I know where to bet and which sites are reliable and always give many bonuses. Most of the time I bet on the sites which are offered here cricketbettingguru.com/best-cricke...
One of the top betting sites for Indian cricket fans over the course of a few years is World777. You will be thrilled to learn that cricketing legend Kevin Pietersen hosts World777 cricket betting site and provides players with specialized betting advice and analysis.
Cricketbettingguru is a great site! Similar to live-score.top/ - those guys are doing a great job!
Also you can check more info on these sites:
cricket14.in
kntvnews.in
footbal24.in
I Hope You Will Share Such Type Of Impressive Content Again With Us So That We Can Utilize It And Get More Advantage.
Click Here: Moonstone Ring
Wonderful Blog! For sharing the list above, we are grateful to Admin. I perused a lot of your blog's pages. You have a great blog, really. Continue to share such inspiring tales. Thanks. Visit Here :- Chakra Jewelry
I have enjoyed reading your blog. As a fellow writer and Kindle publishing enthusiast, I would like to first thank you for the sheer volume of useful resources you have compiled for authors in your blog and across the web. I'm also working on the blog, I hope you like my MOONSTONE GEMS blogs.
Very nice information related to this Blog. This Information is very good and helpful, Thank you to provide us. But, I have some Information related to Gemstone & Jewelry, Check Chakra Jewelry Collection on our Website.
Prediction and expert opinion is very important to win in online games. "Online free poll games are worthy of additional mention on the online gaming benefits list. Free coins and currency allow beginners to learn the rules and strategies of the game and play poker with confidence while winning real money. triofus is the site offers the same mechanism.
"
You have written a very good article, I got a lot of pleasure after reading this article of yours, I hope that you will submit your second article soon. Thank You
Visit Now :- Moldavite Jewelry
I have enjoyed reading your blog. As a fellow writer and Kindle publishing enthusiast, I would like to first thank you for the sheer volume of useful resources you have compiled for authors in your blog and across the web. I'm also working on the blog, I hope you like my Tiger Eye Ring blogs.
I have enjoyed reading your blog. As a fellow writer and Kindle publishing enthusiast, I would like to first thank you for the sheer volume of useful resources you have compiled for authors in your blog and across the web. I'm also working on the blog, I hope you like my Malachite Ring blogs.
Veterans Day free meals have become a popular way for restaurants and businesses to thank veterans for their service. It's a small gesture that carries a significant message of gratitude and respect. FREE MEALS FOR VETERANS ON VETERANS DAY – VETERANS DAY FREE MEALS
Awesome Information!!! Thanks to Admin for sharing this. I visited many pages of your website. Really your Blog is Awesome. Keep Sharing such good Stories. Thanks. Visit Here: Moldavite Jewelry
I visited many pages of your website. Really your Blog is Awesome. Thanks to Admin for sharing this.
labradorite meaning
It's been a long time coming and I've finally found a good website with information on how to bet and good reviews of popular affiliate sites: topbettingapps.in/
It’s fantastic that you are getting ideas from this post as well
as from our argument made at this time. Visit Here: Opal
Thank you so much for sharing such useful information over here with us. This is really a great blog have you written. I really enjoyed reading your article. I will be looking forward to reading your next post.
Visit- Larimar Jewelry
I am very happy to discover your post as it will become on top in my collection of favorite blogs to visit. Visit Us:-Turquoise Ring