Introduction
At some point in time, you have probably heard that your chances of winning a lottery are very slim. As with all things related to probability, multiple trials might tilt an outcome in your favor. Now, if you were to participate in many lotteries, your chances of winning one would be a bit better, depending on how many more lotteries you participated in. This is still by no means a guarantee that you will eventually win, but with uniform distribution, and going by the law of large numbers (in this case meaning a large number of lotteries), we can arrive at relatively more likely possibilities.
It's important to understand that each new lottery is independent of any other, and the same lottery "ticket number" could win many different lotteries (following the law of large numbers). You could also be unlucky and pick the wrong number in every lottery, regardless of how many times you tried. You have two options now:
- You can try a random number every time.
- You can try the same number every time.
Theoretically (and mathematically), both scenarios have an equal likelihood of occurring. However, scenario 2 will give you a slight edge. As the number of times approaches infinity, every number will eventually be selected. The problem is that with scenario 1, you need to try more times with the hope that the number you've picked at the time matches the number that wins. With scenario 2, you're sure that as the trials tend to infinity, your number will at some point "win". For this blog post, we will use scenario 2.
So, do you think you can answer this question before I tell you the answer?
"If all lotteries around you had slots for exactly 1 million people and you picked the same ticket [x] for everyone you played, how many lotteries would you have to play to finally be a winner?" (Feel free to comment on what your initial answer was)
About 14.4 million times.
The rest of this blog post will be about how I arrived at that value, how the simulations were done, and some caveats. Things will get more technical from here.
The Logic
The ticket numbers of a lottery of 1 million people would range from 1 - 1,000,000 (or 0 - 999,999). Players can only pick a number within that range for each lottery, and the winning ticket can only be from that range. Essentially, we can say we will have a set of 1 million numbers.
Accounting for the fact that a user can pick any number within that range, we need to satisfy the condition of every item in the set being hit at least once. This is because if every number has been called at least once, it would cover any possible ticket number a player could have picked. This also means we do not care about how many times each number runs, making a "set" the ideal Python data structure to use for our simulation. We will start with an empty set, and populate it with a randomly generated number at each iteration until the set contains every number within the specified range. Since Python sets do not repeat numbers, we do not have to worry about ensuring uniqueness.
def calculate_lottery_chances(lottery_players_count):
number_set = set()
count = 0
while len(number_set) < lottery_players_count:
gen_number = random.randint(1, lottery_players_count)
number_set.add(gen_number)
count += 1
return count
For a lottery of 1,000,000 people, the function call would look like: calculate_lottery_chances(1000000)
, and it would return the number of lottery attempts before winning. Arranging the code in this manner makes it very extendable.
The Problem
In a nutshell, the root cause of the problem is "variation". The first time I ran the function, I got "13.1 million" times as my value. I reran it, and got something along the lines of 13.9 million. I did this even more times and got widely varying answers -at some point, I got 15 million. It was clear that I would need to do this and find an average. Following the existing pattern so far, I figured that as the number of iterations to average it by tended towards infinity, I would be closer to having one reliable answer. There was a need for something that could do this, and do it fast, and that led me to write this function:
def average_over_n_times(function, function_arg, n):
"""
This returns the average of the returned value of a function
when it is called n times, with its (one) arg
"""
total = 0
for x in range(0, n):
total += function(function_arg)
return round(total/n)
Subsequently, everything would then be patched up as:
num_of_trials = average_over_n_times(calculate_lottery_chances, lottery_players_count, n)
Where "n" would represent the number of times to average results by. This, however, brings up another problem that will be discussed in the next section.
What should "n" be?
The larger the value of n, the closer to an "average-case" result. However, considering that there are still no absolutes or certainties, performing this series of tasks too many times stops being productive. I say this for the following reasons:
- Time is not infinite, and we cannot perform these calculations indefinitely, meaning that there will always be a variation (no matter how little) each time it is run, defeating the idea of an "absolute".
- Computational resources are finite.
- One of the assumptions of this experiment is that "randomness" as generated by computers can accurately mimic reality.
- Just as with algorithm runtimes, smaller magnitudes stop being as important as the greater ones. A variation of about 100,000 wouldn't be as significant when dealing with values greater than 13,000,000.
Keeping these in mind, I tested "n" with the values: 10, 20, 30, 50, 100, 1000, and 5000 times.
Where does PyTorch come in?
At this point, you're probably wondering why the word "PyTorch" in the title of the blog post hasn't even been mentioned. Well, although I mentioned testing n with different values, it wasn't the same code I used for all the tests.
These were computationally heavy experiments, and my CPU had a word with me. The code snippets I shared earlier were written in one file that had zero external package dependencies, and the file was run in the bash shell with the time
command prepended to track execution times. Here's what the execution times looked like when using just the CPU:
n | Time (min and sec) |
---|---|
10 | 1m34.494s |
20 | 3m2.591s |
30 | 5m19.903s |
50 | 10m58.844s |
100 | 14m56.157s |
At 1000, I couldn't get the program to work anymore. I wasn't sure if it broke halfway and failed to stop the execution, but I canceled it after 4 hours and 57 minutes. There are a few factors I feel influenced this, which I will discuss in the "caveats" section. Anyway, my fan noise was blaring, and I knew I may have pushed my laptop's modestly powered CPU a bit too much. I refused to accept defeat, and while thinking of what I could do to at least run 4-digit iterations, I remembered something a friend of mine who worked with PyTorch told me:
"GPUs are generally more efficient at computationally intensive than CPUs"
PyTorch uses the GPU, making it the perfect tool for the job.
Refactoring
PyTorch would be used for calculations for our purposes, so refactoring the existing calculate_lottery_chances()
code would mean changing CPU-reliant numerical operations and switching to suitable PyTorch data structures. In a nutshell:
- The Python
set()
data type would no longer suffice. - The Python
randint()
function would be swapped for its PyTorch equivalent. - Seeing as the
set()
data type would be insufficient, there would be a switch to generating a tensor of zeros that matches the size of thelottery_players_count
, with a boolean to indicate whether or not a number had previously won.
The refactor of calculate_lottery_chances
would look like:
import torch
def calculate_lottery_chances(lottery_players_count):
seen_numbers = torch.zeros(lottery_players_count, dtype=torch.bool, device='xpu')
batch_size = 10000
count = 0
while not torch.all(seen_numbers):
# Generate a batch of random numbers
random_numbers = torch.randint(
0, lottery_players_count,
(batch_size,),
device='xpu'
)
seen_numbers[random_numbers] = True
count += batch_size
return count
I set my device as "xpu" because my computer uses an Intel Graphics GPU, which PyTorch supports.
Output
To ensure my GPU was used during execution, I opened my Windows task manager and navigated to the "performance" section before running. On running, I saw a noticeable spike in GPU resource usage.
For context, here is a before vs after:
Before:
After:
Notice the GPU usage is at 49%
For the runtimes for varying values of n, the GPU was several times faster. It ran values of n below 100 consistently at less than a minute, and was able to calculate for a value of n at 5000 (five thousand!)
Here's a table of runtimes using the GPU:
n | Time (min and sec) |
---|---|
10 | 0m13.920s |
20 | 0m18.797s |
30 | 0m24.749s |
50 | 0m34.076s |
100 | 1m12.726s |
1000 | 16m9.831s |
To have a visual sense of how large the performance gap between the GPU and CPU operations was for this experiment, here's a data visualization to think about:
The x-axis was capped at 100 because I could no longer get realistically "timely" output from the CPU, thus leaving no room for comparison with the GPU. Performing the experiments with numbers within the range of 1000 - 5000 gave me about "14.4 million times" as a result, more often than not. That's how I got the answer from earlier.
Caveats
This experiment made assumptions and relied on certain ways of doing things. Additionally, my inexperience with PyTorch potentially means there may have been a more efficient approach. Here are some factors to consider that may have influenced either the accuracy of my findings or the execution times:
- I made a subtle assumption that computer-generated randomness mimics randomness in real life (the physical world).
- While I switched a bit of the logic to use PyTorch, the rest of the code still relied on the CPU. For instance, in the
average_over_n_times()
function, it's possible that both the addition in the loop and the averaging may have benefitted from PyTorch equivalents. I suspect there would have been a performance boost. - I'm unsure of the impact of the
batch_size
I used on accuracy and performance. - All CPU and GPU tests were done with my PC plugged in, to allow the machine to work at its best. Running them with a device on battery power may see longer runtimes.
- PyTorch's CUDA might have the edge over "XPU", but my PC doesn't have support for the former.
- I avoided letting my PC "sleep" during the tests. Tests may potentially take longer to run if your computer sleeps.
Finally, I'd like to point out that this was my first time using PyTorch for anything, and I was quite impressed with the performance.
Conclusion
When I went down the rabbit hole with this, I didn't expect to see such gains in performance. I learned the idea behind tensors and a few things about the supporting mechanisms behind even more computationally complex tasks. You have the freedom to use, replicate, or modify the code snippets as you please.
Thank you for indulging me, and I hope you had a fun read.
Till next time,
Cheers. 🥂
Top comments (4)
I enjoyed how you walked me through every single thing you did. This is beautiful.
I think you excel at articles like these. Keep it coming.
Thank you!
I'm glad you enjoyed it.
These insights are amazing!
Well done🙌🏼🙌🏼
Thank you, Fayo! 🙌🏾