DEV Community

Ayane Satomi

Posted on Oct 11, 2020 • Edited on Feb 18, 2022

Mapping out Gacha Pull Probabilities using Python and Google Colaboratory

#gaming #colab #python #jupyter

COVID-19 brought us some good stuff because of boredom, some of you may have worked on your applications that you abandoned because you have no time, some of you are probably giving your cats a good well-earned petting, I spent one day of my quarantine doing what most otakus with computer science degrees do best: using our knowledge with programming and technology to our advantage to "cheat" Gacha pulls.

First of all, I do not condone cheating, however, the idea around here is that a group of people were working on mapping a statistical model of Gacha probabilities which tends to have a very predictable RNG. Let's just say I got interested so I decided to hop in on the fun.

Background

Gacha originates from Gacha vending machines which usually you drop a hundred yen to get a capsule that gives you a cute thing, but one out of a thousand of those is an exceedingly rare item out of the pile of gacha capsules inside that vending machine.

Fast forward to modern gaming, the gacha mechanic is used by a lot of games - especially anime-themed ones like Arknights.

Gacha games usually employ an index of items to check, and a Pseudo-Random Number Generator, with either your User ID as the seed or in more extreme cases, the server time as the seed. However, Gacha RNG has a "drop rate" or a probability that you would get your preferred item in a N number of drops. And of course, that involves a lot of money to thoroughly test this, so why not use our knowledge of data science to figure out that instead of wasting buckets of money just to figure out the distribution by hand?

Starting out

The entire project was started by an acquaintance I met who goes by KaidenFrizu with one simple goal: they wanted to figure out how much rolls it would require getting the desired results in a N number of times.

This has been going on for a while now, I only joined to the party since I believed that the RNG algorithm also has an effect of the outcome of a roll (considering not all RNGs are created equal: some of them uses bitshift and some of them uses cryptographically secure algorithms).

After a while, my involvement became more hands-on as I started to port a Python code by another member of the discussion who goes by the name "Eyenine". Together with the resident statistician of the group, SurfChu85, we went ahead and implemented a Jupyter notebook that maps out the probability of a rolls in a specific number of iterations.

The implementation

The implementation focused on using the Subtractive PRNG - the same PRNG used by C# for its Random API, and the use of a Numba-optimized loop function that does the following:

for a specific rate and N number of iterations with the drop rate of X with a sample size of Y:
- iterate until N is reached as...
- ...another iterator iterates until sample size Y is achieved...
- keep iterating on attempts and find a good hit from a RNG vs the rate. If a hit is found, increment the rate, and start all over for until the maximum increment range is achieved.
- After one iteration for N is achieved, save output to an iterable and repeat the loop until N is achieved.

The following algorithm was designed to match the Arknights RNG system that also included a "pity" system that increases the rate should the RNG fail to get the target drop rate on a specified number of failed rolls.

You can view the entire implementation in GitHub:

sr229 / gacha-prng

A Jupyter Notebook that maps the probability distribution of the Arknights RNG for Gacha rolls.

gacha-prng

A Jupyter Notebook that maps the probability distribution of the Arknights RNG for Gacha rolls.

Running

You can run this on Google Colaboratory!.

Copyright

View on GitHub

Which you can open in Google Colaboratory. It should work with Binder as well if you're not into using Google services.

The results

From what we can gather on the output of the Jupyter notebook, we have concluded the following:

Fixed UID as seeds

There is a much steeper curve in fixed seeds (aka seeds that use UIDs).

Therefore, from what I have analyzed fixed UIDs will require more pulls to get a desired drop, and only at the 51st pull will you see the probability increase.

Time as seed

One of the most interesting simulations we did was a theory that a variable seed would have yield a much wider graph which means on a ever increasing seed would also imply that the distribution will be higher as time goes on.

Again, due to how Arknight's pity system modifies the Gacha algorithm, the pity system would be overkill to the point the probability would increase with the lesser number of rolls and uniformly the higher number of rolls as well. This gives an insight into why time-based seeds are not so oftenly used in Gacha games.

Conclusion

Based on our discovery, I can only produce one TL; DR - prepare to simp even harder because you'll need to roll even more.

However, from an experience standpoint, it really shows what we can do with simple statistics and data science to solve a literal "million-dollar question". Such methods have been used to solve more practical problems, but I can now rest well knowing we just helped an entire Gacha community to plan properly when doing the money drain, we call Gacha rolls.