Salmon Run Statistics: Matchmaking (Part 4)

#showdev #python

Hi y'all.
I'm back with a new post on Salmon Run, this time about matchmaking.

You see that big graph at the top?
That's a scatter plot, with the player rank as the x and the hazard level of the match being the y.
I then placed a linear regression on the graph.
An important consideration of this graph is that I set how much points each title is worth to be 100, as that's how many points you need to move from title to title.

With this graph, the equation of the line is:

y = 0.111x + 50.691

As to how I made this graph:
I did it in the scripts/rank_and_danger_rate.py file.

data = core.init("All")
x = []
y = []

First we prepare the dataset, and create empty lists to store the x and y values.

with gzip.open(data) as reader:
    for job in jsonlines.Reader(reader, ujson.loads):
        if "title" in job and "title_exp" in job and job["title"] is not None:
            x.append(job["title"]["splatnet"] * 100 + job["title_exp"])
            y.append(float(job["danger_rate"]))

Here we collect what we're interested in from the data set.

plt.scatter(x, y)
plt.xlabel("Rank")
plt.ylabel("Hazard Level")

Now we prepare the scatter plot.

m, b = numpy.polyfit(x, y, 1)
y_equation = list(sorted([x_val * m + b for x_val in x]))
print("y = {:4}x + {:4}".format(m, b))
plt.plot(list(sorted(x)), y_equation)
plt.show()

Create and plot the linear regression, and show the graph.

Top comments (5)

sunmarsh • Jul 23 '21

First off- this is all super cool! I don't know anything about coding, but I'm working on a book all about Salmon Run and so I googled my way to your page. I have a few questions for you:

Where is this dataset from? Is this all of the salmon run matches held between a certain timeframe? Is it all users or a subset of users?

In regard to this specific set of statistics (Matchmaking), I'm interested in whether certain titles are allowed to match with each other? Looking at your data, it seems like the answer is no, otherwise we would have seen a larger range in hazard levels at lower ranks, since hazard level is determined by the average rank of the team. Do you have any insight in regard to this?

Cassandra de la Cruz-Munoz • Jul 23 '21

I had another thought. I wonder if there's a direct relationship between the average lobby rank and the hazard level, or if there's a little fuzz factor. That wouldn't be too hard to test, but I would need 3 other players to test it with, since I'd need to know everyone's title and points.

sunmarsh • Jul 23 '21

Well, we know (thanks to datamining, I assume) that the average lobby rank is related to hazard level according to this table: splatoonwiki.org/wiki/Salmon_Run_d...

What I'm trying to figure out, is how to read the table/understand which title to use to determine what the hazard level will be. Based on the hazard levels I've gotten when playing online, I know that I'm being matched with players who are much lower than me. But as you said, because I don't know their titles/points, I don't know how low.

It's also not clear from the table how to perform the mathematical 'average' of users between ranks. What is the true value of someone at 50 Overachiever vs 50 Part-Timer? Is it 50 in each case?

Cassandra de la Cruz-Munoz • Jul 23 '21

That's something I'd need to collect data for.

Presumably there's a linear relationship between total party rank and hazard level. By knowing the rank of all four party members, across different titles, and getting the hazard level, for a few different combinations of ranks and hazard level, the exact properties of that relationship could be discovered.

Cassandra de la Cruz-Munoz • Jul 23 '21

So this dataset is of Salmon Run results uploaded to Stat.Ink, a fan made site for tracking Splatoon and Splatoon 2 game results. Players have to manually upload their matches to the website using a 3rd party tool. As such, this should be considered a sample of Salmon Run games played, and the whole population. It also should not be considered a random sample, either, due to the self-selection bias.

My impression from looking at the data is that by default, matchmaking tries to pair you with people with the same title as you. The exception to that is if you have a mixed title lobby of players joining each other, then it fills remaining slots with players whose titles fit the average rank of the players already in the lobby.