~~~ This post was originally published in my blog ~~~
I’ve started watching a lot of chess streams during the lockdown. The quartet of Anish Giri, Vidit, Sagar Shah and Samay Raina have rekindled my curiosity in Chess and while my low rating at chess.com clearly indicates I haven’t progressed much, I do enjoy a 5 minute chess game when I can.
India toured Australia recently the battle was a joy to behold. Never had I been so hooked on to Test cricket as I was during the series.
If you are question is “What do I do with these two pieces of trivia about how you spent your lockdown?”. I would say that’s a fair question. My new found interests reminded me of a side project I had worked on in 2019.
In 2019, I had recently joined the platform engineering team at Freshworks and the Cricket World Cup had just begun. We played a game within the team, the aim was to see who got most match predictions right.
As developers do, our first step when faced with any challenge is to open a jira ticket open Google Sheets. It had a column for each participants and rows indicating the different matches and venue for each match.
Was this a low (zero) stakes game to hide the fact that we had a gambling problem? Maybe.
Inspired by the data folks at fivethrityeight, https://fivethirtyeight.com/methodology/how-our-nfl-predictions-work/ , I wanted to predict world cup match results based on ELO ratings. ELO ratings are widely used in Chess games. It was invented by Arpad Elo, who was a chess player himself and wanted a better way to rate players than the rating system that was prevalent then.
A player's Elo rating is represented by a number which may change depending on the outcome of rated games played. After every game, the winning player takes points from the losing one. The difference between the ratings of the winner and loser determines the total number of points gained or lost after a game. If the high-rated player wins, then only a few rating points will be taken from the low-rated player. However, if the lower-rated player scores an upset win, many rating points will be transferred. The lower-rated player will also gain a few points from the higher rated player in the event of a draw. This means that this rating system is self-correcting. Players whose ratings are too low or too high should, in the long run, do better or worse correspondingly than the rating system predicts and thus gain or lose rating points until the ratings reflect their true playing strength.
Scraping Historical Data
A lot of Elo depends on historical matchups between two teams, so it was critical that I had this information.
Calculating winning %
I wanted to give two teams as input and the program to tell me who has a higher chance of winning based on their elo ratings.
As a bonus, I wanted to have the ability to look up fixtures and tell me today’s fixtures, so that I can query the winner between the teams.
I broke my goal down into three smaller problems
I wanted to build a CLI tool, and didn’t want to get my hands dirty with OptionParser class and was looking for an out of the box solution. Thor fit the bill perfectly.
Fetching / Scraping Data
If you are parsing XML in Ruby, the default choice is Nokogiri, so i didn’t spend much time trying other libraries.
Ruby’s Elo Gem to the rescue. (That was a Ruby pun)
At this point, there’s an argument to be made that Elo is a rating mechanism and not a means to predict results, which is true. All of this is probably an elaborate ruse to try and make a CLI tool in ruby and do a bit of data scraping and that wouldn’t be far from the truth.
Code - https://github.com/girish-koundinya/Predicta
Tinder and Elo - https://twitter.com/iamkoshiek/status/1201111952916975617
Elo Ratings in Tinder, War of Warcraft - https://www.theatlantic.com/entertainment/archive/2016/01/how-tinder-matchmaking-is-like-warcraft/424350/