DEV Community

Yuval
Yuval

Posted on

Testing Github Co-Pilot and Trying to Win World Cup Bet

The world of Algorithmic Betting is very reach; lots of words were written about Arbitrage Betting; what it means is, that different booking providers give different ratios for the same game, so you can, by betting in multiple providers, guarantee to make a profit.

However, this requires a lot of effort - real-time betting, scraping, etc..

This post will be about trying to find the best strategy to gamble with friends.

2 options exist:

  • Bet "single" - single result on a game - home/draw/away - 2 points if correct
  • Bet "double" - choose 2 of home/draw/away - 1 point if correct

Bonus - guess 5/6 games in a group - get 1 bonus point; guess 6/6 games in a group - get 3 bonus points.

A script was written to find the ratio in which you should take a single bet below. e.g. if a team is expected to win at 1.1 ratio, and the cut-off is 1.2, then bet single.
If a team is expected to win at 1.5 but cut-off is 1.4, so bet double - e.g. this team and the best other bet.

Howto?

We start with RapidApi and do Google search for rapid api soccer bet then need to find a free provider.
We'll go with Pinnacle. Subscribing to free plan which would be enough.
Scraping market, scraping games for the market and saving results in cache.
Setting a single_limit, e.g. the limit for single or double bet - below this limit always a single bet.

Teams to Groups

So, each group has 6 games - 4 choose 2, e.g. 4!/(2!2!)
And for 5 correct answers we get 1 point, for 6 we get 3 points.

How do we know if we have 5 or 6 correct answers? We need to map game (team) -> group.

How to do this? It can be done automatically! Since the group(team) is an Equivalence Ratio
we can build it: if we have games:

  • team A <-> Team B, And
  • Team B <-> Team C, and
  • Team C <-> Team D,

So we know all A,B,C,D are in the same group!
And we don't have to enter all the data manually. In the code a similar solution is implemented in GroupsHelper.

Use Copilot!

So, there are some controversy around Github Copilot.

So please don't use it in your corporate job lol.. Or make sure to ask legal before doing so.
I've used Copilot for this toy project, and got some nice results.

Main takeaways:
It can generate complete class, if the class is trivial (e.g. Game class).
You can add a comment before function, and this way give copilot a "hint" about what is expected from it. Function name is a hint, of course, but also the comment.

Sometimes, even hints doesn't help.. The algo gives us 2018 world cup groups, not 2022 as instructed..

copilot giving 2018 world cup groups

Q&A

Q: What is RAPID_API_KEY = os.environ.get('RAPID_API_KEY')?
A: You should store configuration in environment variables; never in code. See 12 factors app.
Python .pyc files can easily be "decompiled" to .py and reveal all secrets in code.

Q: What methods can be used to explore the API?
A: The best option is to search for Swagger file. Swagger is "open source editor to design, define and document RESTful APIs in the Swagger Specification".

Another alternative, is to search for Postman collection for relevant product / service. Some Postman collections examples.
For this project, I've used some hacky method:

resp = requests.get(url)
open('out.txt', 'w', encoding=resp.encoding).write(json.dumps(resp.json(), indent=4))
Enter fullscreen mode Exit fullscreen mode

saving json to file with indent

Q: What about exception handling?
A: Scraping part was manual, e.g. scraping results of all of the games, and that's it; so in case of error - it was handled manually.

Q: I guess there are some libraries to handle http requests cache
A: Yes, there are indeed; however too much dev time to learn those libs.

Q: What is the req_id in do_req?
A: While other libs like requests-cache automatically integrate (eg patch) into requests, since we implement our own "cache", we need a way to know if request was already fetched or not.

This is a signature of the file, which allows us to check quickly if request already exist in cache. E.g., we can save request in some key-value DB (Redis) and query it by the signature. Actually, we use the disk (/tmp/cache/) as key-value store.

Q: Why does numpy gives a warning on the print() line?
numpy warning message
A: Check what is the return type from np.mean(), for example..

Q: What is the name of the technique which can describe the line for single_limit in [1.1, 1.2, 1.3, 1.4, 1.5, 1.6, 1.7, 1.8, 1.9, 2.0]:?
Which scikit-learn method can refer to it?
A: Grid-search; we search for the best parameters for the mode.
scikit-learn ref - https://scikit-learn.org/stable/modules/grid_search.html

Q: What about Generic Algorithm?
A: This was the original plan, to use some GA to get the best betting strategy. But not enough parameters for it.

Q: In tests.py, in some of the tests, you're missing assert statement!
A: Correct; these are statistical tests, so I print values and check them manually; this is also part of the "type" of the project.

If it was a production system, we'd have to do something like making sure values are "similar", e.g. maybe up to some 2 or 3 standard deviations from one each other..

Q: What is the random.seed(42)?
A: In case of bugs, we want to be able to reprod the bug. So the random.seed() allows us to get reproducible results.
Q: But then you get the same results every time; don't you want randomization?
A: Actually, we do. So we can use something like that:

import time, random
time_seed = int(time.time())
print("seeding with %d" % (time_seed))
random.seed(time_seed)
Enter fullscreen mode Exit fullscreen mode

Source Code

Top comments (0)