DEV Community: aishahsofea

Dynamically change scheduled tasks with Django & celery beat

aishahsofea — Fri, 17 Sep 2021 15:49:30 +0000

Background

I was trying to create something similar to CoinGecko/CoinMarketCap where a list of coins with its current price will be displayed. In the first iteration, I fetched the current price via an HTTP request, and while this works, it is not the best and can be rather inconvenient. Cryptocurrency is known for its volatility, so if the user stays idle for a couple of minutes, the price displayed will quickly be outdated. They can only get the updated price after refreshing the page. On average, the API call takes around 600 to 1000ms. Imagine having to wait 1 second every time you want to get an updated Bitcoin price on top of having to refresh the page. That's pretty annoying, isn't it? So how do we go about solving this? Let's dive right in.

Tech stack

Django as the web framework
React for the UI stuff (the bulk of this article will be on the BE so it's totally fine if you don't know React)
Celery for asynchronous task execution
Channels to allow WebSocket communication
Redis as the message queue

Note: If you are on Windows, chances are you'll encounter an issue when running Celery since it does not officially support Windows. So I highly recommend you to use WSL instead. It is also easier to install Redis on WSL.

Also note: This tutorial assumes that you already have some experience with Django.

Setup and installation

Set up Django and React boilerplate.

Create a folder for our project, let's call it cmc_clone.
Assuming you already have Python installed, create a virtual environment inside cmc_clone; python3 -m venv venv. Note that I use python3, that's because I have both Python 2 & 3 installed on my WSL.
Now you'll see venv folder inside your directory. Activate the virtual env like so; source venv/bin/activate.
Now let's install Django; pip install django. Run pip freeze | grep Django or simply python -m django --version to ensure that it's been installed. (Note that I'm now using python instead of python3 since I'm already inside the virtual environment)
Once Django has been successfully installed, let's start a Django project called server; django-admin startproject server.
Inside server folder, you will see another server folder and manage.py file. This is the boilerplate that Django automagically creates for us. To make sure that it is indeed working, run python manage.py runserver and go to localhost:8000 on your browser. You should see a rocket animation and that means your Django server is properly running.
Cool, now let's create an app called coin; python manage.py startapp coin on the same level as manage.py file. You will see a coin folder being created, this is what we'll use later for our logic. Make sure to include it under INSTALLED_APPS.
Let's move on to React. For this we will also use a boilerplate from create-react-app. Assuming you have npm installed, let's generate React boilerplate inside cmc_clone folder (same level as venv folder); npx create-react-app client. This might take a few minutes.
Go inside client folder and run npm start. A development server will run at localhost:3000

Install Redis and make sure it is properly working.

Follow this tutorial for Windows 10.

I will structure this article in the sequence of the mistakes that I made. So bear with me.

Getting current price of the coins without the hassle of refreshing the page

So the first problem that we want to solve is getting the updated coin price without having to refresh the page. This is where WebSocket comes in. Unlike HTTP where client needs to send a request each time they want to get a response, WebSocket makes sure that the connection between a client and a server stays open.

As for the coin data, we will use CoinGecko API. There is a bunch of endpoints that you can leverage but for our purpose we are only interested to use the /coin/markets endpoint.

Let's say we want to get the a price every 30 seconds. Since we don't want to refresh the page, there should be a background process that does this for us, something similar to cron job. Luckily, Celery is pretty good at this.

So there are 2 main parts here:

Setting up a WebSocket connection
Execute background tasks

Setting up a WebSocket connection

For this we will use a package called channels. Follow the official installation guide.
Make sure you include channels in the INSTALLED_APPS inside settings.py. Also, under WSGI settings, please add this line ASGI_APPLICATION = 'server.asgi.application'.
If you encounter any error, please consider upgrading pip.

Then, install the channels_redis package; pip install channels-redis. lt provides channel layers that use Redis. Visit the Github for more config options.
But for us, we will use the following. Include this inside settings.py:

CHANNEL_LAYERS = {
        'default': {
            'BACKEND': 'channels_redis.core.RedisChannelLayer',
            'CONFIG': {
                'hosts': [('127.0.0.1', 6379)]
            }
        }
}

Now modify asgi.py so it can handle WebSocket communication.

import os

from channels.auth import AuthMiddlewareStack
from channels.routing import ProtocolTypeRouter, URLRouter
from coin.routing import ws_urlpatterns
from django.core.asgi import get_asgi_application

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'core.settings')

application = ProtocolTypeRouter({
    'http': get_asgi_application(),
    'websocket': AuthMiddlewareStack(URLRouter(ws_urlpatterns))
})

We have yet to create the ws_urlpatterns, so don't worry about that for now.

Inside coin folder, create consumers.py and routing.py. You can think of consumers.py to asgi application as views.py is to normal Django application, which also has its own routing.

Let's deal with consumers first. Paste the following code inside consumers.py.

import json

from channels.generic.websocket import AsyncWebsocketConsumer

class CoinListConsumer(AsyncWebsocketConsumer):

    async def connect(self):
        await self.channel_layer.group_add('coin_list', self.channel_name)
        await self.accept()
        await self.send(json.dumps({'message': 'hey im server'}))

    def receive(self, text_data):
        print(text_data)

    async def disconnect(self):
        await self.channel_layer.group_discard('coin_list', self.channel_name)

Here I'm writing the consumer class as asynchronous by extending AsyncWebsocketConsumer class provided by channels. To understand consumers better, do read the doc.

Now let's create the route for our CoinListConsumer. Inside routing.py:

from django.urls import path

from .consumers import CoinListConsumer

ws_urlpatterns = [
    path('ws/coin_list/', CoinListConsumer.as_asgi())
]

Note that the ws_urlpatterns is the one that we imported inside asgi.py.

Now, let's connect to the WebSocket from the client. Inside /client/src folder, modify App.js file.

const socket = new WebSocket("ws://localhost:8000/coin_list/");

function App() {

    useEffect(() => {
        socket.onmessage = (message) => {
            const data = JSON.parse(message.data);
            console.log(data);
        };
    }, []);

    const handleButtonClick = () => {
        socket.send(
            JSON.stringify({
                message: "hey im client",
            })
        );
    };

    return (
        <div  className="App">
            <button  onClick={handleButtonClick}>Send message to the server</button>
        </div>
    );
}

export  default App;

Now you should see a button on React server.
Restart Django server, and you'll see an additional line saying something like Starting ASGI/Channels version 3.0.4 development server at http://127.0.0.1:8000/. This means that our ASGI is properly configured and the client can now talk to our server via WebSocket.

Open console tab on your browser, and refresh the page. You should see: { "message": "hey im server" }. It's a JSON that we send from the consumer inside connect() function.

Try clicking on the button and monitor the Django terminal. You should see something like below:

It's the message that we send from the client using socket.send() method, and received by the consumer as a text_data in the receive() function.

Great, WebSocket is working. Let's make it more exciting by integrating Celery beat.

Execute scheduled background tasks with Celery beat

First, we'll install Celery. In our case the background task that we want to execute is the API call and we'd like to use Redis for our message broker. So make sure to also install all the required packages; pip install celery requests redis
On the same level as settings.py, create a file called celery.py. This is where we will configure Celery. You can visit this link for the explanation on the configuration. Inside celery.py, paste the following:

import os

from celery import Celery

os.environ.setdefault('DJANGO_SETTINGS_MODULE', 'server.settings')

app = Celery('server')

app.config_from_object('django.conf:settings', namespace='CELERY')

app.autodiscover_tasks()

And then, make sure to import the app inside __init__.py. Next, create tasks.py and paste the code below:

import requests
from asgiref.sync import async_to_sync
from celery import shared_task
from channels.layers import get_channel_layer

channel_layer = get_channel_layer()

def get_market_api(page=1, per_page=100, currency='usd'):
    market_api = f'https://api.coingecko.com/api/v3/coins/markets?vs_currency={currency}&page={page}&per_page={per_page}'
    return market_api

@shared_task
def get_coin_list():
    data = requests.get(get_market_api(1, 100, 'usd')).json()

    async_to_sync(channel_layer.group_send)(
        'coin_list', {
            'type': 'send_coin_list',
            'coin_list': data
    }
)

get_coin_list() is where we do the API call, and since we want to schedule this task, we're using the shared_task decorator so that it can be used anywhere in the project. Note that we use async_to_sync() function, and this is because get_coin_list() is a synchronous function but we would like to send the data to an asynchronous function inside consumers; send_coin_list(). Speaking of which, we haven't actually created the said function. Inside the CoinListConsumer class, just add the following.

async def send_coin_list(self, event):
    coin_list = event['coin_list']
    await self.send(json.dumps(coin_list))

Remember that we want to schedule the API call so that it will be executed every 30 seconds. Inside celery.py, add the following code right before app.autodiscover_tasks().

app.conf.beat_schedule = {
    'get_coin_list_30s': {
        'task': 'coin.tasks.get_coin_list',
        'schedule': 30.0
    }
}

We want to use Redis as our message broker, so add the following inside settings.py:

CELERY_BROKER_URL = 'redis://localhost:6379'

Before integrating with the front-end, let's make sure the scheduler is actually working. Open new terminals and activate the same python virtual environment. Inside the first terminal, run celery -A server beat -l INFO and in the second one, run celery -A server worker -l INFO --pool=solo. If you're on linux you may omit the pool argument. Once celery beat has started, take note of the configuration, right now it is using PersistentScheduler as the scheduler. This information will be useful later.

So what is happening here is that celery beat will send the task to Redis message broker every 30 seconds, and celery worker will check the queue and execute the first item in it.
The beat terminal should look as follows:

[2021-09-17 15:34:04,540: INFO/MainProcess] Scheduler: Sending due task get_coin_list_30s (coin.tasks.get_coin_list)

and worker terminal should look as follows:

[2021-09-17 15:34:06,041: INFO/MainProcess] celery@aishahsofea ready.
[2021-09-17 15:34:06,051: INFO/MainProcess] Task coin.tasks.get_coin_list[e211f23a-63a3-498d-9068-845beaf6c0e1] received
[2021-09-17 15:34:07,676: INFO/MainProcess] Task coin.tasks.get_coin_list[e211f23a-63a3-498d-9068-845beaf6c0e1] succeeded in 1.6234161000029417s: None

Let's display the scheduled data in the UI. Modify App.js as follows:

import { useEffect, useState } from  "react";

const socket = new WebSocket("ws://localhost:8000/coin_list/");

    function App() {
    const [coins, setCoins] = useState([]);

    useEffect(() => {
        socket.onmessage = (message) => {
            const data = JSON.parse(message.data);
            setCoins(data["coin_list"]);
        };
    }, []);

    return (
        <div  className="App">
            <ol>
                {coins
                    ? coins.map((coin) => (
                        <li>
                            {coin.name} | {coin.current_price} USD
                        </li>
                    ))
                : null}
            </ol>
        </div>
    );
}

export  default App;

A list of coins will be displayed along with their price, and at least every 30 seconds the price will be updated. But this really depends on the data sent by the API. If you want to confirm, just console the parsed message. Alright, this is perfect, we can now get a real time price without having to refresh our page. But what if we need to change the currency? The task scheduler needs to be disrupted and the current task needs to be replaced with a new task that will call the API based on the currency that we choose. This is a going to be a problem, because recall that we are using PersistentScheduler, and we cannot simply change the tasks at runtime. So let's use a DatabaseScheduler where we can manage our tasks with a database table. However this does not come with Celery, we are going to have to install an extension package; pip install django_celery_beat. And once it's done installing, include it under INSTALLED_APPS. Since we want to use the table, don't forget to migrate it; python manage.py migrate

Now let's add a simple dropdown to allow user to select a currency. Add the following JSX under App class.

<label  for="currency">Switch currency:</label>
<select  name="currency"  id="currency"  onChange={handleCurrency}>
    <option  value="usd">US Dollars</option>
    <option  value="eur">Euro</option>
    <option  value="myr">Malaysian Ringgit</option>
    <option  value="btc">Bitcoin</option>
</select>

as well as the handleCurrency function:

const handleCurrency = () => {
    const currency = document.getElementById("currency").value;
    socket.send(
        JSON.stringify({
            currency: currency,
        })
    );
};

Modify consumers.py so it can receive the selected currency.

import json

from asgiref.sync import sync_to_async
from channels.generic.websocket import AsyncWebsocketConsumer
from django_celery_beat.models import IntervalSchedule, PeriodicTask

from .tasks import get_coin_list

class CoinListConsumer(AsyncWebsocketConsumer):

    async def connect(self):
        await self.channel_layer.group_add('coin_list', self.channel_name)
        await self.accept()
        await self.send(json.dumps({'message': 'hey im server'}))

    @sync_to_async
    async def receive(self, text_data):
        text_data_json = json.loads(text_data)
        currency = text_data_json['currency']
        get_coin_list.delay(currency)

        schedule = IntervalSchedule.objects.create(every=30, period=IntervalSchedule.SECONDS)

        try:
            data = PeriodicTask.objects.get(name='Get coin list')
        except PeriodicTask.DoesNotExist:
            data = None

        if data is  None:
            PeriodicTask.objects.create(
                interval=schedule,
                name='Get coin list',
                task='coin.tasks.get_coin_list',
                args=json.dumps([currency]),
            )
        else:
            PeriodicTask.objects.filter(name='Get coin list').update(args=json.dumps([currency]))

    async def disconnect(self):
        await self.channel_layer.group_discard('coin_list', self.channel_name)

Inside the receive method, we captured the currency sent from the client. And then we create an IntervalSchedule object. We're going to name our task as 'Get coin list'. So first we look the name up inside the PeriodicTask table. If it's not in the table, create one and set the currency as the arguments. If it already exists, simply update the argument. Since task scheduling is now handled here, we can remove the one we configured inside celery.py.

Also get_coin_list method in tasks.py should receive currency argument like so:
For the sake of brevity, I omitted the rest of the code. View the complete code here.

def get_coin_list(currency='usd'):
    data = requests.get(get_market_api(1, 100, currency)).json()

The celery beat needs to be restarted, but the command would be slightly different now that we want to use the database scheduler.

celery -A server beat -l INFO --scheduler django_celery_beat.schedulers:DatabaseScheduler

.

Restart the worker as well and the command should be the same as before. Refresh the UI and select any currency and you should see the price changing accordingly.

References:

An article on WebSocket.

Working on a branch that depends on another branch

aishahsofea — Fri, 05 Mar 2021 15:17:22 +0000

Okay, let's imagine that you are assigned to work on 2 different features on 2 separate branches. Let's call these branches 'feature-A' and 'feature-B', and they both branch off a main branch called 'master'.

So, what we'd do is that we go to master and checkout to feature-A from there:

git checkout master
git checkout -b feature-A

After working on feature-A and you are happy with your work, you'd push your changes to the remote branch and open a pull request. Now your work is under review. Cool. Now you're ready to move on to the next task, feature-B. Likewise, you checkout to master branch and again checkout feature-B from there:

git checkout master
git checkout -b feature-B

Now you've successfully created a new branch for feature-B and you are so thrilled to work on this exciting feature... until you realized that you can't actually work on it because you need the changes that are on feature-A!

Okay, why don't just wait until feature-A is merged to master then just rebase master into feature-B? NO! You don't wait, what are you gonna say in tomorrow's stand-up?

Luckily for you, git is very versatile and there are many ways to resolve this. But a solution that I like the most is the one by AnoE on stackexchange.

The main idea is that we need the changes that we made on feature-A in order to start working on feature-B. No problem, let's just get it! Make sure you are on feature-B and rebase feature-A:

git checkout feature-B
git rebase feature-A

Or you can delete feature-B and create a new one from feature-A instead (remember, you branched from master earlier)

git branch -d feature-B
git checkout feature-A
git checkout -b feature-B

Cool, now that you have everything you need, you can work on feature-B like you normally would. And if there are any changes made on feature-A, you'd just rebase it as usual.

Now, what would happen if feature-A is finally approved and merged to master? It means that feature-B does not have to be dependent on feature-A anymore because everything that was in feature-A is now on master. So feature-B should be dependent on master now. How do you go about doing this? Ok, pull all the latest changes into your local master branch:

git checkout master
git pull origin master

Then, the moment of truth:

git checkout feature-B
git rebase --onto master feature-A feature-B

I don't know about you but when I first saw the last command, I couldn't make sense of it. What does --onto mean? Why are we mentioning all the 3 branches in one command? That's a very loaded command.

Generally, git rebase --onto is a command that you should use if you want to change the parent branch. The formula is `git branch --onto new-parent-branch current-parent-branch child-branch. This is a good article explaining the topic. Also, technically, you don't have to specify the child branch if you are already on that branch, which in our case is feature-B. The command would be as follows:

git checkout --onto master feature-A

Awesome! Your feature-B is now again based off master!

Linear Algebra Operations with PyTorch

aishahsofea — Wed, 27 May 2020 11:15:55 +0000

Linear algebra is a core mathematical concept in machine learning, especially deep learning, a sub-field of ML. There is a number of instances where linear algebra comes in handy when implementing neural network in deep learning. One of it is when dealing with unstructured data like images. An image consists of pixels which is commonly represented as a tensor or a matrix. In this blog post, I will briefly talk about basic linear algebra operations in PyTorch that are used in deep learning.

1. `torch.dot()`

This function allows us to perform dot product aka inner product between two vectors of the same size. The first element of t1 is multiplied with the first element of t2 and the second element of t1 is multiplied with the second element of t2 and so on and so forth. These products are then summed together. Note that a dot product between 2 vectors always returns a scalar value.

This operation is also commutative, in which t1 . t2 = t2 . t1. If we pass in t2 as the first argument and t1 as the second argument, we will get the same answer. One thing to keep in mind is that the dot product only works if the vectors are of the same size. If not, it will spit out an error complaining about inconsistent number of elements.

2. `torch.mm()`

torch.mm() is responsible for multiplication between 2 matrices. Similar to vector multiplication, matrix multiplication makes use of dot product and requires the matrices to have certain sizes. The number of columns of the first matrix must be equal to the number of rows of the second matrix. Each row of the first matrix will be transposed and multiplied against each column in the second matrix. This is basically a vector multiplication where each row in the first matrix is transposed to make sure it has the same dimension as each column in the second matrix.

For example, the dot product is valid if the first matrix has a dimension of (3, 2) and the second matrix has a dimension of (2, 2). But not the other way around. A bunch of words might not help much, so let's look at a couple of examples.

Example 2 uses the same arguments as example 1, except that the order of the arguments is swapped. We can see that swapping the order results in a completely different outcome. So, unlike the dot product between 2 vectors, matrix multiplication is not commutative; t1 x t2 != t2 x t1.

Example below shows that it is important to make sure the rows of the first matrix have the same number of entries as the columns of the second matrix.

3. `torch.matmul()`

This function performs multiplication, but it is not limited to certain shapes of tensors. torch.matmul() allows us to do multiplication for different ranks of tensors. Based on PyTorch's official documentation, this function behaves according to the dimensionality of the input tensors. For instance, if both arguments are vectors of the same size, it will behave exactly like torch.dot(). If both arguments are matrices, it will perform matrix multiplication similar to torch.mm(). It also supports multiplication between a scalar and a matrix, by converting the scalar value into a rank-2 tensor so these 2 tensors will be compatible. In other words, it supports broadcasting. Check out this blogpost for a detailed explanation on broadcasting.

In above example, t1 is a scalar value and t2 is a matrix. The way matmul handles this is by pre-pending a 1 to the dimension of t1 so the new dimension becomes (1, 2). It is now compatible with t2 that has a dimension of (2, 3). The pre-pended 1 is removed after multiplication is performed.

4. `torch.transpose()`

Sometimes the tensor that we have is not the shape or dimension that we desire, and this happens a lot. So this is where transposing an array or a matrix comes in handy. One of the applications is when doing an operation within matrices itself, which I mentioned earlier in torch.mm() section. In terms of a matrix, transposing can be thought of as flipping the elements over the diagonal axis. torch.transpose() accepts 3 arguments; first argument being the tensor, second and third arguments being the dimensions that we want to swap.

The above example shows that we swap between dimension 0 and dimension 1, so the row of the output is the column of the input and vice versa.

torch.transpose() allows us to swap between the same dimension, but it is the same as not swapping. So the output will be just the same as the input. Example below demonstrates this behavior:

5. `torch.add()`

So far, we haven't mentioned element-wise operation between tensors. There are a number of functions in PyTorch that allows us to do that and one of them is torch.add(). In order to sum two matrices together, they must have the same size.

t1 and t2 are both (2, 2) matrices allowing values at the same position of the 2 matrices to be added together resulting in also a (2, 2) matrix.

Example above shows that the function supports broadcasting in which it modifies the dimension of t1 so it becomes compatible with the second argument, t2.

Conclusion

It is certainly important to have a good understanding and know when to implement a particular linear algebra operation if we want to delve into the world of deep learning. That being said, the list of functions above is far from exhaustive. Fret not, as we dive deeper, we are likely to discover more functions and the list will only grow from here on out!

CartPole with Q-Learning

aishahsofea — Wed, 13 May 2020 08:34:16 +0000

Motivation

I recently finished the CS50 AI course by Harvard. If you are interested in learning modern AI concepts and looking to do hands-on projects, this course is for you. All you need is basic math and programming knowledge. Also, did I mention that it is completely free? Anyway, in week 4, we were introduced to different types of learning in Machine Learning; supervised learning, unsupervised learning, reinforcement learning along with commonplace algorithms like SVM, KNN-clustering and K-means.

What caught my attention the most was the RL algorithm; Q-learning. Unlike most other algorithms, where we need to prepare the data before training, Q-learning(or just RL in general) collects the data while training, sort of. For the project assignment, we need to implement Nim. Our agent is trained by playing against itself for 10,000 times prior to playing against a human. I would say the outcome was impressive, I mean, I lost 100% of the time. Anyhow, I wanted to reinforce(no pun intended) my understanding and implemented it for a different environment.

Check out my implementation!

CartPole Problem

Luckily for us, Open AI Gym provides a number of environments we can choose from. The most popular one is --wait for it-- the CartPole, so I decided to go with that. Refer to this wiki for the problem details.

It is considered solved when reward is greater than or equal to 195 over 100 consecutive trials.

Challenges

Data collected during training is stored in Q-table. For problems with finite states like Nim, storing state-action pairs with their respective rewards is not an issue. However, for our cartpole environment, the states are continuous. To get a better idea, below are the minimum and maximum values for each variable.

Maximum values:
[4.8000002e+00 3.4028235e+38 4.1887903e-01 3.4028235e+38]

Minimum values:
[-4.8000002e+00 -3.4028235e+38 -4.1887903e-01 -3.4028235e+38]

Imagine all the possible numbers between the max and min values, it is simply impossible to evaluate reward at each distinct state. For this reason, we have to descretize the values into buckets. Code to discretize state space is inspired by sanjitjain2 with some minor tweak.

class CartPoleEnvironment():

    def __init__(self, buckets=(1, 1, 6, 12,)):
        self.env = gym.make('CartPole-v1')
        self.buckets = buckets

    def discretize(self, obs):
        """
        Convert continuous observation space into discrete values
        """
        high = self.env.observation_space.high
        low = self.env.observation_space.low
        upper_bounds = [high[0], high[1] / 1e38, high[2], high[3] / 1e38]
        lower_bounds = [low[0], low[1] / 1e38, low[2], low[3] / 1e38]

        ratios = [(obs[i] + abs(lower_bounds[i])) / (upper_bounds[i] - lower_bounds[i]) for i in range(len(obs))]
        new_obs = [int(round((self.buckets[i] - 1) * ratios[i])) for i in range(len(obs))]
        new_obs = [min(self.buckets[i] - 1, max(0, new_obs[i])) for i in range(len(obs))]

        return tuple(new_obs)

Training

Our agent is trained for 5,000 episodes.

For each episode:

CartPole environment is initialized.
Initial state is extracted from the environment.
Exploration rate is decayed, since we want to explore less and exploit more over time.
Agent can train for a maximum of 200 timesteps.

At each timestep:

Using epsilon-greedy algorithm, select an action.
Passing the selected action to the gym's step() function, we can get the new_state, reward and done. done is true if the pole is no longer upright.
Update our Q-table using Bellman equation.
If the pole is no longer upright, break out of the loop and start a new episode.

Evaluation

For every 500 episodes, I average out the total rewards.

After 5000 episodes of training, the average rewards is starting to look good. This means that, on average (episode 4501-5000), the pole was upright up to 196 timesteps. In fact, for the last 300 episodes or so, the pole was upright for 200 timesteps. This shows that our agent indeed learns over time.

Observe trained agent

In play() method, we initialize a new CartPole environment. By default, the maximum timesteps for each CartPole episode is 500. However, I want to observe the agent balancing the pole for at most 1000 steps. This can be easily achieved by setting env._max_episode_steps = 1000. After the environment is set, we will render it for as long as done = True. Note that we are now utilizing the populated Q-table and actions are selected based on greedy algorithm instead of epsilon-greedy.

Outcome: Our agent does really well!

Agent finished with a reward of 1000.0

P/S: Please check out deeplizard for Q-learning implementation with Gym. Parts of my code are inspired by their implementation. They also have awesome tutorials on topics like Deep Learning, Neural Networks and how to put the knowledge together using tools like Keras and Pytorch.