Viper

Posted on Nov 25, 2020

Deploying Data Visualizer for Nepal Stock Exchange using Streamlit on Heroku

#datascience #machinelearning #webdev #heroku

Introduction

Visualise Nepal Stock Exchange Data and Deploy it On Heroku Using Streamlit and Plotly.

Originally Published on q-viper.github.io

If you are from Nepal then you already know what Nepal Stock Exchange means. Honestly I don't know how to describe stock market but after watching movie Pi, I wanted to take a look at NEPSE data by myself. I am not sure about how old data are present on the NEPSE but I think there are data available from 1980. I started to scrap data on 2019 January and it was very difficult task for me to write the right codes. But there are great and kind programmers who shares their great skills on GitHub. So I came across this repository. I started to scrap all the data of past transaction and it was quite frustrating after server hanged up so I found the best way of scrapping is by scrapping data by each year and then write it on CSV file. Please follow this mine repository for more scrapping code but note that We don't use any of those data so just look that repository for more information only (if available). And rest is on this blog.

Here is a demo Heroku app that we will be building.

Demo

Where was I lost?

From nearly a month, I had not written any blogs because we had our huge festival in Nepal, Dashain and Tihar. Also I was playing lots of games like Injustice Gods Among Us. And I have also been busy watching series like Silicon Valley and Mr. Robot. And good news about me must be I have joined NAAMI as Junior Unity Software Developer from last week.

Inspiration

2020 is not a great year for anyone because of the pandemic and students like me are suffering from hard anxiety of their career and I am one of them. But I have met some awesome people on the web whose work always keeps me inspiring to write more. And I have noticed that Jithendra's blogs and LinkedIn posts are very insightful for learning Data Science.

Contribution

If you are looking right into this blog please try to contribute to Agriculture field by finding a model to classify and detect corn leaf infection. I have worked hard to prepare dataset and made it live on kaggle. Also if you are interested in my more awesome work then Contour Based Writing System must be your first choice.

Getting Systems Ready

Of course we need python along with few other libraries. If you are wondering what should be the requirements then please follow the repository and download all the codes then TADAAAAA!!!! But to describe it on steps, please follow below steps:

Make a virtual environment using venv

Install below libraries best way to download is using pip install -r requirements.txt(Assuming that you have downloaded my repository above):

beautifulsoup4==4.8.0
numpy==1.17.3
pandas==0.25.1
plotly==4.12.0
requests==2.18.4
requests-oauthlib==1.2.0
requests-toolbelt==0.9.1
retrying==1.3.3
streamlit==0.71.0
urllib3==1.22
uvicorn==0.12.2
matplotlib==3.3.3
html5lib==1.0.1
lxml==4.4.1

Importing Dependencies

Make a python file nepse.py under your project folder.

import streamlit as st
import numpy as np
import pandas as pd
from bs4 import BeautifulSoup as BS
import requests
import urllib3
import time
import matplotlib.pyplot as plt
import plotly.express as px

streamlit is used to make a web app and honestly this is the most easiest, fastest and awesome way of deploying ML APPS.
numpy as usual.
pandas for data processing (mostly masking, making dataframes).
BeautifulSoup, we can scrap the data from NEPSE's site.
requests for making HTTP requests.
urllib3 for handling HTTP client.
time as usual.
matplotlib for making simple visualization.
plotly for making cool visualization.

Method: `company_names()`

import streamlit as st
import numpy as np
import pandas as pd
from bs4 import BeautifulSoup as BS
import requests
import urllib3
import time
import matplotlib.pyplot as plt
import plotly.express as px

def company_names():
    http = urllib3.PoolManager()
    http.addheaders = [('User-agent', 'Mozilla/61.0')]
    web_page = http.request('GET', "http://www.nepalstock.com.np/company?_limit=500")
    soup = BS(web_page.data, 'html5lib')
    table = soup.find('table')
    company=[]
    rows = [row.findAll('td') for row in table.findAll('tr')[1:-2]]
    col = 0
    notfirstrun = False
    for row in rows:
        companydata =[]
        for data in row:
            if col == 5 and notfirstrun:
                companydata.append(data.a.get('href').split('/')[-1])
            else:
                companydata.append(data.text.strip())
            col += 1
        company.append(companydata)
        col =0
        notfirstrun = True

    df = pd.DataFrame(company[1:],columns=company[0])
    df.rename(columns={'Operations':'Symbol No'},inplace=True)
    df.index.name = "SN"
    df.drop(columns='',inplace=True)
    df.drop(columns='S.N.',inplace=True)
    return df
company_names()

.dataframe tbody tr th:only-of-type {
    vertical-align: middle;
}

.dataframe tbody tr th {
    vertical-align: top;
}

.dataframe thead th {
    text-align: right;
}

	Stock Name	Stock Symbol	Sector	Symbol No
SN
0	10 % NMB DEBENTURE 2085	NMBD2085	Corporate Debenture	2850
1	10% Himalayan Bank Debenture 2083	HBLD83	Corporate Debenture	2873
2	10% Nabil Debenture 2082	NBLD82	Corporate Debenture	2892
3	10% Nepal SBI Bank Debenture 2086	SBIBD86	Corporate Debenture	2890
4	10% Prabhu Bank Debenture 2084	PBLD84	Corporate Debenture	2904
...	...	...	...	...
268	UNIVERSAL POWER COMPANY LTD	UPCL	Hydro Power	2810
269	Unnati Sahakarya Laghubitta Bittiya Sanstha Li...	USLB	Microfinance	2774
270	Upper Tamakoshi Hydropower Ltd	UPPER	Hydro Power	2792
271	Vijaya laghubitta Bittiya Sanstha Ltd.	VLBS	Microfinance	687
272	Womi Microfinance Bittiya Sanstha Ltd.	WOMI	Microfinance	706

273 rows × 4 columns

This method is used to get list of company names.

Create a HTTP client and make a GET request.
Find the table of result from http://www.nepalstock.com.np/company?_limit=500
Do some cleaning to get the company data for each company and make a DataFrame for it.

After calling the method above, a dataframe just like above must be shown.
It is clear that there are 273 companies. We will be using columns like Stock Symbol and Symbol No.

Visualising Only Limited Data on Streamlit

Why to show entire dataframe on webapp while we can show only desired? So we will be using iloc on dataframe i.e. slicing on index.

@st.cache(suppress_st_warning=True)
def load_data():
    # data = pd.read_csv(DATA_URL, nrows=nrows)
    data = company_names()

    lowercase = lambda x: str(x).lower()
    data.rename(lowercase, axis='columns', inplace=True)
    return data

st.subheader("View the list of Companies.")
def view_company_names():
    st.markdown("(Base URL is [http://www.nepalstock.com.np/company?_limit=500](http://www.nepalstock.com.np/company?_limit=500))")
    num = st.number_input("Enter how many records to view?", min_value=1, max_value=None, step=1)
    data_load_state = st.text('Loading data...')
    data = load_data()
    data_load_state.text("Done! (using st.cache)")
    if st.checkbox('Show raw data'):
        st.subheader('Raw data')
        st.write(data.iloc[:num])
    return data

cdf = view_company_names()

2 methods are given above.

Method: `load_data()`

This method uses the decorator cache to store data in cache and to ignore the cache warning, suppress_st_warning=True is used. This method calls the method comapny_names() and gets the dataframe then makes all the column name to lowercase and then returns that dataframe.

Method: `view_company_names()`

This method is for visualizing actual data. It allows us to input the number and then the dataframe is sliced up to that index.

Streamlit allows us to take number input using number_input() method.
streamlit allows us to show text usign text() method.
Then we load our data from previous method and then after data has been loaded, show log like text.
streamlit allows us to use checkbox using checkbox() then if it is ticked then show data up to the number user has entered.
Also return the original data because it will be used later.

Now save the code above and then from terminal opening on same directory as this file(assume that file name is neplse.py), do

streamlit run nepse.py

If everything is okay then streamlit will provide us a local link with unique port number. By clicking it, we can visualize our first ever NEPSE data visualizer. It should look like below.

View Company Details

st.subheader("View Company Details.")
def view_company_details():
    symbol = st.text_input("Enter Stock Symbol. (For Reference, see above record table.)")
    symbol_no = None
    if len(symbol)>=2:
        url = "http://www.nepalstock.com/company/"

        try:
            req = requests.post(url, data={"stock_symbol":symbol}, verify=False)
            symbol_no = cdf[cdf["stock symbol"]==symbol]["symbol no"]
        except requests.exceptions.RequestException as e:
            print(e)

        response = req.text
        soup = BS(response, "lxml")
        table = soup.find("table")

        for row in table.findAll("tr")[4:]:
            col = row.findAll("td")
            st.write(col[0].string,": ",col[1].string)

    return symbol_no

symbol_no = view_company_details()
if  symbol_no is not None:
    symbol_no = symbol_no.tolist()[0]

The method above is used to visualize the company details.

Ask user for the Stock Symbol (ex NMB, ADBL etc).
Same as earlier method, it makes post request to url http://www.nepalstock.com/company/ and then uses that result to view details.
If the Stock Symbol matches then get symbol number assigned to it. Then do some data cleaning and visualize the result by st.write()
Return the Symbol Number.

The result must be something like below.

Visualize Data of some time period

First we need to get that data. As I stated earlier, sometimes scraping data of entire history will not be feasible and last time I tried that, my laptop crashed so scrapping part by part will be great idea.

Method: `CompanyStocksTransactions()`

Make a GET request to http://www.nepalstock.com.np/company/transactions/%s/0/?startDate=%s&endDate=%s&_limit=9000000 by providing start and end date in required field.
Then as previous methods, clean the data and make a dataframe.
Return that dataframe.

Method: `view_by_year()`

Method takes start date, end date and symbol number of that company symbol.
Call the method CompanyStocksTransactions to get dataframe within that date.
Return that dataframe.

Before calling methods, we try to get date from the streamlit's date_input method and then pass that date to view_by_year method. Also make a checkbox whether to see raw dataframe or not.

st.subheader("Check Company's Progress in Years")
def CompanyStocksTransactions(SymbolNo,startDate, endDate):
    url="http://www.nepalstock.com.np/company/transactions/%s/0/?startDate=%s&endDate=%s&_limit=9000000"%(SymbolNo,startDate, endDate)
    #print("Connecting to %s "%url)
    http = urllib3.PoolManager()
    http.addheaders = [('User-agent', 'Mozilla/61.0')]
    web_page = http.request('GET',url)
    #print("Adding to DataFrame")
    soup = BS(web_page.data, 'html5lib')
    table = soup.find('table')
    FloorSheet=[]
    rows = [row.findAll('td') for row in table.findAll('tr')[1:-2]]
    for row in rows:
          FloorSheet.append([data.text.strip() for data in row])
    if(len(FloorSheet) != 0):
          FloorSheetdf = pd.DataFrame(FloorSheet[1:],columns=FloorSheet[0])
          FloorSheetdf['Date']=pd.to_datetime(FloorSheetdf['Contract No'], format='%Y%m%d%H%M%f', errors='ignore')
          return (1, FloorSheetdf)
    else:
          return (0, None)

@st.cache(suppress_st_warning=True)    
def view_by_year(start_date="2020-1-1", end_date="2020-1-2", symbol="2810"):
    st.write("From year %s to %s "%(start_date, end_date))

    success, dftest=CompanyStocksTransactions(symbol, start_date, end_date)
    if success==1:
        st.write("Successfully scrapped data. Showing results.")
        # st.write(dftest)
    else:
        st.write("Can't scrap data. Try using another symbol. Or Another date.")
    return dftest
start_date = st.date_input("Please input start date.")
end_date = st.date_input("Please input end date.", min_value=start_date)
dfyear = view_by_year(start_date=str(start_date), end_date=str(end_date), symbol=symbol_no)
show_df = st.checkbox("Show Data")

if show_df:
    st.write(dfyear)

The result must be like below:

Do Real Visualization

Now is the time for doing visualization of our time series data. We have already scrapped the data of our interest now is the time to use image to tell story.

Prepare the checkboxes to visualize data.
- Date vs Buyer Broker
- Date vs Seller Broker
- Date vs Amount
- Date vs Rate

date_vs_bbroker = st.checkbox("Date Vs Buyer Broker")
date_vs_sbroker = st.checkbox("Date Vs Seller Broker")
date_vs_amount = st.checkbox("Date vs Amount")
date_vs_rate = st.checkbox("Date vs Rate")

def visualise_broker():
    if date_vs_bbroker:
        st.subheader("Date Vs Buyer Broker")
        fig = px.scatter(dfyear, x="Date", y="Buyer Broker")
        st.plotly_chart(fig)

    if date_vs_amount:
        st.subheader("Date Vs Amount")
        fig = px.scatter(dfyear, x="Date", y="Amount")
        st.plotly_chart(fig)

    if date_vs_sbroker:
        st.subheader("Date Vs Seller Broker")
        fig = px.scatter(dfyear, x="Date", y="Seller Broker")
        st.plotly_chart(fig)

    if date_vs_rate:
        st.subheader("Date Vs Rate")
        fig = px.scatter(dfyear, x="Date", y="Rate")
        st.plotly_chart(fig)

visualise_broker()

Use plotly to scatter our data by giving dataframe and then x and y axis.
Use the figure made by plotly to view it on web app using streamlit's plotly_chart.

Our result must be like below.

If you have came this far then congratulations, you have just made your streamlit app to visualize NEPSE data. Now is the time to make it live on Heroku.

To deploy it on Heroku, we need to have a Heroku Account. Please make one. After making an account, make a new heroku app from here. Give it a proper name.

Preparing additional files

Assume that your main file name is nepse.py. I am not going to explain in detail but if you are concerned then please follow this blog where i have written about deploying Face Mask Classifier on Heroku. Additional files required are:

requirements.txt: Using
```
pip freeze > requirements.txt
```

Procfile: Process file.

web: sh setup.sh && streamlit run nepse.py

setup.sh: Setup file.

mkdir -p ~/.streamlit

echo "[server]
headless = true
port = $PORT
enableCORS = false
" > ~/.streamlit/config.toml

After making all files then make one Git repository and push all the changes to it because we will be integrating our repository to Heroku and Heroku itself will process our files but beware of free version because we have limited number of build number available.

Make sure you are selecting GitHub on Deployement Method.
Then select the repository you are using to visualize on App Connected to GitHub.
Chose the right branch.
If you want automatic deploys, i.e re deploy whenever changes found. Using manual deploy will be best for free version.
Click on Deploy Branch and let it finish the install dependencies and then after completing run launch the app.
If error occurs please let me know on comment section.

Finally

Thank you so much for reading my blog and I really hope that this blog has been some help for you too. I am always eager to meet awesome people like you so please find me on Twitter as QuassarianViper and LinkedIn as Ramkrishna Acharya.

DEV Community

Deploying Data Visualizer for Nepal Stock Exchange using Streamlit on Heroku

Introduction

Where was I lost?

Inspiration

Contribution

Getting Systems Ready

Importing Dependencies

Method: `company_names()`

Visualising Only Limited Data on Streamlit

Method: `load_data()`

Method: `view_company_names()`

View Company Details

Visualize Data of some time period

Method: `CompanyStocksTransactions()`

Method: `view_by_year()`

Do Real Visualization

Preparing additional files

Finally

Why not read more?

Top comments (0)

Introduction

Where was I lost?

Inspiration

Contribution

Getting Systems Ready

Importing Dependencies

Method: company_names()

Visualising Only Limited Data on Streamlit

Method: load_data()

Method: view_company_names()

View Company Details

Visualize Data of some time period

Method: CompanyStocksTransactions()

Method: view_by_year()

Do Real Visualization

Preparing additional files

Finally

Why not read more?

Method: `company_names()`

Method: `load_data()`

Method: `view_company_names()`

Method: `CompanyStocksTransactions()`

Method: `view_by_year()`