If you are a javascript developer and interested in doing machine learning or artificial intelligence then the first question that may have crossed your mind is do I need to learn Python or R to get started?
Well most often than not the obvious answer is yes python is best suited for AI and you can build on online jupyter notebooks on google colab etc. But wait you can do the same in javascript yes not a alternative for very large datasets and if your want to run models on gigabytes of data. Still, you can tickle your curiosity. In this article I am going to demonstrate use of tensorflow.js a google library for doing AI in browser using javascript.
This example is meant to explain how we can do AI on a simple time series data and not a comprehensive tutorial of getting a best model for time series. Also, I have used ReactJS for building this article but I am not a React expert.
What is a time series
First what is a time series data? Any data with a time dimension, example data collected at particular intervals of time forms an time series.
Well, the data can be collected at a non regular interval and later be binned into regular interval in a data pre process step but that is beyond the scope of this article. We assume that data is binned and the data is available at a regular time cadence.
Data Processing
We are going to consider a univariate time series i.e. no other variable in the model but only the dependent variable (the one we want to forecast). So a naive question is how can our model predict without independent (input) variables/features? We are going to do a small feature engineering with our data.
But before that we need to first build an application where user can select our time variable and the variable they need to forecast (dependent variable). A simple UI wizard will drive this process.
For purpose of preparing the data and having it in matrix form similar to pandas in python I am using a npm dependency dataframe.js. This allows to manipulate data in columns and rows, query etc and even load data easily.
Feature Engineering
Once time and dependent features are selected we need to do a feature engineering by generating independent variables. The question is how? Well, we are going to use a simple technique called lag. The concept of lag is that we assume that the current point is correlated to a previous time point and this relationship is called autocorrelation. What we are saying is that today's stock price is correlated to previous 6 days stock price. The value 6 here is called 6 lags. We obviously do not know this value and hence it is one hyper parameter for our model, meaning by varying this value we can see how our model performs. Once we get this value from user we split the time series into sequences of no of lags. Example assuming 3 lags we virtually generate a table of 4 columns please check the image below. The LHS shows the actual data and RHS shows split sequences.
Hyper Parameters
We are just going to consider simple hyper parameters "Number of lags" and "Epochs" , one epoch is when the model is trained once over the training data.
Model Training
We are going to train our model which a two layered model. First layer is LSTM (Long Short Term Memory) model with 50 units. The number of units can also be a hyper parameter but to keep things simple it is hardcoded. If you do not know what is LSTM do not worry much. It is a form of complex RNN (Recurrent Neural Network) model used to model sequential data like time series of language data.
The RNN structure looks like
Image Courtesy: fdeloche
While an LSTM looks like
Image Courtesy: Guillaume Chevalier
Don't worry much about the model simply understand it is two layer model with first layer being LSTM with 50 units with activation as "Relu"
The second layer is simple dense layer with one unit and since our model will output number it is a regression model with loss function as Mean Square Error
Here is what the JS code for model building looks like.
Once a model is built we can check the loss over epochs and see that we are really minimising loss.
Predicting
Final step is predicting with the model and comparing it with actual series to see how the model predicts.
The prediction code is very simple, we just take original series and run it through model with predict function.
Off course for everything we need to convert the values to tensors.
And then compare the predictions with actuals
If this article generates some curiosity in you then feel free to check out the entire code
Feel free to fork it on GitHub and try digging deeper in the code.
Top comments (2)
Thank's for tutorial, Awesome!
Quick question.. every example uses ONE variable ..
Lets assume I want to use LSTM and predict a value Y (say closing price of next day) based on 2 or 3 features (say open , high and close of today), How is that done? (The model/layer/tensor/compile/fit/predict.
And lets assume I then want to add to that code, an additional feature volume.. how can that "add" be done?
Can you give an example.. please javascript/node etc.. I can do this in python alright.