As recommended by most people, I decided to get my hands dirty in the field of computer vision by playing with the famous MNIST data-set to create a digit recognizer.
Link to app: https://usamasubhani.tech/projects/digit-recognizer
Source code: https://github.com/usamasubhani/digit-recognizer
Note: This is the first iteration of my first computer vision project. I will be improving this project (and this post) as I learn more.
Table of Contents
Data-set
Considered the "hello world" data-set of computer vision, the MNIST Data-set contains labeled gray-scale images of handwritten digits (0-9). Each image has 28x28 pixels. and the pixel values are between 0 and 255.
This data-set is very popular, so it is already available in Tensorflow
Load and Normalize the Data:
import tensorflow as tf
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
# Normailize
x_train = tf.keras.utils.normalize(X_train, axis=1)
x_test = tf.keras.utils.normalize(X_test, axis=1)
Training and testing
Training
Keras is a library for implementing neural networks and can work as a high level API for TensorFlow. For this project, I used the Sequential
model.
According to the Keras documentation, A Sequential
model is appropriate for a plain stack of layers where each layer has exactly one input tensor and one output tensor.
Creating a neural network:
from keras.models import Sequential
from keras.layers import Dense, Conv2D, Dropout, Flatten, MaxPool2D
input_shape = (28, 28, 1)
model = Sequential()
model.add( Conv2D(28, kernel_size=(3,3), input_shape=input_shape) )
model.add( MaxPool2D( pool_size=(2,2) ) )
model.add( Flatten() )
model.add( Dense(128, activation='relu') )
model.add( Dropout(0.2) )
model.add( Dense(10, activation='softmax') )
Compiling and Fitting:
# Set optimizer and loss function
model.compile(
optimizer = 'adam',
loss = 'sparse_categorical_crossentropy',
metrics = ['accuracy']
)
# Fit the model using training data
model.fit(
x = x_train,
y = y_train,
epochs=10
)
I will add explanation of the above code once I will learn more about neural networks.
Testing
Test the accuracy of trained model by running:
model.evaluate(x_test, y_test)
Save the model by running:
model_json = model.to_json()
with open("model.json", "w") as jsonFile:
jsonFile.write(model_json)
model.save_weights("Model_mnist.h5")
The trained model is now saved and can be deployed.
Deployment
The plan was to create a web application that allows user to draw a digit using a mouse.
I decided to use Flask, which is a minimalist web framework for python and is perfect for small to medium scale projects.
Directory structure for the Flask app:
Digit-recognizer
|
└─api
| └─templates
| | └─index.html
| └─static
| | └─sketchpad.js
| └─__init__.py
|
└─model
└─model.json
└─modl_mnist.h5
Complete source code can be seen at the Github repository. Pasting all the code here will be unnecessary.
Deployment on Heroku
To deploy a Flask app on heroku, we need 3 files.
- Procfile
web: gunicorn wsgi:app
- wsgi.py
from api import app
- requirements.txt
Run these:
pip install pipreqs
pipreqs
- Create a project on Heroku and connect it to your Github repo.
- Push the application code to the repo. The app is now deployed on Heroku.
Conclusion
There is a lot to improve. I am listing some to-dos here, feel free to contribute the project or give feedback.
- Improve the model
- (Done)Enable drawing on touch devices.
Top comments (0)