DEV Community

Usama Subhani
Usama Subhani

Posted on • Edited on

Digit Recognizer: A Computer Vision Project From Training to Deployment.

Digit Recognizer
As recommended by most people, I decided to get my hands dirty in the field of computer vision by playing with the famous MNIST data-set to create a digit recognizer.

Link to app: https://usamasubhani.tech/projects/digit-recognizer
Source code: https://github.com/usamasubhani/digit-recognizer

Note: This is the first iteration of my first computer vision project. I will be improving this project (and this post) as I learn more.

Table of Contents

  1. Data-set
  2. Training and Testing
  3. Web app + Deployment

Data-set

Considered the "hello world" data-set of computer vision, the MNIST Data-set contains labeled gray-scale images of handwritten digits (0-9). Each image has 28x28 pixels. and the pixel values are between 0 and 255.

This data-set is very popular, so it is already available in Tensorflow

Load and Normalize the Data:

import tensorflow as tf
(x_train, y_train), (x_test, y_test) = tf.keras.datasets.mnist.load_data()
# Normailize
x_train = tf.keras.utils.normalize(X_train, axis=1)
x_test = tf.keras.utils.normalize(X_test, axis=1)

Enter fullscreen mode Exit fullscreen mode

Training and testing

Training

Keras is a library for implementing neural networks and can work as a high level API for TensorFlow. For this project, I used the Sequential model.

According to the Keras documentation, A Sequential model is appropriate for a plain stack of layers where each layer has exactly one input tensor and one output tensor.

Creating a neural network:

from keras.models import Sequential
from keras.layers import Dense, Conv2D, Dropout, Flatten, MaxPool2D

input_shape = (28, 28, 1)

model = Sequential()
model.add( Conv2D(28, kernel_size=(3,3), input_shape=input_shape) )
model.add( MaxPool2D( pool_size=(2,2) ) )
model.add( Flatten() )
model.add( Dense(128, activation='relu') )
model.add( Dropout(0.2) )
model.add( Dense(10, activation='softmax') )
Enter fullscreen mode Exit fullscreen mode

Compiling and Fitting:

# Set optimizer and loss function
model.compile( 
    optimizer = 'adam',
    loss = 'sparse_categorical_crossentropy',
    metrics = ['accuracy']
)
# Fit the model using training data
model.fit(
    x = x_train,
    y = y_train,
    epochs=10
)
Enter fullscreen mode Exit fullscreen mode

I will add explanation of the above code once I will learn more about neural networks.

Testing

Test the accuracy of trained model by running:

model.evaluate(x_test, y_test)
Enter fullscreen mode Exit fullscreen mode

Save the model by running:

model_json = model.to_json()
with open("model.json", "w") as jsonFile:
    jsonFile.write(model_json)

model.save_weights("Model_mnist.h5")
Enter fullscreen mode Exit fullscreen mode

The trained model is now saved and can be deployed.

Deployment

The plan was to create a web application that allows user to draw a digit using a mouse.
I decided to use Flask, which is a minimalist web framework for python and is perfect for small to medium scale projects.

Directory structure for the Flask app:

Digit-recognizer
|
└─api
| └─templates
| | └─index.html
| └─static
| | └─sketchpad.js
| └─__init__.py
|
└─model
  └─model.json
  └─modl_mnist.h5

Enter fullscreen mode Exit fullscreen mode

Complete source code can be seen at the Github repository. Pasting all the code here will be unnecessary.

Deployment on Heroku

To deploy a Flask app on heroku, we need 3 files.

  • Procfile
web: gunicorn wsgi:app
Enter fullscreen mode Exit fullscreen mode
  • wsgi.py
from api import app
Enter fullscreen mode Exit fullscreen mode
  • requirements.txt

Run these:

pip install pipreqs
pipreqs
Enter fullscreen mode Exit fullscreen mode
  • Create a project on Heroku and connect it to your Github repo.
  • Push the application code to the repo. The app is now deployed on Heroku.

Conclusion

There is a lot to improve. I am listing some to-dos here, feel free to contribute the project or give feedback.

  • Improve the model
  • (Done)Enable drawing on touch devices.

Top comments (0)