DEV Community

Cover image for TinyML: Deploying TensorFlow models to Android
Moses Odhiambo
Moses Odhiambo

Posted on

TinyML: Deploying TensorFlow models to Android

What is TinyML?

Tiny machine learning (TinyML) is a field that focuses on running machine learning (mostly deep learning) algorithms directly on edge devices such as microcontrollers and mobile devices. The algorithms have to be highly optimized to be able to run on such systems since most of them are low powered.

Wait, what do you mean by 'edge devices'?

An edge device is the device which makes use of the final output of machine learning algorithms, for instance, a camera that displays the result of image recognition, or a smartphone that plays speech synthesized from text. Most practitioners run machine learning models on more powerful devices, then send the output to edge devices, but this is starting to change with the advent of TinyML.

Why TinyML?

The need to run machine learning directly on edge devices and the convenience that comes with this has made TinyML become one of the fastest growing fields in deep learning.

How does one go about deploying ML to edge devices?

  1. Train a machine learning model on a more powerful environment such as a cloud virtual machine or a faster computer.

  2. Optimize the model, say, by reducing the number of parameters, or by using low precision data types such as 16 bit floats. This will make the model smaller and the inference faster and more power efficient at the cost of accuracy, which is a compromise you'll have to take.

  3. Run the model 'on the edge'!

TensorFlow Lite Quick Start

TensorFlow Lite is TensorFlow's take on TinyML.

Converting a saved model from TensorFlow to TensorFlow Lite

import tensorflow as tf
model=tf.keras.models.load_model("/path_to_model.h5")
converter = tf.lite.TFLiteConverter.from_keras_model(model)
tflite_model = converter.convert()
open("/tflite_model.tflite", "wb").write(tflite_model)
Enter fullscreen mode Exit fullscreen mode

As you can see, it only takes a few lines of code 😊.

Running a TensorFlow Lite model in TensorFlow Lite's Python Interpreter

import tensorflow as tf
interpreter = tf.lite.Interpreter(model_path="/tflite_model.tflite")    #initialize interpreter with model
interpreter.allocate_tensors()
input_details = interpreter.get_input_details()
output_details = interpreter.get_output_details()

inputs = []  #list of input tensors
for index,item in enumerate(inputs):
    interpreter.set_tensor(input_details[index]['index'], item)

interpreter.invoke()    #run model

outputs = [] #output tensors
num_outputs = len(output_details) #number of output tensors
for index in range(num_outputs):
    outputs.append(interpreter.get_tensor(output_details[index]['index']))

Enter fullscreen mode Exit fullscreen mode

Running a TensorFlow Lite model in an Android application

1. Create a new Android Studio Project

2. Import the model into Android Studio

Copy the .tflite model to app/src/main/assets/ - create the assets folder if it does not exist.

3. Import TensorFlow Lite into your project

Add the following dependency to your app-level build.gradle

implementation 'org.tensorflow:tensorflow-lite:+'

4. Load Model

Load the .tflite model you placed in your assets folder as a MappedByteBuffer.

private MappedByteBuffer loadModelFile(Context c, String MODEL_FILE) throws IOException {
    AssetFileDescriptor fileDescriptor = c.getAssets().openFd(MODEL_FILE);
    FileInputStream inputStream = new FileInputStream(fileDescriptor.getFileDescriptor());
    FileChannel fileChannel = inputStream.getChannel();
    long startOffset = fileDescriptor.getStartOffset();
    long declaredLength = fileDescriptor.getDeclaredLength();
    return fileChannel.map(FileChannel.MapMode.READ_ONLY, startOffset, declaredLength);
}

model = loadModelFile(this, 'name_of_model.tflite')
Enter fullscreen mode Exit fullscreen mode

5. Initialize Interpreter

try {
    interpreter = new Interpreter(model);
}
catch (IOException e) {
    e.printStackTrace();
}
Enter fullscreen mode Exit fullscreen mode

6. Run Model

Object[] inputs = {input1, input2, ...} //the objects in inputs{} are jagged arrays - what in TensorFlow would be considered tensors

Map<Integer, Object> outputs = new HashMap<>(); //same for outputs
outputs.put(0, output1);    //add outputs to the map

interpreter.runForMultipleInputsOutputs(inputs, outputs);   //get inference from interpreter
Enter fullscreen mode Exit fullscreen mode

And there you go.

Sample project

Here is source code for a GAN deployed to an Android app with TensorFlow Lite. Here is the android app for you to play with.

Discussion (0)