Google has launched Teachable Machine 2.0 a couple months back which makes it easy for everyone to create machine learning models, amazing, isn't it? Let's understand what it is and what we can do with it.
Teachable Machine is a web-based tool that makes creating machine learning models fast, easy, and accessible to everyone. Check out these amazing examples. You can also Check out this YouTube video.
There are two versions - the first one was released in 2017 and 2.0 which is the current one. Check out both the versions below
- 2017 ( Suppose you're a learner who just wants to quickly understand or present a demo of how machine learning works and don’t need to save anything)
- Current (If you want to save your model and create a working project)
Now, with this version, you can train
You train a computer to recognize your images, sounds and poses without writing any machine learning code. Then, use your model in your projects, sites, apps, and more. Google has made it a very simple three-step process - Gather (gathering data for training model), Train (train it with a single click) and Export (export it to your project).
This works with the following
Note that all of the processes take place in the browser itself and it does not upload it to some server to process it hence, you should not close or refresh the browser.
Well, the Teachable Machine is based on the technique known as Transfer Learning
Transfer learning is a machine learning method where a model developed for a task is reused as the starting point for a model on a second task. It is a popular approach in deep learning where pre-trained models are used as the starting point on computer vision and natural language processing tasks.
Let's suppose that there’s a pre-trained neural network, and when you create your classes, you can sort of picture that your classes are becoming the last layer or step of the neural net. Specifically, both the image and pose models are learning off of pre-trained MobileNet models, and the sound model is built on Speech Commands.
MobileNets are based on a streamlined architecture that uses depthwise separable convolutions to build lightweight deep neural networks. There's a paper written about it by google and you can read it here. The training data for this model is called the image net.
It's a database, a largescale ontology of images built upon the backbone of the
WordNet structure. ImageNet aims to populate the majority of the 80,000 synsets of WordNet with an average of 500-1000 clean and full-resolution images. You can check out this paper to learn more about it. Now, let's jump to speech commands.
It uses the web browser's WebAudio API. It is built on top of TensorFlow.js and can perform inference and transfer learning entirely in the browser, using WebGL GPU acceleration.
There are endless possibilities which can be just for fun or can be useful to many people out there. For inspiration check out this amazing project - Euphonia - Steve Saling is using Teachable Machine to communicate in new ways, such as using facial gestures to trigger sounds.
I hope that you have got an overview of it, this is a complex system and Google mentioned that they are going to do an in-depth writeup in the future which will help us understand what's going on under the hood so stay tuned! Hoping that you enjoyed this article.
Level up every day