For the past two months, I've been doing some research in the field of computer vision on the web.
Today’s Javascript implementations are really fast, and this fact has allowed that some computationally intensive tasks, that were reserved to other languages and platforms just a few years ago, are now feasible for web browsers or Node.js.
So if you are a Javascript developer interested in computer vision you are lucky.
First of all we must differentiate computer vision from image processing. Some of the Javascript libraries in this article are in fact just image processing libraries. Performing computer vision requires more complex and sophisticated algorithms and techniques.
Image processing makes an extensive usage of maths and algorithms to extract important image features. Computer vision uses the power of image processing along with other techniques (decision trees, Bayes classifiers, deep neural networks...) in order to recognize objects or categorize images.
Computer Vision tries to do what a human brain does when it recognizes shapes, objects or situations in an image while Image processing is mainly focused on processing raw images, making them optimal for other tasks (noise reduction for example) and extracting key features.
Some Javascript computer vision libraries like tracking.js or handtrack.js are very specialized in their scope , trying to solve how to detect concrete kind of “objects” like faces, eyes, hands, etcetera. These libraries allow you to use ready-to-go systems to perform actual computer vision tasks. Others, like Opencv4nodejs / OpenCV aims to provide more general systems / frameworks that can help to solve a wider range of computer vision problems.
Here are some of the libraries that I found specially interesting in the field of image processing and computer vision, all of them open source.
GammaCV
WebGL accelerated computer vision library. It uses a data flow paradigm to create and run graphs on the GPU. This is a very compact library: weights just 32.5K minimized.
In addition to the most common algorithms (grayscaling, color segmentation…) it implements some other more sophisticated algorithms like Canny Edges, Sobel operator and lines detections, but it also lacks of important feature extraction algorithms like FAST or ORB.
Website: https://gammacv.com
github repository: https://github.com/PeculiarVentures/GammaCV
Opencv4nodejs
Opencv4nodejs is not exactly a pure Javascript library but a npm package that provides Node.js bindings to OpenCV through an asynchronous API. It supports Open CV 3 and Open CV 4, so it brings us all the performance benefits of the native OpenCV library to your Node.js application and allows to easily implement multithreaded CV tasks via Promises. It sound really great.
OpenCV (Open Source Computer Vision Library) is a library of programming functions mainly aimed at real-time computer vision.
If execution in the browser is not an important requirement, Opencv4nodejs is probably the most interesting option given the performance and maturity of OpenCV.
Github repository: https://github.com/justadudewhohacks/opencv4nodejs/
OpenCV.js
If you are looking for a 100% browser solution OpenCV.js offers a different approach. OpenCV.js offers JavaScript bindings for a subset of the OpenCV library, implemented in WebAssembly.
You can't expect OpenCV.js to perform all the things you can do with OpenCV using C or Python or even Opencv4nodejs. Documentation is not that good either.
An additional problem to take into account is the size of the library itself, 2MB, which doesn’t make it very appropriate to all networks / devices.
To be clear, OpenCV.js is a really interesting Webassembly implementation but, in my opinion, probably you can find better alternatives depending on the kind of task you are trying to achieve.
MarvinJ
MarvinJ is a pure javascript image processing library. It derives from Marvin Framework, a Java cross-platform image processing framework.
MarvinJ provides a set of algorithms and filters (Gaussian, emboss, grayScale, thresholding…) that could be wide enough for your purposes, but as it happens in the case of GammaCV, it lacks of feature extraction algorithms. I could only find the Prewitt edge filter, which is not one exactly one of the most used.
Website: http://www.marvinj.org/en/index.html
Github repository: https://github.com/gabrielarchanjo/marvinj
tracking.js
This library brings some well-known image processing algorithms (gaussian blur, gray scale, convolution...) along with various computer vision algorithms to JavaScript. It can perform color tracking, face detection and feature detection. It is well-documented and the examples in the website are very illustrative.
It is very easy to implement color tracking, face detection (not recognition) or eye tracking from video or webcam. Tracking.js also provides a simple framework to implement your own object tracking algorithm. Of course it comes with some filters and feature extraction tools like FAST, BRIEF.
Website: http://trackingjs.com/
Github repository: https://github.com/eduardolundgren/tracking.js/
jsfeat
jsfeat has a rich and varied feature set to implement image processing in any browser. It can perform tasks such as: edge detection, image processing (grayscale, blur, etc.), corner detection, object detection, optical flow detection, etc...
This library is very lightweight (23 kB) and really fast, with very good performance on desktop computers or even mobile devices. In its website you can find lots of real time demos and examples using your webcam (webRTC required) so you can check the resulting framerate in all of them.
JSFeat documentation is very good. Of course this library include basic filters and algorithms (grayscale, derivatives, box-blur, resample, gaussian blur, equalize histogram) but also more advanced operations like:
Canny edges
Fast Corners feature detector
Lucas-Kanade optical flow
HAAR object detector
BBF object detector
which can be considered as advanced feature extractors.
Website: http://inspirit.github.io/jsfeat/
Github repository: https://github.com/inspirit/jsfeat
In a next article I will show some little experiment using this library.
PoseNet
A machine learning model, built on Tensorflow.js, which allows for real-time human pose estimation in the browser.
PoseNet can be used to estimate either a single pose or multiple poses, meaning there is a version of the algorithm that can detect only one person in an image/video and one version that can detect multiple persons in an image/video.
As you can see there are some interesting options if you don’t want to start coding your image processing system from scratch. if you are planning to learn about computer vision I encourage you to start experimenting with them.
And that is all!. Thanks for reading this is my first article. Hope you found it useful. I look forward to hear any feedback or suggestion.
Top comments (1)
this is a great list..thanks for compiling, I needed something that has optical flow implementation and these look great.