jesusramirezs

Posted on Sep 30, 2020 • Edited on Oct 23, 2020

A simple experiment with the JSFeat library combining skin and edge detection

#javascript #frontend #development #todayilearned

In a previous article, I briefly reviewed some libraries that allow testing with artificial vision and image processing using Javascript. This is an area that I find a fascinating and funny time.

Among these listed libraries, one, in particular, caught my attention: JSFeat. Besides, it seems to be an entirely complete library for the filters and algorithms that it uses; it has a good documentation and some quite illustrative examples.

I found it very easy to start playing with this library. Each filter or algorithm library is documented with a simple example, and all of them work in real-time with the PC’s webcam.

I find it interesting to try something that I have been thinking about: a simple hand gesture/movement detector. To do this, I will first try to apply a simple previous filtering of the image in real-time to detect the skin tones from the rest of the image's colors.

I know that the result won’t be rigorous, but I don’t try to get a 100% reliable result: it is just a test intended to simplify the initial problem as much as possible.

To start with our experiment, we will only need a local HTTP server, for example, Apache, and copy the code from any of the most basic JSfeat’s examples and take it as a template; for example, we can start from “canny edge demo” that already uses one of the most known edge detection algorithms: “Canny edges”:

https://inspirit.github.io/jsfeat/sample_canny_edge.html

The JSfeat website does not provide the setting up of the examples by cloning, so you will have to set up a “js” folder with the necessary libraries next to your .html or modify the code not to use them:

jsfeat-min.js: Github: https://github.com/inspirit/jsfeat
profiler.js
compatibility.js
bootstrap.js

and in a folder named “css”:

js-feat.css // basic styles
bootstrap.css // bootstrap CSS

There is a bunch of code dedicated to webcam's initialization and a the creation of a web canvas on which the webcam video stream is dumped and the algorithms applied. Let's skip all this to focus on just two functions:

    demo_app()
    tick()

demo_app() is an initialization function while tick() is executed at each frame of video captured from our webcam

At demo_app() we find two important lines of code:

    ctx = canvas.getContext('2d');

The getContext() function returns the drawing context from the HTML canvas - which is an object that has all the drawing properties and functions you use to draw on the canvas.

At each frame we will draw the image captured from our webcam into this drawing context

The second line is:

    img_u8 = new jsfeat.matrix_t(640, 480, jsfeat.U8_t | jsfeat.C1_t);

JSfeat uses a data structure called “matrix_t” which is an array with the parameters of our HTML canvas and the resolution chosen for our capturing video from our webcam, in our case 640 x 480 pixels. In this matrix, the edge detection algorithm will be applied once we have filtered the skin tones.

You need to initialize our matrix with the number of channels to be used, and the type of data that represent each pixel, in our case, “single-channel unsigned char” because once we filter the skin of the rest of the image, we’ll apply edge detection to a monochrome image result of the “grayscale” function.

It is important to note that the skin pre-filtering will not be performed using any JSfeat’s specific algorithm but a function programmed from scratch and which this data structure “img_u8” is not involved.

This function traverses an array of data “RGBA”, where each pixel is represented by four bytes: Red, Green, Blue color components and Alpha channel.

To determine whether or not a pixel corresponds to skin in the image, we previously convert the color in RGB format to HSV format using the following function:

    function rgb2hsv(r, g, b) {
                let rabs, gabs, babs, rr, gg, bb, h, s, v, diff, diffc, percentRoundFn;
                rabs = r / 255;
                gabs = g / 255;
                babs = b / 255;
                v = Math.max(rabs, gabs, babs),
                    diff = v - Math.min(rabs, gabs, babs);
                diffc = c => (v - c) / 6 / diff + 1 / 2;
                percentRoundFn = num => Math.round(num * 100) / 100;
                if (diff == 0) {
                    h = s = 0;
                } else {
                    s = diff / v;
                    rr = diffc(rabs);
                    gg = diffc(gabs);
                    bb = diffc(babs);

                    if (rabs === v) {
                        h = bb - gg;
                    } else if (gabs === v) {
                        h = (1 / 3) + rr - bb;
                    } else if (babs === v) {
                        h = (2 / 3) + gg - rr;
                    }
                    if (h < 0) {
                        h += 1;
                    } else if (h > 1) {
                        h -= 1;
                    }
                }
                return {
                    h: Math.round(h * 360),
                    s: percentRoundFn(s * 100),
                    v: percentRoundFn(v * 100)
                };
            }

Next we use the algorithm proposed by the following paper, were are put the results of analyzing the data set "Pratheepan dataset for human skin detection":

https://arxiv.org/ftp/arxiv/papers/1708/1708.02694.pdf

This simple algorithm is passed over the data set obtained from the initialized canvas line in our HTML document:

    function filterSkin(data) {

        for (var i = 0; i < data.length; i += 4) {

            var hsv = rgb2hsv(data[i], data[i + 1], data[i + 2]);

            if (!(((0.0 <= hsv.h && hsv.h <= 50.0)) && 23 <= hsv.s && hsv.s <= 68  &&
                data[i] > 95 && data[i + 1] > 40 && data[i + 2] > 20 && data[i] > data[i + 1] &&
                data[i] > data[i + 2] && (data[i] - data[i + 1]) > 15 && data[i + 3] > 15) ) {

                data[i] = 0;
                data[i + 1] = 0;
                data[i + 2] = 0;
            }


        }
    }

So final data flow in the tick function is:

        // the frame is drawn from the video stream into the 2D context of the canvas 
        ctx.drawImage(video, 0, 0, 640, 480);

        // we get the image data (matrix+metadata) from the 2D context
        var imageData = ctx.getImageData(0, 0, 640, 480);

        // the image data matrix is passed to the Skin Filtering function
        filterSkin(imageData.data);   

        // the new image content is passed to grayscale function. The result is a one byte per pixel image
        jsfeat.imgproc.grayscale(imageData.data, 640, 480, img_u8);

        // lets apply some gaussian blur to reduce noise
        jsfeat.imgproc.gaussian_blur(img_u8, img_u8, 4, 0);

        // the monochrome image is passed to canny edges algorithm
        jsfeat.imgproc.canny(img_u8, img_u8,35, 40);

I would like to continue with these experiments and see how far I can go.

Thanks for reading this article. Any feedback will be greatly appreciated.

Connect with me on Twitter or LinkedIn

Oldest comments (1)

Mes • Feb 28 '22

Thank you, Great experiment.
Do you think it is "safe" to use this library, although it is really impressing, it has been created 9 years ago and I don't see someone who take care of it right now.
I am looking for good JS library for Stabilize video, I saw that jsfeat provides "Multiview" that do it but I cant get it working and I am not sure about the community and the maintenance of it...