We'll see how to use the OpenCV library directly on the browser! To do this, we will compile OpenCV to webassembly and then run it inside a webworker.
What is OpenCV
OpenCV is the most popular library of Computer Vision, and has existed since 1999! What it does is providing a user-friendly and highly efficient development environment. It is a library written in C and C++ by Intel.
OpenCV can also use Intel's embedded performance primitives, a set of low-level routines specific of Intel.
With OpenCV you can develop things like:
- 2D and 3D feature toolkits
- Egomotion estimation
- Facial recognition system
- Gesture recognition
- Human–computer interaction (HCI)
- Mobile robotics
- Motion understanding
- Object identification
- Segmentation and recognition
- Stereopsis stereo vision: depth perception from 2 cameras
- Structure from motion (SFM)
- Motion tracking
- Augmented reality
Why in the browser
Being able to run computer vision algorithms directly from the browser allows us to move costs to the client device, and thus save many costs on the server.
Imagine you want to get the characteristics of a label of wine from a picture. There are many ways to do this. If we look for the most ergonomic way for our server, we'd move part of the wine label detection logic in the browser. Then, when we fetch the request to the server, we'll only need to send the final vector. This way, we avoid processing the image on the server.
Or even if it's an embedded app for the private use of a company, we could put all the logic in the browser.
Starting a new Next.js project
We're going to use the Next.js framework with React, to ease the setup and use of the project. However, the same can be applied to a project with Angular, Vue.js, Svelte... or vanilla.js.
To start with, let's just create a new Next.js project with the following command:
yarn create next-app
Once you fill in the name of your project, raise the local environment with yarn dev
. Now we are ready to start using OpenCV in our Next.js project.
Compile OpenCV into Webassembly
To compile OpenCV to webassembly we can follow the official documentation at:
However, I'll tell you the steps I've taken:
First clone the OpenCV repo:
git clone https://github.com/opencv/opencv.git
Now, once inside the repo directory we've cloned, let's compile with Docker!
For Linux / Mac:
docker run --rm --workdir /code -v "$PWD":/code "trzeci/emscripten:latest" python ./platforms/js/build_js.py build
For Windows:
docker run --rm --workdir /code -v "$(get-location):/code" "trzeci/emscripten:latest" python ./platforms/js/build_js.py build
Now it's time to wait... it may take about 15 minutes.
Once finished, copy the file you've generated into the project and then move it into /public
.
public
├── favicon.ico
├── js
+│ ├── opencv.js
└── vercel.svg
Loading OpenCV on a Worker
Once we have the OpenCV file in webassembly inside the /public
directory, it's ready to use it inside a worker.
It is important to use a worker because all OpenCV functions are very expensive and would block the UI. It is not mandatory to use a worker, but highly recommended.
Creating the worker
Let's create the worker within the same /public
directory.
public
├── favicon.ico
├── js
+│ ├── cv.worker.js
│ ├── opencv.js
└── vercel.svg
The initial content will be like this:
/**
* Here we will check from time to time if we can access the OpenCV
* functions. We will return in a callback if it's been resolved
* well (true) or if there has been a timeout (false).
*/
function waitForOpencv(callbackFn, waitTimeMs = 30000, stepTimeMs = 100) {
if(cv.Mat) callbackFn(true);
let timeSpentMs = 0;
const interval = setInterval(() => {
const limitReached = timeSpentMs > waitTimeMs;
if(cv.Mat || limitReached) {
clearInterval(interval);
return callbackFn(!limitReached);
}
else {
timeSpentMs += stepTimeMs;
}
}, stepTimeMs);
}
/**
* This exists to capture all the events that are thrown out of the worker
* into the worker. Without this, there would be no communication possible
* with the project.
*/
onmessage = function(e) {
switch(e.data.msg) {
case 'load': {
// Import Webassembly script
self.importScripts('./opencv.js');
waitForOpencv(function(success){
if(success) postMessage({ msg: e.data.msg });
else throw new Error('Error on loading OpenCV');
});
break;
}
default: break;
}
}
Loading the worker in our project
Okay, now we can create in our project a service that communicates with the worker. For this, we are going to create a services
directory where we will put our file.
services
+└── cv.js
Once the file has been created, we will enter this initial code, which will allow us to load OpenCV into our project:
class CV {
/**
* We will use this method privately to communicate with the worker and
* return a promise with the result of the event. This way we can call
* the worker asynchronously.
*/
_dispatch(event) {
const { msg } = event
this._status[msg] = ['loading']
this.worker.postMessage(event)
return new Promise((res, rej) => {
let interval = setInterval(() => {
const status = this._status[msg]
if(status[0] === 'done') res(status[1])
if(status[0] === 'error') rej(status[1])
if(status[0] !== 'loading') {
delete this._status[msg]
clearInterval(interval)
}
}, 50)
})
}
/**
* First, we will load the worker and capture the onmessage
* and onerror events to always know the status of the event
* we have triggered.
*
* Then, we are going to call the 'load' event, as we've just
* implemented it so that the worker can capture it.
*/
load() {
this._status = {}
this.worker = new Worker('/js/cv.worker.js') // load worker
// Capture events and save [status, event] inside the _status object
this.worker.onmessage = e => this._status[e.data.msg] = ['done', e]
this.worker.onerror = e => this._status[e.data.msg] = ['error', e]
return this._dispatch({ msg: 'load' })
}
}
// Export the same instant everywhere
export default new CV()
Using the service
Since we are exporting the instance directly, we can import it into our page or component.
For example, we could load it on an onClick
event:
async function onClick() {
await cv.load()
// Ready to use OpenCV on our component
}
Using OpenCV in the browser
Now that we have managed to load the OpenCV library in our browser we will see how to run some utilities from the library.
Of course you can do many things with OpenCV. Here I'll show a simple example. Then it will be your job to read the official documentation and learn how to use OpenCV.
The example we're going to use is a simple image processing, to take pictures with the camera and processing them to a grayscale. Although it may seem simple, this is our first "hello world" with OpenCV.
import { useEffect, useRef, useState } from 'react'
import cv from '../services/cv'
// We'll limit the processing size to 200px.
const maxVideoSize = 200
/**
* What we're going to render is:
*
* 1. A video component so the user can see what's on the camera.
*
* 2. A button to generate an image of the video, load OpenCV and
* process the image.
*
* 3. A canvas to allow us to capture the image of the video and
* show it to the user.
*/
export default function Page() {
const [processing, updateProcessing] = useState(false)
const videoElement = useRef(null)
const canvasEl = useRef(null)
/**
* In the onClick event we'll capture a frame within
* the video to pass it to our service.
*/
async function onClick() {
updateProcessing(true)
const ctx = canvasEl.current.getContext('2d')
ctx.drawImage(videoElement.current, 0, 0, maxVideoSize, maxVideoSize)
const image = ctx.getImageData(0, 0, maxVideoSize, maxVideoSize)
// Load the model
await cv.load()
// Processing image
const processedImage = await cv.imageProcessing(image)
// Render the processed image to the canvas
ctx.putImageData(processedImage.data.payload, 0, 0)
updateProcessing(false)
}
/**
* In the useEffect hook we'll load the video
* element to show what's on camera.
*/
useEffect(() => {
async function initCamara() {
videoElement.current.width = maxVideoSize
videoElement.current.height = maxVideoSize
if (navigator.mediaDevices && navigator.mediaDevices.getUserMedia) {
const stream = await navigator.mediaDevices.getUserMedia({
audio: false,
video: {
facingMode: 'user',
width: maxVideoSize,
height: maxVideoSize,
},
})
videoElement.current.srcObject = stream
return new Promise(resolve => {
videoElement.current.onloadedmetadata = () => {
resolve(videoElement.current)
}
})
}
const errorMessage =
'This browser does not support video capture, or this device does not have a camera'
alert(errorMessage)
return Promise.reject(errorMessage)
}
async function load() {
const videoLoaded = await initCamara()
videoLoaded.play()
return videoLoaded
}
load()
}, [])
return (
<div
style={{
display: 'flex',
justifyContent: 'center',
alignItems: 'center',
flexDirection: 'column',
}}
>
<video className="video" playsInline ref={videoElement} />
<button
disabled={processing}
style={{ width: maxVideoSize, padding: 10 }}
onClick={onClick}
>
{processing ? 'Processing...' : 'Take a photo'}
</button>
<canvas
ref={canvasEl}
width={maxVideoSize}
height={maxVideoSize}
></canvas>
</div>
)
}
In our service:
class CV {
// ...previous service code here...
/**
* We are going to use the _dispatch event we created before to
* call the postMessage with the msg and the image as payload.
*
* Thanks to what we've implemented in the _dispatch, this will
* return a promise with the processed image.
*/
imageProcessing(payload) {
return this._dispatch({ msg: 'imageProcessing', payload })
}
}
In our worker:
// ...previous worker code here...
/**
* With OpenCV we have to work with the images as cv.Mat (matrices),
* so you'll have to transform the ImageData to it.
*/
function imageProcessing({ msg, payload }) {
const img = cv.matFromImageData(payload)
let result = new cv.Mat()
// This converts the image to a greyscale.
cv.cvtColor(img, result, cv.COLOR_BGR2GRAY)
postMessage({ msg, payload: imageDataFromMat(result) })
}
/**
* This function converts again from cv.Mat to ImageData
*/
function imageDataFromMat(mat) {
// converts the mat type to cv.CV_8U
const img = new cv.Mat()
const depth = mat.type() % 8
const scale =
depth <= cv.CV_8S ? 1.0 : depth <= cv.CV_32S ? 1.0 / 256.0 : 255.0
const shift = depth === cv.CV_8S || depth === cv.CV_16S ? 128.0 : 0.0
mat.convertTo(img, cv.CV_8U, scale, shift)
// converts the img type to cv.CV_8UC4
switch (img.type()) {
case cv.CV_8UC1:
cv.cvtColor(img, img, cv.COLOR_GRAY2RGBA)
break
case cv.CV_8UC3:
cv.cvtColor(img, img, cv.COLOR_RGB2RGBA)
break
case cv.CV_8UC4:
break
default:
throw new Error(
'Bad number of channels (Source image must have 1, 3 or 4 channels)'
)
}
const clampedArray = new ImageData(
new Uint8ClampedArray(img.data),
img.cols,
img.rows
)
img.delete()
return clampedArray
}
onmessage = function(e) {
switch(e.data.msg) {
// ...previous onmessage code here...
case 'imageProcessing':
return imageProcessing(e.data)
default:
break
}
}
The result:
Although we have processed the image in a very simple way and we could have done it without using OpenCV, this is our "hello world" with OpenCV. It opens the doors to more complex things.
Conclusion
We have seen how to use the most used library for computer vision in the browser. We've seen how to compile OpenCV into webassembly and use it in a worker to not block the UI for a good performance. I hope that even if you have never heard of this library, now you'll give it a try.
Code
I've uploaded the code of this article on GitHub in case you want to take a look.
To see a more sophisticated example implemented in Vue.js, take a look at this other repo:
Top comments (3)
Another example that I'm working now: label detection 😊
Is this work in React Native?
I didn't try it so I dunno. But I guess that yes because the javascript used is not browser-specific.