DEV Community

Cover image for How to detect objects in videos in a web browser using YOLOv8 neural network and JavaScript

How to detect objects in videos in a web browser using YOLOv8 neural network and JavaScript

Andrey Germanov on May 31, 2023

Table of Contents Introduction Adding a video component to a web page Capture video frames for object detection Detect objects in video ...
Collapse
 
hpozuelo profile image
h-pozuelo

I tried everything.
Your tutorial works fine with the 'yolov8n.onnx' model I've just exported.
But if I put my own 'yolov8_custom.onnx' model (large model trained) it doesn't detect anything.

Could you help me?

Collapse
 
andreygermanov profile image
Andrey Germanov • Edited

Could you share the model and sample image or video for testing ?

Collapse
 
hpozuelo profile image
h-pozuelo

I was using webcam input (it worked with the yolo8n model)
Here is the model:
we.tl/t-NMFDafC8Pc
Also, here is the project folder if u wanna try:
we.tl/t-n2lGv4o0oW

I wasnt able to upload into GitHub, too large file :/

Thread Thread
 
andreygermanov profile image
Andrey Germanov • Edited

These links do not work in my location.
Can you try Google Drive? It works fine here.

Thread Thread
 
hpozuelo profile image
h-pozuelo

Okey, one moment

Collapse
 
hpozuelo profile image
h-pozuelo
Thread Thread
 
andreygermanov profile image
Andrey Germanov • Edited

Cool, I just ran your code and it worked. (However, I do not know American Signs language, maybe the model predicts incorrect labels).

Thread Thread
 
hpozuelo profile image
h-pozuelo

I think the model that is loading is the yolo8n not the yolov8_custom.

Maybe you need to modify the line that loads the model

Thread Thread
 
hpozuelo profile image
h-pozuelo

Yeah, I verify it, it is loading the yolov8n model.

Comment that line and uncomment the other line that loads myh model (yolov8_custom.onnx)

Thread Thread
 
andreygermanov profile image
Andrey Germanov • Edited

Yes, changed to your custom model. It worked much slower, because it's large, but finally predicted something.

Thread Thread
 
hpozuelo profile image
h-pozuelo

So, you didnt change any code right?
works as pretended?

Thread Thread
 
andreygermanov profile image
Andrey Germanov • Edited

Yes, it works. I did not change any other code, except the model file name.

But it's too slow for real time detection in videos for average user CPU, I think better to train it based on Tiny or Small models.

Thread Thread
 
hpozuelo profile image
h-pozuelo

I will try to train on yolov8 nano model.
Any other tip for my training you can give to me?
like, how many epochs should i train with, how much bacth, etc...
for the YOLO trainnig command I mean

thx for everything btw

Thread Thread
 
hpozuelo profile image
h-pozuelo

better train with pyTorch + cpu or pytorch + gpu if I'm gonna export the model to onnx format?

Thread Thread
 
andreygermanov profile image
Andrey Germanov

You can try 50 epochs
As a batch size, you can set -1 to use Autobatch (docs.ultralytics.com/reference/yol...)

Thread Thread
 
andreygermanov profile image
Andrey Germanov • Edited

GPU is only to increase training speed. For ONNX export, it does not matter what you will use, the result will be the same.

Collapse
 
khannomaan profile image
Nomaan • Edited

Hi,
Thank you for your article, it has been a very big help to my project.

I am using python server to run my html page.
I have downloaded and added the ort-wasm-simd.wasm file to the same directory as my index.html file from the link you provided in an earlier reply but I am still getting the errors that I have attached herewith.

I have also imported the ort-min.js using importScripts function in my worker.js file .

Could you please help me solve this problem ?
Image description

Collapse
 
andreygermanov profile image
Andrey Germanov • Edited

Yes, this is a common problem. It means that the 'ort-min.js' and 'ort-wasm-simd.wasm' are from different versions of the ONNX runtime.

You need to download and use both files from the same version. When the article was written it was 1.15.1, but now it is 1.17.1.

Download both files from last ONNX runtime distribution:

cdn.jsdelivr.net/npm/onnxruntime-w...

Include them and try again.

Collapse
 
khannomaan profile image
Nomaan

Hi I included the latest files and that error has been resolved but I am facing another error now.

Image description

Collapse
 
andreygermanov profile image
Andrey Germanov

Hi

Sorry, but I can't read the text of the error message. Can you add bigger screenshot?

Collapse
 
khannomaan profile image
Nomaan • Edited
Thread Thread
 
khannomaan profile image
Nomaan

The above error has been resolved (Thank you for your help) but the code is drawing the boxes in random places.
I wanted to ask if this code will work for 800X800 size images, since my onnx file accepts input of 800X800.

Thread Thread
 
andreygermanov profile image
Andrey Germanov

The standard YOLOv8 model accepts 640x640 images, so, the code resizes any image to 640x640 before processing, and then scales the boxes using this size in mind.

To make it work with 800x800 size, you need to replace all occurrences of the "640" number to "800" in the "prepare_input", "run_model" and "process_output" functions.

Thread Thread
 
Sloan, the sloth mascot
Comment deleted
 
andreygermanov profile image
Andrey Germanov • Edited

Sorry, Django is out of scope here.
If the solution works standalone, then the problem is definitely on the Django end. Search how to correctly integrate static web pages with JavaScript and external links to Django.

Thread Thread
 
khannomaan profile image
Nomaan

After the model runs for a couple of minutes, these error is logged in the console and the model stops working until its refreshed again.

This is the error it is throwing.

Image description

Thread Thread
 
andreygermanov profile image
Andrey Germanov

What do you have on line 12 of the "worker.js" ?

Thread Thread
 
andreygermanov profile image
Andrey Germanov

Haven't experienced this, but seems that this is a bug in some version of onnxruntime-web, that others experienced on different models: github.com/xenova/transformers.js/...

Try to update ort-wasm-simd.wasm to last version and use last version of ort.min.js.

Collapse
 
hpozuelo profile image
h-pozuelo

Hello, I've just followed all of your tutorial.
But I am getting this error:

Image description

I have just copy / paste all your code, but I don't know why I get this error.

I'm running the project on Visual Studio Code, with Live Server extension.
I think the separated thread 'worker.js' is giving the error.

Can you help me solve it?

Collapse
 
andreygermanov profile image
Andrey Germanov • Edited

Hello,
This is a common error when import ONNX runtime from the worker. It can't download the required WASM file automatically.

Did you download the ort-wasm-simd.wasm file to the project folder?

cdn.jsdelivr.net/npm/onnxruntime-w...

Collapse
 
hpozuelo profile image
h-pozuelo

Oh, I forgot:

I thought this line of code: importScripts("cdn.jsdelivr.net/npm/onnxruntime-w...); | on the worker.js was the only thing I need it

Thread Thread
 
andreygermanov profile image
Andrey Germanov • Edited

It's ok. This annoying issue mentioned in the "Running the model in background thread" section, and the link to this file also should be there.

Thread Thread
 
hpozuelo profile image
h-pozuelo

btw, if my model was trained using pyTorch+GPU, is there gonna be any problem?

Thread Thread
 
andreygermanov profile image
Andrey Germanov

No, if it's based on YOLOv8 and successfully exported to ONNX.

Thread Thread
 
hpozuelo profile image
h-pozuelo

Okey, thx.
Another question.
Is my model has only 26 labels (is an American Sign Language detection), I also have to modify the following line: (( const [class_id, prob] = [...Array(80).keys()] // THIS LINE
)) aND CHANGE 80 TO 26?

function process_output(output, img_width, img_height) {
let boxes = [];
for (let index = 0; index < 8400; index++) {
const [class_id, prob] = [...Array(80).keys()] // THIS LINE
.map(col => [col, output[8400 * (col + 4) + index]])
.reduce((accum, item) => item[1] > accum[1] ? item : accum, [0, 0]);
if (prob < 0.5) {
continue;
}
const label = yolo_classes[class_id];
const xc = output[index];
const yc = output[8400 + index];
const w = output[2 * 8400 + index];
const h = output[3 * 8400 + index];
const x1 = (xc - w / 2) / 640 * img_width;
const y1 = (yc - h / 2) / 640 * img_height;
const x2 = (xc + w / 2) / 640 * img_width;
const y2 = (yc + h / 2) / 640 * img_height;
boxes.push([x1, y1, x2, y2, label, prob]);
}
boxes = boxes.sort((box1, box2) => box2[5] - box1[5])
const result = [];
while (boxes.length > 0) {
result.push(boxes[0]);
boxes = boxes.filter(box => iou(boxes[0], box) < 0.7 || boxes[0][4] !== box[4]);
}
return result;
}

Thread Thread
 
andreygermanov profile image
Andrey Germanov • Edited

Yes, you should.

Thread Thread
 
andreygermanov profile image
Andrey Germanov

Or you can replace it like here:

const [class_id,prob] = [...Array(yolo_classes.length).keys()]

Collapse
 
chih3b profile image
chiheb nouri

first of all thank you for beging so helpful.i have a problem.i downloaded your code and tried to run it with webserver exetention in vscode but only the video work with no detections and when i clicked inspect elements in the browser i got this error:Error: no available backend found. ERR: [wasm] RuntimeError: indirect call to null, [cpu] Error: previous call to 'initializeWebAssembly()' failed., [xnnpack] Error: previous call to 'initializeWebAssembly()' failed.

Collapse
 
andreygermanov profile image
Andrey Germanov • Edited

From time to time, Microsoft updates the ONNX runtime library without worrying about backward compatibility. This problem already discussed here before. To solve it, ensure that the version of the "ort.min.js" that you import matches the version of "ort-wasm-simd.wasm" binary that you downloaded. Do the following:

1 Download the last ort-wasm-simd.wasm file from here: cdn.jsdelivr.net/npm/onnxruntime-w...

2 Ensure that you load the last version of the "ort.min.js". The first line of the "worker.js" should be:

importScripts("https://cdn.jsdelivr.net/npm/onnxruntime-web/dist/ort.min.js");
Enter fullscreen mode Exit fullscreen mode

3 Perhaps, you will need to restart live server to apply these changes.

Collapse
 
chih3b profile image
chiheb nouri • Edited

Thank you so much it worked but i still have 2 problems.i tried using my webcam and its very slow.how can we optimize it?i want to count detections after they pass a line(i did that in python with opencv when i was using flask before i saw your solution) how can put that logic in your solution?thank you

Thread Thread
 
andreygermanov profile image
Andrey Germanov • Edited

Hi, not sure how you can dramatically increase the YOLOv8 inference speed without GPU.

To count detections that passed a line, you need to use a simple geometry. If you have coordinates of detected box [x1,y1,x2,y2] for each frame and if you have coordinates of the line [x1,y1,x2,y2], you can calculate the intersection and see if the detected box passed it or not.

Collapse
 
camerayuhang profile image
camerayuhang • Edited

First of all, thank you very much for your tutorials. I've followed all of your tutorials and have some questions. I hope you can help me clarify.

  1. In your tutorial, you use the canvas element to replace the video element. Each prediction, the canvas simultaneously draws the current video frame and bounding boxes. In my project, I still use the video element to display the video, with the canvas overlaying on the video element for drawing. This way, video controls are retained. Would the latter approach be better since the canvas doesn't need to draw the image, only the bounding boxes?

  2. In the official Canvas documentation, OffscreenCanvas and workers enable rendering operations to run in a separate thread, avoiding heavy work on the main thread. Therefore, moving the run_model function and drawing bounding boxes into a worker should further enhance performance.

  3. In the run_model function, you reload the model for every prediction. Moving the model loading outside the detection loop should significantly improve speed. In my code, loading an ONNX format model takes about 400ms. I don't know why you reload the model every time, your real-time detection performance still remains good.

  4. I trained a custom dataset using the pre-trained YOLOv8m model and obtained a best.py model file with a size of 49MB. After converting it to an ONNX model, the file size increased to 98MB. However, my custom model takes over 4000ms to predict an image, which is insufficient for real-time detection tasks. I'm curious to know how many milliseconds it takes for you to predict an image and why my prediction time is so long. My two devices, an M1 MacBook Air and an Arch Linux machine with an i7-12700 processor, both exhibit inference times exceeding 4000ms

Image description

Image description

Collapse
 
anhai profile image
Foxconn.AI Tuan Anh • Edited

I already follow your step, and download code in gg drive here (someone comment that this code run ok):
drive.google.com/drive/folders/1FQ...

I changed the onnx path.
I put it into a folder of XAMPP on my local server (localhost do not have https), when run it show problem with WASM. How do i solve, please?

Errors:

wasm streaming compile failed: LinkError: WebAssembly.instantiate(): Import #37 module="a" function="L": function import requires a callable
(anonymous) @ ort-wasm.js:15
ort-wasm.js:15  falling back to ArrayBuffer instantiation
(anonymous) @ ort-wasm.js:15
ort-wasm.js:14  failed to asynchronously prepare wasm: LinkError: WebAssembly.instantiate(): Import #37 module="a" function="L": function import requires a callable
(anonymous) @ ort-wasm.js:14
ort-wasm.js:13  Aborted(LinkError: WebAssembly.instantiate(): Import #37 module="a" function="L": function import requires a callable)
G @ ort-wasm.js:13
backend-impl.js:91  Uncaught (in promise) Error: no available backend found. ERR: [wasm] RuntimeError: Aborted(LinkError: WebAssembly.instantiate(): Import #37 module="a" function="L": function import requires a callable). Build with -sASSERTIONS for more info.
    at resolveBackend (backend-impl.js:91:1)
    at async InferenceSession.create (inference-session-impl.js:175:1)
    at async run_model (worker.js:17:19)
    at async onmessage (worker.js:10:20)
Enter fullscreen mode Exit fullscreen mode
Collapse
 
andreygermanov profile image
Andrey Germanov • Edited

Hello,

The WASM file is outdated.

Please replace the ort-wasm-simd.wasm file from here cdn.jsdelivr.net/npm/onnxruntime-w... and try again.

Collapse
 
fatmaboodai profile image
fatmaboodai

Hello,
Is there a way that the detection starts from the first second the video is played?

I’m trying to build a warning system where if a Specific label was detected within the frame an webpage alert will be displayed

But because the detection doesn’t start from the first second the first few frames is not being detected

Do you have any idea how i can make this work?

Collapse
 
andreygermanov profile image
Andrey Germanov • Edited

To capture each individual frame you can run the model inside "timeupdate" event handler of the video player, like here:

video.addEventListener("timeupdate", async() => {
    const canvas = document.querySelector("canvas");
    canvas.width = video.videoWidth;
    canvas.height = video.videoHeight;
    const context = canvas.getContext("2d");
    context.drawImage(video,0,0);
    const input = prepare_input(canvas);
    const output = await run_model(input);
    boxes =  process_output(output, canvas.width, canvas.height);
    // find required label inside "boxes" array
})
Enter fullscreen mode Exit fullscreen mode

Also, you can repeat the same code inside "play" event handler to ensure that it captures the earliest frame right in a moment when it starts playing.

Collapse
 
fatmaboodai profile image
fatmaboodai

Thank you so much i really appreciate it

Collapse
 
hpozuelo profile image
h-pozuelo

By the way, have you worked with tfjs format model exported from a Yolov8 model?
I don't know how to interpret the output tensor I got.
Thats because ONNX2tf export all info (bbox, score, class, ...) on just a tensor (not an array of tensors). So I'm unable to read it, I can't understand it.

Collapse
 
andreygermanov profile image
Andrey Germanov

No, haven't worked with it.

Collapse
 
hpozuelo profile image
h-pozuelo

Hello, I read on the onnx webpage that with the onnxruntime-web we can use webgl or wasm.
On your project you re using wasm. Do you know how can I use WebGL for pre-proccess?

Collapse
 
andreygermanov profile image
Andrey Germanov • Edited

I did not use it in practice, because WebGL is not enough stable, it does not support all operators. It did not work with YOLOv8 model when I tried.

In general, you can try it when construct the model this way:

const model = await ort.InferenceSession.create('yolov8n.onnx',
{
executionProviders: ['webgl']
}
)

Collapse
 
hpozuelo profile image
h-pozuelo

with the same file .wasm?

Thread Thread
 
andreygermanov profile image
Andrey Germanov

.wasm file not required for WebGL

Collapse
 
anhai profile image
Foxconn.AI Tuan Anh

I also face error when using webgl. Dont know how to solve :)

Thread Thread
 
andreygermanov profile image
Andrey Germanov • Edited

No, YOLOv8 model has operators that not supported in ONNX WebGL implementation (at least in current version).