<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Kathan</title>
    <description>The latest articles on DEV Community by Kathan (@kiyo).</description>
    <link>https://dev.to/kiyo</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F1237294%2F3bbe4a05-a015-4693-b7ea-4be46adafac1.png</url>
      <title>DEV Community: Kathan</title>
      <link>https://dev.to/kiyo</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/kiyo"/>
    <language>en</language>
    <item>
      <title>Integrating @mediapipe/tasks-vision for Hand Landmark Detection in React</title>
      <dc:creator>Kathan</dc:creator>
      <pubDate>Wed, 20 Dec 2023 13:38:04 +0000</pubDate>
      <link>https://dev.to/kiyo/integrating-mediapipetasks-vision-for-hand-landmark-detection-in-react-2lbg</link>
      <guid>https://dev.to/kiyo/integrating-mediapipetasks-vision-for-hand-landmark-detection-in-react-2lbg</guid>
      <description>&lt;p&gt;I came across a project where I needed to check if a hand was present or not. During my research, I discovered that @mediapipe/hands are not functioning as they used to, and Google has transitioned to @mediapipe/task-vision, which is utilized for their MediaPipe projects. Using their documentation and making some changes, I developed a tool that detects the presence of hands and displays landmarks on a canvas. For future reference, I thought of creating an example of this, so you can go through it and easily work with any task-vision models in React.&lt;br&gt;
here is how To do this:&lt;br&gt;
(Directly scroll to last if you just want to see the full code.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Setting Up the Environment&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Install MediaPipe.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;npm i @mediapipe/tasks-vision&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt;  Download the HandLandmark model from  &lt;a href="https://developers.google.com/mediapipe/solutions/vision/hand_landmarker#models"&gt;here&lt;/a&gt;.&lt;br&gt;
(Note: For different models, refer to the models section within the MediaPipe documentation.)&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Implementing in React&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Now let's jump to our demo file.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 1:&lt;/strong&gt; Import the Model and FilesetResolver.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import { FilesetResolver, HandLandmarker } from "@mediapipe/tasks-vision";
import hand_landmarker_task from "../models/hand_landmarker.task";
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;FilesetResolver - To find and use a specific set of files needed in project.&lt;br&gt;
HandLandmarker - It's used for recognizing and understanding hand movements in images or videos.&lt;br&gt;
hand_landmarker_task is just name of model.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 2:&lt;/strong&gt; Initialize hand detection.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;
const initializeHandDetection = async () =&amp;gt; {
            try {
                const vision = await FilesetResolver.forVisionTasks(
                    "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@latest/wasm",
                );
                handLandmarker = await HandLandmarker.createFromOptions(
                    vision, {
                        baseOptions: { modelAssetPath: hand_landmarker_task },
                        numHands: 2,
                        runningMode: "video"
                    }
                );
                detectHands();
            } catch (error) {
                console.error("Error initializing hand detection:", error);
            }
        };
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The function first uses FilesetResolver.forVisionTasks to load necessary files from a URL. Next, the HandLandmarker.createFromOptions is called to create a hand landmarker. to work in a 'video' mode (runningMode: "video"). For Image use runningMode: "image" .&lt;br&gt;
Once everything is set up, the function calls detectHands() to start the actual hand detection process.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 3:&lt;/strong&gt; Detect hands using detectForVideo.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;if (videoRef.current &amp;amp;&amp;amp; videoRef.current.readyState &amp;gt;= 2) {
const detections = handLandmarker.detectForVideo(videoRef.current, performance.now());

requestAnimationFrame(detectHands);
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This line is where the hand detection actually happens. The handLandmarker.detectForVideo function is called with two arguments:&lt;br&gt;
videoRef.current: The current video element.&lt;br&gt;
performance.now(): The current time in milliseconds. This is used to timestamp the detections, helping in synchronizing the detection with the video.&lt;/p&gt;

&lt;p&gt;requestAnimationFrame is a method that tells the browser to perform an animation and requests that the browser calls a specified function (in this case, detectHands) to update an animation before the next repaint. This creates a loop where detectHands is called repeatedly in sync with the browser's refresh rate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Step 4:&lt;/strong&gt; Start the webcam.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;const startWebcam = async () =&amp;gt; {
            try {
                const stream = await navigator.mediaDevices.getUserMedia({ video: true });
                videoRef.current.srcObject = stream;
                await initializeHandDetection();
            } catch (error) {
                console.error("Error accessing webcam:", error);
            }
        };

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;ul&gt;
&lt;li&gt;videoRef.current.srcObject = stream; Here, the video stream from the webcam is assigned to a video element (referred to by videoRef.current).&lt;/li&gt;
&lt;li&gt;await initializeHandDetection(); After the webcam starts, this line calls the initializeHandDetection function.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Step 5:&lt;/strong&gt; Clean up in useEffect.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;return () =&amp;gt; {
            if (videoRef.current &amp;amp;&amp;amp; videoRef.current.srcObject) {
                videoRef.current.srcObject.getTracks().forEach(track =&amp;gt; track.stop());
            }
            if (handLandmarker) {
                handLandmarker.close();
            }
            if (animationFrameId) {
                cancelAnimationFrame(animationFrameId);
            }
        };
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Inside the useEffect hook, we'll include our key functions to ensure our app runs smoothly and without errors. By using useEffect, we make sure everything starts and stops at the right time, keeping our app fast and reliable.&lt;br&gt;
Here's a summary with the complete code using our 'Demo' component: This component utilizes a webcam to detect if a hand is present, and alongside, it displays a canvas that shows all the hand landmarks. It's an integrated solution for hand tracking, seamlessly combining live video with graphical landmark representation.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;em&gt;Here full code of DEMO.js:&lt;/em&gt;&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;import React, { useEffect, useRef, useState } from "react";
import { FilesetResolver, HandLandmarker } from "@mediapipe/tasks-vision";
import hand_landmarker_task from "../models/hand_landmarker.task";

const Demo = () =&amp;gt; {
    const videoRef = useRef(null);
    const canvasRef = useRef(null);
    const [handPresence, setHandPresence] = useState(null);

    useEffect(() =&amp;gt; {
        let handLandmarker;
        let animationFrameId;

        const initializeHandDetection = async () =&amp;gt; {
            try {
                const vision = await FilesetResolver.forVisionTasks(
                    "https://cdn.jsdelivr.net/npm/@mediapipe/tasks-vision@latest/wasm",
                );
                handLandmarker = await HandLandmarker.createFromOptions(
                    vision, {
                        baseOptions: { modelAssetPath: hand_landmarker_task },
                        numHands: 2,
                        runningMode: "video"
                    }
                );
                detectHands();
            } catch (error) {
                console.error("Error initializing hand detection:", error);
            }
        };

    const drawLandmarks = (landmarksArray) =&amp;gt; {
    const canvas = canvasRef.current;
    const ctx = canvas.getContext('2d');
    ctx.clearRect(0, 0, canvas.width, canvas.height);
    ctx.fillStyle = 'white';

    landmarksArray.forEach(landmarks =&amp;gt; {
        landmarks.forEach(landmark =&amp;gt; {
            const x = landmark.x * canvas.width;
            const y = landmark.y * canvas.height;

            ctx.beginPath();
            ctx.arc(x, y, 5, 0, 2 * Math.PI); // Draw a circle for each landmark
            ctx.fill();
        });
    });
};

        const detectHands = () =&amp;gt; {
            if (videoRef.current &amp;amp;&amp;amp; videoRef.current.readyState &amp;gt;= 2) {
                const detections = handLandmarker.detectForVideo(videoRef.current, performance.now());
                setHandPresence(detections.handednesses.length &amp;gt; 0);

                // Assuming detections.landmarks is an array of landmark objects
                if (detections.landmarks) {
                    drawLandmarks(detections.landmarks);
                }
            }
            requestAnimationFrame(detectHands);
        };

        const startWebcam = async () =&amp;gt; {
            try {
                const stream = await navigator.mediaDevices.getUserMedia({ video: true });
                videoRef.current.srcObject = stream;
                await initializeHandDetection();
            } catch (error) {
                console.error("Error accessing webcam:", error);
            }
        };

        startWebcam();

        return () =&amp;gt; {
            if (videoRef.current &amp;amp;&amp;amp; videoRef.current.srcObject) {
                videoRef.current.srcObject.getTracks().forEach(track =&amp;gt; track.stop());
            }
            if (handLandmarker) {
                handLandmarker.close();
            }
            if (animationFrameId) {
                cancelAnimationFrame(animationFrameId);
            }
        };
    }, []);

    return (
        &amp;lt;&amp;gt;
        &amp;lt;h1&amp;gt;Is there a Hand? {handPresence ? "Yes" : "No"}&amp;lt;/h1&amp;gt;
        &amp;lt;div style={{ position: "relative" }}&amp;gt;
            &amp;lt;video ref={videoRef} autoPlay playsInline &amp;gt;&amp;lt;/video&amp;gt;
            &amp;lt;canvas ref={canvasRef} style={{ backgroundColor: "black" , width:"600px", height:"480px"}}&amp;gt;&amp;lt;/canvas&amp;gt;
        &amp;lt;/div&amp;gt;
    &amp;lt;/&amp;gt;
    );
};

export default Demo;

&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It will show something like this:&lt;/p&gt;

&lt;p&gt;&lt;a href="https://res.cloudinary.com/practicaldev/image/fetch/s--zU0-F5ir--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/myz6mdcayt8lmg7x0psr.png" class="article-body-image-wrapper"&gt;&lt;img src="https://res.cloudinary.com/practicaldev/image/fetch/s--zU0-F5ir--/c_limit%2Cf_auto%2Cfl_progressive%2Cq_auto%2Cw_800/https://dev-to-uploads.s3.amazonaws.com/uploads/articles/myz6mdcayt8lmg7x0psr.png" alt="Hand landmark showing on canvas" width="800" height="378"&gt;&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;(not the purple screen but webcam will be there).&lt;/p&gt;

&lt;p&gt;Thanks for reading :)&lt;/p&gt;

</description>
      <category>react</category>
      <category>handlandmark</category>
      <category>mediapipe</category>
      <category>computervision</category>
    </item>
  </channel>
</rss>
