DEV Community

Yuiko Koyanagi
Yuiko Koyanagi

Posted on

Use Pose detection of TensorFlow with Next.js and TypeScript: Let's become pictograms with Pose detection #Tokyo2020

Hello guys,

Good news! Pictograms are now the official sport in #Tokyo2020.
So, I have developed an application with Pose Detection in TensorFlow.js, that we can become pictograms :D

Image from Gyazo

In this article, I will explain about this application.

DEMO→https://pictogram-san.com/
github→https://github.com/tommy19970714/pictogram-san

※ I developed this application with my friends, Tommy, Waserin, Nishikawa, and mikkame.

Why Pictograms?

Now now the pictograms are official in Olympic Games.

The Tokyo Organising Committee of the Olympic and Paralympic Games (Tokyo 2020) today unveiled the official sport pictograms of the Olympic Games Tokyo 2020

Tokyo 2020 unveils Olympic Games sport pictograms


Pictogram

You want to try pictograms? Just try this application!

Usage

Visit https://pictogram-san.com/, then click Start Game or Take photo.

When you click Start Game, the music starts now!
Let's pose as per the subject! lol

Image from Gyazo

When you click Take photo, You can get a screen shot.

If you want switch a camera between in-camera and rear camera, you can just click the following button! (Only SP is available. When you try on PC, this button is disabled.)

image

Features

This application has the following functions.

  • Make a pictogram of the image acquired by the webcam.
  • Split the screen in two vertically, and play the pictogram on the top and the webcam video on the bottom.
  • By clicking the play button, the music automatically starts playing.
  • Switch a camera between in-camera and rear camera
  • Take a screenshot of the screen when the music ends.

Here, we used a model called pose-detection in TensorFlow.js and this application is built with Next.js and TypeScript.

I can't write about all the technical details, so now I'll just write some points important.

If you need, please check our github.
Any issues and PRs are really welcome!

Set up to use pose-detection with Next.js and TypeScript

$ yarn add @tensorflow-models/pose-detection @tensorflow/tfjs-core @tensorflow/tfjs-converter @tensorflow/tfjs-backend-webgl
Enter fullscreen mode Exit fullscreen mode

※ I wrote in another article about the reason why I don't do just yarn add @tensorflow/tfjs. Have a look if you need.

If the model you use is based on wasm, you must install @tensorflow/tfjs-backend-wasm instead of @tensorflow/tfjs-backend-webgl.
Here, because the pose-detection model is based on webgl, I installed webgl.

Then, load PoseNet.
If necessary, you can specify the architecture, etc.

    const modelName = SupportedModels.PoseNet
    const net = await createDetector(modelName, {
      quantBytes: 2,
      architecture: 'MobileNetV1'
      outputStride: 16,
      inputResolution: resolution,
    })
Enter fullscreen mode Exit fullscreen mode

There are two architectures, MobileNetV1 and ResNet50. ResNet50 gives higher accuracy, but it is quite heavy, especially on a heavy phone, so we used MobileNetV1 here.

After the PoseNet has been loaded, put the image or video data in the argument of estimatePoses.
In this case, we used the webcam data as is.

      const predictions = await net.estimatePoses(webcam, {
        maxPoses: 1,
        flipHorizontal: false,
      })
Enter fullscreen mode Exit fullscreen mode

Then, you can use the detected predictions and draw them as you like.

Start pose detection as soon as the webcam starts up

In order to use the information from the webcam, we need to wait until the video tag is loaded.
So, I wrote like the following.

if (webcamRef.current.video.readyState === 4) {
 // Write the process of using TensorFlow.js
}
Enter fullscreen mode Exit fullscreen mode

However, since the readyState is not 4 when the page is loaded, this processing part will be passed forever.

So, I tried to wait until readyState becomes 4 in the following code.

  const handleLoadWaiting = async () => {
    return new Promise((resolve) => {
      const timer = setInterval(() => {
        if (webcamRef.current?.video?.readyState === 4) {
          resolve(true)
          clearInterval(timer)
        }
      }, 500)
    })
  }

  const handleStartDrawing = async () => {
    await handleLoadWaiting()
    // Here write something
  }

Enter fullscreen mode Exit fullscreen mode

By the way, I had used useEffect in the past my application that puts on the mask on your face.
※ Since it has no loading, you may be confused because the mask is not displayed at first, but if you wait for some seconds, the mask will be attached.

  useEffect(() => {
    runFaceDetect();
  }, [webcamRef.current?.video?.readyState])

// https://github.com/yuikoito/mask-app/blob/master/src/App.tsx#L43-L46
Enter fullscreen mode Exit fullscreen mode

Here, it was better to wait until the loading is completed with handleLoadWaiting function and I used it.

Then don't forget to specify the size of the input element and the size of the output element as well.

      const webcam = webcamRef.current.video as HTMLVideoElement
      const canvas = canvasRef.current
      webcam.width = webcam.videoWidth
      webcam.height = webcam.videoHeight
      canvas.width = webcam.videoWidth
      canvas.height = webcam.videoHeight * 2
Enter fullscreen mode Exit fullscreen mode

At first, because I totally forgot to specify the width of the input webcam element, the width of the input value was recognized as 0, and I was troubled by the error that roi was 0. So, DON'T forget!

      webcam.width = webcam.videoWidth
      webcam.height = webcam.videoHeight
Enter fullscreen mode Exit fullscreen mode

Switch a camera between in-camera and rear camera

To switch between the in-camera and rear camera, simply switch the facingMode between 'user' and 'environment'.

  const [facingMode, setFacingMode] = useState<'user' | 'environment'>('user')
  const videoConstraints = {
    width: // specify width you want,
    height: // specify height you want,
    facingMode: facingMode,
  }

// webcam part
          <Webcam
            audio={false}
            mirrored={true}
            videoConstraints={videoConstraints}
            ref={webcamRef}
          />
Enter fullscreen mode Exit fullscreen mode

You can then switch the value of facingMode by clicking a button.

References

This idea is based on Mr. Takahashi.

https://github.com/Kazuhito00/Tokyo2020-Pictogram-using-MediaPipe

That's it!

As I wrote, I developed this application with my friends, Tommy, Waserin, Nishikawa, and mikkame.

We developed for this two days like a Hackathon with connecting to discord.
It was a lot of fun, and I'd like to do it again.

If you want to develop with me, please contact :)!

===

This article is the eleventh week of trying to write at least one article every week.

If you'd like, please take a look at my previous weekly posts!
See you soon!

Contact

Please send me a message if you want to offer a job or ask me something.

yuiko.dev@gmail.com
https://twitter.com/yui_active

Thank you!

Top comments (1)

Collapse
 
ruslanastratov profile image
Ruslan Astratov

Your work is amazing!