DEV Community

Cover image for Building UIs in Figma with hand movements
Charlie Gerard
Charlie Gerard

Posted on

Building UIs in Figma with hand movements

Post originally shared on my blog.

Since the release of the latest version of the MediaPipe handpose detection machine learning model that allows the detection of multiple hands, I've had in mind to try to use it to create UIs, and here's the result of a quick prototype built in a few hours!

Before starting this, I also came across 2 projects mixing TensorFlow.js and Figma, one by Anthony DiSpezio to turn gestures into emojis and one by Siddharth Ahuja to move Figma's canvas with hand gestures.

I had never made a Figma plugin before but decided to look into it to see if I could build one to design UIs using hand movements.

The first thing to know is that you can't test your plugins in the web version so you need to install the Desktop version while you're developing.

Then, even though you have access to some Web APIs in a plugin, access to the camera and microphone isn't allowed, for security reasons, so I had to figure out how to send the hand data to the plugin.

The way I went about it is using Socket.io to run a separate web app that handles the hand detection and send specific events to my Figma plugin via websockets.

Here's a quick visualization of the architecture:

A web app is running the hand detection with TensorFlow.js, sends events such as pinch or zoom to the express server with socket.io running on port 8080. The Figma plugin is listening on the same port to the zoom event and when received, triggers different actions in the Figma UI.

Gesture detection with TensorFlow.js

In my separate web app, I am running TensorFlow.js and the hand pose detection model to get the coordinates of my hands and fingers on the screen and create some custom gestures.

Without going into too much details, here's a code sample for the "zoom" gesture:

let leftThumbTip,
    rightThumbTip,
    leftIndexTip,
    rightIndexTip,
    leftIndexFingerDip,
    rightIndexFingerDip,
    rightMiddleFingerDip,
    rightRingFingerDip,
    rightMiddleFingerTip,
    leftMiddleFingerTip,
    leftMiddleFingerDip,
    leftRingFingerTip,
    leftRingFingerDip,
    rightRingFingerTip;

if (hands && hands.length > 0) {
    hands.map((hand) => {
      if (hand.handedness === "Left") {
        //---------------
        // DETECT PALM
        //---------------
        leftMiddleFingerTip = hand.keypoints.find(
          (p) => p.name === "middle_finger_tip"
        );
        leftRingFingerTip = hand.keypoints.find(
          (p) => p.name === "ring_finger_tip"
        );
        leftIndexFingerDip = hand.keypoints.find(
          (p) => p.name === "index_finger_dip"
        );
        leftMiddleFingerDip = hand.keypoints.find(
          (p) => p.name === "middle_finger_dip"
        );
        leftRingFingerDip = hand.keypoints.find(
          (p) => p.name === "ring_finger_dip"
        );

        if (
          leftIndexTip.y < leftIndexFingerDip.y &&
          leftMiddleFingerTip.y < leftMiddleFingerDip.y &&
          leftRingFingerTip.y < leftRingFingerDip.y
        ) {
          palmLeft = true;
        } else {
          palmLeft = false;
        }
      } else {

        //---------------
        // DETECT PALM
        //---------------
        rightMiddleFingerTip = hand.keypoints.find(
          (p) => p.name === "middle_finger_tip"
        );
        rightRingFingerTip = hand.keypoints.find(
          (p) => p.name === "ring_finger_tip"
        );
        rightIndexFingerDip = hand.keypoints.find(
          (p) => p.name === "index_finger_dip"
        );
        rightMiddleFingerDip = hand.keypoints.find(
          (p) => p.name === "middle_finger_dip"
        );
        rightRingFingerDip = hand.keypoints.find(
          (p) => p.name === "ring_finger_dip"
        );

        if (
          rightIndexTip.y < rightIndexFingerDip.y &&
          rightMiddleFingerTip.y < rightMiddleFingerDip.y &&
          rightRingFingerTip.y < rightRingFingerDip.y
        ) {
          palmRight = true;
        } else {
          palmRight = false;
        }

        if (palmRight && palmLeft) {
          // zoom
          socket.emit("zoom", rightMiddleFingerTip.x - leftMiddleFingerTip.x);
        }
      }
    });
  }
}
Enter fullscreen mode Exit fullscreen mode

This code looks a bit messy but that's intended. The goal was to validate the hypothesis that this solution would work before spending some time improving it.

What I did in this sample was checking that the y coordinate of the tips of my index, middle finger and ring finger was smaller than the y coordinate of their dip cause it would mean my fingers are straight so I'm doing some kind of "palm" gesture.
Once it is detected, I'm emitting a "zoom" event and sending the difference in x coordinate between my right middle finger and left middle finger to represent some kind of width.

Express server with socket.io

The server side uses express to serve my front-end files and socket.io to receive and emit messages.

Here's a code sample of the server listening for the zoom event and emitting it to other applications.

const express = require("express");
const app = express();
const http = require("http");
const server = http.createServer(app);
const { Server } = require("socket.io");
const io = new Server(server);

app.use("/", express.static("public"));

io.on("connection", (socket) => {
  console.log("a user connected");

  socket.on("zoom", (e) => {
    io.emit("zoom", e);
  });
});

server.listen(8080, () => {
  console.log("listening on *:8080");
});
Enter fullscreen mode Exit fullscreen mode

Figma plugin

On the Figma side, there's two parts. A ui.html file is usually responsible for showing the UI of the plugin and a code.js file is reponsible for the logic.
My html file starts the socket connection by listening to the same port as the one used in my Express server and sends the events to my JavaScript file.

For example, here's a sample to implement the "Zoom" functionality:

In ui.html:

<script src="https://cdnjs.cloudflare.com/ajax/libs/socket.io/4.4.1/socket.io.js"></script>
<script>
  var socket = io("ws://localhost:8080", { transports: ["websocket"] });
</script>

<script>
  // Zoom zoom
  socket.on("zoom", (msg) => {
    parent.postMessage({ pluginMessage: { type: "zoom", msg } }, "*");
  });
</script>
Enter fullscreen mode Exit fullscreen mode

In code.js:

figma.showUI(__html__);
figma.ui.hide();

figma.ui.onmessage = (msg) => {
  // Messages sent from ui.html
  if (msg.type === "zoom") {
    const normalizedZoom = normalize(msg.msg, 1200, 0);
    figma.viewport.zoom = normalizedZoom;
  }
};
const normalize = (val, max, min) =>
  Math.max(0, Math.min(1, (val - min) / (max - min)));
Enter fullscreen mode Exit fullscreen mode

According to the Figma docs, the zoom level needs to be a number between 0 and 1, so I am normalizing the coordinates I get from the hand detection app to be a value between 0 and 1.

So as I move my hands closer or further apart, I am zooming in or out on the design.

Gif showing me moving my hands closer and further apart to zoom in and out of a UI design

It's a pretty quick walkthrough but from there, any custom gesture from the frontend can be sent to Figma and used to trigger layers, create shapes, change colors, etc!

Having to run a separate app to be able to do this is not optimal but I doubt Figma will ever enable access to the getUserMedia Web API in a plugin so in the meantime, that was an interesting workaround to figure out!

Top comments (13)

Collapse
 
ben profile image
Ben Halpern

Whoa

Collapse
 
nickytonline profile image
Nick Taylor

This is so so cool!

Collapse
 
sherrydays profile image
Sherry Day

This is so cool

Collapse
 
giacomorebonato profile image
Giacomo Rebonato

So cool!

Collapse
 
vaibhavkhulbe profile image
Vaibhav Khulbe

Crazyy!

Collapse
 
rosapaula739 profile image
Rosa paula

online gaming describes any computer game that gives on-line interactions with different players. Video games accustomed be classified by {an on-line|a web|an internet} Content PEGI descriptor to indicate whether or not they were online or not. However, as most games currently give on-line interactions this distinction is not any longer used.

Collapse
 
lepinekong profile image
lepinekong

"access to the camera and microphone isn't allowed, for security reasons" : yeah that sucks even for end user it overcomplicates thing, not sure figma did so really for security reason because they sell audio chat maybe they don't want a plugin which would do the same ;)

Collapse
 
david050708 profile image
david050708

Super project I love it

Collapse
 
charlesinto profile image
charles

So cool

Collapse
 
ganwani_kamal profile image
kamal ganwani

the future is here