Computer Vision has developed rapidly in the past few years. You might have come across the AI generated avatars that are booming lately all over social media, but Computer Vision applications are more varied, from face and ID verification to self-driving cars, ecommerce and retail industries, healthcare and more.
Computer Vision is a subfield of AI that gives machines the sense of sight. Unlike human vision, a computer sees an image as just a bunch of integer values which represent intensities across the color spectrum. It uses a model which is provided by an algorithm and trained over a set of data to "learn" and recognize certain types of patterns.
Let's take the AI generated avatars as an example, the model used for that would be trained over an algorithm that uses biometrics to analyze facial features to be able to recognize and extract a face from an image, modify it and apply it to the creation of the avatar.
Without getting much into machine learning concepts as we will be building our app today using Clarifai which is a computer vision solution, let's jump right off and start coding!
We will start by building our server that will be holding the Clarifai integration logic
Open your terminal and copy the following command into it
mkdir face-detection-react-node-clarifai && cd face-detection-react-node-clarifai && mkdir server && cd server
Now that we are on the server directory we initialize our project
npm init -y
Then we install some dependencies
npm install express dotenv clarifai-nodejs-grpc --save
Let's now create index.js file
touch index.js
Now we create a simple HTTP server
Copy the following code into the file we just created
const express = require('express');
const http = require('http');
const app = express();
app.use(express.urlencoded({ extended: false }))
app.use(express.json())
const port = process.env.PORT || '8080';
app.set('port', port);
const server = http.createServer(app);
server.listen(port);
server.on('listening', () => {
console.log(`Listening on ${port}`);
});
Now we create detect.js file which will contain the Clarifai logic
touch detect.js
Before we get into the logic, let's hop over Clarifai website and create a new account
Once you log into your account, navigate to My Apps where you will find your first application created (create one if not found), click on it, then go to App Settings, and copy the API key.
Now back to our server directory, create a new .env file and copy the following variable into it after you add the API key copied to clipboard.
CLARIFAI_AUTH_KEY=
Now we start the integration
We construct the Clarifai stub, which contains all the methods available in the Clarifai API, and the Metadata object that's used to authenticate.
const { ClarifaiStub, grpc } = require("clarifai-nodejs-grpc");
const dotenv = require('dotenv');
dotenv.config();
const metadata = new grpc.Metadata();
metadata.set("authorization", `Key ${process.env.CLARIFAI_AUTH_KEY}`);
const stub = ClarifaiStub.grpc();
We create a new promise to handle the PostModelOutputs asynchronous method, then provide the model Id that we get the predictions from. We are here using a public pre-optimized general Clarifai model.
You can dive more into public Clarifai models here
const detectFace = (inputs) => {
return new Promise((resolve, reject) => {
stub.PostModelOutputs(
{
model_id: "a403429f2ddf4b49b307e318f00e528b",
inputs: inputs
},
metadata,
(error, response) => {
if (error) {
reject("Error: " + error);
return;
}
if (response.status.code !== 10000) {
reject("Received failed status: " + response.status.description + "\n" + response.status.details);
return;
}
let results = response.outputs[0].data.regions;
resolve(results);
}
);
})
}
We need to also supply the image that the model would process, so here we are assigning inputs which is an array that we will construct in a moment.
We should expect to receive the regions of the faces that the model predicts as a result.
Now we create a function to handle the asynchronous call that we are going to fire using the detectFace function we just created.
const handleDetect = async (req, res) => {
try{
const { imageUrl } = req.body;
const inputs = [
{
data: {
image: {
url: imageUrl
}
}
}
];
const results = await detectFace(inputs);
return res.send({
results
})
}
catch (error) {
return res.status(400).send({
error: error
})
}
}
We construct the inputs array where we are going to assign the imageUrl that we get from the frontend input.
You can also provide the image directly by sending bytes but for this tutorial we are just going to use URLs.
Then we export the handleDetect
module.exports = {
handleDetect
}
Let's now modify the server file index.js and add the detect route that would be called later on the frontend.
Full Code:
const express = require('express');
const http = require('http');
const detect = require('./detect');
const app = express();
app.use(express.urlencoded({ extended: false }))
app.use(express.json())
app.post('/detect', (req, res) => detect.handleDetect(req, res));
const port = process.env.PORT || '8080';
app.set('port', port);
const server = http.createServer(app);
server.listen(port);
server.on('listening', () => {
console.log(`Listening on ${port}`);
});
Now we add the start script into package.json
"scripts": {
"start": "node ./index.js"
},
Start the server and make sure everything is working as expected.
Now let’s navigate to the root directory and start building our frontend.
We will use create-react-app to bootstrap our project.
Copy the command below into your terminal
npx create-react-app client && cd client
We install a couple of dependencies
npm install styled-component axios –-save
We start by specifying the proxy in package.json
"proxy": "http://localhost:8080"
We are going to use styled-components to build our app interface.
In the src folder create components directory and inside of it create styled.js
mkdir components && cd components && touch styled.js
Copy the following code into the file
import styled from "styled-components";
import FaceDetectIcon from '../assets/FaceDetectIcon.svg';
const FaceDetectionWrapper = styled.div`
display: flex;
flex-direction: column;
align-items: center;
justify-content: center;
`
const Title = styled.h2`
margin: 20px 0;
font-weight: 800;
font-size: 20px;
text-align: center;
color: #383838;
`
const ImagePreviewWrapper = styled.div`
position: relative;
width: ${props => props.width};
height: ${props => props.height};
display: flex;
align-items: center;
justify-content: center;
background: #F9F9FB;
background-image: url(${FaceDetectIcon});
background-repeat: no-repeat;
background-position: center;
border-radius: 16px;
border: 2px dashed #D2D7E5;
`
const Image = styled.img`
border-radius: ${props => props.borderRadius};
`
const FormWrapper = styled.form`
display: flex;
flex-direction: column;
`
const InputWrapper = styled.input`
width: 500px;
height: 45px;
margin: 12px 0;
padding: 8px 20px;
box-sizing: border-box;
background: #FFFFFF;
border: 1px solid #D2D7E5;
border-radius: 16px;
outline: none;
`
const SubmitButton = styled.button`
width: 500px;
height: 45px;
display: flex;
align-items: center;
justify-content: center;
cursor: pointer;
border-radius: 16px;
border: none;
outline: none;
background: #576684;
color: #FFFFFF;
font-weight: 300;
font-size: 20px;
text-align: center;
color: #FFFFFF;
`
const FaceBox = styled.div`
position: absolute;
border: 2px solid #FFFFFF;
border-radius: 8px;
`;
const FaceBoxesWrapper = styled.div`
`;
export {
FaceDetectionWrapper,
Title,
ImagePreviewWrapper,
Image,
FormWrapper,
InputWrapper,
SubmitButton,
FaceBox,
FaceBoxesWrapper
}
Create Form.js file where we are going to set our input
touch Form.js
Copy the code below into it
import {
FormWrapper,
InputWrapper,
SubmitButton
} from './styled';
const FaceDetectionForm = ({ setInput, input, detectFaces, loading }) => {
return(
<FormWrapper onSubmit={detectFaces}>
<InputWrapper type="text" placeholder="Enter Image Link" onChange={e => setInput(e.target.value)} value={input}/>
<SubmitButton type="submit">{loading ? "Loading..." : "Detect"}</SubmitButton>
</FormWrapper>
)
}
export default FaceDetectionForm;
Nothing but a simple input and a submit button.
Create Preview.js file where we are going to view the image with the received face predictions
touch Preview.js
Copy the following code into it
import {
ImagePreviewWrapper,
Image,
FaceBoxesWrapper,
FaceBox
} from './styled';
const FaceDetectionPreview = ({ boxDimensions, imageUrl }) => {
const faceBoxGenerator = (dimensions) => {
let boxes = [];
for(let i = 0; i < dimensions.length; i++){
boxes.push(<FaceBox key={i}
style={{
top: dimensions[i].topRow,
right: dimensions[i].rightCol,
bottom: dimensions[i].bottomRow,
left: dimensions[i].leftCol
}}></FaceBox>)
}
return <FaceBoxesWrapper>{boxes}</FaceBoxesWrapper>
}
return (
<ImagePreviewWrapper width="500px" height={imageUrl ? "auto" : "300px"}>
{imageUrl && <Image src={imageUrl}
id="inputImage"
alt="face detect output"
width='500px' height='auto'
borderRadius="16px"
/>}
{faceBoxGenerator(boxDimensions)}
</ImagePreviewWrapper>
);
}
export default FaceDetectionPreview;
Finally, let's do our API call
We will use Axios as our HTTP client
Copy the code below into App.js
import { useState } from "react";
import axios from "axios";
import {
FaceDetectionWrapper,
Title
} from './styled';
import FaceDetectionForm from './Form';
import FaceDetectionPreview from './Preview';
import '../App.css';
function App() {
const [input, setInput] = useState("");
const [imageUrl, setImageUrl] = useState("");
const [boxDimentions, setBoxDimentions] = useState([]);
const [loading, setLoading] = useState(false);
const detectFaces = async (e) => {
e.preventDefault();
setBoxDimentions([]);
setLoading(true);
setImageUrl(input);
const boxDimensionArray = []
try {
const detect = await axios.post('/detect', {
imageUrl: input
});
const results = detect.data.results;
results.forEach(region => {
const faceBoxDimentions = region.region_info.bounding_box;
const image = document.getElementById('inputImage');
const width = Number(image.width);
const height = Number(image.height);
const boxDimension = {
leftCol: faceBoxDimentions.left_col * width,
topRow: faceBoxDimentions.top_row * height,
rightCol: width - (faceBoxDimentions.right_col * width),
bottomRow: height - (faceBoxDimentions.bottom_row * height)
}
boxDimensionArray.push(boxDimension);
});
setBoxDimentions(boxDimensionArray);
} catch (error) {
console.error(error);
} finally {
setLoading(false);
}
}
return (
<FaceDetectionWrapper>
<Title>Face Detection</Title>
<FaceDetectionPreview boxDimensions={boxDimentions} imageUrl={imageUrl} />
<FaceDetectionForm setInput={setInput} input={input} detectFaces={detectFaces} loading={loading} />
</FaceDetectionWrapper>
);
}
export default App;
As mentioned earlier, we should be expecting an array of predictions for the image that we provided, then we target the image width and height to be able to calculate each box’s side position.
As mentioned earlier, we should be expecting an array of predictions for the image that we provided, then we target the image’s width and height to be able to calculate each box's side positioning on the preview.
Let's now run the app et voila!
Full code can be found in the github repo here.
Conclusion:
There are a lot of computer vision and face detection applications and use cases out there, and soon this technology will go beyond and become a part of our everyday life (If it has not become already!).
I'm planning to get more into AI and Machine learning, and I will do articles and tutorials about TensorFlow and OpenCV, so follow me and keep posted!
P.S. I'm currently in the middle of building my own lab on my website.
Subscribe here and get notified when the project is launched.
Have a good one!
Top comments (0)