DEV Community

Cover image for Develop a Video Chat App with WebRTC, Socket.IO, Express and React.
Eyitayo Itunu Babatope
Eyitayo Itunu Babatope

Posted on • Updated on

Develop a Video Chat App with WebRTC, Socket.IO, Express and React.

INTRODUCTION

In 2013, Google developed Web Real-Time Communication (WebRTC) technology for peer-to-peer communication. WebRTC enables web browsers to capture audio and video, exchange data, and conduct teleconferencing without plugins or intermediaries. WebRTC achieves these through APIs and protocols that interact with one another. WebRTC media streaming, when used with SocKet.IO, will produce an application that streams media and exchanges data instantly.

Socket.IO is a library that provides low latency bi-directional communication between client and server. Socket.IO is built on WebSocket, a communication protocol that provides full-duplex and low-latency communication between the server and browser. In this article, readers will learn how to build a video chat application using WebRTC and Socket.IO. This article is for web developers who wish to develop web applications that can stream media between two computers in real time without installing any plugins.

Prerequisite

This article assumes the reader is familiar with setting up an Express NodeJS server, Socket.IO, a working knowledge of React.js, JavaScript, and CSS. We created the UI of the video chat using React and vanilla CSS, while we created the signaling server using Express and Socket.IO.

Contents

  • WebRTC Media Streaming Concept
  • Advantages of WebRTC
  • Drawbacks of WebRTC
  • Socket.IO Concept
  • Advantages of Socket.IO
  • Drawbacks of Socket.IO
  • Socket.IO Signaling Server setup
  • WebRTC Setup
  • Video Chat UI
  • Conclusion

WebRTC Media Streaming Concept

At the top level, WebRTC media streaming occurs in four phases:

  1. Offer.
  2. Signaling.
  3. Answer.
  4. Data exchange between two browsers.

WebRTC uses Session Description Protocol(SDP) to create an offer and answer mechanism. SDP provides a profile of your device to other users trying to connect to your device. SDP is a text-based protocol, that contains information about the type of media stream, codec, transport layer, and other information.

To make an offer, a WebRTC peer connection object is created and takes an optional parameter. The optional parameter contains the configuration for the Interactive Connectivity Establishment Protocol(ICE) servers. ICE finds the shortest path for media to travel between two peers. ICE helps devices connect across the internet, and overcome NAT(Network Address Translator), firewalls, and anything that can hinder peer-to-peer communication.

For devices to communicate over the internet, they utilize IP addresses and port numbers. However, most devices operate behind a firewall or NAT that hides their IP addresses, preventing peer-to-peer communication. During ICE gatherings, ICE allows devices to exchange potential network addresses called ICE candidates. The candidates contain IP addresses, port numbers, and transport protocols. A computer device can have multiple ICE candidates. ICE protocol then decides the best network path for a session using the information provided by ICE candidates. Devices exchange ICE candidates using the ICE servers. There are two types of ICE servers

1 Session Traversal Utilities for NATS(STUN)
2 Traversal Using Relay around NATS(TURN)

Turn servers are used in a more restrictive network when both peers are behind a firewall, NAT, or symmetric NAT that prevents direct communication between devices.

Advantages of WebRTC

  1. High communication quality
  2. Automatically adjusts to any type of connection.
  3. Automatic microphone sensitivity (AGC) control for all connections.
  4. All connections are protected (HTTPS) and encrypted (SRTP).
  5. WebRTC application works on desktop or mobile operating systems provided it has browser support.

Drawbacks of WebRTC

  1. Audio and video mixing to run group audio or video conferences.
  2. WebRTC solutions are incompatible with each other.
  3. Vendors decide on signaling, messaging, file transfer, conference scheduling, etc. No uniformity in signaling or messaging

Socket.IO concepts

Socket.IO is a library that facilitates bi-directional low-latency communication between the client and the server. The communication between the client and server is event-based. The client and the server emit and listen to events. To use Socket.IO, the client and the server library must be installed on the client and server. Socket.IO is used for applications that depend on real-time data like the stock market, weather, and chat applications. Socket.IO is implemented in most major languages, Python, Java, and C++. In this article, we used JavaScript implementation for both the client and server side.

Advantages of Socket.IO

  1. Multiple namespace streams down a single engine.io session.

  2. Encoding of messages as named JSON and binary events.

  3. Acknowledgment callbacks per event

Drawbacks of Socket.IO

  1. Socket.IO doesn't provide end-to-end encryption.

  2. Socket.IO does not guarantee exact-once messaging semantics. By default, an at-most-once guarantee is provided.

SOCKET.IO Signaling server setup

Create a directory and name it videoserver or any name of your choice.

mkdir videoserver
cd video server
npm init
Enter fullscreen mode Exit fullscreen mode

The npm init directive will create a package.json file in our folder.
Install Express and Socket.IO packages.

npm install express
npm install socket.io
Enter fullscreen mode Exit fullscreen mode

Create a file in the root of the directory and name it server.js. Open the server.js file, and import the necessary packages. Create the server as shown in the code below.

require("dotenv").config();
const express = require("express");
const cors = require("cors");
const app = express();
app.use(cors());
const server = require("http").createServer(app);
const io = require("socket.io")(server, {
  cors: { origin: "http://127.0.0.1/:5173" },
});
const PORT = 3000 || process.env.PORT;
io.on("connection", (socket) => {
  console.log("Connected");

  socket.on("message", (message) => {
    socket.broadcast.emit("message", message);
  });

  socket.on("disconnect", () => {
    console.log("Disconnected");
  });
});

function error(err, req, res, next) {
  // log it
  if (!test) console.error(err.stack);

  // respond with 500 "Internal Server Error".
  res.status(500);
  res.send("Internal Server Error");
}
app.use(error);
server.listen(3000, () => {
  console.log("listening on Port 3000");
});
Enter fullscreen mode Exit fullscreen mode

In the script above, the require statements load the needed packages by the script. The script uses the Cross Origin Resource Sharing (CORS) package to allow cross-origin server requests. The code snippet below creates a Socket.IO session.

const io = require("socket.io")(server, {
  cors: { origin: "http://127.0.0.1/:5173" },
});
Enter fullscreen mode Exit fullscreen mode

Socket.IO permits event-based communication between client and server. The code below shows how to emit an event with Socket.IO.

socket.emit("custom-named-event", data)
Enter fullscreen mode Exit fullscreen mode

The code below shows how to listen to an event.

socket.on("custom-named-event", (data)=>{
//process the data 
})
Enter fullscreen mode Exit fullscreen mode

The script listens to a client connection with io.on. When a Socket.IO client connects to the server, it will log 'Connected' to the console. The script uses the function error to handle the error that occurs during the script execution.

WebRTC Setup

We mentioned earlier that peer-to-peer connections in WebRTC occur through an offer-and-answer mechanism. A peer(caller) creates an instance of the RTCPeerConnection object and passes to it, an optional configuration object that contains the iceServers. The RTCPeerConnection creates an offer and sets the offer as its local session description, emits the offer to the remote peer through the signaling server. The caller uses the browser API getUserMedia to access the device media. Then attach the media stream and ice candidates to the RTCPeerConnection.

The remote peer(callee) will also create an instance of the RTC peer connection. When the callee receives an offer, it will set its remote session description to the offer, then create an answer, and set the answer as its local session description. The callee will send the answer to the caller through the signaling server. The callee would also attach its media stream and ice candidate to the peer connection.

The connection between the caller and callee can only take place after they have both exchanged ice candidates. We mentioned before that ice candidates are potential network addresses that peers can use to communicate with one another. The caller would send its ice candidate to the callee through the signaling server and listen for the ice candidate of the callee, when the caller receives the ice candidate of the callee, it will add it to the RTCPeerConnection. The callee would also repeat this process.

We implemented the offer and answer mechanism in the code below.

import { io } from "socket.io-client";
import { useRef, useEffect, useState } from "react";
import { FiVideo, FiVideoOff, FiMic, FiMicOff } from "react-icons/fi";

const configuration = {
  iceServers: [
    {
      urls: ["stun:stun1.l.google.com:19302", "stun:stun2.l.google.com:19302"],
    },
  ],
  iceCandidatePoolSize: 10,
};
const socket = io("http://localhost:3000", { transports: ["websocket"] });

let pc;
let localStream;
let startButton;
let hangupButton;
let muteAudButton;
let remoteVideo;
let localVideo;
socket.on("message", (e) => {
  if (!localStream) {
    console.log("not ready yet");
    return;
  }
  switch (e.type) {
    case "offer":
      handleOffer(e);
      break;
    case "answer":
      handleAnswer(e);
      break;
    case "candidate":
      handleCandidate(e);
      break;
    case "ready":
      // A second tab joined. This tab will initiate a call unless in a call already.
      if (pc) {
        console.log("already in call, ignoring");
        return;
      }
      makeCall();
      break;
    case "bye":
      if (pc) {
        hangup();
      }
      break;
    default:
      console.log("unhandled", e);
      break;
  }
});

async function makeCall() {
  try {
    pc = new RTCPeerConnection(configuration);
    pc.onicecandidate = (e) => {
      const message = {
        type: "candidate",
        candidate: null,
      };
      if (e.candidate) {
        message.candidate = e.candidate.candidate;
        message.sdpMid = e.candidate.sdpMid;
        message.sdpMLineIndex = e.candidate.sdpMLineIndex;
      }
      socket.emit("message", message);
    };
    pc.ontrack = (e) => (remoteVideo.current.srcObject = e.streams[0]);
    localStream.getTracks().forEach((track) => pc.addTrack(track, localStream));
    const offer = await pc.createOffer();
    socket.emit("message", { type: "offer", sdp: offer.sdp });
    await pc.setLocalDescription(offer);
  } catch (e) {
    console.log(e);
  }
}

async function handleOffer(offer) {
  if (pc) {
    console.error("existing peerconnection");
    return;
  }
  try {
    pc = new RTCPeerConnection(configuration);
    pc.onicecandidate = (e) => {
      const message = {
        type: "candidate",
        candidate: null,
      };
      if (e.candidate) {
        message.candidate = e.candidate.candidate;
        message.sdpMid = e.candidate.sdpMid;
        message.sdpMLineIndex = e.candidate.sdpMLineIndex;
      }
      socket.emit("message", message);
    };
    pc.ontrack = (e) => (remoteVideo.current.srcObject = e.streams[0]);
    localStream.getTracks().forEach((track) => pc.addTrack(track, localStream));
    await pc.setRemoteDescription(offer);

    const answer = await pc.createAnswer();
    socket.emit("message", { type: "answer", sdp: answer.sdp });
    await pc.setLocalDescription(answer);
  } catch (e) {
    console.log(e);
  }
}

async function handleAnswer(answer) {
  if (!pc) {
    console.error("no peerconnection");
    return;
  }
  try {
    await pc.setRemoteDescription(answer);
  } catch (e) {
    console.log(e);
  }
}

async function handleCandidate(candidate) {
  try {
    if (!pc) {
      console.error("no peerconnection");
      return;
    }
    if (!candidate) {
      await pc.addIceCandidate(null);
    } else {
      await pc.addIceCandidate(candidate);
    }
  } catch (e) {
    console.log(e);
  }
}
async function hangup() {
  if (pc) {
    pc.close();
    pc = null;
  }
  localStream.getTracks().forEach((track) => track.stop());
  localStream = null;
  startButton.current.disabled = false;
  hangupButton.current.disabled = true;
  muteAudButton.current.disabled = true;
}

function App() {
  startButton = useRef(null);
  hangupButton = useRef(null);
  muteAudButton = useRef(null);
  localVideo = useRef(null);
  remoteVideo = useRef(null);
  useEffect(() => {
    hangupButton.current.disabled = true;
    muteAudButton.current.disabled = true;
  }, []);
  const [audiostate, setAudio] = useState(false);

  const startB = async () => {
    try {
      localStream = await navigator.mediaDevices.getUserMedia({
        video: true,
        audio: { echoCancellation: true },
      });
      localVideo.current.srcObject = localStream;
    } catch (err) {
      console.log(err);
    }

    startButton.current.disabled = true;
    hangupButton.current.disabled = false;
    muteAudButton.current.disabled = false;

    socket.emit("message", { type: "ready" });
  };

  const hangB = async () => {
    hangup();
    socket.emit("message", { type: "bye" });
  };

  function muteAudio() {
    if (audiostate) {
      localVideo.current.muted = true;
      setAudio(false);
    } else {
      localVideo.current.muted = false;
      setAudio(true);
    }
  }

  return (
    <>
      <main className="container  ">
        <div className="video bg-main">
          <video
            ref={localVideo}
            className="video-item"
            autoPlay
            playsInline
            src=" "
          ></video>
          <video
            ref={remoteVideo}
            className="video-item"
            autoPlay
            playsInline
            src=" "
          ></video>
        </div>

        <div className="btn">
          <button
            className="btn-item btn-start"
            ref={startButton}
            onClick={startB}
          >
            <FiVideo />
          </button>
          <button
            className="btn-item btn-end"
            ref={hangupButton}
            onClick={hangB}
          >
            <FiVideoOff />
          </button>
          <button
            className="btn-item btn-start"
            ref={muteAudButton}
            onClick={muteAudio}
          >
            {audiostate ? <FiMic /> : <FiMicOff />}
          </button>
        </div>
      </main>
    </>
  );
}

export default App
Enter fullscreen mode Exit fullscreen mode

In the script above, we initialized a variable named configuration and assigned it an object with two fields, iceServers, and iceCandidatePoolSize. The value of the iceServers field is an array that contains the URL of the ice servers. Next, we instantiated a Socket.IO client and assigned it to the variable socket. The instance of the Socket.IO client contains two parameters, the URL of server-side Socket.IO and the transport protocol to use. Then, we declared seven global variables without assigning any value to them.

The instance of the Socket.IO client listens to a message event. On receiving a message event, the event will be passed through a switch block to handle a specific type of event. The offer and answer were implemented through five async functions:

  1. makeCall
  2. handleOffer
  3. handleAnswer
  4. handleCandidate
  5. hangUp

Video App UI

We used React.js to create the UI of the video chat app. We created a component named App. To start a video call, click on the phone icon. This will fire the browser API navigator.mediaDevices.getUserMedia({video: true, audio:{'echoCancellation':true}}) The stream from the camera will be seen in the video element of the App component. Open another tab in the browser, and type the URL of the React App, you will see two streams of videos. One local and one remote.

Below are the Cascading Style Sheets(CSS) and HTML for the UI

.bg-body {
  background-color: #332e33;
}

.container {
  display: flex;
  flex-direction: column;
  gap: 5px;
  padding: 10px;
}
.col-container {
  display: flex;
  flex-direction: column;
  width: inherit;
}
.label-text {
  color: #fff;
  text-align: center;
  font-size: 20px;
}
.btn-start {
  background-color: #0ced23;
}
.btn-end {
  background-color: rgb(225, 5, 5);
}
.btn {
  display: flex;
  flex-direction: row;
  justify-content: center;
  column-gap: 20px;
  margin-top: 10px;
}
.btn-item {
  width: 50px;
  height: 50px;
  color: #fff;
  border-radius: 50%;
}
.video {
  display: flex;
  flex-direction: column;
  justify-content: center;
  align-items: center;
  row-gap: 10px;
}
.video-item {
  width: 90%;
  height: 250px;
  border: 2px solid #fff;
  border-radius: 10px;
  margin: 10px auto;
}
@media only screen and (min-width: 800px) {
  .container {
    margin-top: 30px;
  }
  .video {
    display: flex;
    flex-direction: row;
    column-gap: 0.5%;
    justify-content: center;
  }
  .video-item {
    width: 40%;
    height: 400px;
  }
  .btn-item {
    width: 80px;
    height: 80px;
    color: #fff;
    border-radius: 50%;
  }
}
Enter fullscreen mode Exit fullscreen mode
<!doctype html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <link rel="icon" type="image/svg+xml" href="/vite.svg" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <link rel="stylesheet" href="/style.css" />
    <title>Video Chat</title>
  </head>
  <body class="bg-body">
    <div id="root"></div>
    <script type="module" src="/src/main.jsx"></script>
  </body>
</html>
Enter fullscreen mode Exit fullscreen mode

Conclusion

In this article, we explained WebRTC, and Socket.IO concepts and how to set them up to create a Video Chat App. Also, we discussed the advantages and disadvantages of using WebRTC for video chat applications.

Like, Share, or Comment if you find this article interesting.

Top comments (2)

Collapse
 
jcubic profile image
Jakub T. Jankiewicz • Edited

I suggest to not use text files for source code it's hard to read if it's not syntax highlighted. Add real extension. If it's react use jsx.

Another thing is indentation they are important, you can't just add random spaces like this.

Overal the code is hard to read.

Collapse
 
temiogundeji profile image
Temiogundeji

Great article on WebRTC