DEV Community

Cover image for WebRTC For Beginners. Part 1
Pete
Pete

Posted on

WebRTC For Beginners. Part 1

WebRTC is an open source project that enables peer-to-peer communication between browsers or applications.
WebRTC enables two internet clients to exchange any kind of media through the web without any plugin or framework. Whether the media type is audio, video or any other kind of data, it doesn't matter. WebRTC will enable two clients to exchange data seamlessly.

WebRTC is currently being used by many well-known applications, including but not limited to:

  • Google Meet
  • Google Hangouts
  • Facebook Messenger
  • Discord

Components Of A WebRTC System

Components of a WebRTC system

The basic components of any WebRTC system are:

  1. Signaling Server
  2. STUN Server
  3. TURN Server
  4. Client

Signaling Server

A signaling server is a server that clients use to exchange necessary information about each other. The clients use this exchanged information to establish a connection between themselves.
The process of exchanging information is called signaling and signaling can be done using Websocket or XMLHttpRequest

STUN Server

STUN stands for Session Traversal Utilities for NAT. A stun server allows a client to request information about their public address and the type of NAT they are behind. This information is part of the data sent to the signaling server.

So the way STUN servers fit into this flow is that the client asks the STUN server for information, the STUN server responds with the information, and then the client sends that information to the signaling server

TURN Server

TURN stands for Traversal Using Relay NAT.
It is a protocol for relaying network traffic. In 15 to 20% of cases, the stun server can fail to provide information to client. If that happens, it won't be possible for the clients to set up a peer-to-peer connection with each other.
Cases like that are where the TURN server comes in.
If a peer-to-peer connection can not be established, A TURN server is used to connect the clients to each other. In this case, instead of exchanging data directly, the clients will send data to each other but the data will pass through the TURN server.

You can look at the diagram above for more clarity.

Basically, a TURN server acts as a fallback or backup communication channel in cases where the peer-to-peer connection can not be created.

How It Works

There are usually two clients involved in a WebRTC connection but for us to understand how things work a lot more easily, I'll explain the processes that are involved from the perspective of one client - not two.

First off, we must know the types of data that a client sends to the signaling server. They are:

  1. Multimedia communication sessions using SDP
  2. ICE Candidates

Don't get overwhelmed by these terms. As you have probably already discovered, verbose terms in computer science always refer to very simple concepts. I don't know why computer science just chooses to embrace verbosity, but I know that the underlying concepts are usually simplistic.

SDP

SDP stands for Session Description Protocol. SDP is a format for describing properties of the media you want to exchange like video and audio codecs, source addresses and timing information of audio and video. These properties of the media are called multimedia communication sessions and they are used for creating WebRTC offers and session invitations

SDP is not used to deliver the actual media. It is just used to tell the signaling server and the other client what kind of media you want to exchange.

It is important to note that the data sent using SDP does not come from the STUN server. It comes from the client itself.

ICE Candidates

ICE Candidates hold information about a client's internet connection.
The ICE Candidate is fetched from the STUN Server and it contains information like the client's IP address and the available methods the peers will use to communicate(either directly or through a TURN server)

In essence, the process of signalling is:

  1. Client sends info to the signaling server using SDP
  2. Client asks STUN server for ICE Candidate
  3. STUN Server sends ICE candidate to client
  4. Client sends ICE Candidate to signaling server

Each of these steps is also repeated on the receiving client's end. So the final flow diagram for establishing a connection would, in essence, look like this๐Ÿ‘‡

A successful p2p connection

๐Ÿ‘† A successful p2p connection

A fallback connection with a TURN server

๐Ÿ‘† A fallback connection with a TURN server

Top comments (1)

Collapse
 
malti_thakur_fc70e73f4dde profile image
Malti Thakur

This is a fantastic walkthrough, Pete! You've done a great job demystifying the core components of a WebRTC connection. It's the perfect primer.

For developers who read this and are inspired to build something, I'd like to mention the Ant Media Server project (github.com/ant-media/Ant-Media-Server). It's an open-source media server that acts as the signaling and SFU/MCU backend you described, making it much easier to scale WebRTC applications from one-to-one calls to large-scale live streaming.

It's a great tool to pair with the foundational knowledge from this post!