DEV Community

Lakshya Purohit
Lakshya Purohit

Posted on

Why Your WebRTC App Breaks After 3 Users (And How Zoom Fixes It)

A simple explanation of WebRTC scaling problems and how Mediasoup solves them using SFU architecture.


Introduction

If you tried building a video calling app using WebRTC, you probably thought it was easy.

Until the third or fourth user joins and everything starts breaking.

  • Lag increases
  • Video freezes
  • CPU usage spikes
  • Bandwidth explodes

So how do apps like Zoom or Google Meet handle thousands of users smoothly?

They don’t use pure WebRTC.

They use a smarter architecture.


The Simple Promise of WebRTC

WebRTC allows direct browser-to-browser communication.

  • No plugins
  • No external software
  • Ultra-low latency

At its core, it creates a direct connection between users.

Sounds perfect — until it isn’t.


Where Things Break

WebRTC uses a mesh architecture.

That means every user sends video to every other user.

Example:

  • 2 users = 1 connection
  • 4 users = 6 connections
  • 10 users = 45 connections

This grows exponentially.


Mesh Architecture

Mesh Architecture

Mesh architecture: every user connects to every other user, causing exponential connections.


The Real Problem

Every new user multiplies:

  • Bandwidth usage
  • CPU load
  • Network complexity

This is why:

  • Browsers crash
  • Calls lag
  • Systems fail to scale

The Breakthrough: Mediasoup (SFU)

Instead of connecting everyone to everyone, we introduce a server.

This is called an SFU (Selective Forwarding Unit).

New flow:

  • User sends stream → server
  • Server forwards stream → other users

Now users do not send data to everyone.


SFU Architecture

SFU Architecture

SFU architecture: a central server forwards streams efficiently.


Why This Changes Everything

Mesh architecture

  • Too many connections
  • High bandwidth usage
  • Not scalable

SFU architecture

  • One connection per user
  • Efficient forwarding
  • Scales to large rooms

This is how real systems are built.


Inside Mediasoup

Mediasoup acts as a media router.

Core components:

  • Worker
    Handles media processing using CPU cores

  • Router
    Represents a room

  • Transport
    Manages WebRTC connections

  • Producer
    Sends media

  • Consumer
    Receives media

Every user is both a producer and a consumer.


How a Call Actually Works

  • User connects via Socket.IO
  • Server creates worker and router
  • User joins a room
  • User sends video
  • Other users receive it

Everything is routed efficiently.


Mediasoup Flow

Mediasoup Flow

Mediasoup flow: producer sends media to server, which routes it to consumers.


Tech Stack

  • Frontend: React / Next.js
  • Backend: Node.js
  • Signaling: Socket.IO
  • Media: Mediasoup

The Hard Truth

This is not easy.

You will face:

  • WebRTC debugging challenges
  • NAT traversal (STUN/TURN) issues
  • Media synchronization problems

But once you understand it, you can build production-grade systems.


Final Insight

WebRTC gives you real-time communication.

Mediasoup gives you scalability.

That is the difference between:

  • A demo project
  • A real product

Originally Published

This article was originally written and published on my portfolio:

https://www.lakhsyapurohit.online/blog/real-time-video-with-webrtc-and-mediasoup


If this helped you, consider sharing it 🚀

Top comments (0)