Lakshya Purohit

Posted on Mar 18

Why Your WebRTC App Breaks After 3 Users (And How Zoom Fixes It)

A simple explanation of WebRTC scaling problems and how Mediasoup solves them using SFU architecture.

Introduction

If you tried building a video calling app using WebRTC, you probably thought it was easy.

Until the third or fourth user joins and everything starts breaking.

Lag increases
Video freezes
CPU usage spikes
Bandwidth explodes

So how do apps like Zoom or Google Meet handle thousands of users smoothly?

They don’t use pure WebRTC.

They use a smarter architecture.

The Simple Promise of WebRTC

WebRTC allows direct browser-to-browser communication.

No plugins
No external software
Ultra-low latency

At its core, it creates a direct connection between users.

Sounds perfect — until it isn’t.

Where Things Break

WebRTC uses a mesh architecture.

That means every user sends video to every other user.

Example:

2 users = 1 connection
4 users = 6 connections
10 users = 45 connections

This grows exponentially.

Mesh Architecture

Mesh architecture: every user connects to every other user, causing exponential connections.

The Real Problem

Every new user multiplies:

Bandwidth usage
CPU load
Network complexity

This is why:

Browsers crash
Calls lag
Systems fail to scale

The Breakthrough: Mediasoup (SFU)

Instead of connecting everyone to everyone, we introduce a server.

This is called an SFU (Selective Forwarding Unit).

New flow:

User sends stream → server
Server forwards stream → other users

Now users do not send data to everyone.

SFU Architecture

SFU architecture: a central server forwards streams efficiently.

Why This Changes Everything

Mesh architecture

Too many connections
High bandwidth usage
Not scalable

SFU architecture

One connection per user
Efficient forwarding
Scales to large rooms

This is how real systems are built.

Inside Mediasoup

Mediasoup acts as a media router.

Core components:

Worker
Handles media processing using CPU cores
Router
Represents a room
Transport
Manages WebRTC connections
Producer
Sends media
Consumer
Receives media

Every user is both a producer and a consumer.

How a Call Actually Works

User connects via Socket.IO
Server creates worker and router
User joins a room
User sends video
Other users receive it

Everything is routed efficiently.

Mediasoup Flow

Mediasoup flow: producer sends media to server, which routes it to consumers.

Tech Stack

Frontend: React / Next.js
Backend: Node.js
Signaling: Socket.IO
Media: Mediasoup

The Hard Truth

This is not easy.

You will face:

WebRTC debugging challenges
NAT traversal (STUN/TURN) issues
Media synchronization problems

But once you understand it, you can build production-grade systems.

Final Insight

WebRTC gives you real-time communication.

Mediasoup gives you scalability.

That is the difference between:

A demo project
A real product

Originally Published

This article was originally written and published on my portfolio:

https://www.lakhsyapurohit.online/blog/real-time-video-with-webrtc-and-mediasoup

If this helped you, consider sharing it 🚀

DEV Community