DEV Community

Konstantin Shpinev
Konstantin Shpinev

Posted on

Load testing Jitsi: Open-source alternative to Zoom

Jitsi is an open-source video conferencing system distributed under the Apache license (Terms of Service - Jitsi) that can be installed on your server to create your own independent version of Jitsi Meet.

First, let's briefly discuss what Jitsi is and what solutions exist for multi-user video calling, and then we'll move on to testing. At the end, I will share the formula for calculating the hardware resources of the server to support video conferences of different configurations.

Jitsi consists of the following main components:

Jitsi modules scheme

Jitsi Videobridge - the core of the system, a WebRTC media server with SFU/Simulcast architecture that routes media sessions between conference participants. Stack: Java, WebRTC.

Prosody — an XMPP server that provides signaling between clients and media servers. Stack: Lua, XMPP.

jibri — a set of scripts for recording a conference to a file. It emulates a client session and records video from a virtual screen to a file using ffmpeg. Stack: Java, ffmpeg, chrome.

jicofo — conference control service. Stack: Java.

Jitsi Meet — a web client for connecting to a video conference. Stack: Web, React, WebRTC.

Jitsi Desktop — a desktop client for various OSs (Windows, Linux, macOS). Stack: Jitsi Meet + Electron.

Jitsi Meet Mobile — an application for iOS and Android. Stack: React Native.

What are the different types of systems for video conferencing?

There are several server architectures for video conferencing systems, including:

  • MCU — the server mixes/transcodes media streams.
  • SFU — the server multiplexes/routs media streams between clients.
  • Simulcast — similar to SFU, but each client sends 3 media streams at different bitrates to optimize traffic.
  • SVC — similar to Simulcast, but with adaptive video coding on the client side.

In my opinion, SFU/Simulcast is the most optimal option, as it can significantly save server resources due to the absence of the need for video mixing/transcoding, and it uses the widely adopted WebRTC protocol.

It's worth noting that not all browsers support Simulcast (for example, it was found to be unsupported in Firefox after testing), so for Jitsi, it's recommended to use Google Chrome or the desktop client.

Load testing Jitsi Videobridge

Input data

The tests were conducted on the following testbed:

  • 2 Supermicro servers with Intel Xeon E3-12XX CPUs (4 physical cores + 4 Hyper-Threading cores), 8GB RAM, Intel Server Adapter (with hardware queues enabled). OS: Debian Buster, CPU in Performance mode.
  • A virtual server based on AMD: 2x 2 Ghz CPUs, 5GB RAM.
  • 8 client computers with different configurations.

The objectives of the testing were to analyze the nature of the load, identify critical performance indicators, and develop a formula for calculating the required resources for a given number of users.

It is important that the CPU on the server with Jitsi Videobridge supports hardware AES encryption, and that the operating system and openssl library have support for aes-ni enabled, otherwise the performance will be significantly (up to 2-3 times) worse. More information on how to check for the presence and activation of this technology is described in the article "How to find out AES-NI (Advanced Encryption) Enabled on Linux System" - nixCraft.

This is due to the peculiarities of the WebRTC protocol, which uses encryption of all media streams (SRTP). Thus, each incoming stream from the client is decrypted and encrypted on the server.

On our testbed, hardware support for AES was enabled.

First test

Using the Chromium browser in video emulation mode, connections to the conference were established from multiple computers. This was done in several stages.

Testing in Chromium

The first test involved testing a large number of participants in a single conference, starting from 20 participants up to 38 participants. Performance measurements were taken at each step.

The second stage involved testing multiple conferences with a small number of participants in each one (10 and 5 people).

In the first test, performance measurement was done manually. The results of the first test are summarized in the table:

First results

Second test

For verification of the obtained results, a repeated test was conducted in similar configurations, but with automated measurement of averaged performance indicators.

The results of the second test are presented in the table below.

Second results

Conclusion

The CPU is the main resource that affects server performance. After analyzing the obtained values using various approaches, a key parameter was identified - the dependence of CPU resource consumption on the network traffic bandwidth to the server, and the dependence is specifically related to the total bitrate (upload+download).

CPU load dependency on the total bitrate

On the graph, it can be seen that the CPU load dependency on the total bit rate is linear, regardless of the conference configuration. At the same time, the bit rate is directly dependent on the conference configuration. The linear dependency can also be explained by the WebRTC protocol feature - each media stream is encrypted, so the more total bit rate, the more encryption operations need to be performed on the processor.

The last column in the load test results tables calculates a parameter that represents the amount of traffic per 1% CPU. The average value after the second test was 5.82 Mbit/sec per 1% CPU.

In our case, the server was launched on 8 vCPUs, so we can calculate the bit rate that 1 vCPU can handle (on average):
5.82 * 100 / 8 = 72.75 Mbit/sec per 1 vCPU with a clock speed of about 3Ghz.

The Jitsi Videobridge server is the main consumer of CPU.

Memory consumption parameters are not reflected in the table, as they are not significant. In further calculations, formulas are used based on 2GB of RAM per 1 vCPU.

As a result of the tests, a universal formula was developed for calculating the required resources based on the total number of conferences, their type, and the simultaneous number of participants in them.

The formula is based on the following universal representation of an SFU/Simulcast conference:

Universal scheme

The diagram for universalization uses the concepts of Speaker - a participant in the conference who can see other participants, and Listener - a participant who can only see the Speaker.

The variables for the universal formula are marked in blue:

  • A - the number of participants that Speakers can see in the grid.
  • B - the number of Listener participants.
  • C - the number of Speaker participants.
  • D - the minimum Simulcast bitrate.
  • E - the maximum Simulcast bitrate.
  • F - the number of Listener participants who Speakers view in a large tile.

Thus, to calculate the incoming and outgoing bitrate for the traffic server, you can use the formula:

Download = A*D + C*E + F*E
Upload = B*E*C + C*(E*F + A*D)
Enter fullscreen mode Exit fullscreen mode

Using the parameters, it is possible to calculate the required bitrate for a specific video conference configuration with a given number of participants. Also, by using the average bitrate that 1 vCPU can handle and other data from load testing, it is possible to calculate the necessary hardware resources.

For example, in the field of education, it can be assumed that the average conference size is 31 people, with 30 listeners and 1 speaker. Let the lower Simulcast bitrate be 0.2 Mbit/sec, and the upper bitrate be 1.5 Mbit/sec. Suppose the speaker sees 6 participants in the grid and always sees one of them in an enlarged window. Based on this:

A = 6
B = 30
C = 1
D = 0.2
E = 1.5
F = 1
Download = 6*0.2 + 1*1.5 + 1*1.5 = 4.2 
Upload = 30*1.5*1 + 1*(1.5*1 + 6*0.2) = 47.7 
Total bitrate = 51.9 Mbit/sec
Enter fullscreen mode Exit fullscreen mode

Scalability capabilities of Jitsi

Several Jitsi Videobridge servers can be combined into a cluster without losing functionality and quality according to the following scheme:

Scalability scheme

However, the XMPP server Prosody is not designed to work in a cluster and has a single-threaded implementation, but it can work with multiple Videobridge servers. This appears to be a bottleneck in the system, but load testing has not been able to generate enough load on Prosody that would lead to high resource consumption. However, in theory, it is possible to run multiple instances of Prosody.

Conclusion and table for equipment performance calculation

Thank you for reading, I hope the test results and my calculations will be useful to you. As promised, I'm sharing an Excel spreadsheet that will help you calculate the required amount of vCPU and RAM for hosting Jitsi based on your tasks.

Top comments (0)