So far in our series we have discussed RTP: The Voice Courier and how it is crucial for carrying voice and video. But how do we know if those packets are arriving on time, or if the call quality is suffering?
Introducing RTCP (Real-Time Control Protocol). While RTP carries voice or video, RTCP monitors quality, synchronizes streams, and reports stats like jitter, packet loss, and delay. Defined in RFC 3550 (the same spec as RTP), RTCP is the control plane for RTP streams.
๐งฉ What is RTCP?
- RTP sends the media stream (voice, video, etc.).
- RTCP sends control and quality feedback alongside RTP.
๐ Together, they make a complete real-time transport framework.
๐ฏ Why RTCP Exists?
- Quality Feedback โ Reports on packet loss, jitter, delay
- Synchronization โ Aligns RTP streams (e.g., audio & video) using NTP + RTP timestamps
- Participant Identification โ Each SSRC gets a unique ID so you know whoโs who
- Scalability โ Bandwidth control ensures RTCP doesnโt eat into RTPโs share
RTCP packets travel periodically (every few seconds) to:
- Report on packet loss, jitter, and round-trip time (RTT).
- Provide sender statistics (how many RTP packets were sent).
- Carry source descriptions (like CNAMEs for sync).
- Enable feedback mechanisms (NACK, picture loss indication, etc.).
๐ RTCP in the Protocol Stack
Protocol Stack |
---|
Application Signaling (SIP/SDP) |
RTCP (control, reporting) / RTP (media transport) |
Transport (UDP / DTLS) |
IP |
- RTP: Media stream
- RTCP: Control/feedback stream
- Both usually run on adjacent UDP ports
๐ฆ RTCP Packet Structure
Every RTCP packet starts with a common header:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P| Count | Packet Type | Length |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
- V (2 bits): Protocol version (always 2).
- P (1 bit): Padding.
- Count (5 bits): Varies by packet type (e.g., number of reports).
- Packet Type (PT): Identifies RTCP message type.
- Length: Length of this RTCP packet in 32-bit words.
๐ Types of RTCP Messages
RTCP defines multiple packet types, each with a specific purpose.
1. Sender Report (SR) โ PT=200
- Sent by active senders.
- Contains:
- NTP timestamp (wall clock).
- RTP timestamp (media clock).
- Count of packets/bytes sent.
- Used for synchronization and quality monitoring.
2. Receiver Report (RR) โ PT=201
- Sent by receivers.
- Contains:
- Packet loss percentage.
- Jitter.
- Last SR timestamp and delay.
- Gives feedback to the sender about stream quality.
3. Source Description (SDES) โ PT=202
- Provides metadata about streams.
- Example: CNAME, name, email, tool used.
4. BYE โ PT=203
- Indicates end of participation.
- Graceful โleaving the game.โ
5. APP โ PT=204
- Application-specific RTCP extensions.
- Rare, custom use cases.
6. Extended Reports (XR)
- Defined later to carry detailed stats (e.g., round-trip delay, burst metrics).
7. Feedback Messages (RTP/AVPF extensions)
- NACK (Negative ACK) โ Request retransmission.
- PLI (Picture Loss Indication) โ Request a new video keyframe.
- FIR (Full Intra Request) โ Force a full intra-frame.
๐ ๏ธ Real RTCP Packet Example
Letโs look at a real RTCP Sender Report (SR) packet captured with Wireshark:
0000 81 c8 00 06 12 34 56 78 00 00 01 63 89 ab cd ef
0010 11 22 33 44 00 00 27 10 00 00 09 f6 00 00 00 01
0020 5e f6 7a 2c 00 00 00 00
๐ Field Breakdown
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
|V=2|P=0|RC=1| PT=200 (SR) | length=6 |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
81 c8 00 06
-
V=2
โ RTP version 2 -
P=0
โ No padding -
RC=1
โ 1 report block will follow -
PT=200
โ Packet Type = SR (Sender Report) -
length=6
โ This RTCP packet is 7 words (28 bytes) long
-
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| SSRC of sender |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
12 34 56 78 โ SSRC =
0x12345678
(unique sender identifier)
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| NTP timestamp, most significant word |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| NTP timestamp, least significant word |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
00 00 01 63 89 ab cd ef โ NTP timestamp
- High 32 bits:
0x00000163
- Low 32 bits:
0x89abcdef
- Used to sync RTP streams to real wall-clock time
- High 32 bits:
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| RTP timestamp |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
11 22 33 44 โ RTP timestamp =
0x11223344
- Media timestamp used for jitter calculation and sync with NTP
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| sender's packet count |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| sender's octet count |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
00 00 27 10 โ Senderโs packet count =
10000
packets -
00 00 09 f6 โ Senderโs octet count =
2550
bytes
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
| Report Block (for one source) |
+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+-+
-
00 00 00 01 5e f6 7a 2c 00 00 00 00
-
SSRC=0x00000001
(source being reported) - Fraction lost, cumulative lost, extended highest seq received, jitter, and delay since last SR
-
๐งฉ Why This Matters
This single RTCP packet tells us:
-
Who the sender is (
SSRC
) - When media was sent (NTP + RTP timestamps)
- How much was sent (packet & octet counts)
- How the receiver is doing (loss/jitter stats from the report block)
Together, SRs and RRs form the control loop that makes RTP reliable for real-time communication.
๐ฎ Real Life Examples
-
Monitoring Voice Call Quality
- RTCP RRs reveal packet loss % and jitter.
- VoIP monitoring tools (like Wireshark) use these stats.
-
Video Call Resync
- Receiver sends PLI when frames are corrupted.
- Sender reacts by sending a new keyframe.
-
Conference Systems
- SRs synchronize multiple media streams (audio + video).
๐ Port Usage
Traditionally:
- RTP uses port
N
. - RTCP uses port
N+1
.
But in practice:
- RTCP Mux (RFC 5761): RTP and RTCP share the same port to save resources (common in WebRTC).
- Bundle (RFC 8843): Multiple media streams (audio, video, data) share a single 5-tuple (IP, port pair).
๐ Why?
- NAT/firewall traversal is easier.
- Fewer ports to manage.
- Cleaner for ICE/STUN/TURN usage.
๐ Quick RTCP Reference Table
PT | Message | Purpose |
---|---|---|
200 | Sender Report | Stats from sender + sync |
201 | Receiver Report | Feedback on quality from receiver |
202 | Source Description | Metadata (CNAME, user, tool) |
203 | BYE | End participation |
204 | APP | Application specific |
XR | Extended Reports | Detailed metrics |
NACK | Feedback | Retransmission request |
PLI | Feedback | Request video keyframe |
FIR | Feedback | Request full intra-frame |
๐ Wrap Up
- RTP carries the voice/video. RTCP monitors it.
- With reports, feedback, and metadata, RTCP keeps the real-time flow honest.
- In modern deployments, RTCP-mux and Bundle simplify port usage.
RTCP is the quiet referee in the SIP Games โ not in the spotlight, but making sure everything runs fair. โ๏ธ
๐ฎ Thanks for playing another round of SIP Games!
๐ Follow @sip_games to unlock the next level.
Top comments (0)