Maria Artamonova for Red5

Posted on Mar 10 • Originally published at red5.net

SRT vs MOQT: Low-Latency Video Transport Comparison

#livestreaming #software #srt #moq

Questions about SRT vs MOQT come up often when engineers evaluate low latency video transport options. As a Lead Real-Time Video Architect working on MOQ development at Red5, I ran into this question during architecture discussions and realized others will likely face the same comparison soon. This article is written from my perspective and has also been reviewed and verified by my teammate Paul Gregoire, Red5 Solutions Architect.

If you want a quick summary, read the Key Takeaways section. If you want a deeper technical comparison, continue through the rest of the blog.

Key Takeaways

SRT is a well-established, reliable protocol for video contribution, offering strict end-to-end latency controls. However, its payload-agnostic approach to packet dropping can introduce playback instability under severe congestion, and its single-stream architecture can still create Head-of-Line blocking conditions, limiting its applicability to modern SVC workflows.

MOQT, pairing flexible streaming formats mapped to independent QUIC streams, provides equivalent latency controls while enabling granular, payload-aware data handling and discard strategies. Utilizing parallel streams, isolated packet loss recovery, and priority-based delivery, it can safely drop late data and natively support SVC adaptation. The protocol’s architecture is highly optimized for resilient, low-latency, bandwidth efficient media distribution.

Baseline for Comparison

When comparing Secure Reliable Transport (SRT) and Media over QUIC Transport (MOQT), it is important to establish equivalent architectural layers. Technically, SRT is a payload-agnostic transport protocol. However, in standard broadcast and streaming workflows, it is predominantly used to carry multiplexed MPEG-TS (Transport Stream) payloads.

MOQT is an end-to-end media transport protocol designed to operate in conjunction with a wide range of application-layer streaming formats (current drafts define MOQT Streaming Format (MSF) and CMAF MSF (CMSF). The streaming format and related container structure provides necessary media meta-data including timestamps. Therefore, a functional comparison for video delivery is best framed as SRT + MPEG-TS versus MOQT + CMSF.

Note: it is important to mention that MOQT occupies different use-case spaces, with overlap primarily on the contribution side – SRT is generally not considered for distribution at scale to end-consumers.

Packet Scheduling and Multiplexing

Under network congestion, transport protocols must manage how data is queued and transmitted through restricted bandwidth.

SRT Scheduling: SRT processes packets in a primarily First-In, First-Out (FIFO) sequence over a single UDP connection. Because it does not natively inspect the payload, it cannot differentiate between critical media (like base video layers or audio) and supplemental media (like video enhancement layers) without custom application layer multiplexing.
MOQT Scheduling: MOQT incorporates a prioritization model utilizing various priority parameters per stream. This allows the sender and intermediate relays to identify the relative importance of different media components. Under bandwidth constraints, an MOQT relay can use this logic to selectively delay or drop lower-priority streams to ensure the timely delivery of higher-priority streams.

Packet Loss Recovery and Head-of-Line Blocking

Both SRT and MOQT (via QUIC) use Automatic Repeat reQuest (ARQ) to recover lost packets. When a packet is dropped by the network, the receiver asks the sender to retransmit it. The operational difference lies in how this recovery impacts the rest of the data in transit.

SRT (Head-of-Line Blocking): SRT transmits its payload sequentially over a single connection. If a packet is lost, the receiver must hold all subsequent packets in a buffer until the lost packet is retransmitted and successfully arrives. However, if the retransmission delay exceeds the set latency buffer, SRT drops the packet entirely, allowing the stream to continue rather than waiting forever. This creates Head-of-Line (HoL) Blocking. Because all media (audio, video base layer, video enhancement layer) shares this single pipeline, a single lost network packet stalls the delivery of the entire transport stream, increasing latency across the board. Again this only “stalls” within the latency period, not forever.
MOQT (Independent Stream Recovery): MOQT relies on QUIC’s multiplexed stream architecture. Because different media components (e.g., audio and video) are mapped to independent QUIC streams, packet loss is isolated. If a packet containing video data is lost, only the specific video stream waits for a retransmission. The audio stream, operating on a parallel QUIC stream, continues to deliver data to the application without interruption. This prevents a single network packet loss from stalling the entire media presentation.

Handling Late Data under Congestion

When network delays cause data to arrive past its intended playback deadline, the two protocols use different mechanisms to discard that data.

SRT (Network-Level Dropping): SRT utilizes a configured latency buffer. If a packet cannot be delivered within this timeframe, SRT’s Too-Late Packet Drop (TLPKTDROP) mechanism discards it. Because SRT drops data at the network packet level without payload awareness, this can result in the delivery of partial media frames. In an MPEG-TS workflow, this fragmentation can lead to decoder errors or visual artifacts, potentially persisting until the next keyframe.
MOQT (Application-Aware Dropping): MOQT relies on a feedback loop between the application’s Streaming Format and the QUIC transport layer. The application layer evaluates the Presentation Timestamp (PTS); if a frame exceeds its playback deadline, it instructs the transport layer to issue a QUIC STOP_SENDING frame. MOQT then discards the complete, semantic media unit via a RESET_STREAM operation. This preserves the structural integrity of the remaining video streams and avoids corrupting the decoder.

Support for Video Adaptation (ABR and SVC)

Modern video delivery relies on Adaptive Bitrate (ABR) and Scalable Video Coding (SVC) to adjust to changing network conditions.

SRT and SVC: Because SRT typically carries a single, multiplexed MPEG-TS stream, all SVC layers (base resolution and enhancement details) share the same transport queue. If the network drops a base layer packet while successfully delivering an enhancement layer packet, the enhancement data cannot be decoded, limiting the practical effectiveness of SVC over a standard SRT link.
MOQT and SVC/ABR Integration: MOQT maps SVC layers to independent QUIC streams (Subgroups), facilitating two types of adaptation:
1. Layer Dropping (SVC): During transient network drops, MOQT relays autonomously discard low-priority enhancement Subgroups. The player experiences a temporary reduction in quality while maintaining uninterrupted playback.
2. Track Switching (ABR): For sustained changes in network capacity, the client can issue a SUBSCRIBE command for a lower-bitrate MOQT Track. MOQT processes these switches at defined Group boundaries (Keyframes), providing clean transitions between quality tiers.

Integration with Transcoding Workflows

The integration of these protocols into transcoding pipelines involves a tradeoff between current ecosystem support and architectural efficiency.

SRT Ecosystem Maturity: SRT combined with MPEG-TS is a mature, widely adopted standard. It possesses extensive support across legacy hardware encoders, software transcoders (e.g., FFmpeg), and existing cloud broadcast infrastructure.
MOQT Processing Efficiency: SRT’s monolithic payload requires transcoders to ingest, demultiplex, and decode the entire Transport Stream before processing. By contrast, MOQT’s architecture separates media into independent Tracks and Subgroups. This allows a modern transcoder to selectively ingest only the required streams (e.g., processing a 1080p base layer while actively ignoring higher-resolution streams), offering a more compute-efficient pipeline as the software ecosystem matures.

Conclusion

In summary, the SRT vs MOQT comparison highlights the difference between a mature contribution protocol built around a single transport stream and a newer architecture designed for multiplexed, media aware delivery. SRT remains widely used and reliable, while MOQT introduces transport level capabilities that align better with modern scalable video workflows and adaptive streaming models.

If you want to explore how MOQ compares with other real time delivery approaches, read our related blog on MOQ vs WebRTC.

DEV Community