Introduction
MPEG-DASH is one of the most popular protocols for streaming media content over the internet. MPEG-DASH can download large files without causing bandwidth overuse or interruptions to video streaming. YouTube and Netflix, two major video streaming platforms use MPEG-DASH for video streaming. MPEG-DASH protocols adapt to the connection bandwidth of the user by switching from one bandwidth to another. In this post, the reader will learn about MPEG-DASH, how to encode media files using MPEG-DASH
Contents
- MPEG-DASH
- Dynamic Adaptive Streaming
- Media Presentation Description
- How to encode video files using MPEG-DASH Protocol?
- How to Serve and Consume DASH Stream
- Conclusion
MPEG-DASH
In a bid to have a unified protocol for streaming media content over the internet, Morning Picture Expert Group(MPEG) issued a call for proposals in 2009. In 2012, the group published a standard specification called Dynamic Adaptive Streaming over HTTP(MPEG-DASH).
MPEG-DASH is a streaming protocol. Its advantages over other streaming protocols are that, it is codec agnostic and supports multiplexed and un-multiplexed encoded contents. MPEG-DASH supports multiple Digital Rights Management and encryption of media contents.
According to MPEG-DASH 2012(1) specification guideline, MPEG-DASH has the following features.
- Switching and selectable streams: the MPD provides adequate information to the client for selecting and switching between streams, e.g. selecting one audio stream from different languages, selecting video between different camera angles, selecting the subtitles from provided languages, switching between different bitrates of the same video camera dynamically.
- Ad-insertion: Permits advertisement placement as a period between periods or segments between segments in both on-demand and live cases.
- Compact manifest: the segment's address URLs can be signaled using a template scheme that produces a compact MPD. Fragmented manifest: Division of the MPD into multiple parts or some of its elements externally referenced, enabling downloading MPD in multiple steps.
- Segments with variable durations: Variation of segment duration. With live streaming, the duration of the next segment can also be signaled with the delivery of current segment. Multiple base URLs: the same content can be available at multiple URLs, i.e. at different servers or CDNs, and the client can stream from any of them to maximize the available network bandwidth.
- Clock drift control for live sessions: the UTC can be included with each segment to enable the client to control its clock drift.
- Scalable Video Coding (SVC) and Multiview Video Coding (MVC) support: the MPD provides adequate information regarding the decoding dependencies between representations and is used for streaming any multi-layer coded streams such as SVC and MVC.
- A flexible set of descriptors: for describing content rating, components roles, accessibility features, camera views, frame-packing, and audio channel configuration.
- Subseting of adaptation sets into groups according to the content author's guidance.
- Quality metrics for reporting the session experience: A set of well-defined quality metrics for the client to measure and report back to a reporting server.
MPEG-DASH streams Multimedia content by dividing the content into segments. To prevent buffering, MPEG-DASH will adjust the video quality of the stream according to the user's internet connection speed. MPEG-DASH utilizes Multimedia Presentation Description(MPD) to achieve dynamic adaptive streaming.
Dynamic Adaptive Streaming
Dynamic adaptive streaming requires a multiple bitrate of media content to be available on the server. For example, if we have a video file named match.mp4. Typically, this file will contain video, audio, text, and subtitles. To utilize the DASH protocol, we must have multiple bitrates of this file on the server. Let's say we encoded our match.mp4 at the following bitrate 100kb/s, 64kb/s, 24kb/s. After encoding, each alternative bitrate will be in chunks called segments, which are about 10s of multimedia content. Each segment contains approximately 100 Access Units. Also, an iframe-only bitstream with a low frame rate is available for streaming during trick play mode.
A client device starts streaming the first segment of match.mp4 at the highest bitrate of 100kb/s. While monitoring the network bandwidth, on completion of the first segment, if the device realizes the bandwidth is lower than 100kb/s, the device will start streaming at a lower bandwidth. If the bandwidth increases, the device streams at the highest bandwidth. When the user pauses the stream and rewind, the device streams the video in trick mode, playing the video in reverse order with muted audio until the user press play at the desired point. The device will resume streaming at the highest bitrate rate while monitoring the network bandwidth.
Multimedia dynamic streaming can stream 3D, video streams with subtitles and captions, ad insertion, and multiple camera view streaming. Dynamic streaming uses an MPD file to switch from one bitrate to another.
Media Presentation Description
MPD is an XML document that describes the characteristics of the multimedia content. For a DASH client to stream multimedia content, the client downloads the MPD file and parses the file. MPD files consist of one or more periods. A period is the interval between multimedia contents along the temporal axis. A period consists of multiple adaptation sets. An adaptation set contains information about one or more multimedia components and its alternative bitrate.
An adaptation set consists of representation: information about the encoded bitrate alternative of the same multimedia component. In an MPD file, rep-24.mp4, rep-64.mp4, and rep-100.mp4 are examples of alternative encoded bitrate representations of the same multimedia content.
<?xml version="1.0" encoding="utf-8"?>
<MPD xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
xmlns="urn:mpeg:dash:schema:mpd:2011"
xmlns:xlink="http://www.w3.org/1999/xlink"
xsi:schemaLocation="urn:mpeg:DASH:schema:MPD:2011 http://standards.iso.org/ittf/PubliclyAvailableStandards/MPEG-DASH_schema_files/DASH-MPD.xsd"
profiles="urn:mpeg:dash:profile:isoff-live:2011"
type="static"
mediaPresentationDuration="PT8M59.9S"
maxSegmentDuration="PT5.0S"
minBufferTime="PT16.6S">
<ProgramInformation>
</ProgramInformation>
<ServiceDescription id="0">
</ServiceDescription>
<Period id="0" start="PT0.0S">
<AdaptationSet id="0" contentType="video" startWithSAP="1" segmentAlignment="true" bitstreamSwitching="true" frameRate="60000/1001" maxWidth="1280" maxHeight="720" par="16:9" lang="eng">
<Representation id="0" mimeType="video/mp4" codecs="avc1.640020" bandwidth="800000" width="1280" height="720" sar="1:1">
<SegmentTemplate timescale="60000" initialization="init-stream$RepresentationID$.m4s" media="chunk-stream$RepresentationID$-$Number%05d$.m4s" startNumber="61">
<SegmentTimeline>
<S t="30013984" d="500500" r="3" />
<S d="383383" />
</SegmentTimeline>
</SegmentTemplate>
</Representation>
<Representation id="2" mimeType="video/mp4" codecs="avc1.640020" bandwidth="300000" width="1280" height="720" sar="1:1">
<SegmentTemplate timescale="60000" initialization="init-stream$RepresentationID$.m4s" media="chunk-stream$RepresentationID$-$Number%05d$.m4s" startNumber="61">
<SegmentTimeline>
<S t="30013984" d="500500" r="3" />
<S d="383383" />
</SegmentTimeline>
</SegmentTemplate>
</Representation>
</AdaptationSet>
<AdaptationSet id="1" contentType="audio" startWithSAP="1" segmentAlignment="true" bitstreamSwitching="true" lang="eng">
<Representation id="1" mimeType="audio/mp4" codecs="mp4a.40.2" bandwidth="128000" audioSamplingRate="48000">
<AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2" />
<SegmentTemplate timescale="48000" initialization="init-stream$RepresentationID$.m4s" media="chunk-stream$RepresentationID$-$Number%05d$.m4s" startNumber="104">
<SegmentTimeline>
<S t="24784896" d="240640" r="3" />
<S d="171509" />
</SegmentTimeline>
</SegmentTemplate>
</Representation>
<Representation id="3" mimeType="audio/mp4" codecs="mp4a.40.2" bandwidth="128000" audioSamplingRate="48000">
<AudioChannelConfiguration schemeIdUri="urn:mpeg:dash:23003:3:audio_channel_configuration:2011" value="2" />
<SegmentTemplate timescale="48000" initialization="init-stream$RepresentationID$.m4s" media="chunk-stream$RepresentationID$-$Number%05d$.m4s" startNumber="104">
<SegmentTimeline>
<S t="24784896" d="240640" r="3" />
<S d="171509" />
</SegmentTimeline>
</SegmentTemplate>
</Representation>
</AdaptationSet>
</Period>
</MPD>
A representation contains information about one or more segments, the actual chunks of a media file. The segment contains the URI of the media file on the server and the start period. A device will use the URI to download the media segment with the HTTPS GET method.
How to encode video files using MPEG-DASH Protocol?
To encode our video file, we will use FFmpeg, a software for encoding multimedia content. However, other software like Handshake will work.
Download and install FFmpeg
To install FFmpeg on Windows, download the latest build here
https://www.gyan.dev/ffmpeg/builds
Extract the zip file and copy it to C://. Open the Command Line as an administrator. Copy the following code and press enter.
setx /m PATH "C:\ffmpeg\bin;%PATH%"
Restart system. To confirm installation, run ffmpeg -h. The command will output a series of ffmpeg commands to the screen.
After installation, create a folder and give it a name of your choice. In this case, we named the folder videodash. Copy your video into the videodash folder.Section 4.9 of FFmpeg documentation specified how to use FFmpeg to encode and create MPD files. Open the command line, cd into the videodash folder, copy, and run the code below.
ffmpeg -re -i culture.MOV -map 0 -map 0 -c:a aac -c:v libx264 -b:v:0 800k -b:v:1 300k -s:v:1 320x170 -profile:v:1 baseline -profile:v:0 main -bf 1 -keyint_min 120 -g 120 -sc_threshold 0 -b_strategy 0 -ar:a:1 22050 -use_timeline 1 -use_template 1 -window_size 5 -adaptation_sets "id=0,streams=v id=1,streams=a" -f dash output.mpd
Code Explanation
You execute ffmpeg commands with sets of options and parameters. The above code is for dash streaming.
- -re read input at native frame rate
- -i indicate input stream
- -map 0 select the first audio stream
- -map 0 select the first video stream
- -c:a audio codec aac
- -c:v video codec libx264
- -b:v 0 set the bitrate of the first video stream to 800k
- -b:v 1 set the bitrate of the second video stream to 300k
- -s✌️1 set the size of the second video stream to 320 x 270
- -profile:v 1 use baseline as the profile for the second video
- -profile:v 0 use main as the profile of the first video stream
- -bf 1 set the bufsize to 1
- -keyint-min
- -h gulp size 120
- -sc_threshshold
- -b_strategy
- -ar🅰️1 audio rate of the second audio stream is 22050
- -use_timeline enables the use of Segment templates
- -use_template enables the use of Segment templates
- -window_size sets the number of segments in the manifest to 5.
- -adaptation_sets assign streams to adaptation sets video stream to id 0 and audio to id 1
- -f specify the wrapper for the output, in this case, dash.
The output file is out.mpd. You can give the output file any name of your choice. For a better understanding of ffmpeg options, Read the documentation. In the videodash folder, we will have a series of chunk files with m4s extension and an MPD file.
How to Serve and Consume DASH Stream.
Video streamed using MPEG-DASH can only be viewed using DASH Client. There is a JavaScript implementation of the DASH client dash.js which you can download or you can use CDN.
Copy the videodash folder to your localhost or remote server.
In your server folder, create a html file and name it video.html then copy the code below
<!doctype html>
<html>
<head>
<title>Dash.js Rocks</title>
<style>
video {
width: 640px;
height: 360px;
}
</style>
</head>
<body>
<div>
<video id="videoPlayer" controls></video>
</div>
<script src="yourPathToDash/dash.all.min.js"></script>
<script>
(function(){
var url = "https://dash.akamaized.net/envivio/EnvivioDash3/manifest.mpd";
var player = dashjs.MediaPlayer().create();
player.initialize(document.querySelector("#videoPlayer"), url, true);
})();
</script>
</body>
</html>
Go to your browser and input the URL of your server.
Conclusion
Before now there was no interoperability between commercial multi-media streaming platforms. Each commercial platform has its manifest, content format, and protocols but nowadays, media contents over the internet are available to a wide range of devices such as mobile phones, PCs, laptops, consoles, and TVs using the same manifest, protocol, and content format. This is due to the advent of protocols like MPEG-DASH. In this post, we examined the fundamentals of MPEG-DASH, how to encode media files using MPEG-DASH protocol, and how to serve MPEG-DASH encoded media files.
References
- ISO/IEC 23009-1:2012(E). Information technology — Dynamic adaptive streaming over HTTP (DASH) — Part 1: Media presentation description and segment formats
Top comments (0)