November 03, 2014 by Constance Gangneux

How Bitmovin leverage MP4Box

to package their contents for their bitdash player

This is an invited article from Bitmovin. This article belongs the "GPAC industry use-cases" category which shows how industry actors use GPAC in their projects. Please click here to read the original version of this article.

MPEG-DASH Generation with x264 and MP4Box

As many people have asked how to create MPEG-DASH content, e.g. to test it with the bitdash MPEG-DASH player [1], we will answer this question in this post.

The Situation

A video is given in some container format, with a certain codec, probably including one or more audio tracks. Let’s call this file “inputvideo.mkv”. This video should be prepared for MPEG-DASH playout.

H.264/AVC for video will be used within segmented mp4 containers.

Tools

Two tools will be used. x264 [2] to prepare the video content, and MP4Box [3] to segment the file and create a Media Presentation Description (MPD).

Preparation

If the source video is already in the correct format, this step can be skipped. However, the odds are long for this being the case.

The following command (re-) encodes the video in H.264/AVC with the properties we will need. All the command line parameters are explained after the code.

x264 --output intermediate_2400k.264 --fps 24 --preset slow --bitrate 2400 --vbv-maxrate 4800 --vbv-bufsize 9600 --min-keyint 48 --keyint 48 --scenecut 0 --no-scenecut --pass 1 --video-filter "resize:width=1280,height=720" inputvideo.mkv

--output intermediate_2400k.264	Specifies the output filename. File extension is .264 as it is a raw H.264/AVC stream.
--fps 24	Specifies the framerate which shall be used, here 24 frames per second.
--preset slow	Presets can be used to easily tell x264 if it should try to be fast to enhance compression/quality. Slow is a good default.
--bitrate 2400	The bitrate this representation should achieve in kbps.
--vbv-maxrate 4800	Rule of thumb: set this value to the double of --bitrate.
--vbv-bufsize 9600	Rule of thumb: set this value to the double of --vbv-maxrate.
--keyint 96	Sets the maximum interval between keyframes. This setting is important as we will later split the video into segments and at the beginning of each segment should be a keyframe. Therefore, --keyint should match the desired segment length in seconds mulitplied with the frame rate. Here: 4 seconds * 24 frames/seconds = 96 frames.
--min-keyint 96	Sets the minimum interval between keyframes. See --keyint for more information.We achieve a constant segment length by setting minimum and maximum keyframe interval to the same value and furthermore by disabling scenecut detection with the --no-scenecut parameter.
--no-scenecut	Completely disables adaptive keyframe decision.
--pass 1	Only one pass encoding is used. Can be set to 2 to further improve quality, but takes a long time.
--video-filter "resize:width=1280,height=720"	Is used to change the resolution. Can be omitted if the resolution should stay the same as in the source video.
inputvideo.mkv	The source video

Note that this are only example values. Depending on the use case you might need to use totally different options. For more details and options consult x264’s documentation [4].

Segmenting

Now we add the previously created h264 raw video to an mp4 container as this is our container format of choice.

MP4Box -add intermediate.264 -fps 24 output_2400k.mp4

intermediate_2400k.264	The H.264/AVC raw video we want to put in a mp4.
-fps 24	Specifies the framerate. H.264 doesn’t provide meta information about the framerate so it’s recommended to specify it. The number (in this example 24 frames per second) must match the framerate used in the x264 command.
output_2400k.mp4	The output file name.

What follows is the step to actual create the segments and the corresponding MPD.

MP4Box -dash 4000 -frag 4000 -rap -segment-name segment_ output_2400k.mp4

-dash 4000	Segments the given file into 4000ms chunks.
-frag 4000	Creates subsegments within segments and the duration therefore must be longer than the duration given to -dash. By setting it to the same value, there will only one subsegment per segment. Please see [5] for more information on fragmentation, segmentation, splitting and interleaving.
-rap	Forces segments to start random access points, i.e. keyframes. Segment duration may vary due to where keyframes are in the video - that’s why we (re-) encoded the video before with the appropriate settings!
-segment-name segment_	The name of the segments. An increasing number and the file extension is added automatically. So in this case, the segments will be named like this: segment_1.m4s, segment_2.m4s, ...
output_2400k.mp4	The video we have created just before which should be segmented.

Fore more details please refer to the MP4Box documentation [6].

The output is one video representation, in form of segments. Additionally, there is one initialization segment, called output_2400k_dash.mp4. Finally, there is a MPD.

And that’s it.

What’s next?

Just put the segments, the initialization segment, and the MPD onto a web server. Then point the bitdash player’s config [7] to the MPD on the web server and enjoy your content.

In case of problems with the player, please refer to the F.A.Q. [8].

What about more representations?

The steps explained in this post can be repeated over and over again, just pass another bitrate to x264. And make sure previously created files are not overwritten.

For each representation another MPD will be created. As it is just a XML file, it is possible to open it with a text editor and copy & paste the representation into the video AdaptationSet of another MPD file until all needed representations are in the same MPD.