HH:MM:SS:FF expressions in IMSC1

IMSC1 requires the use of ttp:timeBase="media". As a result, time expressions of the form:

hours ":" minutes ":" seconds ":" frames

are neither SMPTE ST 12 time labels nor related to any timecode or frame rate of the related video object.

Instead they express an offset in seconds, given by the following formula (see Annex N.2 in TTML):

3600 * hours + 60 * minutes + seconds + (frames / (ttp:frameRate * ttp:frameRateMultipler))

Note that the offset depends on the value of ttp:frameRate and ttp:frameRateMultipler, which are not necessarily equal to the Composition Edit Rate.

For example , the time expression 00:00:02:22 corresponds to an offset of:

  • 2 + 22 / (24 * 1000/1001) = 2.9175833… seconds if ttp:frameRate="24" and ttp:frameRateMultiplier="1000 1001"
  • 2 + 22 / (30 * 1000/1001) = 2.734066… seconds if ttp:frameRate="30" and ttp:frameRateMultiplier="1000 1001"

To avoid ambiguities, one can always use offset-time, e.g. 2.9175833s, or clock-time with fraction , e.g. 00:00:02.9175833, neither of which use frames.

Positioning in TTML and IMSC1

Positioning of elements in IMSC1 uses the tts:origin and tts:extent attributes defined in TTML.

These attributes apply to region elements only (see 8.2.7 and 8.2.14 at 2) and have no effect when used on a p or div element.

For example, <p begin="1s" end="2s" tts:origin="47.59% 84.67%" tts:extent="auto">hello</p> will not position the p element at "47.59% 84.67%".

In order to position p or div elements, a separate region element has to be defined for each target position.

For example, the following positions each of the two p elements at different locations:

<?xml version="1.0" encoding="UTF-8"?>
<tt xmlns="http://www.w3.org/ns/ttml" xmlns:tts="http://www.w3.org/ns/ttml#styling">
<head>
<layout>
<region xml:id="pop1" tts:origin="20% 80%" tts:extent="50% 15%"/>
<region xml:id="pop2" tts:origin="50% 80%" tts:extent="30% 15%"/>
</layout>
</head>
<body>
<div>
<p region="pop1" begin="1s" end="2s" >One</p>
<p region="pop2" begin="3s" end="4s" >Two</p>
</div>
</body>
</tt>

Edit Rate in IMF Composition

The IMF Composition Playlist defines Edit Rate values for the Composition itself (/CompositionPlaylist/EditRate) as well as for individual Resources (//Resource/EditRate).

These values define the granularity of time-related parameters specified in the Composition Playlist, including:

  • the duration of the Composition itself since the duration of individual segments is a multiple of /CompositionPlaylist/EditRate; and
  • the EntryPoint and SourceDuration parameters used to specify the Playable Region of a Resource.

These values do not however specify the sampling rate or playback speed of the essence, which is set by metadata in the underlying Track File, e.g. the Sample Rate item of the File Descriptor for MXF Track Files.

Selecting a J2K profile for 4K UHD video in IMF

IMF Application 2e (SMPTE ST 2067-21) offers the option between two sets of JPEG 2000 profiles when performing lossy encoding of UHD video (3840×2160 pixels): Broadcast Contribution Single Tile Profile or 4k IMF Single Tile Lossy Profile.

While Broadcast Contribution Single Tile Profile is sometimes desirable for compatibility with legacy encoders, it also severely limits the maximum sampling rate of the video to 520 MSample/s (1 MSample = 106 samples) at its highest level (Level 5). In contrast, 4k IMF Single Tile Lossy Profile supports sampling rates up to 38400 MSample/s.

The sampling rate of video is calculated according to:

sampling_rate = frame_width_in_pixels × frame_height_in_pixels × frame_rate × average_number_of components_per_pixel

where

average_number_of components_per_pixel = 3 for 4:4:4 R’G’B’ images since there is one R’, one G’ and one B’ value per pixel; and

average_number_of components_per_pixel = 2 for 4:2:2 Y’C’bC’r images since there is two Y’, one C’r and one C’b value per 2 pixels.

As a practical example, 4K UHD 4:4:4 R’G’B’ images at 23.98 fps exceed the capabilities of Broadcast Contribution Single Tile Profile Level 5 since:

sampling_rate = 3840 × 2160 × 24000/1001 × 3 = 596 MSamples/s > 520 MSamples/s

Instead, 4k IMF Single Tile Lossy Profile Main Level 6 Sublevel 3, which has a maximum sampling rate of 1200 MSamples but the same maximum compressed bit rate (800 Mbit/s) as Broadcast Contribution Single Tile Profile Level 5, can be used.

Overall, it is therefore always preferable to use 4k IMF Single Tile Lossy Profile when creating 4K UHD video in IMF Application 2e.