IBM®
Skip to main content
    Country/region change    Terms of use
 
 
 
    Home    Products    Services & solutions    Support & downloads    My account    
IBM Research

Standards work


The Composite Media Group has provided many contributions to the MPEG-4 standard, in particular this includes technologies such as FlexTime and XMT (eXtensible Mpeg Textual format) that have now been successfully integrated into the standard. We contributed and promoted the MPEG-4 Systems Main2D profile which has now been standardized. A profile defines the set of features that a terminal must support - Main2D features were selected to address streaming media applications for the Internet. Also we have provided much valuable feedback and technical corrections in support of interoperabilty and been editor for a corrigenda to the standard.

Learn more about MPEG-4.


MPEG-4 XMT (Extensible Textual Format)

MPEG-4 is an ISO/IEC standard developed by MPEG (Moving Picture Experts Group) for representing and compressing interactive, audio-visual scenes. The standard defines a set of tools for the coded representation of individual audio-visual objects, text/graphics and synthetic objects. An MPEG-4 scene defines the interactive behavior of these coded objects and the way they are composed in space and time. The scene description is coded in a highly compressed binary format known as BIFS (BInary Format for Scenes).

Compressed binary-coded MPEG-4 content can be stored in the MPEG-4 Systems file format commonly known as the mp4 format. While the mp4 file is exchangeable it is often difficult to subsequently use it to edit or re-purpose the stored content. This is because the binary coded representation often cannot be "reverse-engineered" in a consistent manner to represent the content author's original intentions. At the end of 1999, recognizing the requirement for a user friendly high textual format for authoring, MPEG issued a call for proposal. In March of 2000 we took a proposal to MPEG for a high-level XML-based textual format - this has subsequently become part of the MPEG-4 standard and is now known as XMT.

XMT Logo

The XMT (eXtensible MPEG Textual format) was been designed to provide an exchangeable format between content authors while preserving the author's intentions in a high-level textual format. In addition to providing an author-friendly abstraction of the underlying MPEG-4 technologies, another important consideration for the XMT design was to respect existing practices of content authors such as the Web3D X3D, W3C SMIL and HTML. A brief overview of XMT can be found in this ACM Multimedia 2000 workshop paper. The XMT logo here was designed for the MPEG-4 standard.

The XMT is suitable for many uses including manually authored content as well as machine-generated content using multimedia database material and templates. The XMT may be encoded and stored in the exchangeable mp4 binary file or may also be encoded directly into streams and transmitted. XMT encoding and delivery hints exist to assist this process.

Learn more about XMT.


MPEG-4 Flextime TOP

Streaming media solutions, in order to be effective in a best-effort (non-QoS) Internet environment, need to be able to cope with network delivery delays across the media streams over the duration of the presentation playback. The media streams may all be delivered from a single server, or, in a more advanced case, a client may be composing streams from multiple servers that are widely dispersed over the network geography.

To better operate in such a best-effort network environment, we developed Flextime and brought it to the MPEG-4 standard. Flextime defines a flexible, relative timing based system so that presentation playback can be altered to accommodate network delays and avoid disruptions to the end-users experience. In Flextime the playback of media streams can be delayed and/or their playback durations altered in response to network delay. Content that includes Flextime allows the author to specify both the relative timing and the bounds of the flexibility by which the playback may be altered. Flextime can thus be seen as an expression of application level QoS over the presentation playback.

Flextime example figure

The diagram above pictorially represents three media objects, their temporal relationships and bounded playback durations as expressed by Flextime.

Learn more about Flextime.


MPEG-4 Main2D Profile TOP

Profiles in MPEG-4 define the tools (parts of the standard) that will be implemented in terminals conforming to that profile. MPEG-4, like other MPEG standards, is essentially a kit of standardized 'tools' and those tools, although part of the standard, will only get used if terminal manufacturers agree to define terminals that use them. A profile represents an set of tools that multiple companies have agreed upon to address particular application requirements.

We (IBM) promoted a profile for MPEG-4 Systems called Main2D. This profile includes the Flextime tools and was designed with a particular focus towards Internet media applications.

Profiles can contain levels, where a level is a amount/complexity of a tool. Main2D has 3 levels increasing in complexity. The lowest level can address simpler devices, and was designed with a view to portable phones, PDA devices etc. The higher levels are for more powerful devices, PCs, set-top boxes etc.


MPEG-4 Corrigenda TOP

Our focus in MPEG-4 has been on the Systems part of the standard. In this we have now considerable experience and have one of the few players capable of playing MPEG-4 Systems content. We have input many corrections/clarifications to standard in order to ensure it's promise of broad interoperability across manufacturers. As part of our participation in the standard we had editorship of the 2nd corrigenda for MPEG-4.


SMIL TOP

The Synchronized Multimedia Integration Language (SMIL) is a recommendation from the World Wide Web Consortium (W3C) for integrating rich media in interactive audiovisual presentations. It is a language in XML for specifying this integration of audio, video, text and graphics.

Being engaged in composite media work we joined the working group for SMIL 2.0 and actively contributed to the development of the specification. The following were the main areas of our involvement:

  1. We proposed to SMIL 2.0 aspects of the Flexible timing model, which we had standardized in MPEG-4, in the form of a min and max module for the timing. This was accepted and the module is now part of the Basic Profile for SMIL 2.0.

  2. We worked jointly with both MPEG and SMIL in support of the XMT textual format to ensure interoperability.

  3. Participated in the SMIL 2.0 Interop testing to verify the ability to implement SMIL 2.0 Timing constructs as part of the process SMIL 2.0 had to go through in order to become a W3C Recommendation.

  4. Co-edited the SMIL 2.0 specification.




    About IBMPrivacyContact