|
The Composite Media Group has provided many contributions to the MPEG-4
standard, in particular this includes technologies such as FlexTime and XMT (eXtensible Mpeg
Textual format) that have now been successfully integrated into the standard. We contributed and
promoted the MPEG-4 Systems Main2D profile which has now been standardized. A profile defines the set of features
that a terminal must support - Main2D features were selected to address streaming media applications
for the Internet. Also we have provided much valuable feedback and technical corrections in support
of interoperabilty and been editor for a corrigenda to the standard.
Learn more about MPEG-4.
| MPEG-4 XMT (Extensible Textual Format) |
MPEG-4 is an ISO/IEC standard developed by MPEG (Moving Picture Experts Group) for
representing and compressing interactive, audio-visual scenes. The standard defines a set of tools
for the coded representation of individual audio-visual objects, text/graphics and synthetic objects.
An MPEG-4 scene defines the interactive behavior of these coded objects and the way
they are composed in space and time. The scene description is coded in a highly compressed binary format known as
BIFS (BInary Format for Scenes).
Compressed binary-coded MPEG-4 content can be stored in the MPEG-4
Systems file format commonly known as the mp4 format. While the mp4 file is exchangeable
it is often difficult to subsequently use it to edit or re-purpose the stored content.
This is because the binary coded representation often cannot be "reverse-engineered" in a
consistent manner to represent the content author's original intentions. At the end of 1999,
recognizing the requirement for a user friendly high textual format for authoring, MPEG issued
a call for proposal. In March of 2000 we took a proposal to MPEG for a high-level XML-based textual
format - this has subsequently become part of the MPEG-4 standard and is now known as XMT.
The XMT (eXtensible MPEG Textual format) was been designed to provide an
exchangeable format between content
authors while preserving the author's intentions in a high-level textual format. In addition to
providing an author-friendly abstraction of the underlying MPEG-4 technologies, another important
consideration for the XMT design was to respect existing practices of content authors such as the
Web3D X3D, W3C SMIL and HTML. A brief overview of XMT can be found in this
ACM Multimedia 2000 workshop paper. The XMT logo here was designed for the MPEG-4 standard.
The XMT is suitable for many uses including manually authored content as well
as machine-generated content using multimedia database material and templates. The XMT may be encoded
and stored in the exchangeable mp4 binary file or may also be encoded directly into streams
and transmitted. XMT encoding and delivery hints exist to assist this process.
Learn more about XMT.
Streaming media solutions, in order to be effective in a best-effort (non-QoS) Internet environment,
need to be able to cope with network delivery delays across the media streams over the duration of the
presentation playback. The media streams may all be delivered from a single server, or, in a more advanced case,
a client may be composing streams from multiple servers that are widely dispersed over the network geography.
To better operate in such a best-effort network environment, we developed Flextime and
brought it to the MPEG-4 standard.
Flextime defines a flexible,
relative timing based system so that presentation playback can be altered to accommodate network delays
and avoid disruptions to the end-users experience. In Flextime the playback of media streams can
be delayed and/or their playback durations altered in response to network delay.
Content that includes Flextime allows the author to specify both the relative timing
and the bounds of the flexibility by which the playback may be altered. Flextime
can thus be seen as an expression of application level QoS over the presentation playback.
The diagram above pictorially represents three media objects, their temporal relationships and
bounded playback durations as expressed by Flextime.
Learn more about Flextime.
| MPEG-4 Main2D Profile |
TOP |
Profiles in MPEG-4 define the tools (parts of the standard) that will be implemented in terminals
conforming to that profile. MPEG-4, like other MPEG standards, is essentially a kit of standardized 'tools'
and those tools, although part of the standard, will only get used if terminal manufacturers agree to define terminals
that use them. A profile represents an set of tools that multiple companies have agreed upon to address particular
application requirements.
We (IBM) promoted a profile for MPEG-4 Systems called Main2D.
This profile includes the Flextime tools and was designed
with a particular focus towards Internet media applications.
Profiles can contain levels, where a level is a amount/complexity of a tool. Main2D
has 3 levels increasing in complexity. The lowest level can address simpler devices, and was designed with a view to
portable phones, PDA devices etc. The higher levels are for more powerful devices, PCs, set-top boxes etc.
Our focus in MPEG-4 has been on the Systems part of the standard. In this we have now
considerable experience and have one of the few players capable of playing MPEG-4 Systems content. We have
input many corrections/clarifications to standard in order to ensure it's promise of broad interoperability
across manufacturers. As part of our participation in the standard we had editorship of
the 2nd corrigenda for MPEG-4.
The
Synchronized Multimedia Integration Language (SMIL) is a recommendation from the World Wide Web
Consortium (W3C) for integrating rich media in interactive audiovisual presentations. It is a language in XML for
specifying this integration of audio, video, text and graphics.
Being engaged in composite media work we joined the working group for SMIL
2.0 and actively contributed to the development of the specification. The following were the
main areas of our involvement:
We proposed to SMIL 2.0 aspects of the Flexible timing model, which we had
standardized in MPEG-4, in the form of a min and max module for the timing. This was accepted and the
module is now part of the Basic Profile for SMIL 2.0.
We worked jointly with both MPEG and SMIL in support of the XMT textual format to ensure
interoperability.
Participated in the SMIL 2.0 Interop testing to verify the ability to implement SMIL 2.0 Timing constructs
as part of the process SMIL 2.0 had to go through in order to become a W3C Recommendation.
Co-edited the SMIL 2.0 specification.

|