|
This section provides a high-level feature analysis, or executive overview,
of MPEG-4 as compared to other media technologies and standards in use in the industry today.
An official overview of MPEG-4, which is full of detail from the ISO/IEC standards body,
can be read in a
document created by the MPEG group.
MPEG-4 follows on from the very successful MPEG-1 and
MPEG-2 standards. MPEG-1 and MPEG-2 are video and audio compressions for CD
quality and broadcast quality video/audio content. MPEG-1 audio includes the popular mp3 compression;
MPEG-2 is used for digital TV and on DVDs.
MPEG-4 built on that very successful compression work and introduced new audio and video compression
technologies which can scale from very low bit rate Internet type applications to high quality TV broadcast and studio
applications. In addition the MPEG-4 standard includes MPEG-4 Systems which can describe
complete dynamic, interactive intelligent, animated 2D and 3D content that can include this video and audio. And with
this MPEG-4 moved beyond simple video/audio only content to allow engaging, immersive rich-media applications.
MPEG-4 Introduction
MPEG-4 is an International Standard developed by ISO/IEC. The standard itself,
ISO/IEC 14496, comprises several parts, including reference software and conformance parts. The parts that are of
most interest are the first three parts.
- ISO/IEC 14496-1: Systems
- ISO/IEC 14496-2: Visual
- ISO/IEC 14496-3: Audio
Each part is self-contained in that the technologies can be used on their own. The Systems
part, although it can be used standalone, is more often used to integrate the visual and audio functions into one
seamless composite media presentation.
The Visual and Audio parts are fairly self-explanatory. The Visual part contains visual
technologies such as MPEG-4 video, as well as other visual technologies such as still image texture,
compressed mesh and face and body animation. The video covers a wide range of applications from low-bit rate
suitable for low-complexity mobile devices, to broadcast TV quality (overlap with MPEG-2) right up to
studio applications with very high quality/resolution. The Audio part contains audio codecs covering a wide range
of audio applications from very low-bit rate speech to high-quality music. It also has synthetic audio and text to
speech. Audio objects can be AAC, TwinVQ, CELP, HVXC (parametric speech), TTSI (text to speech), Main synthetic,
Wavetable synthesis, General Midi, Algorithmic Synthesis and Audio FX plus error resilient flavors of the above
including scalable AAC.
The Systems part contains description and control frameworks for the composite scene
presentation as well as for the fundamental elementary streams (audio, video etc) that can go into making up that
scene. The scene takes the form of a tree-like description that relates objects to one another hierarchically
as well
as providing links to the media in the elementary stream framework that are to be rendered. The elementary streams
can contain media, such as video or audio, as well as a number of other types. The scene was based on VRML
(ISO/IEC 14472) with additions for 2D and other animation and streaming extensions (VRML being a static
non-streamed scene). Systems also contains the mp4 file format for storage and exchange of mp4 content; it defines
MPEG-J interfaces so that Java byte code forming so-called MPEG-lets can be executed and
can interact with the scene; and it defines the eXtensible Mpeg Textual format (XMT), and XML based language
designed for MPEG-4 systems for the express purposes of authoring, machine creation and interchange of
content.
There are two other parts of the standard that may be of interest:
- ISO/IEC 14496-6: DMIF
- ISO/IEC 14496-8: Carriage of ISO/IEC 14496 contents over IP networks
DMIF, is Delivery Multimedia Integration Framework, and provides an abstraction to a delivery
interface for the purposes of specification; MPEG-4 has been designed to be transport independent and
so does not specify networks or network protocols. MPEG has however generated a framework document for carriage of
MPEG-4 content over IP. This work has been in conjunction with the IETF and a number of RFCs have
been created to specify payload formats etc.
And finally one new part under development is a new video codec. This is joint work with the
ITU who were defining an H.26L codec (follow on beyond H.261 and H.263). The work is being done by Joint Video
Taskforce ( JVT) working group and will become a new MPEG-4 video standard as part 10, i.e. ISO/IEC
14496-10 and is called Advanced Video Coding. The specification is technically complete and thus, as
part of the standardization process, is technically frozen apart from necessary corrections. Corrections, review
and voting cycles for the national bodies will mean that the standard will be finally published, and be publicly
available, after February 2003.
MPEG-4 Profiles and Levels
MPEG as a standards organization does not specify end-user product or equipments. MPEG
standardizes what it calls tools that can then be selected and used to build products. Tools for video codecs
would be advanced motion vectors, ¼ pel compensation etc, for Systems these are individual nodes that represent the
scene, individual commands etc.
What MPEG does standardize though are Profiles. A Profile is a selection of tools that a group
of participating companies within the standard have selected as a basis for deploying products to meet specified
application areas. To be standardized the Profiles pass through a requirements process where the tools and
applications are reviewed and voted on and if there are sufficient supporting companies the profile can be
standardized as being an interoperable profile for the industry.
Within each profile there can be one or more levels. Levels allow for increasing complexity of
the tools to allow some diversity within a profile in addressing devices of varying performance. Levels may thus
restrict bit-rates, size, number of nodes etc. The restrictions being more at one end of the level scale than the
other; and at the high end there may even be no restriction.
The wide application range for MPEG-4 can be seen in the names of the profiles.
MPEG-4 Systems is broken down into 4 profile sets. Two profile sets for the
scene, one set for the Object Descriptor (OD) framework describing the elementary streams, and one set for
MPEG-J. Audio and Visual just have one set of profiles each.
There are two profiles for MPEG-4 systems covering the scene and these are the
SceneGraph profiles, which contain mainly the tools forming the structure of the tree, and the Graphics profile
that contain the renderable tools such as Circle, Rectangle, Text etc.
The Systems SceneGraph Profiles specified are:
- Simple 2D
- Audio
- 3D Audio
- Basic 2D
- Core 2D
- Main 2D
- Advanced 2D
and the Systems Graphics Profiles are:
- Simple2D
- Simple 2D + Text
- Core 2D
- Advanced 2D
For Audio the following profiles are defined (some have up to 8 levels):
- Main
- Scalable
- Speech
- Synthesis
- High Quality Audio
- Low Delay Audio
- Natural Audio
- Mobile Audio Internetworking
And for Visual the following:
- Simple
- Simple Scalable
- Core
- Main
- N-Bit
- Hybrid
- Basic Animated Texture
- Scalable Texture
- Simple Face Animation
- Simple FBA
- Advanced Real Time Simple
- Core Scalable
- Advanced Coding Efficiency
- Advance Core Profile
- Advanced Scalable Texture
- Simple Studio
- Core Studio
- Advanced Simple
- FGS
It is expected and highly desired, although not required, that industry groups building
products select tools by selecting one or more profiles and levels as standardized by MPEG. Further restriction of
the profiles is acceptable practice and so, for example, ISMA have selected Simple Profile visual but restricted
the max bit-rate to 64kbps and only allowed one video object to be coded within the stream.
Patents and Licensing
Companies can bring technologies to MPEG on which they have patents, and the technologies may be
selected for inclusion into the standard. Each part of the standard lists the companies who have provided patent
statements to the standards organization. Patents would generally cover certain tool(s) and may cover encoding
and/or decoding processes. MPEG however only standardizes decoders, so that the specification, the conformance, and
the reference bit streams are all for decoders. By not standardizing encoders this allows their implementation to
vary so long as they produce conformant bit streams that can therefore be decoded by conformant decoders. Patents
can still cover either or both aspects though.
When patented tools/technologies are accepted into the standard the company(s) in question are
required to provide licenses for any patents reading on those tools under reasonable and non-discriminatory
terms.
Profiles were discussed above and these were described as a set of tools selected from the
standard to address particular application area(s). When choosing a particular profile there may be patents on the
tools therein so that licensing is required.
So how to get a license for use? A convenient way for essential patents is to go to a licensing
administration company that has set up a license pool for essential patents covering those technologies. For
MPEG Visual and Systems licensing the MPEG-LA
is such an administration. And as of June 2003 Via
Licensing is providing licensing for MPEG-4 audio and has begun the process for AVC (Advanced Video
Coding).
To give an idea of how this comes about, for example, under MPEG-LA any company believing it
holds relevant essential patents is invited to submit them and for a fee they are evaluated by independent experts
to determine their essentiality. The resultant companies, that are determined to have essential patents, form a pool
under the management of the MPEG-LA. Terms and conditions for the licensing are then worked out
amongst those companies.
Note that the presence of these licensing pools does not preclude negotiations being held
individually with each of the companies holding patents. So if a company wants to ship product they can either
negotiate with the individual companies involved, or more straightforwardly, license the technologies from a
relevant licensing administration where one exists.
See the MPEG-4 Industry Forum site for further comprehensive
information on patents and licensing.

|