Abstract:: What A Codec Is?
Abstract:: What A Codec Is?
In the age of smart phones and internet ready devices, audio/video transport and distribution has
evolved from sharing low quality files to providing high quality mobile device streams, click to
play content, over the air broadcasting, audio distribution in large facilities, and more. Each
medium has several methods of compressing content by means of a codec. This session will
explain which codecs are appropriate for which purposes, common misuse of audio codecs, and
how to maintain audio quality by implementing codecs professionally.
Introduction:
According to Wikipedia, "A codec is a device or program capable of performing encoding and
decoding on a digital data stream or signal." In plain English I'd put it this way: a codec allows
one to read and save audio and video files, often for the purposes of saving space.
The best known example of a codec is MP3. It compresses bulky audio files such as WAV to much
smaller MP3 files.
All codecs involve a tradeoff between the amount of compression and the resultant quality. If you
compress too much the quality loss may become intolerable.
A codec can consist of two components: an encoder and a decoder. The encoder performs the
compression (encoding) function and the decoder performs the decompression (decoding)
function. Some codecs include both of these components and some codecs only include one of
them.
For example, when you rip a song from an audio CD to your computer, the Player uses
the Windows Media Audio codec by default to compress the song into a compact WMA file. When
you play that WMA file (or any WMA file that might be streamed from a website), the Player uses
the Windows Media Audio codec to decompress the file so the music can be played through your
speakers.
Hence A codec is software that is used to compress or decompress a digital media file, such as a
song or video.
Compression has always had a lot to do with the ability to share video over various connection
speeds through the internet. Full frame resolution NTSC video with CD quality audio
uncompressed has a data rate of nearly 30MB per
second. Even with large amounts of compression that stream would choke over a 56K
connection. Over the years many different types of codecs have been developed in the hope of
achieving better quality video at lower data rates.
This is where MPEG-2 compression outshines all the rest lowering the Megabytes to megabits
and still offering a high quality picture with sound. Unlike many of the non-DV video algorithm's
MPEG-2 does not have sub-format frame ratios.
Most video applications using software based codecs assume a 4:3 ratio and the sub-ratio's
based upon it, i.e. 640x480, 320x240, 160x120,etc. all square pixel ratio's. This is because DV
was not around and analog capture was done in square pixel 4:3 screen ratios. MPEG-2 on the
other hand expects the DV pixel ratio and frame resolution. There is a breaking point with any
codec in relationship to compression - the more the compression the lesser the picture quality.
Naturally the less the compression the better the picture quality.
Most codecs are LOSSY, in order to get a reasonably small file size. There are LOSSLESS codecs
as well, but for most purposes the almost imperceptible increase in quality is not worth the
considerable increase in data size. The main exception is if the data will undergo more
processing in the future, in which case the repeated lossy encoding would damage the eventual
quality too much.
Examples of Lossy file formats: AAC (Advanced Audio Coding), MP3, Vorbis (filename
extension .OGG), lossy Windows Media Audio (filename extension .WMA)...
Example of Lossless file formats: Apple Lossless (filename extension .m4a), FLAC, Monkey's
Audio (filename extension .APE), Shorten, TTA, lossless Windows Media Audio (filename
extension .WMA), WavPack.
Classification of Codecs:
1. Audio Codecs:
In Software, an audio codec is a computer program implementing an algorithm that
compresses and decompresses digital audio data according to a given audio file format or
streaming media audio format. The object of the algorithm is to represent the high-fidelity
audio signal with minimum number of bits while retaining the quality. This can effectively
reduce the storage space and the bandwidth required for transmission of the stored audio
file. Most codecs are implemented as libraries which interface to one or more multimedia
players.
2. Video Codecs:
A video codec is a combination of hardware and/or software that creates a binary stream
of data that represents the video and audio captured by a camera. Encoders differ from
capture devices primarily in what they are intended to create as output. A capture card
usually creates a binary stream that will be stored as a file. An encoder usually creates a
stream of data that is to be transferred to a second device. This second device has
various names such as set-top box or decoder, but it essentially reverses the process
carried out by the encoder and re-creates the representation of the scene picked up by
the camera.
Video codecs basically do three things. If the camera is analog, they sample the output
signal. The rate at which this is done is referred to as the sampling rate. Each sample is
then converted into a certain number of bits (often 8) during the analog-to-digital
conversion process (A/D). This is called quantization. Finally, the codec must do
compression on the resulting bit stream because it is usually too much information to be
efficiently transmitted.
Hence, a video codec is software or a device that provides encoding and decoding which
may or may not include the use of video compression and/or decompression for digital
video.
3. Text Codecs
A Text Codec is a function that transforms text into (when encoding) or out of (when
decoding) another kind of representation. Usually, the most human-readable
representation is said to be "decoded".
"Encoders" will turn the (selected or whole) text into something less readable, "Decoders"
try to revert those effects as good as possible.
E.g.: ROT-13, Base64, URI Codecs, Unicode Codecs, Case Encoders, CMML, BiM
Codecs can also be classified into Specialized codecs such as ‘Speech Codecs’ which are
designed to deal with the characteristics of voice, while ‘Audio Codecs’ are developed for
music. The difference between speech and audio codecs is that speech codecs look for
speech patterns in order to compress the data further.
Codecs may also be able to transcode from one digital format to another; for example, from PCM
audio to MP3 audio.
• Decoding
Simply opening and watching video files with a video player (e.g. a DivX decoder for opening
DivX files or a MPEG-1 Decoder for opening a MPEG-1 video file).
• Encoding
Creating a video file in a special format (e.g. DivX, MPEG-2, MPEG-4, ...) - You´ll need a DivX
Encoder in order to be able to create DivX files and a MPEG-2 Encoder in order to be able to
create MPEG-2 videos. You´ll need these encoders for transcoding and recoding video files as
well.
• Recoding
Conversion of a video file which is present in a special format with special attributes in the same
format with different attributes (e.g. 2 h movie with 3000 Kbit/s into 2 h movie with 2000 Kbit/s).
• Transcoding
Conversion of a video file which is present in a special format into another video format (e.g.
DivX in DVD or MPEG-2 in DivX).
These various Video codecs are technically differentiated from each other based on various
factors which includes compression technology / algorithm, platform supported, sampling, OS
supported etc
One can easily compare the various Video codecs from various websites. But still there is
confusion which codec is the appropriate? However it also depends on application. But
understanding pros & cons of some of these codecs gives us the better information and insight
depth.
MPEG-4
MPEG-4 is a standard currently under development for the delivery of interactive multimedia
across networks. As such, it is more than a single codec, and will include specifications for audio,
video, and interactivity.
The video component of MPEG-4 is very similar to H.263. It is optimized for delivery of video at
Internet data rates. One implementation of MPEG-4 video is included in Microsoft’s NetShow.
Pros
Good image quality at low data rates
Cons
Standard is still being designed
DiVx
DivX is a brand name of products created by Divx Inc. The DivX codec uses lossy MPEG-4 Part 2
compression and it isfully MPEG-4-Advanced Simple Profile compliant; MPEG-4 ASP.
Pros
The Divx codec is quite simple to set up and use
It is popular due to its ability to compress lengthy video segments into small sizes while
maintaining relatively high visual quality.
Cons
It’s a commercial codec, so in order to get all the options you have to pay money for it.
x264
x264 is a freely available open source implementation of the h.264 standard. H.264, or AVC as it
is sometimes known is a very advanced compression method that is part of the MPEG-4
standard.
Pros
It offers the best quality at the smallest filesize
Cons
x264 (or any h.264 codec for that matter) is that it can take bit of CPU power to play
Real Video
Real Media currently has only two video codecs: Real Video (Standard) and Real Video (Fractal).
Please bear in mind that this page only compares the one to the other.
Pros
RealVideo (Standard) is usually best for data rates below 3 KBps.
It works better with relatively static material than it does with higher action content.
It usually encodes faster.
Cons
RealVideo (Standard) is significantly more CPU intensive than the RealVideo (Fractal)
codec.
It usually requires a very fast PowerMac or Pentium for optimal playback.
Sorenson
The Sorenson Video Codec produces excellent Web video suitable for playback on any Pentium
or PowerMac. It also delivers outstanding quality CD-ROM video at a fraction of traditional data
rates.
Pros
Provides much higher image quality than Cinepak, with smaller files. It is often possible to
get twice the image quality at less than half the data rate.
Tuned to work well from 2 - 100 KBps.
Supports Media Cleaner Pro’s variable bitrate encoding, which provides the best possible
results at any data rate.
Cons
Playback of CD-ROM video requires faster computers than Cinepak
Movies larger than 320×240, or at data rates above 100 KBps, do not play smoothly
except on high-end machines (such as a Macintosh G3). While picture quality is usually
outstanding at higher rates, you should test these movies on your target machines to
determine if playback performance is acceptable.
MPEG-1
MPEG-1 provides excellent image quality at CD-ROM data rates. One of the most popular uses of
MPEG-1 is the VCD, or “white book” video CD. MPEG includes both audio and video compression.
The biggest problem with MPEG is that it has high requirements for playback. Either a dedicated
MPEG decoder card must be installed, or a high-end CPU is required for software-only playback.
Because of this limitation, MPEG-1 has not gained wide acceptance in consumer titles.
Pros
Excellent image quality
Cons
Very high playback requirements
Majority of installed base not capable of viewing MPEG
Licensing fees (typically US $0.04 - $0.40 per unit) are required to distribute MPEG-2
video. There may also be fees for MPEG-1; there is some uncertainty regarding this.
Not well-suited to WWW video (the upcoming MPEG-4 standard will address this)
MPEG-2
MPEG-2 is a standard for broadcast-quality digitally encoded video. It offers outstanding image
quality and resolution. MPEG-2 is the primary video standard for DVD-Video
Pros
Excellent image quality
Cons
Very few people are currently capable of viewing MPEG-2
Licensing fees (typically US $0.04 - $0.40 per unit) are required to distribute MPEG-2
video.
H.261
H.261 is a standard video-conferencing codec. As such, it is optimized for low data rates and
relatively low motion.
Pros
H.261 is optimized for low data rates.
H.261 has a strong temporal compression component, and works best on movies in which
there is little change between frames.
Cons
Not generally as good quality as H.263.
It may not play well on lower-end machines.
H.263
H.263 is a standard video-conferencing codec. As such, it is optimized for low data rates and
relatively low motion.H.263 is an advancement of the H.261 standard; mainly it was used as a
starting point for the development of MPEG (which is optimized for higher data rates.)
Pros
H.263 is optimized for low data rates.
Generally better quality than H.261
H.263 has a strong temporal compression component, and works best on movies in which
there is little change between frames.
Cons
H.263 is CPU intensive
It may not play well on lower-end machines.
Understanding various Audio Codecs:
AAC
AAC+ / AAC+ Enhanced
AC3 or Digital Dolby
Digital Dolby Plus
Speex
FLAC
MIDI
MP3
MP3 Pro
Monkey’s audio
Ogg Vorbis
QCELP
Real Audio
WMA
Melody
HVAC
These various Audio codecs are technically differentiated from each other based on various
factors which includes compression technology / algorithm, platform supported, sampling, OS
supported etc
One can easily compare the various audio codecs on wikipedia.But still there is confusion which
codec is the appropriate? However it also depends on application. But understanding pros & cons
of some of these codecs gives us the better information and insight depth.
AAC
Pros
An international standard approved by the ISO
Flexible: supports several sampling rates (8000-96000 Hz), bit depths, and multichannel
(up to 48 channels)
Several implementations, including free and high quality ones.
Reaches transparency in most samples and for most users at around 150 kbps
Part of MPEG-4 specs
Anyone can create its own implementation (specifications and demo sources available)
Cons
Problem cases that trip out all transform codecs
Heavily patented
Increased complexity
AAC comes in different “flavors” (object types: AAC LC, AAC HE, AAC PS etc.).
Many (especially portable) players only support LC (at the moment) so you can have files
that are valid but your player won’t play them.
Cons
Max support for 5.1 channel audio CDs, limited to 448 kbps maximum for Digital Dolby
SPEEX
Pros
Speex is an Open Source/Free Software patent-free audio compression format
Speex is based on CELP and is designed to compress voice at bitrates ranging from 2 to 44
kbps
Speex has a number of features that aren’t in other codecs such as Intensity stereo
encoding, integration of multiple sampling rates in the same bitstream, and a VBR mode
Cons:
Speex is mainly designed for only three different sampling rates: 8 kHz, 16 KHz & 32 KHz
FLAC
Pros
FLAC is portable to many systems
Open source and freely licensed
The encoding of audio data incurs no loss of information.
Hardware support & Streaming support
Extremely fast decoding
Supports multi-channel and high resolution streams
Supports Replay Gain & cue-sheet (with some limitations)
Gaining wide use as successor to Shorten
Cons:
Compresses less efficiently than other popular modern compressors (Monkey’s Audio,
OptimFROG)
Higher compression modes slow, for little gain over the default setting.
MP3
Pros
Widespread acceptance, support in nearly all hardware audio players and devices
An ISO standard, part of MPEG specs
Fast decoding, lower complexity than AAC or Vorbis
Anyone can create their own implementation (Specs and demo sources available)
Relaxed licensing schedule
Cons
Lower performance/efficiency than modern codecs.
Problem cases that trip out all transform codecs.
Sometimes, maximum bitrate (320kbps) isn’t enough.
No multichannel implementations.
Unusable for high definition audio (sampling rates higher than 48kHz).
OGG VORBIS
Pros
(Ogg) Vorbis specification is in the public domain; it is free for commercial or
noncommercial use, under both (LGPL and BSD license)
Easy to use high-level API (Application Programming Interface)
Good all-round performance (>48 kbps – a leading codec at 128 kbps)
Well written specs
Supported by most portable (Ogg) DAPs
Suitable for internet-streaming (via Icecast and other methods)
Fully gapless playback
High potential for further tuning
Structured to allow the design for a hybrid filterbank
Cons
Limited official development (third-party developement is always encouraged)
Current implementations are more computationally intensive to decode than MP3