MP3 Format
MP3 Format
Frequency masking
Temporal masking
MPEG/audio divides audio signal into frequency sub-bands that approximate critical
bands. Then we quantize each sub-band according to the audibility of quantization
noise within the band
MPEG Audio Bit Allocation
This process determines number of code bits allocated to
each sub-band based on information from the psycho-
acoustic model
Algorithm:
1. Compute mask-to-noise ratio: MNR=SNR-SMR
Standard provides tables that give estimates for SNR resulting
from quantizing to a given number of quantizer levels
2. Get MNR for each sub-band
3. Search for sub-band with the lowest MNR
4. Allocate code bits to this sub-band.
If sub-band gets allocated more code bits than appropriate, look
up new estimate of SNR and repeat step 1
Audio Quality
Bitrate
With too low bit rate, we get compression artifacts
Ringing
Pre-echo – sound is heard before it occurs. It is most
noticeable in impulsive sounds from percussion
instruments such as cymbals
Occurs in transform-based audio compression algorithms
Source: http://wiki.hydrogenaudio.org/images/e/ee/Mp3filestructure.jpg
MPEG Audio Comments
Precision of 16 bits per sample is needed to get
good SNR ratio
Noise we are getting is quantization noise from
the digitization process
For each added bit, we get 6dB better SNR ratio
Masking effect means that we can raise the
noise floor around a strong sound because the
noise will be masked away
Raising noise floor is the same as using less bits
and using less bits is the same as compression
Successor of MP3
Advanced Audio Coding (AAC) – now part
of MPEG-4 Audio
Inclusion of 48 full-bandwidth audio
channels
Default audio format for iPhone, iPad,
Nintendo, PlayStation, Nokia, Android,
BlackBerry
Introduced 1997 as MPEG-2 Part 7
In 1999 – updated and included in MPEG-4
AAC’s Improvements over MP3
More sample frequencies (8-96 kHz)
Arbitrary bit rates and variable frame
length
Higher efficiency and simpler filterbank
Uses pure MDCT (modified discrete cosine
transform)
Used in Windows Media Audio
MPEG-4 Audio
Variety of applications
General audio signals
Speech signals
Synthetic audio
Synthesized speech (structured audio)
MPEG-4 Audio Part 3
Includes variety of audio coding technologies
Lossy speech coding (e.g., CELP)
CELP – code-excited linear prediction – speech
coding
General audio coding (AAC)
Lossless audio coding
Text-to-Speech interface
Structured Audio (e.g., MIDI)
MPEG-4 Part 14
Called MP4 with Extension .mp4
Multimedia container format
Stores digital video and audio streams and
allows streaming over Internet
Container or wrapper format
meta-fileformat whose spec describes how
different data elements and metadata coesit
in computer file
MPEG-4 Audio
Bit-rate 2-64kbps
Scalable for variable rates
MPEG-4 defines set of coders
Parametric Coding Techniques: low bit-rate 2-6kbps,
8kHz sampling frequency
Code Excited Linear Prediction: medium bit-rates 6-
24 kbps, 8 and 16 kHz sampling rate
Time Frequency Techniques: high quality audio 16
kbps and higher bit-rates, sampling rate > 7 kHz
Conclusion
MPEG Audio is an integral part of the
MPEG standard to be considered together
with video
MPEG-4 Audio represents an major
extension in terms of capabilities to
MPEG-1 Audio