0% found this document useful (0 votes)
68 views10 pages

MPEG-1 Audio: 18-899 Special Topics in Signal Processing

1. The document discusses MPEG-1 Audio, the first high quality audio compression standard that could provide CD quality two-channel audio at 256 kbits/s. 2. It describes the key aspects of MPEG-1 Audio including psychoacoustics, subband coding, and its three layers (Layer I, II, and III) that provide increasing quality and compression ratios. 3. The document outlines the encoder and decoder block diagrams, how filterbanks and quantization are used, and new features of Layer III including MDCT, nonuniform quantization, and entropy coding.

Uploaded by

Juan M. Córdova
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
68 views10 pages

MPEG-1 Audio: 18-899 Special Topics in Signal Processing

1. The document discusses MPEG-1 Audio, the first high quality audio compression standard that could provide CD quality two-channel audio at 256 kbits/s. 2. It describes the key aspects of MPEG-1 Audio including psychoacoustics, subband coding, and its three layers (Layer I, II, and III) that provide increasing quality and compression ratios. 3. The document outlines the encoder and decoder block diagrams, how filterbanks and quantization are used, and new features of Layer III including MDCT, nonuniform quantization, and entropy coding.

Uploaded by

Juan M. Córdova
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

1

Prof. Tsuhan Chen


tsuhan@ece.cmu.edu
18-899 Special Topics in Signal Processing
Multimedia Communications:
Coding, Systems, and Networking
Lecture 8
MPEG-1 Audio
2
18-899/Spring 1998/Chen
MPEG-1 Audio
Outline
Background
Psychoacoustics
Subband coding
Layer I and II
Layer III
Frame structure and packetization
18-899/Spring 1998/Chen
MPEG-1 Audio
ISO/IEC 11172-3 (1988~1991)
First high quality audio compression standard
CD quality two-channel audio at 256 kbits/s
CD: 44.1 kHz 16 bits 2 = 1.411 Mbits/s
Frequency
Band (Hz)
Sampling
Rate
Bits per
Sample
Raw Bitrate
Telephone
Speech
300~3400 8 8 64
Wideband
Speech
50~7000 16 8 128
Mediumband
Audio
10~11000 24 16 384
Wideband
Audio
10~22000 48 16 768
3
18-899/Spring 1998/Chen
Quality Demonstration
MPEG-1 Audio (Layer II)
Stereo 44.1 kHz at 64 kbits/s
Stereo 44.1 kHz at 128 kbits/s
Stereo 44.1 kHz at 192 kbits/s
Stereo 44.1 kHz at 256 kbits/s
18-899/Spring 1998/Chen
Psychoacoustics
Threshold in quiet
4
18-899/Spring 1998/Chen
Frequency Masking
18-899/Spring 1998/Chen
Temporal Masking
Post-Masking: 50~200ms
Also Pre-Masking (much shorter)
5
18-899/Spring 1998/Chen
Encoder Block Diagram
mapping
quantizer
and
coding
frame
packing
psychoacoustic
model
PCM
audio samples
32, 44.1, 48 kHz
encoded
bitstream
11172-3
Encoder
ancillary data
18-899/Spring 1998/Chen
Decoder Block Diagram
frame
unpacking
reconstruction
inverse
mapping
encoded
bits tream
PCM
audio samples
32, 44.1, 48 kHz
ancillary data
11172-3 Decoder
6
18-899/Spring 1998/Chen
H
1
(z)
H
2
(z)
F
1
(z)
F
2
(z)
H
M
(z) F
M
(z)
M
M
M
M
M
M
Q
Q
Q
Analysis
Filterbank
Synthesis
Filterbank
Mapping: Subband Coding
Critical downsampling
Q should be based on signal-to-masking ratio (SMR)
Ears critical bands are not uniform, but logarithmic
18-899/Spring 1998/Chen
Alias cancellation and perfect reconstruction
M
M
M
z
-1
z
-1
E(z) R(z)
M
M
M
z
z
.
.
.
.
.
.
.
.
.
Polyphase Filterbank
7
18-899/Spring 1998/Chen
Layers
Increasing complexity, delay, and quality
Layer I
~384 kbits/s for perceptually lossless quality (4:1)
Layer II
~192 kbits/s for perceptually lossless quality (8:1)
Layer III
~128 kbits/s for perceptually lossless quality (12:1)
18-899/Spring 1998/Chen
Analysis
Filterbank
Scaler &
Quantizer
Mux
32
Masking
Threshold
Generator
Layer I and II Encoder
Dynamic
Bit
Allocator
FFT
Coder
8
Analysis
Filterbank
.
.
.
12 12 12
Layer I
Layer II
Block-Based Coding
12 samples for Layer I, 36 samples for Layer II
Block companding: Each block normalized by scalefactor
For Layer II, up to 3 scalefactors, with 2-bit scalefactor select
Each block receives one bit allocation
18-899/Spring 1998/Chen
Analysis
Filterbank
Scaler &
Quantizer
Mux
Layer III Encoder
FFT
MDCT
Huffman
Coding
Masking
Threshold
Generator
Coding
6 or 18
with overlap
9
18-899/Spring 1998/Chen
New Features in Layer III
Modified DCT (MDCT)
DCT with overlap
Long/short window switching
Short for better temporal resolution (to prevent pre-echoes)
Long for better frequency resolution
Nonuniform quantization
Entropy coding
Run-length and Huffman coding
Bit reservoir (buffer)
18-899/Spring 1998/Chen
Side Info Subband Sanples Header Info Aux Data
Frame Structure
Header info: Sync bits, system info, CRC (cyclic
redundancy code)
Side info: bit allocation, scalefactor, (and scalefactor select
for Layer II and III)
Subband samples: 32 12 for Layer I, 32 36 for Layer II
and III
Packetization: 4-byte header, 184-byte payload
10
18-899/Spring 1998/Chen
Stereo Redundancy Coding
Four modes: mono, stereo, dual with two separate
channel, joint stereo
In joint stereo mode
Human stereo perception > 2kHz is based on envelope
Intensity stereo coding > 2kHz
Encode (L + R)
Assign independent left- and right- scalefactors
Layer III supports (L+R) and (LR) coding
18-899/Spring 1998/Chen
References
Peter Noll, MPEG digital audio coding, IEEE Signal
Processing Magazine, Sept. 1997, pp. 59-81
D. Pan, A tutorial on MPEG/Audio compression,
IEEE Trans. on Multimedia, vol. 2, no. 2, 1995, pp.
60-74

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy