Video Compression: Dereje Teferi (PHD) Dereje - Teferi@Aau - Edu.Et
Video Compression: Dereje Teferi (PHD) Dereje - Teferi@Aau - Edu.Et
2
Intro …
Typically 30 frames are displayed on the screen every
second.
There will be lots of information repeated in the
consecutive frames.
For ex. If a tree is displayed for one second then 30
frames contain that tree.
This information can be used in the compression and
the frames can be defined based on previous frames.
Consecutive frames can have information like "move
this part of the tree to this place".
3
Introduction…
When it comes to compression
Frames can be compressed using only the
information in that frame (intra-frame) or
using information in other frames as well (inter-
frame).
Intra-frame coding allows random access
operations like fast forward and provides fault
tolerance.
If part of a frame is lost, the next intraframe and
the frames after that can be displayed because they
only depend on the intra-frame.
4
Intro …
Every color can be represented as a combination of
red, green and blue.
Images can also be represented using this color
space.
However this color space called RGB is not suitable
for compression since it does not consider the
perception of humans.
YUV color space where only Y gives the grayscale
image is more convenient for compression. Because
Human eye is more sensitive to changes is Y.
YUV is also used by the NTSC, PAL, SECAM
composite color TV standards.
5
Introduction …
Compression ratio is the ratio of the size of the original
video to the size of the compressed video.
To get better compression ratios pixels are predicted based
on other pixels.
In spatial prediction, the prediction of a pixel can be
obtained from pixels of the same image/frame,
In temporal prediction, the prediction of a pixel is
obtained from a previously transmitted image.
Motion compensation establishes a correspondence
between elements of nearby images in the video sequence.
The main application of motion compensation is providing a
useful prediction for a given image from a reference image
6
MPEG General Information
MPEG=Motion Picture Expert Group
Goal: data compression 1.5 Mbps
MPEG defines video, audio coding and system
data streams with synchronization
Each image is divided into macro-blocks
Macro-block : 16x16 pixels for luminance; 8x8
pixels for each chrominance component
Macro-blocks are useful for Motion Estimation
7
Temporal Redundancy
• Temporal redundancy between pixels of adjacent frames of
video sequence is used for compression
• The pixel differences at the same spatial location between
consecutive frames are typically very small
8
Inter-frame Encoder
Conditional Replacement
picture, and
10
MPEG Video Processing …
P-frames are coded with respect to a past I-frame
or P-frame, resulting in a smaller encoded frame
size than the I-frames.
I1 B1 B2 P1 B3 B4 P2 B5 B6 P3 B7 B8 I2
12
Selecting I, P, or B Frames
Heuristics(rules ..)
change of scenes should generate I frame
limit B and P frames between I frames
B frames are computationally intense
I 18K 7:1
P 6K 20:1
B 2.5K 50:1
Intra-coded images
I-frames – points of
random access in
MPEG stream
No quantization
table for all DCT
coefficients, only
quantization factor
14
MPEG Video P-Frames
Motion Estimation Method
15
Motion Computation for P Frames
Predictive search
Look for match window within a given search
window
Match window – macro-block
Search window – arbitrary window size
depending how far away are we willing to look
Displacement of two match windows is expressed
by motion vector
16
Block Based Motion Est.
18
Matching Methods
N 1
SSD metric
(Sum of Squared Difference)
SSD ( xi yi ) 2
i 0
N 1
SAD metric SAD | xi yi |
(Sum of Absolute Difference)
i 0
19
MPEG Video B Frames
Bi-directionally Predictive-coded frames
21
MPEG Video Decoding
Display Order
I1 B1 B2 P1 B3 B4 P2 B5 B6 P3 B7 B8 I2
Decoding Order
I1 P1 B1 B2 P2 B3 B4 P3 B5 B6 I2 B7 B8
22
MPEG Audio Encoding
Characteristics
Precision 16 bits
Sampling frequency: 32KHz, 44.1 KHz, 48 KHz
3 compression layers: Layer 1, Layer 2, Layer 3
(MP3)
Layer 3: 32-320 kbps, target 64 kbps
Layer 2: 32-384 kbps, target 128 kbps
Layer 1: 32-448 kbps, target 192 kbps
23
MPEG Audio Encoding Steps
24
MPEG Audio Psycho-acoustic Model
MPEG audio compresses by removing
acoustically irrelevant parts of audio signals
Takes advantage of human auditory systems
inability to hear quantization noise under
auditory masking
Auditory masking: occurs when ever the
presence of a strong audio signal makes a
temporal or spectral neighborhood of
weaker audio signals imperceptible.
25
MPEG Audio Comments
Precision of 16 bits per sample is needed to
get good (Signal to Noise Ratio) SNR
The noise we are getting is quantization
noise from the digitization process
Masking effect means that we can raise the
noise floor around a strong sound because
the noise will be masked away
Raising noise floor is the same as using less
bits and using less bits is the same as
compression
26