0% found this document useful (0 votes)
62 views

Video Compression: Dereje Teferi (PHD) Dereje - Teferi@Aau - Edu.Et

Video is a sequence of images called frames displayed in order. Consecutive frames contain redundant information that can be compressed. MPEG video compression uses intra-frame coding for individual frames and inter-frame coding to remove redundancy between frames using motion estimation. It divides frames into I, P, and B frames and exploits the fact that human vision is less sensitive to small changes over time to achieve high compression ratios with some loss of visual quality.

Uploaded by

dave
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
62 views

Video Compression: Dereje Teferi (PHD) Dereje - Teferi@Aau - Edu.Et

Video is a sequence of images called frames displayed in order. Consecutive frames contain redundant information that can be compressed. MPEG video compression uses intra-frame coding for individual frames and inter-frame coding to remove redundancy between frames using motion estimation. It divides frames into I, P, and B frames and exploits the fact that human vision is less sensitive to small changes over time to achieve high compression ratios with some loss of visual quality.

Uploaded by

dave
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 26

Video Compression

Dereje Teferi (PhD)


dereje.teferi@aau.edu.et
Introduction
 Video is a sequence of images which are displayed in
order.
 Each of these images is called a frame.
 We cannot notice small changes in the frames like a
slight difference of color so video compression
standards do not encode all the details in the video,
some of the details are lost. This is called lossy
compression.
 It is possible to get very high compression ratios
when lossy compression is used.

2
Intro …
 Typically 30 frames are displayed on the screen every
second.
 There will be lots of information repeated in the
consecutive frames.
 For ex. If a tree is displayed for one second then 30
frames contain that tree.
 This information can be used in the compression and
the frames can be defined based on previous frames.
 Consecutive frames can have information like "move
this part of the tree to this place".
3
Introduction…
 When it comes to compression
 Frames can be compressed using only the
information in that frame (intra-frame) or
 using information in other frames as well (inter-
frame).
 Intra-frame coding allows random access
operations like fast forward and provides fault
tolerance.
 If part of a frame is lost, the next intraframe and
the frames after that can be displayed because they
only depend on the intra-frame.
4
Intro …
 Every color can be represented as a combination of
red, green and blue.
 Images can also be represented using this color
space.
 However this color space called RGB is not suitable
for compression since it does not consider the
perception of humans.
 YUV color space where only Y gives the grayscale
image is more convenient for compression. Because
Human eye is more sensitive to changes is Y.
 YUV is also used by the NTSC, PAL, SECAM
composite color TV standards.
5
Introduction …
 Compression ratio is the ratio of the size of the original
video to the size of the compressed video.
 To get better compression ratios pixels are predicted based
on other pixels.
 In spatial prediction, the prediction of a pixel can be
obtained from pixels of the same image/frame,
 In temporal prediction, the prediction of a pixel is
obtained from a previously transmitted image.
 Motion compensation establishes a correspondence
between elements of nearby images in the video sequence.
 The main application of motion compensation is providing a
useful prediction for a given image from a reference image
6
MPEG General Information
 MPEG=Motion Picture Expert Group
 Goal: data compression 1.5 Mbps
 MPEG defines video, audio coding and system
data streams with synchronization
 Each image is divided into macro-blocks
 Macro-block : 16x16 pixels for luminance; 8x8
pixels for each chrominance component
 Macro-blocks are useful for Motion Estimation

7
Temporal Redundancy
• Temporal redundancy between pixels of adjacent frames of
video sequence is used for compression
• The pixel differences at the same spatial location between
consecutive frames are typically very small

8
Inter-frame Encoder
 Conditional Replacement

 Inter-frame coders produce differential signals. This


difference is
 effectively zero in the non-changing parts of the

picture, and

 non-zero only in the moving areas

 Thus, we can transmit the differential signal values


only for the moving areas of the picture
9
MPEG Video Processing
 All MPEG frames are encoded in one of three
different ways: Intra-coded (I-frames), Predictive-
coded (P-frames), or Bidirectionally-predictive-
coded (B-frames).
 I-frames are encoded as discrete frames,
independent of adjacent frames.
 Thus, they provide randomly accessible points
within the video stream.
 Due to this, I-frames have the worst compression
ratio of the three frames.

10
MPEG Video Processing …
 P-frames are coded with respect to a past I-frame
or P-frame, resulting in a smaller encoded frame
size than the I-frames.

 The B-frames require a preceding and a future


frame, which may be either I-frames or P-
frames, in order to be decoded,

 B frames offer the highest degree of compression.


11
MPEG Video Processing …
 Intra frames (compressed using JPEG)
 typically about 12 frames between I frames
 Predictive frames
 encode from previous I or P reference frame
 Bi-directional frames
 encode from previous and future I or P frames

I1 B1 B2 P1 B3 B4 P2 B5 B6 P3 B7 B8 I2

12
Selecting I, P, or B Frames
 Heuristics(rules ..)
 change of scenes should generate I frame
 limit B and P frames between I frames
 B frames are computationally intense

Type Size Compress

I 18K 7:1

P 6K 20:1

B 2.5K 50:1

Avg 4.8K 27:1


13
MPEG Video I-Frames(JPEG)

Intra-coded images

I-frames – points of
random access in
MPEG stream

I-frames use 8x8


blocks defined within
Macro-block

No quantization
table for all DCT
coefficients, only
quantization factor

14
MPEG Video P-Frames
Motion Estimation Method

Predictive coded frames


require information of
previous I frame and or
previous P frame for
encoding/decoding

For Temporary Redundancy


we determine last P or I frame
that is most similar to the
block under consideration

15
Motion Computation for P Frames
 Predictive search
 Look for match window within a given search
window
 Match window – macro-block
 Search window – arbitrary window size
depending how far away are we willing to look
 Displacement of two match windows is expressed
by motion vector

16
Block Based Motion Est.

16x16 – Macro block


17
Full search algorithm

18
Matching Methods
N 1
 SSD metric
(Sum of Squared Difference)
SSD   ( xi  yi ) 2
i 0

N 1
 SAD metric SAD   | xi  yi |
(Sum of Absolute Difference)
i 0

 Minimum error represents best match


 must be below a specified threshold
 error and perceptual similarity not always
correlated

19
MPEG Video B Frames
Bi-directionally Predictive-coded frames

21
MPEG Video Decoding
Display Order

I1 B1 B2 P1 B3 B4 P2 B5 B6 P3 B7 B8 I2
Decoding Order

I1 P1 B1 B2 P2 B3 B4 P3 B5 B6 I2 B7 B8
22
MPEG Audio Encoding
 Characteristics
 Precision 16 bits
 Sampling frequency: 32KHz, 44.1 KHz, 48 KHz
 3 compression layers: Layer 1, Layer 2, Layer 3
(MP3)
 Layer 3: 32-320 kbps, target 64 kbps
 Layer 2: 32-384 kbps, target 128 kbps
 Layer 1: 32-448 kbps, target 192 kbps

23
MPEG Audio Encoding Steps

 Filter bank divides input


into 32 equal frequency
sub-bands

24
MPEG Audio Psycho-acoustic Model
 MPEG audio compresses by removing
acoustically irrelevant parts of audio signals
 Takes advantage of human auditory systems
inability to hear quantization noise under
auditory masking
 Auditory masking: occurs when ever the
presence of a strong audio signal makes a
temporal or spectral neighborhood of
weaker audio signals imperceptible.
25
MPEG Audio Comments
 Precision of 16 bits per sample is needed to
get good (Signal to Noise Ratio) SNR
 The noise we are getting is quantization
noise from the digitization process
 Masking effect means that we can raise the
noise floor around a strong sound because
the noise will be masked away
 Raising noise floor is the same as using less
bits and using less bits is the same as
compression
26

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy