0% found this document useful (0 votes)
59 views

Video Coding

The document discusses video coding and compression techniques. It begins with an introduction on why video compression is important and the differences between video and image compression. It then covers key concepts in video coding including coding standards, quality evaluation, and open issues. Specific techniques discussed include motion estimation, motion compensation, block matching algorithms, and the group of picture structure in the MPEG-1 standard.

Uploaded by

Cong-Son Tran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
59 views

Video Coding

The document discusses video coding and compression techniques. It begins with an introduction on why video compression is important and the differences between video and image compression. It then covers key concepts in video coding including coding standards, quality evaluation, and open issues. Specific techniques discussed include motion estimation, motion compensation, block matching algorithms, and the group of picture structure in the MPEG-1 standard.

Uploaded by

Cong-Son Tran
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Video Coding

Associate Prof. Nguyen Chan Hung


Head of Research and Development of Multimedia Technology Laboratory (RDLAB)
Hanoi University of Science and Technology

Agenda

Coding process
Video coding standards
Quality evaluation
Open issues

1
Introduction (1/2)
Why video compression technique is
important ?
One movie video without compression
◦ 720 x 480 pixels per frame
◦ 30 frames per second
◦ Total 90 minutes
◦ Full color
◦ The full data quantity = 167.96 G bytes !!

Introduction (2/2)
What is the difference between video
compression and image compression?
◦ Temporal Redundancy
Coding method to remove redundancy
◦ Intraframe Coding
Remove spatial redundancy
◦ Interframe Coding
Remove temporal redundancy

2
Desired Features
Better compression
Improved quality
Interactivity and Manipulation of Content
Error Resilience
Processing of content in the compressed
domain
Identification and selective
coding/decoding of the object of interest
Facilitate Search / Indexing (MPEG-7)

Time table

H.26L H.264
H.263
H.261

MPEG4
MPEG2/H.262
MPEG1
JPEG
Year 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004

3
Where is MPEG used?
Most probably.
◦ MPEG-1
Video-CD
Usually .mpg or .mpeg files are MPEG-1
DAB Digital Radio is MP2 (MPEG-1 Layer 2)
MP3 files (MPEG-1 Layer 3)
◦ MPEG-2:
.vob, .m2v, rarely .mpg files
Anything to do with DVD
Camcorders, DVD players, DVD recorders, TiVo
Digital TV
◦ MPEG-4:
High Quality AVI files
Video Phones
DivX
Some advanced audio players support MPEG-4 Advanced Audio Coding (AAC)
◦ NetMeeting and similar video-chat
H.263/+/++
◦ H.264
Some content has appeared recently, mainly trailers

R-D Performance of MPEG Codecs

50

48

H264
46

44 MPEG-4

42 MPEG-2
PSNR (Y)

MPEG-1
40

38

36

34

32
350 450 550 650 750 850 950 1050
Bit rate (kbps)

MPEG-1 MPEG-2 MPEG-4 H.264

4
CODEC Design

The most intuitive method to remove


temporal redundancy

3-Dimensional DCT
◦ Remove spatiotemporal correlation
◦ Good for low motion video
◦ Bad for high motion video

2 N −1 N −1 N −1
 π (2 x + 1)u   π (2 y + 1)v   π (2t + 1) w 
F ( x, y, t ) = C (u )C (v)C ( w)∑∑∑ Ψ ( x, y, t ) cos   cos  2 N  cos  2 N 
N t = 0 x =0 y = 0  2N
for u = 0,..., N − 1 ,v = 0,..., N − 1 and w = 0,..., N − 1
1/ 2 for k = 0
where N = 8 and C (k ) = 
 1 otherwise

10

5
(From Princeton EE330 S’01 by B.Liu)

“Horse ride” Pixel-wise difference w/o motion compensation

Motion estimation Residue after motion compensation

Motion Estimation
Help understanding the content of image sequence
◦ For surveillance

Help reduce temporal redundancy of video


◦ For compression

Stabilizing video by detecting and removing small, noisy global


motions
◦ For building stabilizer in camcorder

6
Motion Compensation

It aims to reduce the data transmitted by detecting the


motion of objects
◦ Use the previous as reference
◦ In steps:
Split the current frame in blocks. For each one:
Find the best-matching block in the reference frame
The best matching block is coded and transmitted
◦ Next frame can be used a reference too

Hybrid MC-
MC-DCT Video Encoder

• Intra-frame: encoded without prediction


• Inter-frame: predictively encoded => use quantized frames as ref for residue

7
The Exhaustive Block-Matching
Algorithm
Intensive computation
◦ Not suitable for implementation
◦ Fast Algorithm is necessary

15

Fast Algorithms for Block Matching


Basic ideas
◦ Matching errors near the best match are generally smaller than far away
◦ Skip candidates that are unlikely to give good match

8
Fast Block-
Block-Matching Algorithms
The characteristics of fast algorithm
◦ Not accurate compared with the exhaustive
method
◦ Save large computation
Two famous fast algorithm
◦ Coarse-Fine Three Steps Search Method
◦ 2-D logarithm Search Method

17

Three Steps Search Method (TSS)


Introduced by Koga et al in 1981.
◦ Very popular because of its simplicity and
also robust and near optimal performance.
◦ Searches for the best motion vectors in a
coarse to fine search pattern.

The algorithm:
Step 1: An initial step size is picked.
Eight blocks at a distance of step size
from the centre (around the centre
block) are picked for comparison.
Step 2: The step size is halved. The
centre is moved to the point with the
minimum distortion.
Steps 1 and 2 are repeated till the
step size becomes smaller than 1. A
particular path for the convergence of
this algorithm is shown below:

18

9
2-D logarithm Search Method (TDL)
Introduced by Jain & Jain requires
more computation, more accurate,
especially when the search window is
large
Step 1: Pick an initial step size. Look at
the block at the Centro the search are
and the four blocks at a distance of s
from this on the X and Y axes. (the five
positions form a + sign)
Step 2 : If the position of best match
is at the centre, halve the step size. If
however, one of the other four points
is the best match, then it becomes the
centre and step 1 is repeated.
Step 3: When the step size becomes
1, all the nine blocks around the centre
are chosen for the search and the best
among them is picked as the required
block.
19

The MPEG-
MPEG-1 Standard
Group of Pictures
Motion Estimation
Motion Compensation
Differential Coding
DCT
Quantization
Entropy Coding

20

10
Group of Pictures (1/2)
I-frame (Intracoded Frame)
◦ Coded in one frame such as DCT.
◦ This type of frame do not need previous frame

P-frame (Predictive Frame)


◦ One directional motion prediction from a previous frame
The reference can be either I-frame or P-frame
◦ Generally referred to as inter-frame

B-frame (Bi-directional predictive frame)


◦ Bi-directional motion prediction from a previous or future frame
The reference can be either I-frame or P-frame
◦ Generally referred to as inter-frame

21

Group of Pictures (2/2)


The distance between two nearest P-frame or P-frame
and I-frame
◦ denoted by M
The distance between the nearest I-frame
◦ denoted by N

22

11
MPEG--1 = JPEG + Motion Prediction + Rate Control
MPEG
Early motivation: to encode motion video at 1.5Mbits/s for transport over T1
data circuits and for replay from CD-ROM
Defines the decoder but not the encoder
Frames (pictures)
◦ Intra-coded using JPEG
◦ Inter-coded using (interpolated)
motion estimation & compensation
and JPEG for the residuals
Predicted and Bi-directional
MacroBlocks (MBs)
◦ 16×16 pixels block
Rate control
◦ buffer at each end
◦ Test Model 5 (TM5)

23

MPEG--1 – Motion Prediction


MPEG
Motion prediction = motion estimation + error compensation

24

12
MPEG--2 = MPEG-
MPEG MPEG-1 + …

Improvements
◦ Color space: could support 4:2:2 and 4:4:4 coding
◦ Quantization: could have 9- or 10- bit precision for DC coefficients
◦ Concealment motion vectors: used when an intra-MB is lost
◦ Pan and Scan: supports display of different aspect ratios, e.g., 16:9
Profiles and levels
◦ Profiles: define the tools or syntactical elements
◦ Levels: define the permissible ranges of parameters
Interlace tools
Scalable coding profiles
System layer: define two bit stream constructs
◦ Program stream (PS): modeled on MPEG-1 (backward compatibility)
◦ Transport stream (TS): more robust, does not need a common time base,
designed for use in error-prone environment.

25

The MPEG-
MPEG-2 Standard
The main encoder structure is similar to
that of the MPEG-1 standard
Field/frame DCT coding
Field/frame prediction mode selection
Alternative scan order
Various picture sampling formats
User defined quantization matrix

26

13
MPEG – Scalable Coding (SC)
Non-scalable coding
◦ To optimize video quality at a given
bit rate.
Base and enhancement layer SC
◦ To optimize video quality at two given
bit rates.
◦ SNR SC (different quantization accuracy)
◦ Temporal SC (different frame rates)
◦ Spatial SC (different spatial resolution)
Fine granularity scalability (FGS)
◦ To optimize the video quality over a given bit rate
range
◦ Also has base layer and enhancement layer
◦ Enhancement layer uses bit-plane coding
Bit-plane coding considers each quantized DCT
coefficient as a binary integer of several bits
instead of a decimal integer of a certain value
Frequency weighting and selective enhancement
2-layer SNR scalable coder
27

Field/Frame DCT Coding


The field type DCT
◦ Fast motion video
The frame type DCT
◦ Slow motion video

28

14
Alternative Scan Order
Zigzag scan order
◦ Frame DCT
Alternative scan order
◦ Field DCT

29

The MPEG-
MPEG-2 Encoder (1/2)
Base Layer
◦ Basic quality requirement
◦ For SDTV
Enhanced Layer
◦ High quality service
◦ For HDTV

30

15
The MPEG-
MPEG-2 Encoder (2/2)
Quantization
◦ User can change the quantization if necessary
◦ Quantization matrix
Various picture sampling formats
◦ 4:4:4 8 16 19 22 26 27 29 34
◦ 4:2:2  
16 16 22 24 27 29 34 37
◦ 4:2:0 19 22 26 27 29 34 34 38
 
22 22 26 27 29 34 37 40
Qintra =
22 26 27 29 32 35 40 48
 
26 27 29 32 35 40 48 58
26 27 29 34 38 46 56 69
 
27 29 35 38 46 56 69 83

31

MPEG-4 = MPEG-
MPEG- MPEG-2+Objects+Other
Enhancements

Objects (optional)
◦ Video (texture+shape), image, audio, speech, text, etc.
◦ Encoded using different techniques
◦ Transmitted independently
◦ Composited at the decoder using BInary Format for Scenes (BIFS)
Improvements in MPEG-4 version2
◦ Global motion compensation (GMC)
◦ Quarter pixel motion compensation
◦ Shape-adaptive DCT
Why is MPEG-4 not a success as MPEG-2?
◦ Not substantially better than MPEG-2
◦ Issue of licensing

32

16
MPEG--4 – Error Resilience Tools
MPEG
Video packet resynchronization
◦ Previous coding standards: Resynchronization markers are fixed at the beginning of each
row of MBs
◦ MPEG-4: Resynchronization markers are inserted at every K bits
Data partitioning
◦ Partitions the data in a video packet into a motion part and a texture part separated by a
motion boundary marker (MBM)
Reversible variable length codes (RVLC)
◦ Finds the next resynchronization marker and decode backwards
Header extension code (HEC)
◦ The header information is repeated after the 1-bit HEC
Unequal error protection technique (UEP)
VP DC DCT AC DCT VP Motion Texture
I-VOP P-VOP
Header data data Header data data
A video
packet Resync. MB Repeated Motion
QP HEC MBM DCT data
marker No. header info. data use discard use

33

Advanced Video Coding/ ITU-


ITU-T Recommendation
H.264/ ISO/IEC MPEG-
MPEG-4 (Part 10)

H.264 structure
◦ Video coding layer (VCL)
◦ Network abstraction layer (NAL)
Possible applications of H.264
◦ Conversational services operated
below 1Mbps with low latency.
◦ Entertainment services operated between 1-8+ Mbps with moderate latency
such as 0.5-2s in modified MPEG-2/H.222.0 systems.
Broadcast via satellite, cable, terrestrial or DSL
DVD for standard and high-definition video
Video-on-demand via various channels
◦ Streaming services operated at 50-1500kbps with 2s or more of latency.

34

17
New Features of H.264
Multi-mode, multi-reference MC
Motion vector can point out of image border
1/4-, 1/8-pixel motion vector precision
B-frame prediction weighting
4×4 integer transform
Multi-mode intra-prediction
In-loop de-blocking filter
UVLC (Uniform Variable Length Coding)
NAL (Network Abstraction Layer)
SP-slices

Profiles and Levels


Profiles: Baseline, Main, and X
◦ Baseline: Progressive, Videoconferencing & Wireless
◦ Main: esp. Broadcast
◦ X: Mobile network
Baseline profile is the minimum implementation
◦ Without CABAC, 1/8 MC, B-frame, SP-slices
11 levels
◦ Resolution, capability, bit rate, buffer, reference #
◦ Built to match popular international production and
emission formats
◦ From QCIF to D-Cinema

18
Basic Macroblock Coding Structure
Input Coder
Video Control
Signal Control
Data
Transform/
Quant.
Scal./Quant.
- Transf. coeffs
Decoder Scaling & Inv.
Split into
Macroblocks Transform
16x16 pixels Entropy
Coding
De-blocking
Intra-frame Filter
Prediction
Output
Motion- Video
Compensation Signal
Intra/Inter

Motion
Data
Motion
Estimation

Variable block size


The fixed block size may not be suitable for all
motion objects
◦ Improve the flexibility of comparison
◦ Reduce the error of comparison
7 types of blocks for selection
◦ 16×16, 16×8, 8×16, 8×8, 8×4, 4×8, 4×4

38

19
Motion Compensation
Input Coder
Video Control
Signal Control
Data
Transform/
Quant.
Scal./Quant.
- Transf. coeffs
Decoder Scaling & Inv.
Split into
Macroblocks Transform
16x16 pixels Entropy
Coding
De-blocking
16x16 16x8 8x16 8x8
Intra-frame MBFilter 0 0 1
Prediction Types 0 0 1
Output1 2 3
Motion- Video
Compensation 8x8 8x4
Signal 4x8 4x4
Intra/Inter
8x8 0 0 1
0 0 1
Types Motion
1 2 3
Data
Motion Various block sizes and shapes
Estimation

Multiple Reference Frames


The neighboring frames are not the most similar in
some cases
The B-frame can be reference frame
◦ B-frame is close to the target frame in many situations

40

20
Multiple Reference Frames
Input Coder
Video Control
Signal Control
Data
Transform/
Quant.
Scal./Quant.
- Transf. coeffs
Decoder Scaling & Inv.
Split into
Macroblocks Transform
16x16 pixels Entropy
Coding
De-blocking
Intra-frame Filter
Prediction
Output
Motion- Video
Compensation Signal
Intra/Inter

Motion
Multiple Reference Data
Frames for
Motion Motion Compensation
Estimation

B-frame Prediction Weighting

Time
I0 B1 B2 B3 P4 B5 B6

Playback order: I0 B1 B2 B3 P4 B5 B6 ……...


Bitstream order: I0 P4 B1 B3 B2 P8 B5 ……...

21
Intra Prediction
Predict the similarity between the
neighboring pixels in one frame in
advance, and exploit transform coding to
remove the redundancy.

43

Intra--Coded Macroblocks
Intra
H.264 MPEG-1/2/4, H.261/3

Spatial prediction No spatial prediction


Prediction in Encode the prediction modes
space domain
(Use predictive coding if 4x4
modes are used)
Integer transform of residue 8x8 Discrete Cosine
Transform Transform (DCT) for pixel
values
Quantization Quantization including scaling Quantization
No coefficient prediction Coefficient prediction (for
Prediction in
frequency
DC values in MPEG-2 and
domain AC values in the first row
and column in MPEG-4)

44

22
Spatial Prediction for Intra-
Intra-Coded MBs

luma M A B C D M A B C D M A B C D M A B C D E F G H
I I I I
- 4x4: 9 modes
J J J
Mean
J
K K
L
K
(A-D,
I-M)
K
L

L L

- 16x16: 4 modesH H H H

……..
V …….. V V
Mean V
(H, V)
chroma
- 8x8: 4modesH H H H

……..
V
Mean V V …….. V
(H, V)

- The same prediction mode is always applied to both chroma


blocks

45

23

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy