Video Coding
Video Coding
Agenda
Coding process
Video coding standards
Quality evaluation
Open issues
1
Introduction (1/2)
Why video compression technique is
important ?
One movie video without compression
◦ 720 x 480 pixels per frame
◦ 30 frames per second
◦ Total 90 minutes
◦ Full color
◦ The full data quantity = 167.96 G bytes !!
Introduction (2/2)
What is the difference between video
compression and image compression?
◦ Temporal Redundancy
Coding method to remove redundancy
◦ Intraframe Coding
Remove spatial redundancy
◦ Interframe Coding
Remove temporal redundancy
2
Desired Features
Better compression
Improved quality
Interactivity and Manipulation of Content
Error Resilience
Processing of content in the compressed
domain
Identification and selective
coding/decoding of the object of interest
Facilitate Search / Indexing (MPEG-7)
Time table
H.26L H.264
H.263
H.261
MPEG4
MPEG2/H.262
MPEG1
JPEG
Year 1986 1988 1990 1992 1994 1996 1998 2000 2002 2004
3
Where is MPEG used?
Most probably.
◦ MPEG-1
Video-CD
Usually .mpg or .mpeg files are MPEG-1
DAB Digital Radio is MP2 (MPEG-1 Layer 2)
MP3 files (MPEG-1 Layer 3)
◦ MPEG-2:
.vob, .m2v, rarely .mpg files
Anything to do with DVD
Camcorders, DVD players, DVD recorders, TiVo
Digital TV
◦ MPEG-4:
High Quality AVI files
Video Phones
DivX
Some advanced audio players support MPEG-4 Advanced Audio Coding (AAC)
◦ NetMeeting and similar video-chat
H.263/+/++
◦ H.264
Some content has appeared recently, mainly trailers
50
48
H264
46
44 MPEG-4
42 MPEG-2
PSNR (Y)
MPEG-1
40
38
36
34
32
350 450 550 650 750 850 950 1050
Bit rate (kbps)
4
CODEC Design
3-Dimensional DCT
◦ Remove spatiotemporal correlation
◦ Good for low motion video
◦ Bad for high motion video
2 N −1 N −1 N −1
π (2 x + 1)u π (2 y + 1)v π (2t + 1) w
F ( x, y, t ) = C (u )C (v)C ( w)∑∑∑ Ψ ( x, y, t ) cos cos 2 N cos 2 N
N t = 0 x =0 y = 0 2N
for u = 0,..., N − 1 ,v = 0,..., N − 1 and w = 0,..., N − 1
1/ 2 for k = 0
where N = 8 and C (k ) =
1 otherwise
10
5
(From Princeton EE330 S’01 by B.Liu)
Motion Estimation
Help understanding the content of image sequence
◦ For surveillance
6
Motion Compensation
Hybrid MC-
MC-DCT Video Encoder
7
The Exhaustive Block-Matching
Algorithm
Intensive computation
◦ Not suitable for implementation
◦ Fast Algorithm is necessary
15
8
Fast Block-
Block-Matching Algorithms
The characteristics of fast algorithm
◦ Not accurate compared with the exhaustive
method
◦ Save large computation
Two famous fast algorithm
◦ Coarse-Fine Three Steps Search Method
◦ 2-D logarithm Search Method
17
The algorithm:
Step 1: An initial step size is picked.
Eight blocks at a distance of step size
from the centre (around the centre
block) are picked for comparison.
Step 2: The step size is halved. The
centre is moved to the point with the
minimum distortion.
Steps 1 and 2 are repeated till the
step size becomes smaller than 1. A
particular path for the convergence of
this algorithm is shown below:
18
9
2-D logarithm Search Method (TDL)
Introduced by Jain & Jain requires
more computation, more accurate,
especially when the search window is
large
Step 1: Pick an initial step size. Look at
the block at the Centro the search are
and the four blocks at a distance of s
from this on the X and Y axes. (the five
positions form a + sign)
Step 2 : If the position of best match
is at the centre, halve the step size. If
however, one of the other four points
is the best match, then it becomes the
centre and step 1 is repeated.
Step 3: When the step size becomes
1, all the nine blocks around the centre
are chosen for the search and the best
among them is picked as the required
block.
19
The MPEG-
MPEG-1 Standard
Group of Pictures
Motion Estimation
Motion Compensation
Differential Coding
DCT
Quantization
Entropy Coding
20
10
Group of Pictures (1/2)
I-frame (Intracoded Frame)
◦ Coded in one frame such as DCT.
◦ This type of frame do not need previous frame
21
22
11
MPEG--1 = JPEG + Motion Prediction + Rate Control
MPEG
Early motivation: to encode motion video at 1.5Mbits/s for transport over T1
data circuits and for replay from CD-ROM
Defines the decoder but not the encoder
Frames (pictures)
◦ Intra-coded using JPEG
◦ Inter-coded using (interpolated)
motion estimation & compensation
and JPEG for the residuals
Predicted and Bi-directional
MacroBlocks (MBs)
◦ 16×16 pixels block
Rate control
◦ buffer at each end
◦ Test Model 5 (TM5)
23
24
12
MPEG--2 = MPEG-
MPEG MPEG-1 + …
Improvements
◦ Color space: could support 4:2:2 and 4:4:4 coding
◦ Quantization: could have 9- or 10- bit precision for DC coefficients
◦ Concealment motion vectors: used when an intra-MB is lost
◦ Pan and Scan: supports display of different aspect ratios, e.g., 16:9
Profiles and levels
◦ Profiles: define the tools or syntactical elements
◦ Levels: define the permissible ranges of parameters
Interlace tools
Scalable coding profiles
System layer: define two bit stream constructs
◦ Program stream (PS): modeled on MPEG-1 (backward compatibility)
◦ Transport stream (TS): more robust, does not need a common time base,
designed for use in error-prone environment.
25
The MPEG-
MPEG-2 Standard
The main encoder structure is similar to
that of the MPEG-1 standard
Field/frame DCT coding
Field/frame prediction mode selection
Alternative scan order
Various picture sampling formats
User defined quantization matrix
26
13
MPEG – Scalable Coding (SC)
Non-scalable coding
◦ To optimize video quality at a given
bit rate.
Base and enhancement layer SC
◦ To optimize video quality at two given
bit rates.
◦ SNR SC (different quantization accuracy)
◦ Temporal SC (different frame rates)
◦ Spatial SC (different spatial resolution)
Fine granularity scalability (FGS)
◦ To optimize the video quality over a given bit rate
range
◦ Also has base layer and enhancement layer
◦ Enhancement layer uses bit-plane coding
Bit-plane coding considers each quantized DCT
coefficient as a binary integer of several bits
instead of a decimal integer of a certain value
Frequency weighting and selective enhancement
2-layer SNR scalable coder
27
28
14
Alternative Scan Order
Zigzag scan order
◦ Frame DCT
Alternative scan order
◦ Field DCT
29
The MPEG-
MPEG-2 Encoder (1/2)
Base Layer
◦ Basic quality requirement
◦ For SDTV
Enhanced Layer
◦ High quality service
◦ For HDTV
30
15
The MPEG-
MPEG-2 Encoder (2/2)
Quantization
◦ User can change the quantization if necessary
◦ Quantization matrix
Various picture sampling formats
◦ 4:4:4 8 16 19 22 26 27 29 34
◦ 4:2:2
16 16 22 24 27 29 34 37
◦ 4:2:0 19 22 26 27 29 34 34 38
22 22 26 27 29 34 37 40
Qintra =
22 26 27 29 32 35 40 48
26 27 29 32 35 40 48 58
26 27 29 34 38 46 56 69
27 29 35 38 46 56 69 83
31
MPEG-4 = MPEG-
MPEG- MPEG-2+Objects+Other
Enhancements
Objects (optional)
◦ Video (texture+shape), image, audio, speech, text, etc.
◦ Encoded using different techniques
◦ Transmitted independently
◦ Composited at the decoder using BInary Format for Scenes (BIFS)
Improvements in MPEG-4 version2
◦ Global motion compensation (GMC)
◦ Quarter pixel motion compensation
◦ Shape-adaptive DCT
Why is MPEG-4 not a success as MPEG-2?
◦ Not substantially better than MPEG-2
◦ Issue of licensing
32
16
MPEG--4 – Error Resilience Tools
MPEG
Video packet resynchronization
◦ Previous coding standards: Resynchronization markers are fixed at the beginning of each
row of MBs
◦ MPEG-4: Resynchronization markers are inserted at every K bits
Data partitioning
◦ Partitions the data in a video packet into a motion part and a texture part separated by a
motion boundary marker (MBM)
Reversible variable length codes (RVLC)
◦ Finds the next resynchronization marker and decode backwards
Header extension code (HEC)
◦ The header information is repeated after the 1-bit HEC
Unequal error protection technique (UEP)
VP DC DCT AC DCT VP Motion Texture
I-VOP P-VOP
Header data data Header data data
A video
packet Resync. MB Repeated Motion
QP HEC MBM DCT data
marker No. header info. data use discard use
33
H.264 structure
◦ Video coding layer (VCL)
◦ Network abstraction layer (NAL)
Possible applications of H.264
◦ Conversational services operated
below 1Mbps with low latency.
◦ Entertainment services operated between 1-8+ Mbps with moderate latency
such as 0.5-2s in modified MPEG-2/H.222.0 systems.
Broadcast via satellite, cable, terrestrial or DSL
DVD for standard and high-definition video
Video-on-demand via various channels
◦ Streaming services operated at 50-1500kbps with 2s or more of latency.
34
17
New Features of H.264
Multi-mode, multi-reference MC
Motion vector can point out of image border
1/4-, 1/8-pixel motion vector precision
B-frame prediction weighting
4×4 integer transform
Multi-mode intra-prediction
In-loop de-blocking filter
UVLC (Uniform Variable Length Coding)
NAL (Network Abstraction Layer)
SP-slices
18
Basic Macroblock Coding Structure
Input Coder
Video Control
Signal Control
Data
Transform/
Quant.
Scal./Quant.
- Transf. coeffs
Decoder Scaling & Inv.
Split into
Macroblocks Transform
16x16 pixels Entropy
Coding
De-blocking
Intra-frame Filter
Prediction
Output
Motion- Video
Compensation Signal
Intra/Inter
Motion
Data
Motion
Estimation
38
19
Motion Compensation
Input Coder
Video Control
Signal Control
Data
Transform/
Quant.
Scal./Quant.
- Transf. coeffs
Decoder Scaling & Inv.
Split into
Macroblocks Transform
16x16 pixels Entropy
Coding
De-blocking
16x16 16x8 8x16 8x8
Intra-frame MBFilter 0 0 1
Prediction Types 0 0 1
Output1 2 3
Motion- Video
Compensation 8x8 8x4
Signal 4x8 4x4
Intra/Inter
8x8 0 0 1
0 0 1
Types Motion
1 2 3
Data
Motion Various block sizes and shapes
Estimation
40
20
Multiple Reference Frames
Input Coder
Video Control
Signal Control
Data
Transform/
Quant.
Scal./Quant.
- Transf. coeffs
Decoder Scaling & Inv.
Split into
Macroblocks Transform
16x16 pixels Entropy
Coding
De-blocking
Intra-frame Filter
Prediction
Output
Motion- Video
Compensation Signal
Intra/Inter
Motion
Multiple Reference Data
Frames for
Motion Motion Compensation
Estimation
Time
I0 B1 B2 B3 P4 B5 B6
21
Intra Prediction
Predict the similarity between the
neighboring pixels in one frame in
advance, and exploit transform coding to
remove the redundancy.
43
Intra--Coded Macroblocks
Intra
H.264 MPEG-1/2/4, H.261/3
44
22
Spatial Prediction for Intra-
Intra-Coded MBs
luma M A B C D M A B C D M A B C D M A B C D E F G H
I I I I
- 4x4: 9 modes
J J J
Mean
J
K K
L
K
(A-D,
I-M)
K
L
…
L L
- 16x16: 4 modesH H H H
……..
V …….. V V
Mean V
(H, V)
chroma
- 8x8: 4modesH H H H
……..
V
Mean V V …….. V
(H, V)
45
23