Multimedia Communications Lecture 10: Video Standards H.261/H.263
Multimedia Communications Lecture 10: Video Standards H.261/H.263
Multimedia Communications
Lecture 10: Video Standards
Part I. Videophone and video
conferencing: H.261/H.263
Dr. Tian-Sheuan Chang
tschang@twins.ee.nctu.edu.tw
Dept. Electronics Engineering
National Chiao-Tung University
Adapted from Prof. Hangs slides
Observation
Significant amount of statistical and subjective redundancy within and
between frames
Statistical redundancy
Lossless compression
e.g. 000000000000000 -> run length coding, arithmetic coding,
huffman coding
Subjective redundancy
Lossy compression
Explore characteristics of Human Visual System
Not sensitive to high frequency component
Spatial redundancy
DCT transform, quantized high freq. component
Temporal redundancy
Motion estimation
Pre-Processing
Encoding
Post-Processing
& Error Recovery
Decoding
Destination
Scope of Standard
3
1950s
1960s
1970s
1980s
1990s
DPCM 1952-1980
Motion Compensated Prediction 1972-1989
H.261 1984-1990
JPEG 1984-1992
MPEG1
19881992
MPEG2
1991-1994
H.263
MPEG4
ITU/MPEG Standards
H.261
ITU H.261
Optimized for CIF@384Kbps, focus on video phone over ISDN
First design (late 90) embodying typical structure that dominates today
16x16 macroblock motion compensation, 8x8 DCT, scalar quantization, and
variable-length coding
MPEG-1
ISO/IEC 11172
1993 IS, design focus on VHS quality (352x240)@1.5Mbps
MPEG-2
ISO/IEC 13818
1994 IS, Optimized at NTSC quality CCIR601 video@6-10Mbps
H.263
ITU H.263
Focus on video phone over phone lines/wireless
MPEG-4
officially ISO/IEC 14496
Part 2. video : 2001 IS, content based video coding, interactive video
Part 10. advance video coding (AVC) ITU H.264
2004 IS, 50% bit rate reduction than other video standard
5
H.261
Dept. Electronics Engineering, National Chiao Tung University
/3
Institute of Electronics, National Chiao Tung University
H.324 Terminal
10
1
3
2
4
Cb
Y
Cr
10 11
12 13 14 15 16 17 18 19 20 21 22
23 24 25 26 27 28 29 30 31 32 33
12
13
Syntax
Institute of Electronics, National Chiao Tung University
14
Syntax (cont.)
Institute of Electronics, National Chiao Tung University
15
Syntax (cont.)
Institute of Electronics, National Chiao Tung University
16
17
18
Macroblocks are of size 16x16 pixels for the Y frame, and 8x8 for Cb
and Cr frames, since 4:2:0 chroma subsampling is employed. A
macroblock consists of four Y, one Cb, and one Cr 8x8 blocks.
For each 8x8 block a DCT transform is applied, the DCT coefficients
then go through quantization zigzag scan and entropy coding.
19
20
21
22
Loop filter
23
24
25
Quantization
Institute of Electronics, National Chiao Tung University
26
Quantization
Institute of Electronics, National Chiao Tung University
Deadzone:
To avoid too many small
coefficients being coded, which
are typically due to noise
28
DCT coefficients are converted into runlength representations and then coded
using VLC (Huffman coding for each pair of symbols)
Symbol: (Zero run-length, non-zero value range)
R
u
n
Bits
0
1
2
3
4
5
6
7
8
.
.
.
11
12
.
.
2(3 )
4
5
6
6
7
7
7
8
5
7
8
9
11
11
13
13
13
27
.
.
63
6
8
9
9 11 13
11 13 14
13 14 20
13 20
14 20
20
20
20
9
14
20
11
14
.
.
.
.
. 20
9 20
9 . 20
.
8 .
15
16 . 128
13 ..
20 ..
14
20 ..
20
20 .
20
20
20
20
29
30
H.263
Dept. Electronics Engineering, National Chiao Tung University
ITU-T Study Group (SG) 15/16: Very low Bit-Rate Visual Telephony
(LBC)
History:
Sept. 1993: Started new work item.
Near-term: Improving H.261
Nov. 1995 H.263 decided
Jan. 1998 H.263+ (H.263 Ver.2) decided
2000 Finished H.263++ ( H.263 Ver.3)
Long-term: Draft H.26L
H.264 (2003)
Different from H.261 (H.263)
Collaborate with MPEG-4 (JVT = AVC)
Goal: Improved quality at lower rates
Result: Significantly better quality at lower rates
Better video at 18-24 Kbps than H.261 at 64 Kbps
Enable video phone over regular phone lines (28.8 Kbps) or wireless
modem
32
34
35
36
3-D VLC
Institute of Electronics, National Chiao Tung University
Last
0
0
0
Run
0
0
0
Level
1
2
3
(Bits)
VCL Code
3
5
7
10s
1111s
0101 01s
5
10
12
0111s
0000 1100 1s
0000 0000 101s
1
1
1
0
0
0
1
2
3
37
39
40
41
42
p ( i , j ) ( q ( i , j ) H 0 ( i , j ) r ( i , j ) H r ( i , j )
s ( i , j ) H s ( i , j ) 4 ) / 8
where q(i, j) is the pels displaced by the current MV, MV 0
r (i, j) is the pel displaced by MV r (MV of the top or the
bottom block), s(i, j) is the pel displaced by MVs (MV of the
left or the right block).
43
44
Overlapped
MC (cont.)
Institute of Electronics, National Chiao Tung University
PB-picture mode codes two pictures as a group. The second picture (P) is coded first, then the first picture (B) is
coded using both the P-picture and the previously coded picture. This is to avoid the reordering of pictures
required in the normal B-mode. But it still requires additional coding delay than P-frames only.
In a B-block, forward prediction (predicted from the previous frame) can be used for all pixels; backward
prediction (from the future frame) is only used for those pels that the backward motion vector aligns
with pels of the current MB. Pixels in the white area use only forward prediction.
Under large motions, PB-frames do not compress as well as B-frames. An improved PB-frame mode was
defined in H.263+, that removes the previous restriction.
45
Advantages of Options
Institute of Electronics, National Chiao Tung University
47
H.
3+ (cont.)
Error robustness:
Slice structured mode
Referenced picture selection mode
Independently segmented decoding mode
Enhanced Communication:
Temporal, SNR, and spatial scalability mode
Reduced-resolution updated mode
49