0% found this document useful (0 votes)
85 views52 pages

Multimedia Communications Lecture 10: Video Standards H.261/H.263

H.261 was the first video compression standard developed in the late 1980s and early 1990s. It was optimized for video calling over ISDN lines at bit rates of 64-2048 kbps. H.261 uses 16x16 pixel macroblocks, 8x8 DCT transform, scalar quantization, and variable length coding. It defines intra-frames (I-frames) which are spatially compressed and inter-frames (P-frames) which use motion compensation from previous frames for temporal compression. H.261 had a significant impact as the basis for modern video coding standards due to its introduction of these core techniques.

Uploaded by

vjyou3986
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
85 views52 pages

Multimedia Communications Lecture 10: Video Standards H.261/H.263

H.261 was the first video compression standard developed in the late 1980s and early 1990s. It was optimized for video calling over ISDN lines at bit rates of 64-2048 kbps. H.261 uses 16x16 pixel macroblocks, 8x8 DCT transform, scalar quantization, and variable length coding. It defines intra-frames (I-frames) which are spatially compressed and inter-frames (P-frames) which use motion compensation from previous frames for temporal compression. H.261 had a significant impact as the basis for modern video coding standards due to its introduction of these core techniques.

Uploaded by

vjyou3986
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as DOCX, PDF, TXT or read online on Scribd
You are on page 1/ 52

Dept.

Electronics Engineering, National Chiao Tung University

Multimedia Communications
Lecture 10: Video Standards
Part I. Videophone and video
conferencing: H.261/H.263
Dr. Tian-Sheuan Chang
tschang@twins.ee.nctu.edu.tw
Dept. Electronics Engineering
National Chiao-Tung University
Adapted from Prof. Hangs slides

Introduction to Video Standard


Dept. Electronics Engineering, National Chiao Tung University

Behind the Scene


Why we can do compression?
Institute of Electronics, National Chiao Tung University

Observation
Significant amount of statistical and subjective redundancy within and
between frames

Statistical redundancy
Lossless compression
e.g. 000000000000000 -> run length coding, arithmetic coding,
huffman coding

Subjective redundancy
Lossy compression
Explore characteristics of Human Visual System
Not sensitive to high frequency component
Spatial redundancy
DCT transform, quantized high freq. component
Temporal redundancy
Motion estimation

The Scope of Picture and Video Coding


Standardization
Institute of Electronics, National Chiao Tung University

Only the Syntax and Decoder are standardized:


Permits optimization beyond the obvious
Permits complexity reduction for implementability
Provides no guarantees of Quality
Source

Pre-Processing

Encoding

Post-Processing
& Error Recovery

Decoding

Destination
Scope of Standard
3

Development of Coding Tools and Standards


Institute of Electronics, National Chiao Tung University

1950s

1960s

1970s

1980s

1990s

Entropy Coding 1949-1976

DPCM 1952-1980
Motion Compensated Prediction 1972-1989

Transform Coding 1965-1980

H.261 1984-1990

JPEG 1984-1992

MPEG1
19881992
MPEG2
1991-1994
H.263
MPEG4

ITU/MPEG Standards

H.261

Institute of Electronics, National Chiao Tung University

ITU H.261
Optimized for CIF@384Kbps, focus on video phone over ISDN
First design (late 90) embodying typical structure that dominates today
16x16 macroblock motion compensation, 8x8 DCT, scalar quantization, and
variable-length coding

MPEG-1
ISO/IEC 11172
1993 IS, design focus on VHS quality (352x240)@1.5Mbps

MPEG-2
ISO/IEC 13818
1994 IS, Optimized at NTSC quality CCIR601 video@6-10Mbps

H.263
ITU H.263
Focus on video phone over phone lines/wireless

MPEG-4
officially ISO/IEC 14496
Part 2. video : 2001 IS, content based video coding, interactive video
Part 10. advance video coding (AVC) ITU H.264
2004 IS, 50% bit rate reduction than other video standard
5

H.261
Dept. Electronics Engineering, National Chiao Tung University

ITU-T Video Standard: H.261 - History


Institute of Electronics, National Chiao Tung University

CCITT Study Group (SG) XV Videophone and


videoconferencing at bit rate: ~40 kb/s -- 2 Mb/s
Defines only the decoder; a reference encoder
model was developed to test the decoder.
History:
Dec. 1984: The specialists group established.
1984~1988: Algorithm developed for nx384 kb/s, n =
1, , 5.
1989: Modified for px64 kb/s, p = 1, , 30.
Dec. 1990: Standards approved

ITU-T Multimedia Communications


Standards

/3
Institute of Electronics, National Chiao Tung University

H.324 Terminal

(multimedia communication over PSTN)


Institute of Electronics, National Chiao Tung University

10

H.261 Overall Codec System


Institute of Electronics, National Chiao Tung University

Quick View of H.261


Institute of Electronics, National Chiao Tung University

ITU-T H.261: The basis of modern video


compression
The first widespread practical success
First design (late 90) embodying typical structure that
dominates today
16x16 macroblock motion compensation, 8x8 DCT, scalar
quantization, and variable-length coding

Other key aspects


loop filter, integer-pel motion compensation accuracy, 2-D
VLC for coefficients

Operated at 64-2048 kbps


Still in use
although mostly as a backward compatibility feature
overtaken by H.263
11

Picture Partition (1)


Institute of Electronics, National Chiao Tung University

Picture size: CIF, QCIF


Macroblock (MB): Contains six 8x8 blocks (motion compensation,
quantizer adjustment, )

1
3

2
4

Cb
Y

Cr

Group of Block (GOB): Contains 33 MBs (synchronization,


quantizer adjustment, )

10 11

12 13 14 15 16 17 18 19 20 21 22
23 24 25 26 27 28 29 30 31 32 33
12

Picture Partition (2)


Institute of Electronics, National Chiao Tung University

Picture: Contains 3 or 12 GOBs (picture


sync., time reference, )

13

Syntax
Institute of Electronics, National Chiao Tung University

Picture Layer: Picture Start Code (PSC: 20 bits), Temporal


Reference (TR: 5), Picture Type (PTYPE), Picture Extra Insertion
(PEI), Picture Spare (PSPARE)

GOB Layer: Group Start Code (GBSC), Group Number (GN),


Quantizer (GQUANT), Extra Insertion (GEI), ...

14

Syntax (cont.)
Institute of Electronics, National Chiao Tung University

Macroblock (MB) Layer: MB Address deffenrential (MBA),


MB Type (MTYPE), Quantizer (MQUANT), Motion Vector Data
differential (MVD), Coded Block Pattern (CBP)

15

Syntax (cont.)
Institute of Electronics, National Chiao Tung University

Block Layer: (DCT) Transform Coefficients (TCOEFF),


End of Block (EOB: 10)

16

17

H.261 Frame Sequence


Institute of Electronics, National Chiao Tung University

H.261 Frame Sequence

Institute of Electronics, National Chiao Tung University

Two types of image frames are defined: Intra-frames (I-frames) and


Inter-frames (P-frames):
I-frames are treated as independent images. Transform coding method
similar to JPEG is applied within each I-frame, hence intra".
P-frames are not independent: coded by a forward predictive coding
method (prediction from a previous P-frame is allowed --- not just from a
previous I-frame).
Temporal redundancy removal is included in P-frame coding, whereas Iframe coding performs only spatial redundancy removal.
To avoid propagation of coding errors, an I-frame is usually sent a couple
of times in each second of the video. (intra refresh)

Motion vectors in H.261 are always measured in units of full pixel


and they have a limited range of 15 pixels, i.e., p = 15.

18

Intra-frame (I-frame) Coding

Institute of Electronics, National Chiao Tung University

Macroblocks are of size 16x16 pixels for the Y frame, and 8x8 for Cb
and Cr frames, since 4:2:0 chroma subsampling is employed. A
macroblock consists of four Y, one Cb, and one Cr 8x8 blocks.
For each 8x8 block a DCT transform is applied, the DCT coefficients
then go through quantization zigzag scan and entropy coding.

19

20

P-Frame (Inter-frame) Coding


Institute of Electronics, National Chiao Tung University

P-Frame (Inter-frame) Coding


Institute of Electronics, National Chiao Tung University

The P-frame coding encodes the difference macroblock


(not the Target macroblock itself).
Sometimes, a good match cannot be found, i.e., the
prediction error exceeds a certain acceptable level.
The MB itself is then encoded (treated as an Intra MB) and in this
case it is termed a non-motion compensated MB.

For motion vector, the difference MVD is sent for entropy


coding:
MVD = MVPreceding MVCurrent

21

22

H.261 Encoder (Nonstandard)

Loop filter

Institute of Electronics, National Chiao Tung University

23

H.261 Decoder (Standard)


Institute of Electronics, National Chiao Tung University

24

A Glance at Syntax of H.261 Video Bitstream


Institute of Electronics, National Chiao Tung University

Parameter Selection and Rate Control


Institute of Electronics, National Chiao Tung University

MTYPE (intra vs. inter, zero vs. non-zero MV in inter, loop


filter on/off)
CBP (which blocks in a MB have non-zero DCT
coefficients)
MQUANT (allow the changes of the quantizer step size at
the MB level)
should be varied to satisfy the rate constraint

MV (ideally should be determined not only by prediction


error but also the total bits used for coding MV and DCT
coefficients of prediction error)

25

Quantization
Institute of Electronics, National Chiao Tung University

8x8 DCT; zig-zag scan


Uniform quantizer with a dead-zone:
Odd QUANT
REC = QUANT(2level + 1);
REC = QUANT(2level 1);
Even QUANT
REC = QUANT(2level + 1) 1;
REC = QUANT(2level 1) + 1;
REC = 0;

for level > 0


for level < 0
for level > 0
for level < 0
for level = 0

QUANT value: 1 31 (5 bits); may be changed for every


MB and /or GOB
Exception: Intra-block dc coeff step size = 8 (fixed)
and no dead-zone

26

Quantization
Institute of Electronics, National Chiao Tung University

The quantization in H.261 uses a constant step


size, for all DCT coefficients within a macroblock.
If we use DCT and QDCT to denote the DCT
coefficients before and after the quantization, then
for DC coefficients in Intra mode

For other coefficients (floor function for center


deadzone)

Scale: an integer in the range of [1, 31].


27

DCT Coefficient Quantization


Institute of Electronics, National Chiao Tung University

Deadzone:
To avoid too many small
coefficients being coded, which
are typically due to noise

28

Variable Length Coding

Institute of Electronics, National Chiao Tung University

DCT coefficients are converted into runlength representations and then coded
using VLC (Huffman coding for each pair of symbols)
Symbol: (Zero run-length, non-zero value range)

Other information are also coded using VLC (Huffman coding)


Absolute Level

R
u
n

Bits

0
1
2
3
4
5
6
7
8
.
.
.
11
12
.
.

2(3 )
4
5
6
6
7
7
7
8

5
7
8
9
11
11
13
13
13

27
.
.
63

6
8
9
9 11 13
11 13 14
13 14 20
13 20
14 20
20
20
20

9
14
20

11
14
.

.
.
.
. 20
9 20
9 . 20
.

8 .

15

16 . 128

13 ..
20 ..

14

20 ..

20

20-bits fixed length codes


Escape(6 bits)+Run(6)+Level(8)

20 .
20
20
20

20

29

Motion Estimation and Compensation


Institute of Electronics, National Chiao Tung University

Integer-pel accuracy in the range [-16,16]


Methods for generating the MVs are not specified in the
standard
Standards only define the bitstream syntax, or the decoder
operation)

MVs coded differentially (DMV)


Encoder and decoder uses the decoded MVs to perform
motion compensation
Loop-filtering can be applied to suppress propagation of
coding noise temporally
Separable filter [1/4,1/2,1/4]
Loop filter can be turned on or off

30

H.263
Dept. Electronics Engineering, National Chiao Tung University

Very Low Bit Rate Coding

Institute of Electronics, National Chiao Tung University

ITU-T Study Group (SG) 15/16: Very low Bit-Rate Visual Telephony
(LBC)
History:
Sept. 1993: Started new work item.
Near-term: Improving H.261
Nov. 1995 H.263 decided
Jan. 1998 H.263+ (H.263 Ver.2) decided
2000 Finished H.263++ ( H.263 Ver.3)
Long-term: Draft H.26L
H.264 (2003)
Different from H.261 (H.263)
Collaborate with MPEG-4 (JVT = AVC)
Goal: Improved quality at lower rates
Result: Significantly better quality at lower rates
Better video at 18-24 Kbps than H.261 at 64 Kbps
Enable video phone over regular phone lines (28.8 Kbps) or wireless
modem
32

Different From H.261


Institute of Electronics, National Chiao Tung University

A combination of H.261 and MPEG


Various picture formats such as sub-QCIF, 4CIF,
etc.
Half-pel motion compensation (~MPEG)
No loop filter
No microblock addressing (included in MB header)
Quantizer stepsize: 5-bit in picture and GOB
layers; differential MQUANT stepsize: 2-bit in MB
layer
3D VLC for transform coeffs.
Four negotiable options
33

Video Format and Picture Partition in H.263


Institute of Electronics, National Chiao Tung University

Wider application range from sub-QCIF to 16CIF

34

H.263 Typical Encoder (Nonstandard)


Institute of Electronics, National Chiao Tung University

A general source coder model

35

DCT, Quantization and 3-D VLC


Institute of Electronics, National Chiao Tung University

DCT and Zig-zag scan: same as H.261 (JPEG)


Inverse-Quantization: same at that of H.261
At MB layer, the QUANT value can only be
increased / decreased by 1 and 2
3-D VLC: An event (symbol) is made of (Last, Run,
Level).
Last = 1 indicates the last coeff.

36

3-D VLC
Institute of Electronics, National Chiao Tung University

Last
0
0
0

Run
0
0
0

Level
1
2
3

(Bits)

VCL Code

3
5
7

10s
1111s
0101 01s

5
10
12

0111s
0000 1100 1s
0000 0000 101s

1
1
1

0
0
0

1
2
3

37

Motion Estimation: Median Prediction for MV


Institute of Electronics, National Chiao Tung University

1. horizontal and vertical components are


seperatedly calculated
2. The difference between MV and the
predictor is VLC-coded
38

Motion Estimation: Half-Pel Precision


Institute of Electronics, National Chiao Tung University

Half-pixel prediction by bilinear interpolation


to reduce the prediction error,
default range MV(u; v) are now [16; 15:5].
Half pels are generated by bilinear interpolation

39

H.263 Negotiable Options


Institute of Electronics, National Chiao Tung University

-- Negotiable between encoder and decoder


Unrestricted motion vectors (UMV) mode:
Motion vectors are allowed to point outside the
picture
Syntax-based arithmetic coding (SAC) mode:
VLC is replaced by arithmetic coding
Advanced prediction (AP) mode: One MV for
each 8x8 block
PB-frame (PB) mode: Introduce a constrained
version of (MPEG) B-frame

40

Advanced Prediction Mode


Institute of Electronics, National Chiao Tung University

Four MV's can be used in a MB: The 1st


(differential) MV is MVD and the rest, MVD2-4
The MV predictor for each 8x8 block is formed by
using 3 nearby MV's as shown below

41

AP Mode: Overlapped Motion


Compensation
Institute of Electronics, National Chiao Tung University

Each pel in the current 8x8 luminance block is predicted


using the weighted sum of the pels of three previous
frame predictors: current, left (or right), top (or bottom).
For example, the upper left 4x4 corners uses the current,
top and left predictors; the upper right 4x4 corners uses
the current, top and right predictors; etc.
The current predictors is the previous-frame pels
displaced using the current MV, the left predictor is
displaced using the left block MV, etc.
Four MVs enable more accurate MV for each block.
Overlapped compensation achieves smooth transition
between nearby blocks.

42

Overlapped Motion Compensation


Institute of Electronics, National Chiao Tung University

p ( i , j ) ( q ( i , j ) H 0 ( i , j ) r ( i , j ) H r ( i , j )
s ( i , j ) H s ( i , j ) 4 ) / 8
where q(i, j) is the pels displaced by the current MV, MV 0
r (i, j) is the pel displaced by MV r (MV of the top or the
bottom block), s(i, j) is the pel displaced by MVs (MV of the
left or the right block).

43

44

Overlapped
MC (cont.)
Institute of Electronics, National Chiao Tung University

Motion Estimation: PB-Picture Mode


Institute of Electronics, National Chiao Tung University

PB-picture mode codes two pictures as a group. The second picture (P) is coded first, then the first picture (B) is
coded using both the P-picture and the previously coded picture. This is to avoid the reordering of pictures
required in the normal B-mode. But it still requires additional coding delay than P-frames only.
In a B-block, forward prediction (predicted from the previous frame) can be used for all pixels; backward
prediction (from the future frame) is only used for those pels that the backward motion vector aligns
with pels of the current MB. Pixels in the white area use only forward prediction.
Under large motions, PB-frames do not compress as well as B-frames. An improved PB-frame mode was
defined in H.263+, that removes the previous restriction.
45

Performance of H.261 and H.263


Institute of Electronics, National Chiao Tung University

Half-pel MC, +/- 32

OBMC, 4 MVs, etc

Integer MC, +/- 16, loop filter


Integer MC, +/- 32

Integer MC, +/- 16

Forman, QCIF, 12.5 Hz


46

Advantages of Options
Institute of Electronics, National Chiao Tung University

(Girod and et al., Performance of the H.263 Video Compression Standard,


VLSI Signal Proc., 1997)

At 64 kbps, QCIF pictures, ~12.5 frames / sec


H.261 vs. H.263: (1) w/o options ~2 dB PSNR

improvement; (2) with all options ~ 3 dB.


Key factor: Half-pel motion estimation.
H.263 SAC option: 0.2 dB improvement (vs. w/o)
H.263 AP option: 1.2 dB (vs. w/o)
H.263 PB option: P-pic PSNR is higher but B-pic
PSNR is lower; Better subjective quality

47

H.263+ (H.263 v2)


Institute of Electronics, National Chiao Tung University

Enhance H.263 with additional options (Draft 20,


Sept. 97)
Coding efficiency:
Advanced intra coding mode
Deblocking filter mode
Improved PB-frames mode
Reference picture resampling mode
Alternative inter VLC mode
Modified quantization mode
48

H.

3+ (cont.)

Institute of Electronics, National Chiao Tung University

Error robustness:
Slice structured mode
Referenced picture selection mode
Independently segmented decoding mode
Enhanced Communication:
Temporal, SNR, and spatial scalability mode
Reduced-resolution updated mode

49

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy