20250320121146-Module-3 MMC Notes
20250320121146-Module-3 MMC Notes
MODULE – 3
Notes (as per Syllabus 2022)
TEXT BOOK:
1. Multimedia Communications: Applications, Networks, Protocols
and Standards, Fred Halsall, Pearson Education, Asia, Second Indian
reprint 2002.
REFERENCE BOOKS:
1. Multimedia Information Networking, Nalin K. Sharda, PHI, 2003.
2. “Multimedia Fundamentals: Vol 1 - Media Coding and Content
Processing”, Ralf Steinmetz, Klara Narstedt, Pearson Education,
2004.
3. “Multimedia Systems Design”, Prabhat K. Andleigh, Kiran
Thakrar, PHI, 2004.
1. J. Choi and D. Park, A stable feedback control of the buffer state using the controlled
Lagrange multiplier method, IEEE Trans. Image Processing, 3, 546–558 (1994).
2. Y. L. Lin and A. Ortega, Bit rate control using piecewise approximated rate-distortion
characteristics, IEEE Trans CSVT, 8, 446–459 (1998).
3. W. Ding, Rate control of MPEG-video coding and recording by rate quantization
modeling, IEEE Trans CSVT, 6, 12–20 (1966).
4. B. Tao, H. A. Peterson, and B. W. Dickinson, A rate-quantization model for MPEG
encoders, Proc. IEEE ICIP, 1, 338–341 1997.
5. K. H. Yang, A. Jacquin, and N. S. Jayant, A normalized rate distortion model for H.263-
compatible codecs and its application to quantizer selection, Proc. IEEE ICIP, 1, 41–44
(1997).
6. A. Velio, H. F. Sun, and Y. Wang, MPEG-4 rate control for multiple video objects, IEEE
Trans. CSVT, 9, 186–199 (1999).
7. Y. Ribos-Corbero and S. M. Lei, JTC1/SC29/WG11 MPEG96/M1820, Contribution to
Rate Control Q2 Experiment: A Quantization Control Tool for Achieving Target Bitrate
Accurately, Sevilla, Spain (1997).
Introduction
Compression is used just about everywhere. All the images you get on the web
are compressed, typically in the JPEG or GIF formats, most modems use compression,
HDTV will be compressed using MPEG-2, and several file systems automatically
compress files when stored, and the rest of us do it by hand. The neat thing about
compression, as with the other topics we will cover in this course, is that the algorithms
used in the real world make heavy use of a wide set of algorithmic tools, including
sorting, hash tables, tries, and FFTs. Furthermore, algorithms with strong theoretical
foundations play a critical role in real-world applications.
In this chapter we will use the generic term message for the objects we want to
compress, which could be either files or messages. The task of compression consists of
two components, an encoding algorithm that takes a message and generates a
“compressed” representation (hopefully with fewer bits), and a decoding algorithm that
reconstructs the original message or some approximation of it from the compressed
representation. These two components are typically intricately tied together since they
both have to understand the shared compressed representation.
Consider instead a system that reworded sentences into a more standard form, or
replaced words with synonyms so that the file can be better compressed. Technically
the compression would be lossy since the text has changed, but the “meaning” and
clarity of the message might be fully maintained, or even improved. In fact Strunk and
White might argue that good writing is the art of lossy text compression.
Because one can’t hope to compress everything, all compression algorithms must
assume that there is some bias on the input messages so that some inputs are more
likely than others, i.e. that there is some unbalanced probability distribution over the
possible messages. Most compression algorithms base this “bias” on the structure of the
messages – i.e., an assumption that repeated characters are more likely than random
characters, or that large white patches occur in “typical” images. Compression is
therefore all about probability.
The coder component then takes advantage of the probability biases generated in the
model to generate codes. It does this by effectively lengthening low probability
messages and shortening high-probability messages. A model, for example, might have
a generic “understanding” of human faces knowing that some “faces” are more likely
than others (e.g., a teapot would not be a very likely face). The coder would then be able
to send shorter messages for objects that look like faces. This could work well for
compressing teleconference calls. The models in most current real-world compression
algorithms, however, are not so sophisticated, and use more mundane measures such as
repeated patterns in text. Although there are many different ways to design the model
component of compression algorithms and a huge range of levels of sophistication, the
coder components tend to be quite generic—in current algorithms are almost
exclusively based on either Huffman or arithmetic codes. Lest we try to make to fine of
a distinction here, it should be pointed out that the line between model and coder
components of algorithms is not always well defined. It turns out that information
theory is the glue that ties the model and coder components together. In particular it
gives a very nice theory about how probabilities are related to information content and
code length. As we will see, this theory matches practice almost perfectly, and we can
achieve code lengths almost identical to what the theory predicts.
Another question about compression algorithms is how does one judge the quality of
one versus another. In the case of lossless compression there are several criteria I can
think of, the time to compress, the time to reconstruct, the size of the compressed
messages, and the generality—i.e., does it only work on Shakespeare or does it do Byron
too. In the case of lossy compression the judgement is further complicated since we also
have to worry about how good the lossy approximation is. There are typically tradeoffs
between the amount of compression, the runtime, and the quality of the reconstruction.
Depending on your application one might be more important than another and one
would want to pick your algorithm appropriately. Perhaps the best attempt to
systematically compare lossless compression algorithms is the Archive Comparison
Test (ACT) by Jeff Gilchrist. It reports times and compression ratios for 100s of
compression algorithms over many databases. It also gives a score based on a weighted
average of runtime and the compression ratio.
Compression principles:
Compression in Multimedia Data:
Compression basically employs redundancy in the data:
_ Temporal — in 1D data, 1D signals, Audio etc.
_ Spatial — correlation between neighbouring pixels or data items
_ Spectral — correlation between colour or luminescence components. This uses the
frequency domain to exploit relationships between frequency of change in data.
_ Psycho-visual — exploit perceptual properties of the human visual system.
Basic reason: Compression ratio of lossless methods (e.g., Huffman coding, arithmetic
coding, LZW) is not high enough.
A top-down approach
1. Sort symbols (Tree Sort) according to their frequencies/probabilities, e.g., ABCDE.
2. Recursively divide into two parts, each with approx. Same number of counts.
Huffman Coding
_ Based on the frequency of occurrence of a data item (pixels or small blocks of pixels in
images).
_ Use a lower number of bits to encode more frequent data
_ Codes are stored in a Code Book—as for Shannon (previous slides)
_ Code book constructed for each image or a set of images.
_ Code book plus encoded data must be transmitted to enable decoding.
JPEG
JPEG refers to a wide variety of possible image compression approaches that have been
collected into a single standard. In this section we attempt to describe JPEG in a
somewhat general but comprehensible way. A very complete description of the JPEG
standard has been presented in Pennabaker and Mitchell [1993]. As mentioned in the
preceding, the JPEG standard has both lossless and lossy components. In addition, the
entropy coding employed by JPEG can be either Huffman coding or binary arithmetic
coding. Figures 5.1 and 5.2 present very general image compression models that help
describe the JPEG standard. In Figure 5.1 the compression process is broken into two
basic functions: modeling the image data and entropy coding the description provided
by a particular model. As the figure indicates, the modeling and entropy coding are
separate. Hence whether Huffman or arithmetic entropy codes are used is irrelevant to
the modeling. Any standard application-specific or image-specific coding tables can be
used for entropy coding. The reverse process is illustrated in Figure 5.2. The modes of
operation for JPEG are depicted in Figure 5.3. Two basic functional modes exist:
nonhierarchical and hierarchical. Within the nonhierarchical modes are the sequential
lossless and the lossy DCT-based sequential and progressive modes. The sequential
modes progress through an image segment in a strict left-to-right, top-to-bottom pass.
The progressive modes allow several refinements through an image segment, with
increasing quality after each JPEG modes of operation. coding with increasing
resolution, coding of difference images, and multiple frames per image (the
nonhierarchical modes allow only a single frame per image).
DCT-Based Image Compression
The basis for JPEG's lossy compression is the two-dimensional OCT. An image is
broken into 8 x 8 blocks on which the transform is computed. The transform allows
different two-dimensional frequency components to be coded separately. Image
compression is obtained through quantization of these DCT coefficients to a relatively
small set of finite values. These values (or some representation of them) are entropy
coded and stored as a compressed version of the image.
RECOMMENDED QUESTIONS:
1. Differentiate between entropy & source encoding?[08]
2. Explain with the block diagrams DCT, spatial frequency, horizontal and vertical
frequency components?[10]
3. Encode using static Huffman coding for following.
0.25, 0.25, 0.14, 0.14, 0.055, 0.055, 0.055, 0.055. Write the code word generation &
construct Huffman code tree.[10]
4. Develop Dynamic Huffman coding tree for following sentence. “This is”
5. Find the code word for following text “went” using Arithmetic coding technique.
e=0.3, n= 0.3, t=0.2, w=0.1, .=0.1 & explain the coding operation in detail.[07]
6. Explain the Dictionary models used for text compression with an example.[05]
7. Explain in detail the concepts & types of animation.[05]
8. Using Huffman coding, derive code set for following. 0.3, 0.3, 0.2, 0.1,0.1.[05]
9. Define additive & subtractive color mixing.[04]
10. What is dynamic Huffman coding used in text?[06]
11. What is sprite Animation?[03]
12. Define luminance & chrominance.[05]
13. What is discrete cosine transform?[05]
14. Define text & image.[04]
15. Define audio & video.[04]
16. Compare formatted & unformatted text.[08]
17. What is rendering & clip art?[06]
18. What is flicker & frame refresh rate?[05]
19. What is NTSC & PAL?[05]
20. What is sample & hold, Quantizer?[05]
21. Define aspect ratio & pixel depth.[05]
22. What is composite video signal?[05]
23. Define Run – length encoding & statistical encoding.[06]
24. What is synchronization?[02]
25. Explain the principle of operation of the LZW compression algorithm & how this
is different from the LZ algorithm?[10]
26. Describe the principles of TIFF algorithm & its application domains.[10]
27. Discriminate between a one dimensional coding scheme & a two dimensional
(MMR) scheme?[10]