MMC MQP 1
MMC MQP 1
Data Networks
• Designed to provide basic data communication services such as email and general file transfer
• Most widely deployed networks: X.25 network (low bit rate data) not suitable for multimedia
and the Internet (Interconnected Networks)
• Communication protocol: set of rules (defines the sequence and syntax of the messages) that are
adhered to by all communicating parties for the exchange of information/data
• Packet: Container for a block of data, at its head, is the address of the intended recipient
computer which is used to route the packet through the network
• Open systems interconnections (OSI)- is a standard description or "reference model" for how
messages should be transmitted between any two points in a telecommunication network
• Access to homes is through an Internet Service provider (ISP)
• Access through PSTN or ISDN (high-bit rate)
• Business users obtain access either through site network or through an enterprise-wide private
network (multiple sites)
• Universities with single campus use a network known as the Local Area Network (LAN).
However bigger universities with more than one campus use enterprise wide network
• If the communication protocols of the computers on the network are the same as the internet
protocols then the network is known as an intranet (e.g large companies and universities)
• All types of network are connected using a gateway (router) to the internet backbone network
• Router - a router is a device or, in some cases, software in a computer, that determines the next
network point to which a packet should be forwarded toward its destination
• Packet mode – Operates by transfer of packets as defined earlier
• This mode of operation is chosen because normally the data associated with data applications is
in discrete block format.
• With the new multimedia PCs packet mode networks are used to support in addition to the data
communication applications a range of multimedia applications involving audio video and speech
• Started to develop in the early 1980s to provide PSTN users the capability to have additional
services
• Integrated Services Digital Network (ISDN) in concept is the integration of both analogue or
voice data together with digital data over the same network.
ISDN is a set of ITU standards for digital transmission over ordinary telephone copper wire as well as
over other media. Home and business users who install an ISDN adapter (in place of a modem) can see
highly-graphic Web pages arriving very quickly (up to 128 Kbps). ISDN requires adapters at both ends of
the transmission so your access provider also needs an ISDN adapter. ISDN is generally available from
your phone company
• DSL (Digital Subscriber Line) is a technology for bringing high-bandwidth information to homes
and small businesses over ordinary copper telephone lines.
• Assuming your home or small business is close enough to a telephone company central office that
offers DSL service, you may be able to receive continuous transmission of motion video, audio,
and even 3-D effects.
• Typically, individual connections will provide from 1.544 Mbps to 512 Kbps downstream and
about 128 Kbps upstream. A DSL line can carry both data and voice signals and the data part of
the line is continuously connected.
• Access circuit that allows users either two different telephone calls simultaneously or a
telephone call and a data network
• DSL supports two 64 kbps channels that can be used independently or as a single combined
128kbps channel (additional box of electronics). This is known as the aggregation function
• The connection utilizes only a variable portion of the bandwidth of each link and known as
virtual circuit (VC)
• To set up a VC the source terminal sends a call request control packet to the local PSE which in
addition to the source and destination addresses holds a short identifier known as virtual circuit
identifier (VCI)
• Each PSE maintains a table that specifies the outgoing link to use to reach the network address
• On receipt of the call request the PSE uses the destination address within the packet to determine
the outgoing link
• The next free identifier (VCI) for this link is selected and two entries are made in the routing
table
Connectionless
• In connectionless network, the establishment of a connection is not required and they can
exchange information as and when they arrive
• Each packet must carry the full source and destination address in its header in order for each PSE
to route the packet onto the appropriate outgoing link (router term used rather than PSE)
• In both types each packet is stored in a memory buffer and a check is performed to determine if
any transmission errors are present in the received message. (i.e 0 instead of a 1 or vice versa)
• As packets may need to use the same link to transfer information an operation known as store-
and-forward is used.
• The sum of the store and forward delays in each PSE/router contributes to the overall transfer
delay of the packets and the mean of this delay is known as the mean packet transfer delay.
• The variation about the mean are known as the delay variation or jitter
• Example of connectionless mode – Internet
(c )
• The entertainment applications require higher quality / resolution for video and audio since wide-
screen televisions and stereophonic sound are often used
• Normally the subscriber terminal comprises television with a selection deive for interation
purposes
• The user interactions are relayed to the server through a set-top-box (STB) which contains a high
speed modem
• By means of the menu the user can browse through the movies/videos and initiate the showing of
a selected movie. This is known as Movie-on-demand or Video-on-demand.
• Key features of MOD
• - Subscriber can initate the showing of a movie from a library of movies at any time of the day
or night
• Issues associated with MOD
• - The server must be capable of playing out simultaneously a large number of video streams
equal to the number of subscribers at any one time
• - This will require high speed information flow from the server (multi-movies + multi-copies)
• In order to avoid the heavy load there is another mode of operation used. In which requests are
queued until the start of the next playout time.
(ii)Decentralized Mode
• The decentralized mode is used with packet-switched networks that support multicast
communications
• E.g – LAN, Intranet, Internet
• The output of each terminal is received by all the other members of the conference/multicast
group
• Hence a conference server is not required and it is the responsibility of each terminal to manage
the information streams that they receive from the other members
(iii)Hybrid Mode
• This type of mode is used when the terminals are connected to different network types
• In this mode the server determines the output stream to be sent to each terminal
2. (C )
Module 2
3. (a) PCM Speech CODEC
It is a digitization process. Defined in ITU-T Recommendations G.711.PCM consists of encoder
and decoder
It consists of expander and compressor. As compared to earlier where linear quantization is used
– noise level same for both loud and low signals.
As ear is more sensitive to noise on quite signals than loud signals, PCM system consists of non-
linear quantization with narrow intervals through compressor. At the destination expander is used
The overall operation is companding. Before sampling and using ADC, signal passed through
compressor first and passed to ADC and quantized. At the receiver, codeword is first passed to
DAC and expander.
Two compressor characteristics – A law and mu law
3. (b) Interlaced Scanning
• It is necessary to use a minimum refresh rate of 50 times per second to avoid flicker
• Field: The first comprising only the odd scan lines and the second the even scan lines The two
field are then integrated together in the television receiver using a technique known as
3.(c)
4. (a)
• This is the ratio of the screen width to the screen height ( television tubes and PC monitors have
an aspect ratio of 4/3 and wide screen television is 16/9)
• Unformatted text: Known as plain text; enables pages to be created which comprise strings of
fixed-sized characters from a limited character set
• Formatted Text: Known as rich text; enables pages to be created which comprise of strings of
characters of different styles, sizes and shape with tables, graphics, and images inserted at
appropriate points
• Hypertext: Enables an integrated set of documents (Each comprising formatted text) to be created
which have defined linkages between them
• Unformatted Text – The basic ASCII character set
• Control characters
(Back space, escape, delete, form feed etc)
• Printable characters
(alphabetic, numeric, and punctuation)
• The American Standard Code for Information Interchange is one of the most widely used
character sets and the table includes the binary codewords used to represent each character (7 bit
binary code)
The characters in columns 010/011 and 110/111 are replaced with the set of mosaic characters;
and then used, together with the various uppercase characters illustrated, to create relatively
simple graphical images
• Although in practice the total page is made up of a matrix of symbols and characters which all
have the same size, some simple graphical symbols and text of larger sizes can be constructed by
the use of groups of the basic symbols
• Formatted Text
• It is produced by most word processing packages and used extensively in the publishing sector
for the preparation of papers, books, magazines, journals and so on..
• Documents of mixed type (characters, different styles, fonts, shape etc) possible.
• Format control characters are used
• Hypertext – Electronic Document in hypertext
• Hypertext can be used to create an electronic version of documents with the index, descriptions of
departments, courses on offer, library, and other facilities all written in hypertext as pages with
various defined hyperlinks
• An example of a hypertext language is HTML used to describe how the contents of a document
are presented on a printer or a display; other mark-up languages are: Postscript, SGML (Standard
Generalized Mark-up language) Tex, and Latex.
4. (c)
Module 3
5. (a)
• The Joint Photographic Experts Group forms the basis of most video compression algorithms
• 2-D matrix is required to store the required set of 8-bit grey-level values that represent the
image
• For the colour image if a CLUT is used then a single matrix of values is required
• If the Y, Cr, Cb format is used then the matrix size for the chrominance components is smaller
than the Y matrix ( Reduced representation)
• Once the image format is selected then the values in each matrix are compressed separately using
the DCT
• In order to make the transformation more efficient a second step known as block preparation is
carried out before DCT
• In block preparation each global matrix is divided into a set of smaller 8X8 submatrices (block)
which are fed sequentially to the DCT
• Once the source image format has been selected and prepared (four alternative forms of
representation), the set values in each matrix are compressed separately using the DCT)
• Block preparation is necessary since computing the transformed value for each position in a
matrix requires the values in all the locations to be processed
• Each pixel value is quantized using 8 bits which produces a value in the range 0 to 255 for the R,
G, B or Y and a value in the range –128 to 127 for the two chrominance values Cb and Cr
• If the input matrix is P[x,y] and the transformed matrix is F[i,j] then the DCT for the 8X8 block
is computed using the expression:
1 (2 x 1)i (2 y 1) j
F[i, j ] C (i)C ( j ) P[ x, y] cos
7 7
cos
• 4 16
x 0 y 0
16
All 64 values in the input matrix P[x,y] contribute to each entry in the transformed matrix F[i,j]
• For i = j = 0 the two cosine terms are 0 and hence the value in the location F[0,0] of the
transformed matrix is simply a function of the summation of all the values in the input matrix
• This is the mean of all 64 values in the matrix and is known as the DC coefficient
• Since the values in all the other locations of the transformed matrix have a frequency coefficient
associated with them they are known as AC coefficients
• for j = 0 only the horizontal frequency coefficients are present
• for i = 0 only the vertical frequency components are present
• For all the other locations both the horizontal and vertical frequency coefficients are present
• The values are first centred around zero by subtracting 128 from each intensity/luminance value
• Using DCT there is very little loss of information during the DCT phase
• The losses are due to the use of fixed point arithmetic
• The main source of information loss occurs during the quantization and entropy encoding stages
where the compression takes place
• The human eye responds primarily to the DC coefficient and the lower frequency coefficients
(The higher frequency coefficients below a certain threshold will not be detected by the human
eye)
• This property is exploited by dropping the spatial frequency coefficients in the transformed
matrix (dropped coefficients cannot be retrieved during decoding)
• In addition to classifying the spatial frequency components the quantization process aims to
reduce the size of the DC and AC coefficients so that less bandwidth is required for their
transmission (by using a divisor)
• The sensitivity of the eye varies with spatial frequency and hence the amplitude threshold below
which the eye will detect a particular frequency also varies
• The threshold values vary for each of the 64 DCT coefficients and these are held in a 2-D matrix
known as the quantization table with the threshold value to be used with a particular DCT
coefficient in the corresponding position in the matrix
• The choice of threshold value is a compromise between the level of compression that is required
and the resulting amount of information loss that is acceptable
• JPEG standard has two quantization tables for the luminance and the chrominance coefficients.
However, customized tables are allowed and can be sent with the compressed image
•
• From the quantization table and the DCT and quantization coefficents number of observations
can be made:
- The computation of the quantized coefficients involves rounding the quotients to the nearest
integer value
- The threshold values used increase in magnitude with increasing spatial frequency
- The DC coefficient in the transformed matrix is largest
- Many of the higher frequency coefficients are zero
• Entropy encoding consists of four stages
Vectoring – The entropy encoding operates on a one-dimensional string of values (vector).
However the output of the quantization is a 2-D matrix and hence this has to be represented
in a 1-D form. This is known as vectoring
Differential encoding – In this section only the difference in magnitude of the DC coefficient
in a quantized block relative to the value in the preceding block is encoded. This will reduce
the number of bits required to encode the relatively large magnitude
The difference values are then encoded in the form (SSS, value) SSS indicates the number of
bits needed and actual bits that represent the value
e.g: if the sequence of DC coefficients in consecutive quantized blocks was: 12, 13, 11, 11,
10, --- the difference values will be 12, 1, -2, 0, -1
• In order to exploit the presence of the large number of zeros in the quantized matrix, a zig-zag of
the matrix is used
• The remaining 63 values in the vector are the AC coefficients
• Because of the large number of 0’s in the AC coefficients they are encoded as string of pairs of
values
• Each pair is made up of (skip, value) where skip is the number of zeros in the run and value is the
next non-zero coefficient
•
The above will be encoded as
(0,6) (0,7) (0,3)(0,3)(0,3) (0,2)(0,2)(0,2)(0,2)(0,0)
Final pair indicates the end of the string for this block
• Significant levels of compression can be obtained by replacing long strings of binary digits by a
string of much shorter codewords
• The length of each codeword is a function of its relative frequency of occurrence
• Normally, a table of codewords is used with the set of codewords precomputed using the
Huffman coding algorithm
• In order for the remote computer to interpret all the different fields and tables that make up the
bitstream it is necessary to delimit each field and set of table values in a defined way
• The JPEG standard includes a definition of the structure of the total bitstream relating to a
particular image/picture. This is known as a frame
• The role of the frame builder is to encapsulate all the information relating to an encoded
image/picture
5.(c ) CPU Management in Multimedia Operating System
6. (a)
6.(b) LZW Compression
• The principle of the Lempel-Ziv-Welsh coding algorithm is for the encoder and decoder to build
the contents of the dictionary dynamically as the text is being transferred
• Initially the decoder has only the character set – e.g ASCII. The remaining entries in the
dictionary are built dynamically by the encoder and decoder
• Initially the encoder sends the index of the four characters T, H, I, S and sends the space character
which will be detected as a non alphanumeric character
• It therefore transmits the character using its index as before but in addition interprets it as
terminating the first word and this will be stored in the next free location in the dictionary
• In applications with 128 characters initially the dictionary will start with 8 bits and 256 entries
128 for the characters and the rest 128 for the words
• A key issue in determining the level of compression that is achieved, is the number of entries in
the dictionary since this determines the number of bits that are required for the index
DSP circuits help in analyzing the signal based on the required features (perceptual) and then quantized
These are used with proper model of vocal tract to produce synthesized speech
• After analyzing the audio waveform, These are then quantized and sent and the destination uses
them, together with a sound synthesizer, to regenerate a sound that is perceptually comparable
with the source audio signal. This is LPC technique.
• Three feature which determine the perception of a signal by the ear are its:
– Pitch
– Period
– Loudness
Segment- block of sampled signals are analyzed to define perceptual parameters of speech
The speech signal generated by the vocal tract model in the decoder is the present o/p signal of speech
synthesizers and linear combination of previous set of model coefficients
Encoder determines and sends a new set of coefficients for each quantized segment
The output of encoder is a set of frames, each frame consists of fields for pitch and loudness
Bit rates as low as 2.4 or 1.2 kbps. Generated sound at these rates is very synthetic and LPC encoders are
used in military applications, where bandwidth is important
7. (b)
• Frame type
– I-frame- Intracoded
• I-frames are encoded without reference to any other frames
– P-frame:intercoded
• The number of P-frames between I-frame is limited since any errors present in
the first P-frame will be propagated to the next
-B-frame:their contents are predicted using search regions in both past and future
frames
-PB-frame:this does not refer to a new frame type as such but rather the way two neighboring P- and B-
frame are encoded as if they were a single frame
-D-frame:only used in a specific type of application. It has been defined for use in movie/video-on-
demand application
8. (a) DPCM Encoder and Decoder
For most audio signals, the range of the differences in amplitude between successive samples of the audio
waveform is less than the range of the actual sample amplitudes
The previous digitized sample value is held in reg R
The decoder adds the DPCM with previously computed signal in the reg
8. (b)
The number of frames between a P-frame and the immediately Preceding I or P frame.
Motion compensation uses the knowledge of object motion so obtained to achieve data compression
(iv) Motion Estimation
Motion estimation examines the movement of objects in an image sequence to try to obtain vectors
representing the estimated motion.
– When the ear hears a loud sound,it takes a short but finite time before it can hear a
quieter sound
Module 5
Challenge in multimedia application- how to deliver multimedia streams to users with minimal
replay jitters
• Application Layer(top)
• Compression Layer
• Transport Layer
• Transmission Layer
Two techniques to reduce the impact of Jitter on Video Quality:
Traffic Pattern is shaped with desired characteristics such as maximal delay bounds, peak
rate etc.
bit stream from the coder is fed into a buffer at a rate R’(t), served at some rate µ(t),so that the
output R(t) meets the specified behavior
Bit stream is smoothed by the buffer whenever the service rate is below the input rate
Traffic shaping and SRC together finds an appropriate way of bit stream description such that
output R(t) will meet the specification required
Initialization:
• Subtracting the bit counts of the first frame from the total bit counts
Pre-encoded stage:
• Target bit estimation, adjustment of target bit based on buffer status for each QP and VO
Encoding Stage:
Encoding the video frame,recording all actual bit rates and activating the MB layer rate control
Post-encoding stage:
(i)Channel errors
• A Buffer can be used to absorb the instantaneous traffic peak to some extent, but the buffer
overflow in case of congestion.
• In the case of network congestion or buffer overflow, the network congestion control protocol
will drop cells.
• Cell discarding can occur on the transmitting side if the no.of cells generated are in excess of
allocated capacity or it can occur on the receiving side if a cell has not been arrived within the
delay time of the buffer memory.
• In such cases the sender could be informed by the network traffic control protocol to reduce the
traffic flow or to switch to a lower grade service mode by sub sampling and interlacing.
• ATM have cells and packets as multiplexing units that are shorter than a full cell.
• Network framing is used to detect and to possibly correct lost and corrupted multiplexing units
• Errors and losses can be identified by a CRC on the application frame after reassembly.
• It is important that frame length is known a priori because the frame length of a faulty frame is
uncertain.
• The delay requirements might not allow it because it adds another round trip delay which
violates the end to end delay requirements.
• Loss caused by multiplexing overload is likely to be correlated as more traffic bursts can occur
which the code cannot correct.
•