21ECE72_Coding and Cryp Module 1
21ECE72_Coding and Cryp Module 1
Text Book : Bose, Ranjan. Information theory, coding and cryptography, 3 rd Edition, Tata McGraw-Hill Education,
2015, ISBN: 978-9332901257
Where pi is the probability of the occurrence of character number i from a given stream of characters and
b is the base of the algorithm used. Hence, this is also called as Shannon’s Entropy.
Conditional Entropy: The amount of uncertainty remaining about the channel input after observing the
channel output, is called as Conditional Entropy.
It is denoted by
Example:
Consider a diskette storing a data ile consisting of 100,000 binary digits (binits), i.e., a total of 100,000
“0”s and “1”s . If the binits 0 and 1 occur with probabilities of ¼ and ¾ respectively, then binit 0 conveys
an amount of information equal to log2 (4/1) = 2 bits, while the binit 1 conveys information amounting to
log2 (4/3) = 0.42 bit.
The quantity H is called the entropy of a discrete memory-less source. It is a measure of the average
information content per source symbol. It may be noted that the entropy H depends on the probabilities of
the symbols in the alphabet of the source.
Example
Consider a discrete memory-less source with source alphabet {s 0,s1,s2} with probabilities p0=1/4, p1=1/4
and p2=1/2. Find the entropy of the source.
Solution
The entropy of the given source is
H = p0log2(1/p0) + p1log2(1/p1) + p2log2(1/p2)
= ¼log2(4) + ¼log2(4) + ½log2(2)
= 2/4 + 2/4 + 1/2
= 1.5 bits
For a discrete memory-less source with a ixed alphabet:
• H=0, if and only if the probability pk=1 for some k, and the remaining probabilities in
the set are all zero. This lower bound on the entropy corresponds to ‘no uncertainty’.
• H=log2(K), if and only if pk=1/K for all k (i.e. all the symbols in the alphabet are
equiprobable). This upper bound on the entropy corresponds to ‘maximum
uncertainty’.
• In Case I, it is very easy to guess whether the message s0 with a probability =0.01 will occur or the
message s1 with probability =0.99 will occur.(Most of the time message s 1 will occur). Thus in this
case, the uncertainty is less.
• In Case II, it is somewhat dif icult to guess whether s0 will occur or s1 will occur as their
probabilities are nearly equal. Thus in this case, the uncertainty is more.
In Case III, it is extremely dif icult to guess whether s0 or s1 will occur, as their probabilities are
equal. Thus in this case, the uncertainty is maximum
Entropy is less when uncertainty is less.
Entropy is more when uncertainty is more.
Thus, we can say that entropy is a measure of uncertainty.
An analog signal is band limited to B Hz, sampled at the Nyquist rate, and the samples are quantized into 4-
levels. The quantization levels Q1, Q2, Q3, and Q4 (messages) are assumed independent and occur with
probs. P1 = P2 = 1 and P2 = P3 = 3 . Find the information rate of the source.
Relation between Entropy and Mutual Information
Mutual Information: quanti ies the amount of information that knowing one random variable Y gives about
another random variable X. It is a measure of how much the uncertainty in X is reduced by knowing Y.
SHANNON- FANO CODING:
Lempel Ziv–Welch Coding
A drawback of the Huffman code is that it requires knowledge of a probabilistic model of
the source; unfortunately, in practice, source statistics are not always known a priori.
thereby compromising the ef iciency of the code. To overcome these practical
limitations, we may use the Lempel-Ziv algorithm/ which is intrinsically adaptive and
simpler to implement than Huffman coding.
A key to ile data compression is to have repetitive patterns of data so that patterns seen
once, can then be encoded into a compact code symbol, which is then used to represent
the pattern whenever it reappears in the ile. For example, in images, consecutive scan
lines (rows) of the image may be indentical. They can then be encoded with a simple code
character that represents the lines. In text processing, repetitive words, phrases, and
sentences may also be recognized and represented as a code. A typical ile data
compression algorithm is known as LZW - Lempel, Ziv, Welch encoding. Variants of this
algorithm are used in many ile compression schemes such as GIF iles etc. These are
lossless compression algorithms in which no data is lost, and the original ile can be
entirely reconstructed from the encoded message ile. The LZW algorithm is a greedy
algorithm in that it tries to recognize increasingly longer and longer phrases that are
repetitive, and encode them. Each phrase is de ined to have a pre ix that is equal to a
previously encoded phrase plus one additional character in the alphabet. Note “alphabet”
means the set of legal characters in the ile. For a normal text ile, this is the ascii character
set. For a gray level image with 256 gray levels, it is an 8 bit number that represents the
pixel’s gray level. In many texts certain sequences of characters occur with high frequency.
In English, for example, the word the occurs more often than any other sequence of three
letters, with and, ion, and ing close behind. If we include the space character, there are
other very common sequences, including longer ones like of the. Although it is impossible
to improve on Huffman encoding with any method that assigns a ixed encoding to each
character, we can do better by encoding entire sequences of characters with just a few
bits. The method of this section takes advantage of frequently occurring character
sequences of any length. It typically produces an even smaller representation than is
possible with Huffman trees, and unlike basic Huffman encoding it 1) reads through the
text only once and 2) requires no extra space for overhead in the compressed
representation. The algorithm makes use of a dictionary that stores character sequences
chosen dynamically from the text. With each character sequence the dictionary associates
a number; if s is a character sequence, we use codeword(s) to denote the number assigned
to s by the dictionary. The number codeword(s) is called the code or code number of s. All
codes have the same length in bits; a typical code size is twelve bits, which permits a
maximum dictionary size of 2 12 = 4096 character sequences.