0% found this document useful (0 votes)

14 views

21ECE72_Coding and Cryp Module 1

The document provides an overview of Information Theory and Source Coding, covering key concepts such as entropy, mutual information, and various coding techniques including Huffman, Arithmetic, Lempel-Ziv, and Run Length coding. It explains the mathematical foundations of these concepts, emphasizing their applications in communication systems and data compression. The document also includes formulas and examples to illustrate the calculation of entropy and the effectiveness of different coding methods.

Uploaded by

dnekaidbsj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

14 views

21ECE72_Coding and Cryp Module 1

Uploaded by

dnekaidbsj

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 34

Module 1

Information Theory and Source Coding

Syllabus :
Introduction to Information Theory, Uncertainty and Information, Entropy, Mutual information, Relationship
between entropy and mutual information, Shannon Fano coding.
Source Coding Techniques: Huffman Coding, Arithmetic coding, Lempel-Ziv Coding, Run length coding.

Text Book : Bose, Ranjan. Information theory, coding and cryptography, 3 rd Edition, Tata McGraw-Hill Education,
2015, ISBN: 978-9332901257

General Introduction to Information Theory

Information Theory is a branch of mathematics that deals with the quanti ication of information. It
provides a framework for understanding how information is transmitted, stored, and processed.
Uncertainty and Information
 Uncertainty: The degree of unpredictability associated with an event.
 Information: The reduction in uncertainty.
Entropy
 Entropy (H): A measure of the average amount of information contained in a message.
o Higher entropy indicates greater uncertainty.
o Lower entropy indicates less uncertainty.
Formula for Entropy:
H(X) = -∑ P(xi) log₂ P(xi)
where:
 X is a random variable.
 P(xi) is the probability of the i-th outcome.
Mutual Information
 Mutual Information (I): A measure of the shared information between two random variables.
 It quanti ies the reduction in uncertainty about one random variable when the other is known.
Formula for Mutual Information:
I(X;Y) = H(X) - H(X|Y) = H(Y) - H(Y|X)
where:
 H(X|Y) is the conditional entropy of X given Y.
Relationship between Entropy and Mutual Information
 Mutual information is always non-negative.
 Mutual information is zero if and only if the two random variables are independent.
 Mutual information is equal to the entropy of one variable if the other variable is a deterministic
function of the irst.
Shannon Fano Coding
 A source coding technique that assigns variable-length codes to symbols based on their
probabilities.
 The codes are constructed such that more probable symbols have shorter codes.
Source Coding Techniques
Huffman Coding
 A greedy algorithm that constructs optimal pre ix-free codes.
 It involves building a binary tree based on the probabilities of the symbols.
 The codes are assigned by traversing the tree from the root to the corresponding leaf.
Arithmetic Coding
 A source coding technique that represents a sequence of symbols as a single real number.
 It achieves higher compression ef iciency than Huffman coding for many sources.
Lempel-Ziv Coding
 A class of algorithms that exploit repeated patterns in the input data.
 The data is compressed by replacing repeated sequences with pointers to previously seen
occurrences.
 Lempel-Ziv-Welch (LZW) is a popular variant of Lempel-Ziv coding.
Run Length Coding
 A simple compression technique that replaces sequences of identical symbols with a pair of values:
the symbol and the number of consecutive occurrences.
 It is effective for data with long runs of the same symbol.
Note: These are just a brief overview of the concepts involved in information theory and source coding.
For a deeper understanding, it is recommended to explore textbooks and online resources.
Information Theory:
• Information theory applies the laws of probability theory, and mathematics in general, to study the
collection and processing of information.
• In the context of communication systems, information theory, originally called the mathematical
theory of communication, deals with mathematical modelling and analysis of communication
systems, rather than with physical sources and physical channels.
Can we measure information?
• Consider the two following sentences:
1. There is a traf ic jam on New Horizon College of Engineering
2. There is a traf ic jam on New Horizon College of Engineering near Gate No 3.
Sentence 2 seems to have more information than that of sentence 1. From the semantic viewpoint,
sentence 2 provides more useful information.
Information theory is the scienti ic study of information and communication systems designed to handle it
(information).
• Including telegraphy, radio communications, and all other systems concerned with the
processing and/or storage of signals.
• In particular, Information Theory provides answers to the following two fundamental questions:
• What is the minimum number of bits per symbol required to fully represent the source?—
Entropy of the source
• What is the ultimate transmission rate for reliable communication over a noisy channel?—
Capacity of a channel
An information source is an object that produces an event, the outcome of which is random and in
accordance with some probability distribution.
A practical information source in a communication system is a device that produces messages. It can be
either analogue or digital.
Here, we shall deal mainly with the discrete sources, since the analogue sources can be transformed to
discrete sources through the use of sampling and quantisation techniques.
A discrete information source is a source that has only a inite set of symbols as possible outputs. The set
of possible source symbols is called the source alphabet, and the elements of the set are called symbols
ENTROPY:
Conditions of Occurrence of Events
If we consider an event, there are three conditions of occurrence.
 If the event has not occurred, there is a condition of uncertainty.
 If the event has just occurred, there is a condition of surprise.
 If the event has occurred, a time back, there is a condition of having some information.
These three events occur at different times. The differences in these conditions help us gain knowledge on
the probabilities of the occurrence of events.
Entropy: When we observe the possibilities of the occurrence of an event, how surprising or uncertain it
would be, it means that we are trying to have an idea on the average content of the information from the
source of the event.
Entropy can be de ined as a measure of the average information content per source symbol.

Where pi is the probability of the occurrence of character number i from a given stream of characters and
b is the base of the algorithm used. Hence, this is also called as Shannon’s Entropy.
Conditional Entropy: The amount of uncertainty remaining about the channel input after observing the
channel output, is called as Conditional Entropy.

It is denoted by
Example:
Consider a diskette storing a data ile consisting of 100,000 binary digits (binits), i.e., a total of 100,000
“0”s and “1”s . If the binits 0 and 1 occur with probabilities of ¼ and ¾ respectively, then binit 0 conveys
an amount of information equal to log2 (4/1) = 2 bits, while the binit 1 conveys information amounting to
log2 (4/3) = 0.42 bit.
The quantity H is called the entropy of a discrete memory-less source. It is a measure of the average
information content per source symbol. It may be noted that the entropy H depends on the probabilities of
the symbols in the alphabet of the source.
Example
Consider a discrete memory-less source with source alphabet {s 0,s1,s2} with probabilities p0=1/4, p1=1/4
and p2=1/2. Find the entropy of the source.
Solution
The entropy of the given source is
H = p0log2(1/p0) + p1log2(1/p1) + p2log2(1/p2)
= ¼log2(4) + ¼log2(4) + ½log2(2)
= 2/4 + 2/4 + 1/2
= 1.5 bits
For a discrete memory-less source with a ixed alphabet:
• H=0, if and only if the probability pk=1 for some k, and the remaining probabilities in
the set are all zero. This lower bound on the entropy corresponds to ‘no uncertainty’.
• H=log2(K), if and only if pk=1/K for all k (i.e. all the symbols in the alphabet are
equiprobable). This upper bound on the entropy corresponds to ‘maximum
uncertainty’.

• 0  H  log2(K) K is the radix (number of symbols) of the alphabet S of the source

• In Case I, it is very easy to guess whether the message s0 with a probability =0.01 will occur or the
message s1 with probability =0.99 will occur.(Most of the time message s 1 will occur). Thus in this
case, the uncertainty is less.
• In Case II, it is somewhat dif icult to guess whether s0 will occur or s1 will occur as their
probabilities are nearly equal. Thus in this case, the uncertainty is more.
In Case III, it is extremely dif icult to guess whether s0 or s1 will occur, as their probabilities are
equal. Thus in this case, the uncertainty is maximum
Entropy is less when uncertainty is less.
Entropy is more when uncertainty is more.
Thus, we can say that entropy is a measure of uncertainty.

An analog signal is band limited to B Hz, sampled at the Nyquist rate, and the samples are quantized into 4-
levels. The quantization levels Q1, Q2, Q3, and Q4 (messages) are assumed independent and occur with
probs. P1 = P2 = 1 and P2 = P3 = 3 . Find the information rate of the source.
Relation between Entropy and Mutual Information
Mutual Information: quanti ies the amount of information that knowing one random variable Y gives about
another random variable X. It is a measure of how much the uncertainty in X is reduced by knowing Y.
SHANNON- FANO CODING:
Lempel Ziv–Welch Coding
A drawback of the Huffman code is that it requires knowledge of a probabilistic model of
the source; unfortunately, in practice, source statistics are not always known a priori.
thereby compromising the ef iciency of the code. To overcome these practical
limitations, we may use the Lempel-Ziv algorithm/ which is intrinsically adaptive and
simpler to implement than Huffman coding.
A key to ile data compression is to have repetitive patterns of data so that patterns seen
once, can then be encoded into a compact code symbol, which is then used to represent
the pattern whenever it reappears in the ile. For example, in images, consecutive scan
lines (rows) of the image may be indentical. They can then be encoded with a simple code
character that represents the lines. In text processing, repetitive words, phrases, and
sentences may also be recognized and represented as a code. A typical ile data
compression algorithm is known as LZW - Lempel, Ziv, Welch encoding. Variants of this
algorithm are used in many ile compression schemes such as GIF iles etc. These are
lossless compression algorithms in which no data is lost, and the original ile can be
entirely reconstructed from the encoded message ile. The LZW algorithm is a greedy
algorithm in that it tries to recognize increasingly longer and longer phrases that are
repetitive, and encode them. Each phrase is de ined to have a pre ix that is equal to a
previously encoded phrase plus one additional character in the alphabet. Note “alphabet”
means the set of legal characters in the ile. For a normal text ile, this is the ascii character
set. For a gray level image with 256 gray levels, it is an 8 bit number that represents the
pixel’s gray level. In many texts certain sequences of characters occur with high frequency.
In English, for example, the word the occurs more often than any other sequence of three
letters, with and, ion, and ing close behind. If we include the space character, there are
other very common sequences, including longer ones like of the. Although it is impossible
to improve on Huffman encoding with any method that assigns a ixed encoding to each
character, we can do better by encoding entire sequences of characters with just a few
bits. The method of this section takes advantage of frequently occurring character
sequences of any length. It typically produces an even smaller representation than is
possible with Huffman trees, and unlike basic Huffman encoding it 1) reads through the
text only once and 2) requires no extra space for overhead in the compressed
representation. The algorithm makes use of a dictionary that stores character sequences
chosen dynamically from the text. With each character sequence the dictionary associates
a number; if s is a character sequence, we use codeword(s) to denote the number assigned
to s by the dictionary. The number codeword(s) is called the code or code number of s. All
codes have the same length in bits; a typical code size is twelve bits, which permits a
maximum dictionary size of 2 12 = 4096 character sequences.

AZ 104 Questions
No ratings yet
AZ 104 Questions
12 pages
C&C Combined Module Notes
No ratings yet
C&C Combined Module Notes
206 pages
ICT - Module 1 Lecture 1
No ratings yet
ICT - Module 1 Lecture 1
34 pages
INFORMATION THEORY AND SOURCE CODING
No ratings yet
INFORMATION THEORY AND SOURCE CODING
45 pages
Information Theory 5th Unit
No ratings yet
Information Theory 5th Unit
20 pages
Information Theory Final
No ratings yet
Information Theory Final
50 pages
Lec35 - 210108062 - ZAINAB ALI
No ratings yet
Lec35 - 210108062 - ZAINAB ALI
9 pages
Unit 1
No ratings yet
Unit 1
94 pages
ECE4007 Information Theory and Coding: DR - Sangeetha R.G
No ratings yet
ECE4007 Information Theory and Coding: DR - Sangeetha R.G
44 pages
Ec23ec4211itc PPT
No ratings yet
Ec23ec4211itc PPT
148 pages
Lecture01 02 Part1
No ratings yet
Lecture01 02 Part1
27 pages
Information Theory
No ratings yet
Information Theory
26 pages
Information Theory and Coding (Lecture 1) : Dr. Farman Ullah
No ratings yet
Information Theory and Coding (Lecture 1) : Dr. Farman Ullah
32 pages
Lecture 2
No ratings yet
Lecture 2
22 pages
Information Theory and Coding - Chapter 2
0% (1)
Information Theory and Coding - Chapter 2
41 pages
Information Coding Techniques
No ratings yet
Information Coding Techniques
42 pages
Information Theory
No ratings yet
Information Theory
38 pages
ITC Module - I
No ratings yet
ITC Module - I
98 pages
Chapter 1 (A)
No ratings yet
Chapter 1 (A)
30 pages
Information Theory Channel Capacity
No ratings yet
Information Theory Channel Capacity
27 pages
Information T Information Theory and Coding: S.Chandramohan
No ratings yet
Information T Information Theory and Coding: S.Chandramohan
38 pages
Lecture 7 Source Coding 2024
No ratings yet
Lecture 7 Source Coding 2024
28 pages
Itc Term1
No ratings yet
Itc Term1
78 pages
Unit 1 ITC
No ratings yet
Unit 1 ITC
25 pages
Module 1
No ratings yet
Module 1
29 pages
Module-1
No ratings yet
Module-1
40 pages
ITC Notes 2
No ratings yet
ITC Notes 2
36 pages
Information Theory and Coding NOTES
No ratings yet
Information Theory and Coding NOTES
129 pages
IICT Notes Unit-2
No ratings yet
IICT Notes Unit-2
17 pages
Intro Lecture Notes
No ratings yet
Intro Lecture Notes
15 pages
Unit IV - Information Theory
No ratings yet
Unit IV - Information Theory
17 pages
Chapter 2 - Edited
No ratings yet
Chapter 2 - Edited
45 pages
Information Theory and Coding
No ratings yet
Information Theory and Coding
226 pages
ITC-6 sem -1
No ratings yet
ITC-6 sem -1
66 pages
Lec3 Source Coding Annotated Day4
No ratings yet
Lec3 Source Coding Annotated Day4
75 pages
Lecture 2
No ratings yet
Lecture 2
55 pages
Information Theory: Prepared By: Amit Degada Teaching Assistant, ECED, NIT Surat
No ratings yet
Information Theory: Prepared By: Amit Degada Teaching Assistant, ECED, NIT Surat
30 pages
Information Theory
No ratings yet
Information Theory
37 pages
Information Theory and Coding
100% (2)
Information Theory and Coding
108 pages
CE Notes
No ratings yet
CE Notes
32 pages
Communication System CH#2
No ratings yet
Communication System CH#2
40 pages
PMIT-6214: Information Coding: Instructor: M. Shamim Kaiser Email: Text Phone: 01511000555
No ratings yet
PMIT-6214: Information Coding: Instructor: M. Shamim Kaiser Email: Text Phone: 01511000555
76 pages
Information Theory
No ratings yet
Information Theory
108 pages
Digital Communication Intro2
No ratings yet
Digital Communication Intro2
2 pages
Information Theory and Coding PDF
No ratings yet
Information Theory and Coding PDF
61 pages
Unit 4 - DC - 2023-2024
No ratings yet
Unit 4 - DC - 2023-2024
100 pages
Information Theory
No ratings yet
Information Theory
29 pages
The Information Theory: C.E. Shannon, A Mathematical Theory of Communication'
No ratings yet
The Information Theory: C.E. Shannon, A Mathematical Theory of Communication'
43 pages
Amount of Information I Log (1/P)
No ratings yet
Amount of Information I Log (1/P)
2 pages
Chapte-2 Information Theory and Coding
No ratings yet
Chapte-2 Information Theory and Coding
68 pages
Information Theory and Coding: Universit' A Degli Studi Di Siena Facolt'a Di Ingegneria
No ratings yet
Information Theory and Coding: Universit' A Degli Studi Di Siena Facolt'a Di Ingegneria
156 pages
A Visual Introduction To Information Theory
No ratings yet
A Visual Introduction To Information Theory
43 pages
All Coding
No ratings yet
All Coding
52 pages
Unit 1
100% (2)
Unit 1
45 pages
Information Theory
No ratings yet
Information Theory
26 pages
DC Lecture Slides 1 - Information Theory
No ratings yet
DC Lecture Slides 1 - Information Theory
22 pages
Unit 1 INFORMATION ENTROPY FUNDAMENTALS
No ratings yet
Unit 1 INFORMATION ENTROPY FUNDAMENTALS
13 pages
Channel Coding Theorem
No ratings yet
Channel Coding Theorem
23 pages
EC401 M1-Information Theory & Coding-Ktustudents - in PDF
No ratings yet
EC401 M1-Information Theory & Coding-Ktustudents - in PDF
50 pages
Information Theory: A Concise Introduction
From Everand
Information Theory: A Concise Introduction
Stefan Hollos
No ratings yet
Algorithmic Information Theory: Fundamentals and Applications
From Everand
Algorithmic Information Theory: Fundamentals and Applications
Fouad Sabry
No ratings yet
Snake Game
No ratings yet
Snake Game
23 pages
RJFS KeyboardShortcuts
No ratings yet
RJFS KeyboardShortcuts
5 pages
Installing Oracle and Oracle Patching For Sap Installs:: Below Method Describes Using "Xwin Server"
No ratings yet
Installing Oracle and Oracle Patching For Sap Installs:: Below Method Describes Using "Xwin Server"
21 pages
Digital Signal Processing Lecture-03: Arnisha Akhter Lecturer, Dept. of CSE Jagannath University
No ratings yet
Digital Signal Processing Lecture-03: Arnisha Akhter Lecturer, Dept. of CSE Jagannath University
44 pages
80386DX-Basic Programming Model and Applications Instruction Set
No ratings yet
80386DX-Basic Programming Model and Applications Instruction Set
126 pages
Messenger en 44
No ratings yet
Messenger en 44
8 pages
Tapi Diploma Engineering College, Surat - 6: Computer Department
No ratings yet
Tapi Diploma Engineering College, Surat - 6: Computer Department
1 page
Biñan City Senior High School-San Antonio Campus
No ratings yet
Biñan City Senior High School-San Antonio Campus
6 pages
Create A Trading Strategy With ChatGPT
No ratings yet
Create A Trading Strategy With ChatGPT
18 pages
AI^1
No ratings yet
AI^1
12 pages
2019 - Book - Data Analytics and Learning
100% (1)
2019 - Book - Data Analytics and Learning
450 pages
Mastering Python Free E-Book by Cosmicode
No ratings yet
Mastering Python Free E-Book by Cosmicode
60 pages
Lecture 1 Introduction To Artificial Intelligence (AI)
No ratings yet
Lecture 1 Introduction To Artificial Intelligence (AI)
34 pages
Outer Space An Outer Product Based Sparse Matrix Multiplication Accelerator
No ratings yet
Outer Space An Outer Product Based Sparse Matrix Multiplication Accelerator
13 pages
A13C Manual of AI Plugin V2.0
No ratings yet
A13C Manual of AI Plugin V2.0
7 pages
Tecspg 2435
No ratings yet
Tecspg 2435
247 pages
Development of Vibration Spectrum Analyzer Using The Raspberry Pi Microcomputer and 3 Axis Digital MEMS Accelerometer ADXL345
No ratings yet
Development of Vibration Spectrum Analyzer Using The Raspberry Pi Microcomputer and 3 Axis Digital MEMS Accelerometer ADXL345
12 pages
NBL-S-TM Soil Temperature and Moisture Sensor Instruction Manual 4.0
No ratings yet
NBL-S-TM Soil Temperature and Moisture Sensor Instruction Manual 4.0
5 pages
DDC Controller - Jci - SNC
No ratings yet
DDC Controller - Jci - SNC
34 pages
DBMS NEP Lab Manual
No ratings yet
DBMS NEP Lab Manual
57 pages
Waterfall Model Used in Software Development Reference: Software Requirements Engineering Waterfall Model
No ratings yet
Waterfall Model Used in Software Development Reference: Software Requirements Engineering Waterfall Model
4 pages
Computer Science A Structured Approach Using C Behrouz A. Forouzan download
100% (2)
Computer Science A Structured Approach Using C Behrouz A. Forouzan download
60 pages
Chapter 2 - Matrix Algebra
No ratings yet
Chapter 2 - Matrix Algebra
89 pages
New Topic! On The Telephone: Speaking Box 1 Telephone Skills Resource
No ratings yet
New Topic! On The Telephone: Speaking Box 1 Telephone Skills Resource
6 pages
My NGMN
No ratings yet
My NGMN
75 pages
Comp7 - Quarter 3 Module 2-3
No ratings yet
Comp7 - Quarter 3 Module 2-3
5 pages
Computer Network
No ratings yet
Computer Network
6 pages
313306 2024 Winter Question Paper
No ratings yet
313306 2024 Winter Question Paper
3 pages
W03 Web XSS
No ratings yet
W03 Web XSS
38 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

21ECE72_Coding and Cryp Module 1

Uploaded by

21ECE72_Coding and Cryp Module 1

Uploaded by

Module 1

Information Theory and Source Coding

General Introduction to Information Theory

• 0  H  log2(K) K is the radix (number of symbols) of the alphabet S of the source

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.