0% found this document useful (0 votes)

7 views17 pages

Lec 2

This chapter discusses source coding, focusing on binary source encoding and the removal of redundancy from message streams. It introduces concepts such as discrete memoryless sources (DMS), encoders, and the importance of unique decodability and prefix codes in ensuring effective communication. Additionally, it presents Huffman coding as a popular algorithm for optimal source coding based on symbol probabilities.

Uploaded by

Mo’men Ehab

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

7 views17 pages

Lec 2

Uploaded by

Mo’men Ehab

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 17

2 Source Coding

In this chapter, we look at the “source encoder” part of the system. This
part removes redundancy from the message stream or sequence. We will
focus only on binary source coding.
2.1. The material in this chapter is based on [C & T Ch 2, 4, and 5].

2.1 General Concepts

Example 2.2. Suppose your message is a paragraph of (written natural)
text in English.
• Approximately 100 possibilities for characters/symbols.
◦ For example, a character-encoding scheme called ASCII (Amer-
ican Standard Code for Information Interchange) originally1 had
128 specified characters – the numbers 0–9, the letters a–z and
A–Z, some basic punctuation symbols2 , and a blank space.
• Do we need 7 bits per characters?

2.3. A sentence of English–or of any other language–always has more infor-

mation than you need to decipher it. The meaning of a message can remain
unchanged even though parts of it are removed.
Example 2.4.
3
• “J-st tr- t- r–d th-s s-nt-nc-.”
• “Thanks to the redundancy of language, yxx cxn xndxrstxnd whxt x
xm wrxtxng xvxn xf x rxplxcx xll thx vxwxls wxth xn ’x’ (t gts lttl
hrdr f y dn’t vn kn whr th vwls r).” 4
1
Being American, it didn’t originally support accented letters, nor any currency symbols other than the
dollar. More advanced Unicode system was established in 1991.
2
There are also some control codes that originated with Teletype machines. In fact, among the 128
characters, 33 are non-printing control characters (many now obsolete) that affect how text and space are
processed and 95 printable characters, including the space.
3
Charles Seife, Decoding the Universe. Penguin, 2007
4
Steven Pinker, The Language Instinct: How the Mind Creates Language. William Morrow, 1994

8
2.5. It is estimated that we may only need about 1 bits per character in
English text.
Definition 2.6. Discrete Memoryless Sources (DMS): Let us be more
specific about the information source.
• The message that the information source produces can be represented
by a vector of characters X1 , X2 , . . . , Xn .
◦ A perpetual message source would produce a never-ending sequence
of characters X1 , X2 , . . ..
• These Xk ’s are random variables (at least from the perspective of the
decoder; otherwise, these is no need for communication).
• For simplicity, we will assume our source to be discrete and memoryless.
◦ Assuming a discrete source means that the random variables are
all discrete; that is, they have supports which are countable.
∗ Recall that “countable” means “finite” or “countably infinite”.
∗ We will further assume that they all share the same support
and that the support is finite.
· This support is called the source alphabet.
· See Example 2.7 for some examples.
◦ Assuming a memoryless source means that there is no depen-
dency among the characters in the sequence.
∗ More specifically,

pX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) = pX1 (x1 ) × pX2 (x2 ) × · · · × pXn (xn ).
(1)

∗ Practical sources would not be memoryless; there are some

amount of dependence (structure) among the characters. For
English text, this is demonstrated in Example 2.4.
· Simple DMS model provides a good starting point to study.
· We can take advantage of such dependency.

9
∗ We will further assume that all of the random variables share
the same probability mass function (pmf)5 . We denote this
shared pmf by pX (x).

In which case, (1) becomes

pX1 ,X2 ,...,Xn (x1 , x2 , . . . , xn ) = pX (x1 )×pX (x2 )×· · ·×pX (xn ). (2)

· We will also assume that the pmf pX (x) is known. In prac-

tice, there is an extra step of estimating this pX (x).
· To save space, we may see the pmf pX (x) written simply as
p(x), i.e. without the subscript part.
∗ The shared support of X which is usually denoted by SX be-
comes the source alphabet. Note that we also often see the use
of X to denote the support of X.
• Summary: A DMS produces a sequence (symbol by symbol) of i.i.d.
RVs X1 , X2 , . . . all of which share the same pmf pX (x) whose support
is called the source alphabet.
• Because our simplified source code can be characterized by a random
variable X, we only need to specify its pmf pX (x).
Example 2.7. Examples of (finite) source alphabets
(a) Collection of 95 symbols for English text.
(b) Collection of 128 symbols for string of ASCII symbols.
(c) Collection of four symbols {Yes, No, OK, Thank You} for crude con-
versation with Farang.
(d) Collection of four symbols {A, B, C, D} for answers of multiple-choice
test.

5
We often use the term “distribution” interchangably with pmf and pdf; that is, instead of saying “pmf
of X”, we may say “distribution of X”.

10
Definition 2.8. An encoder c(·) is a function that maps each of the char-
acter in the source alphabet into a corresponding (binary) codeword.
• In particular, the codeword corresponding to a source character x is
denoted by c(x).

• Each codeword is constructed from a code alphabet.

◦ A binary codeword is constructed from a two-symbol alphabet,
wherein the two symbols are usually taken as 0 and 1.
◦ It is possible to consider non-binary codeword. Morse code dis-
cussed in Example 2.13 is one such example.
• Mathematically, we write

Encoder c : SX → {0, 1}∗

where

{0, 1}∗ = {ε, 0, 1, 00, 01, 10, 11, 000, 001, 010, 011, . . .}

is the set of all finite-length binary strings.

• The length of the codeword associated with source character x is de-

noted by `(x).
◦ In fact, writing this as ` (c(x)) may be clearer because we can see
that the length depends on the choice of the encoder. However, we
shall follow the notation above6 .
Example 2.9. c(red) = 00, c(blue) = 11 is a source code for SX =
{red, blue}.
6
which is used by the standard textbooks in information theory.

11
Example 2.10. Suppose the message is a sequence of basic English words
which happen according to the probabilities provided in the table below.

x p(x) Codeword c(x) `(x)

Yes 4%
No 3%
OK 90%
Thank You 3%

Definition 2.11. The expected length of a code c(·) for (a DMS source
which is characterized by) a random variable X with probability mass func-
tion pX (x) is given by
X
E [`(X)] = pX (x)`(x).
x∈SX

Example 2.12. Back to Example 2.10. Consider a new encoder:

x p(x) Codeword c(x) `(x)
Yes 4% 01
No 3% 001
OK 90% 1
Thank You 3% 0001
Observe the following:
• Data compression can be achieved by assigning short descriptions to
the most frequent outcomes of the data source, and necessarily longer
descriptions to the less frequent outcomes.
• When we calculate the expected length, we don’t really use the fact
that the source alphabet is the set {Yes, No, OK, Thank You}. We
would get the same answer if it is replaced by the set {1, 2, 3, 4}, or the
set {a, b, c, d}. All that matters is that the alphabet size is 4, and the
corresponding probabilities are {0.04, 0.03, 0.9, 0.03}.
Therefore, for brevity, we often find DMS source defined only by its
alphabet size and the list of probabilities.
Example 2.13. The Morse code is a reasonably efficient code for the En-
glish alphabet using an alphabet of four symbols: a dot, a dash, a letter
space, and a word space. [See Slides]

12
• Short sequences represent frequent letters (e.g., a single dot represents
E) and long sequences represent infrequent letters (e.g., Q is represented
by “dash,dash,dot,dash”).
Example 2.14. Thought experiment: Let’s consider the following code
x p(x) Codeword c(x) `(x)
1 4% 0
2 3% 1
3 90% 0
4 3% 1
This code is bad because we have ambiguity at the decoder. When a
codeword “0” is received, we don’t know whether to decode it as the source
symbol “1” or the source symbol “3”. If we want to have lossless source
coding, this ambiguity is not allowed.
Definition 2.15. A code is nonsingular if every source symbol in the
source alphabet has different codeword.

As seen from Example 2.14, nonsingularity is an important concept.

However, it turns out that this property is not enough.
Example 2.16. Another thought experiment: Let’s consider the following
code
x p(x) Codeword c(x) `(x)
1 4% 01
2 3% 010
3 90% 0
4 3% 10

2.17. We usually wish to convey a sequence (string) of source symbols. So,

we will need to consider concatenation of codewords; that is, if our source
string is
X1 , X2 , X3 , . . .
then the corresponding encoded string is
c(X1 )c(X2 )c(X3 ) · · · .
In such cases, to ensure decodability, we may

13
(a) use fixed-length code (as in Example 2.10), or
(b) use variable-length code and
(i) add a special symbol (a “comma” or a “space”) between any two
codewords

or
(ii) use uniquely decodable codes.
Definition 2.18. A code is called uniquely decodable (UD) if any en-
coded string has only one possible source string producing it.
Example 2.19. The code used in Example 2.16 is not uniquely decodable
because source string “2”, source string “34”, and source string “13” share
the same code string “010”.
2.20. It may not be easy to check unique decodability of a code. (See
Example 2.28.) Also, even when a code is uniquely decodable, one may
have to look at the entire string to determine even the first symbol in the
corresponding source string. Therefore, we focus on a subset of uniquely
decodable codes called prefix code.
Definition 2.21. A code is called a prefix code if no codeword is a prefix7
of any other codeword.
• Equivalently, a code is called a prefix code if you can put all the
codewords into a binary tree where all of them are leaves.
• A more appropriate name would be “prefix-free” code.
• The codeword corresponding to a symbol is the string of labels on the
path from the root to the corresponding leaf.
Example 2.22.
x Codeword c(x)
1 10
2 110
3 0
4 111
7
String s1 is a prefix of string s2 if there exist a string s3 , possibly empty, such that s2 = s1 s3 .

14
Example 2.23. The code used in Example 2.12 is a prefix code.
x Codeword c(x)
1 01
2 001
3 1
4 0001

2.24. Any prefix code is uniquely decodable.

• The end of a codeword is immediately recognizable.
• Each source symbol can be decoded as soon as we come to the end of
the codeword corresponding to it. In particular, we need not wait to
see the codewords that come later.
• Therefore, another name for “prefix code” is instantaneous code.
Example 2.25. The codes used in Example 2.12 (Example 2.23) and Ex-
ample 2.22 are prefix codes and hence they are uniquely decodable.
2.26. The nesting relationship among all the types of source codes is shown
in Figure 2. Classes of codes

All codes

Nonsingular codes

UD codes

Prefix
codes

Figure 2: Classes of codes

15
Example 2.27.
x Codeword c(x)
1 1
2 10
3 100
4 1000
Try to decode 10010001110100111
Example 2.28. [5, p 106–107]
x Codeword c(x)
1 10
2 00
3 11
4 110
This code is not a prefix code because codeword “11” is a prefix of code-
word “110”.
This code is uniquely decodable. To see that it is uniquely decodable,
take any code string and start from the beginning.
• If the first two bits are 00 or 10, they can be decoded immediately.
• If the first two bits are 11, we must look at the following bit(s).
◦ If the next bit is a 1, the first source symbol is a 3.
◦ If the next bit is a 0, we need to count how many 0s are there
before 1 shows up again.
◦ If the length of the string of 0’s immediately following the 11 is
even, the first source symbol is a 3.
◦ If the length of the string of 0’s immediately following the 11 is
odd, the first codeword must be 110 and the first source symbol
must be 4.
By repeating this argument, we can see that this code is uniquely decodable.

16
2.29. For our present purposes, a better code is one that is uniquely de-
codable and has a shorter expected length than other uniquely decodable
codes. We do not consider other issues of encoding/decoding complexity or
of the relative advantages of block codes or variable length codes. [6, p 57]

2.2 Optimal Source Coding: Huffman Coding

In this section we describe a very popular source coding algorithm called
the Huffman coding.
Definition 2.30. Given a source with known probabilities of occurrence
for symbols in its alphabet, to construct a binary Huffman code, create a
binary tree by repeatedly combining8 the probabilities of the two least likely
symbols.
• Developed by David Huffman as part of a class assignment9 .
8
The Huffman algorithm performs repeated source reduction [6, p 63]:
• At each step, two source symbols are combined into a new symbol, having a probability that is the
sum of the probabilities of the two symbols being replaced, and the new reduced source now has
one fewer symbol.
• At each step, the two symbols to combine into a new symbol have the two lowest probabilities.
◦ If there are more than two such symbols, select any two.

9
The class was the first ever in the area of information theory and was taught by Robert Fano at MIT
in 1951.
◦ Huffman wrote a term paper in lieu of taking a final examination.
◦ It should be noted that in the late 1940s, Fano himself (and independently, also Claude Shannon)
had developed a similar, but suboptimal, algorithm known today as the ShannonFano method. The
difference between the two algorithms is that the ShannonFano code tree is built from the top down,
while the Huffman code tree is constructed from the bottom up.

17
• By construction, Huffman code is a prefix code.
Example 2.31.
x pX (x) Codeword c(x) `(x)
A 0.5
B 0.25
C 0.125
D 0.125
E [`(X)] =
Note that for this particular example, the values of 2`(x) from the Huffman
encoding is inversely proportional to pX (x):
1
pX (x) = .
2`(x)
In other words,
1
`(x) = log2 = − log2 (pX (x)).
pX (x)
Therefore,
X
E [`(X)] = pX (x)`(x) =
x

Example 2.32.
x pX (x) Codeword c(x) `(x)
‘a’ 0.4
‘b’ 0.3
‘c’ 0.1
‘d’ 0.1
‘e’ 0.06
‘f’ 0.04

E [`(X)] =

18
Example 2.33.
x pX (x) Codeword c(x) `(x)
1 0.25
2 0.25
3 0.2
4 0.15
5 0.15

E [`(X)] =
Example 2.34.
x pX (x) Codeword c(x) `(x)
1/3
1/3
1/4
1/12

E [`(X)] =

x pX (x) Codeword c(x) `(x)

1/3
1/3
1/4
1/12

E [`(X)] =
2.35. The set of codeword lengths for Huffman encoding is not unique.
There may be more than one set of lengths but all of them will give the
same value of expected length.
Definition 2.36. A code is optimal for a given source (with known pmf) if
it is uniquely decodable and its corresponding expected length is the shortest
among all possible uniquely decodable codes for that source.
2.37. The Huffman code is optimal.

19
2.3 Source Extension (Extension Coding)
2.38. One can usually (not always) do better in terms of expected length
(per source symbol) by encoding blocks of several source symbols.
Definition 2.39. In, an n-th extension coding, n successive source sym-
bols are grouped into blocks and the encoder operates on the blocks rather
than on individual symbols. [4, p. 777]
Example 2.40.
x pX (x) Codeword c(x) `(x)
Y(es) 0.9
N(o) 0.1

(a) First-order extension:

E [`(X)] =

YNNYYYNYYNNN...

(b) Second-order Extension:

x1 x2 pX1 ,X2 (x1 , x2 ) c(x1 , x2 ) `(x1 , x2 )
YY
YN
NY
NN

E [`(X1 , X2 )] =

(c) Third-order Extension:

x1 x2 x3 pX1 ,X2 ,X3 (x1 , x2 , x3 ) c(x1 , x2 , x3 ) `(x1 , x2 , x3 )
YYY
YYN
YNY
..
.

E [`(X1 , X2 , X3 )] =

20
2.4 (Shannon) Entropy for Discrete Random Variables
Entropy is a measure of uncertainty of a random variable [5, p 13].

It arises as the answer to a number of natural questions. One such

question that will be important for us is “What is the average length of the
shortest description of the random variable?”
Definition 2.41. The entropy H(X) of a discrete random variable X is
defined by
X
H (X) = − pX (x) log2 pX (x) = −E [log2 pX (X)] .
x∈SX

• The log is to the base 2 and entropy is expressed in bits (per symbol).
◦ The base of the logarithm used in defining H can be chosen to be
any convenient real number b > 1 but if b 6= 2 the unit will not be
in bits.
◦ If the base of the logarithm is e, the entropy is measured in nats.
◦ Unless otherwise specified, base 2 is our default base.
• Based on continuity arguments, we shall assume that 0 ln 0 = 0.

21
Example 2.42. The entropy of the random variable X in Example 2.31 is
1.75 bits (per symbol).

Example 2.43. The entropy of a fair coin toss is 1 bit (per toss).

2.44. Note that entropy is a functional of the (unordered) probabilities

from the pmf of X. It does not depend on the actual values taken by
the random variable X, Therefore, sometimes, we write H(pX ) instead of
H(X) to emphasize this fact. Moreover, because we use only the probability
values, we can use the row vector representation p of the pmf pX and simply
express the entropy as H(p).
In MATLAB, to calculate H(X), we may define a row vector pX from
the pmf pX . Then, the value of the entropy is given by

HX = -pX*(log2(pX))’.

Example 2.45. The entropy of a uniform (discrete) random variable X on

{1, 2, 3, . . . , n}:

Example 2.46. The entropy of a Bernoulli random variable X:

22
Definition 2.47. Binary Entropy Function : We define hb (p), h (p) or
H(p) to be −p log2 p − (1 − p) log2 (1 − p), whose plot is shown in Figure 3.

0.9

0.8

0.7

0.6
H(p)

0.5

0.4

0.3

0.2

0.1

0
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
p

1
• Logarithmic Bounds: ( ln p Figure
)( ln q )3:≤Binary
( log eEntropy
) H ( p )Function
≤ ( ln p )( ln q )
ln 2
2.48. Two important facts about entropy:
0.7

(a) H (X) ≤ log2 |SX | with equality if and only if X is a uniform random
0.6
variable.
0.5
(b) H (X) ≥ 0 with equality if and only if X is not random.
0.4
In summary,
0.3 0 ≤ H (X) ≤ log2 |SX | .
deterministic uniform
Theorem 2.49. The expected length E [`(X)] of any uniquely decodable
0.2

binary code for a random variable X is greater than or equal to the entropy
0.1

H(X); that is,

0 E [`(X)] ≥ H(X)
0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1

with equality if and only if 2−`(x) = pX (x). [5, Thm. 5.3.1]

1
• Definition
Power-type ( lnLet
2.50.
bounds: 2 )(L(c, ) ≤ be
4 pqX) ( logthee ) expected
H ( p ) ≤ (codeword
ln 2 )( 4 pqlength
) ln 4 when ran-
dom variable X is encoded by code c.
Let L∗ (X) be 0.7
the minimum possible expected codeword length when
random variable X is encoded by a uniquely decodable code c:
0.6
L∗ (X) = min L(c, X).
UD c
0.5
23
0.4

0.3
2.51. Given a random variable X, let cHuffman be the Huffman code for this
X. Then, from the optimality of Huffman code mentioned in 2.37,
L∗ (X) = L(cHuffman , X).
Theorem 2.52. The optimal code for a random variable X has an expected
length less than H(X) + 1:
L∗ (X) < H(X) + 1.
2.53. Combining Theorem 2.49 and Theorem 2.52, we have
H(X) ≤ L∗ (X) < H(X) + 1. (3)
Definition 2.54. Let L∗n (X) be the minimum expected codeword length
per symbol when the random variable X is encoded with n-th extension
uniquely decodable coding. Of course, this can be achieve by using n-th
extension Huffman coding.
2.55. An extension of (3):
1
H(X) ≤ L∗n (X) < H(X) + . (4)
n
In particular,
lim L∗n (X) = H(X).
n→∞
In otherwords, by using large block length, we can achieve an expected
length per source symbol that is arbitrarily close to the value of the entropy.
2.56. Operational meaning of entropy: Entropy of a random variable is the
average length of its shortest description.
2.57. References
• Section 16.1 in Carlson and Crilly [4]
• Chapters 2 and 5 in Cover and Thomas [5]
• Chapter 4 in Fine [6]
• Chapter 14 in Johnson, Sethares, and Klein [8]
• Section 11.2 in Ziemer and Tranter [18]

5.3 Kraft Inequality and Optimal Codeword Length: Theorem 22 Let X
No ratings yet
5.3 Kraft Inequality and Optimal Codeword Length: Theorem 22 Let X
11 pages
Chapter 4 - Introduction To Source Coding
No ratings yet
Chapter 4 - Introduction To Source Coding
72 pages
C Programming Basics Questions and Answers
No ratings yet
C Programming Basics Questions and Answers
7 pages
XML Tutorial
100% (1)
XML Tutorial
66 pages
Common Land Measuring Units in Pakistan
100% (3)
Common Land Measuring Units in Pakistan
2 pages
The Reasoned Schemer
100% (2)
The Reasoned Schemer
177 pages
Demo. Teaching Topics
No ratings yet
Demo. Teaching Topics
3 pages
Identifying Proper Fraction, Improper Fraction, and Mixed Numbers
No ratings yet
Identifying Proper Fraction, Improper Fraction, and Mixed Numbers
5 pages
Coding Line Coding Covered
No ratings yet
Coding Line Coding Covered
68 pages
TTBBook Arr.
100% (2)
TTBBook Arr.
10 pages
English Reasoning 1
No ratings yet
English Reasoning 1
76 pages
Chapter 3
No ratings yet
Chapter 3
96 pages
Information Theory Module 3
No ratings yet
Information Theory Module 3
68 pages
Subject: PRF192-PFC Workshop 02: Objectives
0% (1)
Subject: PRF192-PFC Workshop 02: Objectives
5 pages
7-Information Theory
No ratings yet
7-Information Theory
29 pages
Coding-Theoretic Foundations: Source Alphabet S Categories of Source Encoding
No ratings yet
Coding-Theoretic Foundations: Source Alphabet S Categories of Source Encoding
12 pages
Digital Communication Process Through Swayam
No ratings yet
Digital Communication Process Through Swayam
31 pages
Lecture03 (Source Coding, Channel Coding, - Modulation)
No ratings yet
Lecture03 (Source Coding, Channel Coding, - Modulation)
63 pages
Grade 6: Music: Keyboards - Performance and Practice
No ratings yet
Grade 6: Music: Keyboards - Performance and Practice
11 pages
Caricias Arr.-Charango
No ratings yet
Caricias Arr.-Charango
2 pages
Improper Fractions-Division
No ratings yet
Improper Fractions-Division
2 pages
Data Compression
No ratings yet
Data Compression
26 pages
Shanonfano and Huffman Coding
No ratings yet
Shanonfano and Huffman Coding
18 pages
Chapter 2
No ratings yet
Chapter 2
22 pages
Programming Language Exams
No ratings yet
Programming Language Exams
9 pages
Csc3205-Lexical - Analysis PDF
No ratings yet
Csc3205-Lexical - Analysis PDF
33 pages
Numbers and More in Python
No ratings yet
Numbers and More in Python
6 pages
C For Java Programmers: J. Maassen
No ratings yet
C For Java Programmers: J. Maassen
78 pages
The Programming Language Landscape
No ratings yet
The Programming Language Landscape
27 pages
Lec 2 Source Coding
No ratings yet
Lec 2 Source Coding
10 pages
Notes 07 Compression PDF
No ratings yet
Notes 07 Compression PDF
193 pages
Entropy 3
No ratings yet
Entropy 3
10 pages
18ec501 U4lm1
No ratings yet
18ec501 U4lm1
20 pages
Little Book of Ruby
100% (1)
Little Book of Ruby
85 pages
Unit 2
No ratings yet
Unit 2
30 pages
Assignment - Digital Communication
No ratings yet
Assignment - Digital Communication
6 pages
Informatics Practices: Class XI (As Per CBSE Board)
No ratings yet
Informatics Practices: Class XI (As Per CBSE Board)
23 pages
DC-PPT 5
No ratings yet
DC-PPT 5
44 pages
ESG 111 Programming For Engineers
No ratings yet
ESG 111 Programming For Engineers
25 pages
Coding Theory
No ratings yet
Coding Theory
49 pages
Source 515 A
No ratings yet
Source 515 A
80 pages
Chapter 4 - Introduction To Source Coding PDF
No ratings yet
Chapter 4 - Introduction To Source Coding PDF
72 pages
Shan PDF
No ratings yet
Shan PDF
104 pages
Information Theory: Dr. Muhammad Imran Farid
No ratings yet
Information Theory: Dr. Muhammad Imran Farid
32 pages
Lecture 4
No ratings yet
Lecture 4
65 pages
Bods Scripting
100% (1)
Bods Scripting
3 pages
Unit 5 - Part-Ii
No ratings yet
Unit 5 - Part-Ii
41 pages
3 Source Coding
No ratings yet
3 Source Coding
31 pages
Information Theory: Mohamed Hamada
No ratings yet
Information Theory: Mohamed Hamada
19 pages
Source Coding: 3.1 What Is Coding and Why Is It Necessary?
No ratings yet
Source Coding: 3.1 What Is Coding and Why Is It Necessary?
2 pages
Chapter Two - Part 1
No ratings yet
Chapter Two - Part 1
21 pages
Introduction To Digital Communications and Information Theory
No ratings yet
Introduction To Digital Communications and Information Theory
8 pages
Koya Gondwana Gondi Script
No ratings yet
Koya Gondwana Gondi Script
47 pages
Information Theory and Coding: What You Need To Know in Today's ICE Age!
No ratings yet
Information Theory and Coding: What You Need To Know in Today's ICE Age!
44 pages
KPS Piano Test Lesson 1
100% (1)
KPS Piano Test Lesson 1
6 pages
Entropy, Coding and Data Compression
No ratings yet
Entropy, Coding and Data Compression
33 pages
1 Convolutional Codes: 1.1 Example
No ratings yet
1 Convolutional Codes: 1.1 Example
4 pages
Mesleki Yeterlilik
No ratings yet
Mesleki Yeterlilik
106 pages
Data Compression
No ratings yet
Data Compression
35 pages
Module 3 - Intermediate Spreadsheet
No ratings yet
Module 3 - Intermediate Spreadsheet
13 pages
Lecture 4
No ratings yet
Lecture 4
34 pages
Conversion of Unit Class Notes #2
No ratings yet
Conversion of Unit Class Notes #2
3 pages
Unit 2
No ratings yet
Unit 2
28 pages
CH 6
No ratings yet
CH 6
21 pages
Data Compression Using Adaptive Coding and Partial String Matching
No ratings yet
Data Compression Using Adaptive Coding and Partial String Matching
7 pages
Unit I Information Theory & Coding Techniques P I
No ratings yet
Unit I Information Theory & Coding Techniques P I
48 pages
Indraca Module 1 Lesson 3
No ratings yet
Indraca Module 1 Lesson 3
10 pages
Ch. 2 Source Coding-Ppt1 PDF
No ratings yet
Ch. 2 Source Coding-Ppt1 PDF
59 pages
ch3 Part1
No ratings yet
ch3 Part1
7 pages
Coding & Information Theory: By: Shiva Navabi January, 29 2011
No ratings yet
Coding & Information Theory: By: Shiva Navabi January, 29 2011
38 pages
Behavioral Modelling: Initial Statement
No ratings yet
Behavioral Modelling: Initial Statement
12 pages
Chapter Three Source Coding: 1-Sampling Theorem
No ratings yet
Chapter Three Source Coding: 1-Sampling Theorem
19 pages
Publication 3 26433 1410
No ratings yet
Publication 3 26433 1410
6 pages
Ees452 2021 2.1
No ratings yet
Ees452 2021 2.1
9 pages
Source Coding: 1. Introduction-Encoding of The Source Output 2. Shannon S Encoding Algorithm 3. 4. 5. Outcome
No ratings yet
Source Coding: 1. Introduction-Encoding of The Source Output 2. Shannon S Encoding Algorithm 3. 4. 5. Outcome
15 pages
Information Theory and Coding
No ratings yet
Information Theory and Coding
27 pages
Ecs452 2
No ratings yet
Ecs452 2
16 pages
Source Codes
No ratings yet
Source Codes
37 pages
Group Presentation Digital Communication Systems
No ratings yet
Group Presentation Digital Communication Systems
29 pages
An Information Source Is A Device Which Delivers Symbols (Or Letters) Randomly From A
No ratings yet
An Information Source Is A Device Which Delivers Symbols (Or Letters) Randomly From A
5 pages
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
No ratings yet
Source Coding: Source Encoder Channel Encoder Digital Source Source Entropy Symbols Binary Sequence Modulator
18 pages
Input Source Encoder Channel Encoder Binary Interface
No ratings yet
Input Source Encoder Channel Encoder Binary Interface
29 pages
CFG MCQ
100% (1)
CFG MCQ
7 pages
ABAP Statements
No ratings yet
ABAP Statements
27 pages
(2012) ICSE Computer Applications Question Paper
100% (1)
(2012) ICSE Computer Applications Question Paper
6 pages

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Lec 2

Uploaded by

Lec 2

Uploaded by

2 Source Coding

2.1 General Concepts

2.3. A sentence of English–or of any other language–always has more infor-

∗ Practical sources would not be memoryless; there are some

In which case, (1) becomes

· We will also assume that the pmf pX (x) is known. In prac-

• Each codeword is constructed from a code alphabet.

Encoder c : SX → {0, 1}∗

is the set of all finite-length binary strings.

• The length of the codeword associated with source character x is de-

x p(x) Codeword c(x) `(x)

Example 2.12. Back to Example 2.10. Consider a new encoder:

As seen from Example 2.14, nonsingularity is an important concept.

2.17. We usually wish to convey a sequence (string) of source symbols. So,

2.24. Any prefix code is uniquely decodable.

Figure 2: Classes of codes

2.2 Optimal Source Coding: Huffman Coding

x pX (x) Codeword c(x) `(x)

(a) First-order extension:

(b) Second-order Extension:

(c) Third-order Extension:

It arises as the answer to a number of natural questions. One such

2.44. Note that entropy is a functional of the (unordered) probabilities

Example 2.45. The entropy of a uniform (discrete) random variable X on

Example 2.46. The entropy of a Bernoulli random variable X:

H(X); that is,

with equality if and only if 2−`(x) = pX (x). [5, Thm. 5.3.1]

You might also like

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.