CH 6
CH 6
✫ ✪
Instructor: Jia-Chin Lin 1
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
Outline
• Mathematical Models for Information Sources
✫ ✪
Instructor: Jia-Chin Lin 2
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
pk = P [X = xk ], 1≤k≤L
• Where
L
X
pk = 1
k=1
✫ ✪
Instructor: Jia-Chin Lin 3
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
P [x|y]
I(X; Y ) = log
P [x]
• the mutual information between random variables X and Y is
defined as the average of I(x; y)
XX
I(X; Y ) = P [X = x, Y = y]I(X; Y )
x∈X y∈Y
XX P [x|y]
= P [X = x, Y = y] log
P [x]
x∈X y∈Y
✫ ✪
Instructor: Jia-Chin Lin 4
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
✫ ✪
x∈X
P
H(Y |X = x) = P [X = x]H(Y |X = x)
x∈X
P
=− P [X = x, Y = y] log P [X = x, Y = y]
(x,y)∈X×Y
• Then
H(X, Y ) = H(X) + H(Y |X)
✫ ✪
Instructor: Jia-Chin Lin 7
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
• Joint entropy
X
H(X1 , X2 , . . . , Xn ) =− P [X1 = x1 , X2 = x2 , . . . , Xn = xn ]
x1 ,x2,... ,xn
✫ ✪
Instructor: Jia-Chin Lin 8
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
Lossless Source
• Since the number of occurrences of ai in x is roughly npi and the
source is memoryless
N npi
Q
log P [X =x] ≈ log (pi )
i=1
N
X
= npi log pi
i=1
= −nH(X)
• Hence,
P [X =x] ≈ 2−nH(x)
• This states that all typical sequences have roughly the same
probability, and this common probability is 2−nH(x)
✫ ✪
Instructor: Jia-Chin Lin 9
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
✫ ✪
Instructor: Jia-Chin Lin 10
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
• Define
Hk (X) ≥ H( Xk | X1 X2 · · · Xk−1 )
1 j+1
Hk+j (X) = H(X1 X2 · · · Xk−1 ) + H( Xk | X1 X2 · · · Xk−1 )
k+j k+j
• As j → ∞
H∞ (X) ≤ H( Xk | X1 X2 · · · Xk−1 )
• It is valid for k → ∞
✫ ✪
Instructor: Jia-Chin Lin 13
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
✫ ✪
Instructor: Jia-Chin Lin 14
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
✫ ✪
Instructor: Jia-Chin Lin 15
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
✫ ✪
Instructor: Jia-Chin Lin 16
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
Entropy and Mutual Information for Continuous Random Variables
• Mutual information
Z ∞ Z ∞
p(y|x)p(x)
I(X; Y ) = p(x)p(y|x) log dxdy
−∞ −∞ p(x)p(y)
• Properties
– I(X; Y ) = I(Y ; X)
– I(X; Y ) ≥ 0
– I(X; Y ) = H(X) − H(X|Y ) = H(Y ) − H(Y |X)
• Conditional differential entropy
Z ∞Z ∞
H(X|Y ) = − p(x, y) log p(x | y)dxdy
−∞ −∞
✫ ✪
Instructor: Jia-Chin Lin 17
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
Entropy and Mutual Information for Continuous Random Variables
The random variable X is discrete and Y is continuous. X has
possible outcomes xi , i = 1, 2, . . . , n. When X and Y are statistically
dependent, the marginal pdf p(y) can be expressed as:
n
X
p(y) = p(y|xi )P[xi ]
i=1
p(y|xi )P[xi ]
I(xi ; y) = log
p(y)P[xi ]
p(y|xi )
= log
p(y)
✫ ✪
Instructor: Jia-Chin Lin 18
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
Entropy and Mutual Information for Continuous Random Variables
Then the average mutual information between X and Y is
n Z ∞
X p(y|xi )
I(X; Y ) = p(y|xi )P[xi ] log dy
i=1 −∞ p(y)
✫ ✪
Instructor: Jia-Chin Lin 19
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
ex 6.4-1. Suppose that X is a discrete random variable with two
equally probable outcomes x1 = A and x2 = −A. Let the conditional
pdfs p(y|xi ), i = 1, 2, be Gaussian with mean xi and variance σ 2 .
1 (y − A)2
p(y|A) = √ exp(− 2
)
2πσ 2σ
1 (y + A)2
p(y| − A) = √ exp(− 2
)
2πσ 2σ
The average mutual information
Z ∞
1 p(y|A) p(y| − A)
I(X; Y ) = p(y|A) log + p(y| − A) log dy
2 −∞ p(y) p(y)
where
1
p(y) = [p(y|A) + p(y| − A)]
2
✫ ✪
Instructor: Jia-Chin Lin 20
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
✫ ✪
Instructor: Jia-Chin Lin 21
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
✫ ✪
Instructor: Jia-Chin Lin 22
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
P [Y = 0|X = 1] = P [Y = 1|X = 0] = p
P [Y = 1|X = 1] = P [Y = 0|X = 0] = 1 − p
✫ ✪
Instructor: Jia-Chin Lin 23
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
✫ ✪
Instructor: Jia-Chin Lin 24
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
✫ ✪
Instructor: Jia-Chin Lin 25
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
E[X 2 (t)] ≤ P
which for ergodic inputs results in an input power constraint of the form
T /2
1
Z
lim x2 (t)dt ≤ P
T →∞ T −T /2
✫ ✪
Instructor: Jia-Chin Lin 26
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
✫ ✪
Instructor: Jia-Chin Lin 27
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
N
Y
p(y1 , y2 , . . . , yN |x1 , x2 , . . . , xN ) = p(yj |xj )
j=1
for any N .
✫ ✪
Instructor: Jia-Chin Lin 28
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
T /2 2W T
1 1 X 2
Z
lim x2 (t)dt = lim xj
T →∞ T −T /2 T →∞ T
j=1
1
= lim× 2W T E[X 2 ]
T →∞ T
= 2W E[X 2 ]
≤ P
✫ ✪
Instructor: Jia-Chin Lin 29
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
Channel Capacity
• The entropy and the rate distortion function provide the minimum
required rate for compression of a discrete memoryless source
subject to the condition that it can be losslessly recovered.
• We introduce a third fundamental quantity called channel capacity
that provides the maximum rate at which reliable communication
over a channel is possible.
✫ ✪
Instructor: Jia-Chin Lin 30
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
Channel Capacity
For an arbitrary DMC the capacity is given by
C = max I (X; Y )
p
pi ≥ 0 i = 1, 2, . . . , |X |
|X |
X
pi = 1
i=1
The unit of C are bits per transmission or bits per channel use.
✫ ✪
Instructor: Jia-Chin Lin 31
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
ex 6.5-1. For an BSC, due to the symmetry of the channel, the
capacity is achieved for a uniform input distribution, i.e., for
1
P [Y = 0|X = 1] = P [Y = 1|X = 0] = . The maximum mutual
2
information is given by
✫ ✪
Instructor: Jia-Chin Lin 32
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
Channel Capacity with Input Power Constraint
✫ ✪
Instructor: Jia-Chin Lin 33
✬
Ch. 6: An Introduction to Information Theory CO6019-∗: Digital Communications
✩
Channel Capacity with Input Power Constraint
✫ ✪
Instructor: Jia-Chin Lin 34