0% found this document useful (0 votes)
25 views6 pages

Communication Systems Channel Capacity

The document discusses channel capacity in communication systems. It defines channel capacity as the maximum rate in bits per channel use at which information can be sent with arbitrarily low probability of error. The channel capacity of a discrete memoryless channel is defined as the maximum mutual information between the input and output over all possible input distributions. Noisy channels can still have their full capacity achieved if their outputs are always distinguishable, such as a channel where the input is unchanged or shifts to the next symbol. The capacity of a noisy typewriter channel where the input remains or shifts is calculated as the maximum entropy of the output minus one.

Uploaded by

Kate boss
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
25 views6 pages

Communication Systems Channel Capacity

The document discusses channel capacity in communication systems. It defines channel capacity as the maximum rate in bits per channel use at which information can be sent with arbitrarily low probability of error. The channel capacity of a discrete memoryless channel is defined as the maximum mutual information between the input and output over all possible input distributions. Noisy channels can still have their full capacity achieved if their outputs are always distinguishable, such as a channel where the input is unchanged or shifts to the next symbol. The capacity of a noisy typewriter channel where the input remains or shifts is calculated as the maximum entropy of the output minus one.

Uploaded by

Kate boss
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 6

' $ ' $

S-72.2410 Channel Capacity (1) 1 S-72.2410 Channel Capacity (1) 3

Channel Capacity (1)


Communication Systems
The information channel capacity of a discrete memoryless
A communication system is a means of transporting information from channel is
one party (A) to another (B). The transfer is subject to noise and
imperfections of the signalling process. C = max I(X; Y ),
p(x)
We shall here find the number of distinguishable signals for n uses of
a communication channel. It will turn out that the number grows where the maximum is taken over all possible input distributions
exponentially with n, where the exponent is known as the channel p(x).
capacity.
The operational channel capacity is the highest rate in bits per
A communication system is depicted in [Cov, Fig. 8.1]. channel use at which information can be sent with arbitrarily low
probability of error.
& % & %
c Patric Östergård
c Patric Östergård

' $ ' $
S-72.2410 Channel Capacity (1) 2 S-72.2410 Channel Capacity (1) 4

Discrete Channels

A discrete channel is a system consisting of an input alphabet X ,


an output alphabet Y, and a probability transition matrix p(y|x) Channel Capacity (2)
that expresses the probability of observing the output symbol y given
that we send the symbol x.
Shannon: information channel capacity = operational channel
The channel is said to be memoryless if the probability distribution capacity.
of the output depends only on the input at that time and is
conditionally independent of previous channel inputs or outputs.  We may just talk about channel capacity.

If the input depends on the past output, then we have a channel


with feedback. Unless stated otherwise, we always consider
channels without feedback.
& % & %
c Patric Östergård
c Patric Östergård

' $ ' $
S-72.2410 Channel Capacity (1) 5 S-72.2410 Channel Capacity (1) 7

Data Compression vs. Data Transmission


Noisy Channels with Nonoverlapping Outputs

Data compression: Redundancy is removed to obtain the most


compressed version possible. An example of a noisy channel with nonoverlapping outputs is
depicted in [Cov, Fig. 8.3]. The channel appears to be noisy, but
Data transmission: Redundancy is added to combat errors in the really is not as the input can be determined from the output.
channel.
Hence, the channel in [Cov, Fig. 8.3] is equivalent to the noiseless
We shall see that a general communication system—with both binary channel, and its capacity is 1 bit. We can also calculate the
compression and error control—can be broken into two parts and information channel capacity C = max I(X; Y ) = 1 bit, which is
that the problems of data compression and data transmission can be achieved for p(x) = ( 12 , 12 ).
considered separately.

& % & %
c Patric Östergård
c Patric Östergård

' $ ' $
S-72.2410 Channel Capacity (1) 6 S-72.2410 Channel Capacity (1) 8

The Noisy Typewriter


The Noiseless Binary Channel
For the noisy typewriter, shown in [Cov, Fig. 8.4], the input is
unchanged at the output or transformed into the next letter, both
For the noiseless binary channel, illustrated in [Cov, Fig. 8.2]
with probability 12 .
(and earlier in [Cov, Fig. 1.3]), the binary input is reproduced exactly
at the output. Any transmitted bit is therefore received without With 26 input symbols, the information channel capacity is
error; for each use of the channel, we can send 1 bit reliably to the
C = max I(X; Y ) = max[H(Y ) − H(Y |X)] = max H(Y ) − 1
receiver and the capacity is 1 bit.
= log 26 − 1 = log 13.
We can also calculate the capacity C = max I(X; Y ) = 1 bit, which is
achieved for p(x) = ( 12 , 12 ). To achieve this capacity, we may use every alternate input symbol
and can then transmit 13 symbols without error with each
transmission.

& % & %
c Patric Östergård
c Patric Östergård

' $ ' $
S-72.2410 Channel Capacity (1) 9 S-72.2410 Channel Capacity (1) 11

The Binary Symmetric Channel The Binary Erasure Channel (2)

The binary symmetric channel (BSC), shown in [Cov, Fig. 8.5], is


C = max H(Y ) − H(α, 1 − α)
the simplest model of a channel with error. For the mutual p(x)
information, we get = max(H(α, 1 − α) + (1 − α)H(a, 1 − a)) − H(α, 1 − α)
X p(x)
I(X; Y ) = H(Y ) − H(Y |X) = H(Y ) − p(x)H(Y |X = x) = max(1 − α)H(a, 1 − a) = 1 − α,
x a

where capacity is achieved when a = 12 . As with the BSC, it is not


X
= H(Y ) − p(x)H(p, 1 − p) = H(Y ) − H(p, 1 − p)
x obvious how to achieve this rate.
≤ 1 − H(p, 1 − p),
In channels with feedback, the capacity can be achieved as follows: If
where the last inequality follows as Y is a binary random variable. a bit is lost, retransmit it until it gets through. It turns out that
As equality is achieved when the input distribution is uniform, the 1 − α is the best possible rate both with and without feedback.
information channel capacity is
 Feedback does not increase the capacity of discrete
& % & %
C = 1 − H(p, 1 − p) bits.
memoryless channels (but often simplifies the coding).
c Patric Östergård
c Patric Östergård

' $ ' $
S-72.2410 Channel Capacity (1) 10 S-72.2410 Channel Capacity (1) 12

The Binary Erasure Channel (1) Symmetric Channels (1)

A binary channel where bits are lost (rather than corrupted) is called Consider the channel with the transmission matrix
the binary erasure channel and is shown in [Cov, Fig. 8.6]. The  
receiver knows which bits have been erased. The mutual information 0.3 0.2 0.5
 
is now p(y|x) = 
 0.5 0.3 0.2  ,

I(X; Y ) = H(Y ) − H(Y |X) = H(Y ) − H(α, 1 − α). 0.2 0.5 0.3

Clearly, H(Y ) ≤ log 3, but this bound cannot be achieved. With where the entry in the xth row and the yth column denotes the
Pr(X = 1) = a, we get Pr(Y = 0) = (1 − a)(1 − α), Pr(Y = e) = α, conditional probability p(y|x) that y is received when x is sent. In
and Pr(Y = 1) = a(1 − α), so (after some calculation) this channel, all the rows of the probability transition matrix are
H(Y ) = H(α, 1 − α) + (1 − α)H(a, 1 − a). permutations of each other and so are the columns. Such a channel is
said to be symmetric.
& % & %
c Patric Östergård
c Patric Östergård

' $ ' $
S-72.2410 Channel Capacity (1) 13 S-72.2410 Channel Capacity (1) 15

Symmetric Channels (2) Properties of Channel Capacity

For a symmetric channel, we have 1. C ≥ 0, as I(X; Y ) ≥ 0.


2. C ≤ log |X |, as C = max I(X; Y ) ≤ max H(X) = log |X |.
I(X; Y ) = H(Y ) − H(Y |X) = H(Y ) − H(r) ≤ log |Y| − H(r), 3. C ≤ log |Y| with an analogous argument.
with equality if the output distribution is uniform (r is a row of the 4. I(X; Y ) is a continuous function of p(x).
transition matrix). Indeed, it is not difficult to show that uniform 5. I(X; Y ) is a concave function of p(x) (proved in
input distribution, p(x) = |X1 | , leads to uniform output distribution. [Cov, Theorem 2.7.4]).
For example, the channel described on the previous slide has capacity
In general, there is no closed form solution for the capacity of an
C = max I(X; Y ) = log 3 − H(0.5, 0.3, 0.2), arbitrary channel. Since I(X; Y ) is concave, a local maximum is a
p(x)
global maximum, and can be found using a standard nonlinear
which is achieved with uniform input distribution. optimization method.

& % & %
c Patric Östergård
c Patric Östergård

' $ ' $
S-72.2410 Channel Capacity (1) 14 S-72.2410 Channel Capacity (1) 16

Preview of the Channel Coding Theorem


Symmetric Channels (3)
Basic idea: For large block lengths, every channel looks like the noisy
typewriter channel ([Cov, Fig. 8.4]). For each (typical) input
A channel is said to be symmetric if the rows of the channel n-sequence, there are approximately 2nH(Y |X) possible Y sequences,
transition matrix p(y|x) are permutations of each other, and the all of them equally likely; see [Cov, Fig. 8.7].
columns are permutations of each other. A channel is said to be
weakly symmetric if every row of the transition matrix is a We wish to ensure that no two X sequences produce the same Y
permutation of every other row, and all the column sums are equal. output sequence. The total number of possible (typical) Y sequences
is ≈ 2nH(Y ) . This set has to be divided into sets of size 2nH(Y |X)
Theorem 8.2.1. For a weakly symmetric channel, corresponding to the different input X sequences. Hence, the total
number of disjoint sets is at most
C = log |Y| − H(row of transition matrix),

and this is achieved by a uniform distribution on the input alphabet. 2nH(Y )


= 2n(H(Y )−H(Y |X)) = 2nI(X;Y ) .
2nH(Y |X)
& % & %
c Patric Östergård
c Patric Östergård

' $ ' $
S-72.2410 Channel Capacity (1) 17 S-72.2410 Channel Capacity (1) 19

Channel Codes (1) Channel Codes (3)

We consider a communication system as shown in [Cov, Fig. 8.8]. A The rate of an (n, M ) code is
code for this channel consists of the following:
log M
1. An index set I = {1, 2, . . . , M }. R= bits per transmission.
n
2. An encoding function X n : I → X n .
3. A decoding function g : Y n → I. A rate R is said to be achievable if there exists a sequence of
We say that a code with these parameters is an (n, M ) code. The (n, d2nR e) (for simplicity, we write (n, 2nR ) in the sequel) codes such
conditional probability of error given that index i was sent is denoted that the maximal probability of error λ(n) tends to 0 as n → ∞.
by λi , that is, The capacity of a discrete memoryless channel is the supremum of
λi = Pr(Ŵ 6= i | W = i). all achievable rates.

& % & %
c Patric Östergård
c Patric Östergård

' $ ' $
S-72.2410 Channel Capacity (1) 18 S-72.2410 Channel Capacity (1) 20

Channel Codes (2) Jointly Typical Sequences (1)

The maximal probability of error, λ(n) , for a channel code is defined as (n)
The set A of jointly typical sequences {(xn , y n )} with respect to
the distribution p(x, y) is the set of n-sequences with empirical
λ(n) = max λi .
i∈I entropies -close to the true entropies, that is,
(n) 
1

The average probability of error, Pe , for a channel code is defined as (n) n n n

A = (x , y ) ∈ X × Y : − log p(x ) − H(X) < ,

M
n
1 X
Pe(n) =

λi . 1
M i=1 − log p(y n ) − H(Y ) < ,
n

If the index is chosen uniformly on the set I, we have 1
− log p(xn , y n ) − H(X, Y ) < , ,

n
Pe(n) = Pr(Ŵ 6= W ).

& % & %
c Patric Östergård
c Patric Östergård

' $ ' $
S-72.2410 Channel Capacity (1) 21 S-72.2410 Channel Capacity (1) 23

Jointly Typical Sequences (3)


Jointly Typical Sequences (2)

The jointly typical sequences are illustrated in [Cov, Fig. 8.9]. There
We will decode a channel output Y n as index i if the the codeword
are about 2nH(X) typical X sequences and about 2nH(Y ) typical Y
X n (i) is jointly typical with the received signal Y n . For the joint
sequences. However, since there are only 2nH(X,Y ) jointly typical
AEP (asymptotic equipartition property), we have the following
sequences, not all pairs of typical X and Y sequences are also jointly
result.
typical.
Theorem 8.6.1. Let (X n , Y n ) be sequences of length n drawn i.i.d.
Qn The probability that a any randomly chosen pair is jointly typical is
according to p(xn , y n ) = i=1 p(xi , yi ). Then
about 2−nI(X;Y ) . Hence for a fixed Y n , we can consider about
(n)
1. Pr((X n , Y n ) ∈ A ) → 1 as n → ∞. 2nI(X;Y ) such pairs before we are likely to come across a jointly
(n)
2. |A | ≤ 2n(H(X,Y )+) . typical pair. This suggests that there are at most about 2nI(X;Y )
distinguishable signals X n .

& % & %
c Patric Östergård
c Patric Östergård

' $
S-72.2410 Channel Capacity (1) 22

Jointly Typical Sequences (3)

Theorem 8.6.1 (cont.)

3. If (X̃ n , Ỹ n ) ∼ p(xn )p(y n ), that is, X̃ n and Ỹ n are


independent with the same marginals as p(xn , y n ), then

Pr((X̃ n , Ỹ n ) ∈ A(n) ) ≤ 2−n(I(X;Y )−3) ,

and for sufficiently large n,

Pr((X̃ n , Ỹ n ) ∈ A(n) ) ≥ (1 − )2−n(I(X;Y )+3) .

& %
c Patric Östergård

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy