0% found this document useful (0 votes)
87 views12 pages

EE 533 Information Theory: Barı S Nakibo Glu

This document provides an overview of information theory and probability concepts. It defines key terms like random variables, probability spaces, σ-algebras, measures, and convex sets/functions. Random variables can describe both discrete and continuous distributions using the Borel σ-algebra. Probability distributions of random variables can be viewed as elements of a vector space in information theory.

Uploaded by

Safa Çelik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
87 views12 pages

EE 533 Information Theory: Barı S Nakibo Glu

This document provides an overview of information theory and probability concepts. It defines key terms like random variables, probability spaces, σ-algebras, measures, and convex sets/functions. Random variables can describe both discrete and continuous distributions using the Borel σ-algebra. Probability distributions of random variables can be viewed as elements of a vector space in information theory.

Uploaded by

Safa Çelik
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 12

EE 533 Information Theory

Lecture 01

Barış Nakiboğlu

Department of Electrical and Electronics Engineering


Middle East Technical University

March 16, 2021


Figure: Relationship of Information Theory to other fields [CT06, fig1.1]
1/10
I What is a discrete random variable?
I What is a continuous random variable?
I What is a random variable?
I What is a probability space? (Sample space? Set of events?)

2/10
Definition (σ-algebra)
A set F is a σ-algebra of subsets of Ω iff it satisfies the following three conditions
i) Ω ∈ F
ii) Ω \ E ∈ F for all E ∈ F.
iii) If En ∈ F for all n ∈ Z+ , then ∪∞
i=1 En ∈ F.

Note that
I F is closed under countable unions, intersections, and complements.
I If the third condition holds for finite collection of sets rather than countable then F is an
algebra.
I Every σ-algebra is an algebra. But not every algebra is a σ-algebra.
I Third condition is for countable collections not arbitrary collections.

Definition (Measurable Space)


An ordered pair (Ω, F) is a measurable space iff F is a σ-algebra of subsets of Ω.
3/10
Definition
i) A function µ : F → [0, ∞] is a measure iff µ is countably additive, i.e.
[∞  X∞
µ Ei = µ(Ei )
i=1 i=1

for any countable collection of elements of F satisfying Ei ∩ Ej = ∅ for all i 6= j.


ii) A measure µ is a probability measure iff µ(Ω) = 1.

Definition
(Ω, F, P) is a probability space iff F is σ-algebra of subsets of Ω and P is a probability measure
on (Ω, F).
Ω Sample space, certain event
ω : ω ∈ Ω Outcomes, sample points, elementary events.
F σ-algebra of events, set of events

4/10
Axioms of Probability.
i) P(E) ≥ 0 for all E ∈ F.
ii) P(Ω) = 1
iii) If Ei ∩ Ej = ∅ for all i, j ∈ Z+ s.t. i 6= j then
[∞  X∞
P Ei = P(Ei ) .
i=1 i=1

Example (Two fair coin tosses)


Ω = {HH, HT , TH, TT }
F = 2Ω The power set of Ω, i.e. the set of all subsets of Ω.
P({ω}) = 1/4
Example (Fair coin tosses until an H follows a T or a T follows an H)
Ω = {HT , TH, HHT , TTH, · · · }
F = 2Ω
P({ω}) = 2−`(ω) where `(ω) is the length of the string ω.
5/10
Is it possible to use single σ to describe both discrete and continuous random variables?
Definition
The σ-algebra generated by a collection S of subsets of a set Ω is the smallest σ-algebra of
subsets of Ω that is a superset of S. Such a minimum σ-algebra always exist and it is unique
by [Bog07, Proposition 1.2.6.].

Definition
The Borel σ-algebra on R, denoted by B(R), is the σ-algebra generated by open intervals on the
real line.
Since we assign probabilities to all open intervals via B(R), we know we can describe continuous
random variables using B(R).
On the other hand {x} ∈ B(R) for all x ∈ R because {x} is in the intersection of all open
intervals containing it that has rational end points.
Thus we can use B(R) to describe discrete random variables using B(R).

6/10
Definition
On a probability space (Ω, F, P) a function X : Ω → R is a random variable iff X−1 (A) ∈ F for
all A ∈ B(R), i.e. X is (F, B(R))-measurable.
I In plain English: for all events about X defined by open sets on the real line, I can find an
event whose probability I can calculate under P.
Thus (Ω, F, P) and X induce a probability measure P e on (R, B(R)) via
:= P X−1 (A)

P(A)
e ∀A ∈ B(R).
I If there exists a countable X ⊂ R s.t. P(X ∈ X) = 1, then X is a discrete random variable.
P(X = x) = pX (x) ∀x ∈ X.

I If there exists a non-negative function fX s.t.


Z τ
P(X ≤ τ ) = fX (x) dx ∀τ ∈ R.
−∞
then X is a (absolutely) continuous random variable. We are content with the Riemann
integral for this course, but the analysis works for the Lebesgue integral, which is more
general. 7/10
Information Theory vs/in Probability Theory

I In the probability theory one is often interested in the random variables and algebraic
operations are performed on the random variables themselves.
I In information theory one often works with the distributions of the random variables and
algebraic operations are often performed on the distributions of the random variables
themselves.
I Since we are perform algebraic operations on the distributions of the random variables, we
need to work with the associated vector space structure.
I Note that
I For a discrete random variable X, its pmf, i.e. pX (X), is itself a discrete random variable.
I For a continuous random variable X, its pdf, i.e. fX (X), in not necessarily a continuous random
variable! (Consider a random variable that is uniformly distributed on [0, 1].)
I How about the cdf’s, i.e. FX (x) :=P(X ≤ x)’s? Can we make similar assertions?

8/10
Vector Spaces

Figure: Axioms of Vector Space (wikipedia.org)


9/10
Definition (Convex Sets and Convex Functions)
A subset S of a vector space is a convex set iff
αx + (1 − α)y ∈ S ∀x, y ∈ S, α ∈ [0, 1].

A function f : S → R defined on a convex set S is convex iff


f (αx + (1 − α)y ) ≤ αf (x) + (1 − α)f (y ) ∀x, y ∈ S, α ∈ [0, 1].

A function f : S → R defined on a convex set S is strictly convex iff


f (αx + (1 − α)y ) < αf (x) + (1 − α)f (y ) ∀x, y ∈ S, α ∈ (0, 1).

A function f : S → R defined on a convex set S is (strictly) concave iff −f is (strictly) convex.


Example
I The set of all probability mass functions on a countable set is a convex.
I x, max{1, x}, x 2 , exp(x) are convex functions on R.
I x 2 , exp(x), x log(x) are strictly convex functions on R.

I log (x), x are strictly concave functions on R+ .
10/10
Vladimir I. Bogachev.
Measure Theory.
Springer-Verlag, Berlin Heidelberg, 2007.
Thomas M. Cover and Joy A. Thomas.
Elements of information theory.
Wiley-Interscience, New York, NY, 2 edition, 2006.
Richard M. Dudley.
Real analysis and probability, volume 74.
Cambridge University Press, New York, NY, 2002.

10/10

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy