0% found this document useful (0 votes)

123 views10 pages

18.445 Introduction To Stochastic Processes

This document summarizes key concepts from a lecture on Markov chain mixing, including: - The lecturer announced midterm and final exam dates for the course. - Total variation distance and coupling are introduced as ways to characterize convergence of a Markov chain to its stationary distribution. - The convergence theorem states that for irreducible, aperiodic Markov chains, the total variation distance between the chain's distribution and the stationary distribution decreases exponentially fast. - Mixing time is defined as the number of steps needed for the total variation distance to drop below a given threshold, and methods for bounding mixing time are discussed.

Uploaded by

AdityaPandhare

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

123 views10 pages

18.445 Introduction To Stochastic Processes

Uploaded by

AdityaPandhare

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 10

18.

445 Introduction to Stochastic Processes

Lecture 4: Introduction to Markov chain mixing

Hao Wu

MIT

23 February 2015

Hao Wu (MIT) 18.445 23 February 2015 1/9

Announcement
Midterm : April 6th.(on class)

Final : May 18th.

The tests are closed book, closed notes, no calculators.

Recall
If (Xn )n is an irreducible Markov chain with stationary distribution π,
then
n
1n
lim 1[Xj =x] = π(x), Pµ − a.s.
n→∞ n
j=0

Today’s goal We will show that Xn converges to π under some "strong

sense".
total variation distance
the convergence theorem
mixing times

Hao Wu (MIT) 18.445 23 February 2015 2/9

Three ways to characterize the total variation distance
µ and ν : probability measures on Ω.
||µ − ν||TV = max |µ(A) − ν(A)|.
A⊂Ω

Lemma

1X
||µ − ν||TV = |µ(x) − ν(x)|.
2
x∈Ω

1
||µ − ν||TV = sup{µf − νf : f satisfying max |f (x)| ≤ 1}.
2 x∈Ω

||µ − ν||TV = inf{P[X 6= Y ] : (X , Y ) is a coupling of µ, ν}.

Definition
We call (X , Y ) the optimal coupling if P[X 6= Y ] = ||µ − ν||TV .
Hao Wu (MIT) 18.445 23 February 2015 3/9
The Convergence Theorem

Suppose that (Xn )n is a Markov chain with transition matrix P. Assume

that P is irreducible and aperiodic, then
there exists r such that P r (x, y ) > 0 for all x, y ∈ Ω ;
there exists a unique stationary distribution π and π(x) > 0 for all
x ∈ Ω.

Theorem
Suppose that P is irreducible, aperiodic, with stationary distribution π.
Then there exist constants α ∈ (0, 1) and C > 0 such that

max ||P n (x, ·) − π||TV ≤ C αn ∀n ≥ 1.

x∈Ω

Hao Wu (MIT) 18.445 23 February 2015 4/9

Mixing time
Definition
d(n) = max ||P n (x, ·) − π||TV
x∈Ω

d̄(n) = max ||P n (x, ·) − P n (y , ·)||TV

x,y ∈Ω

Lemma
d(n) ≤ d̄(n) ≤ 2d(n)

Lemma
d̄(m + n) ≤ d̄(m) · d̄(n)

Corollary

d̄(mn) ≤ d̄(n)m
Hao Wu (MIT) 18.445 23 February 2015 5/9
Mixing time

Definition
tmix = min{n : d(n) ≤ 1/4}, tmix () = min{n : d(n) ≤ }

Lemma
1 tmix
tmix () ≤ log( )
log 2

Questions : How long does it take the Markov chain to be close to the
stationary measure ?
Lecture 5 : Upper bounds on tmix ; Lecture 6 : Lower bounds on tmix ;
Lecture 7 : Interesting models.

Hao Wu (MIT) 18.445 23 February 2015 6/9

Couple two Markov chains

Definition
A coupling of two Markov chains with transition matrix P is a process
(Xn , Yn )n≥0 with the following two properties.
Both (Xn ) and (Yn ) are Markov chains with transition matrix P.
They stay together after their first meet.

Notation : If (Xn )n≥0 and (Yn )n≥0 are coupled Markov chains with
X0 = x, Y0 = y , then we denote by Px,y the law of (Xn , Yn )n≥0 .

Hao Wu (MIT) 18.445 23 February 2015 7/9

Couple two Markov chains

Theorem
Suppose that P is irreducible with stationary distribution π. Let
(Xn , Yn )n≥0 be a coupling of Markov chains with transition matrix P for
which X0 = x, Y0 = y . Define τ to be their first meet time :

τ = min{n ≥ 0 : Xn = Yn }.

Then
||P n (x, ·) − P n (y , ·)||TV ≤ Px,y [τ > n].
In particular,
d(n) ≤ max Px,y [τ > n].
x,y

Hao Wu (MIT) 18.445 23 February 2015 8/9

Random walk on N−cycle : Upper bound on tmix

Lazy walk : it remains in current position with probability 1/2, moves

left with probability 1/4, right with probability 1/4.
It is irreducible.
The stationary measure is the uniform measure.

Theorem
For the lazy walk on N−cycle, we have

tmix ≤ N 2 .

Hao Wu (MIT) 18.445 23 February 2015 9/9

MIT OpenCourseWare
http://ocw.mit.edu