0% found this document useful (0 votes)
123 views10 pages

18.445 Introduction To Stochastic Processes

This document summarizes key concepts from a lecture on Markov chain mixing, including: - The lecturer announced midterm and final exam dates for the course. - Total variation distance and coupling are introduced as ways to characterize convergence of a Markov chain to its stationary distribution. - The convergence theorem states that for irreducible, aperiodic Markov chains, the total variation distance between the chain's distribution and the stationary distribution decreases exponentially fast. - Mixing time is defined as the number of steps needed for the total variation distance to drop below a given threshold, and methods for bounding mixing time are discussed.

Uploaded by

AdityaPandhare
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
123 views10 pages

18.445 Introduction To Stochastic Processes

This document summarizes key concepts from a lecture on Markov chain mixing, including: - The lecturer announced midterm and final exam dates for the course. - Total variation distance and coupling are introduced as ways to characterize convergence of a Markov chain to its stationary distribution. - The convergence theorem states that for irreducible, aperiodic Markov chains, the total variation distance between the chain's distribution and the stationary distribution decreases exponentially fast. - Mixing time is defined as the number of steps needed for the total variation distance to drop below a given threshold, and methods for bounding mixing time are discussed.

Uploaded by

AdityaPandhare
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

18.

445 Introduction to Stochastic Processes


Lecture 4: Introduction to Markov chain mixing

Hao Wu

MIT

23 February 2015

Hao Wu (MIT) 18.445 23 February 2015 1/9


Announcement
Midterm : April 6th.(on class)

Final : May 18th.


The tests are closed book, closed notes, no calculators.

Recall
If (Xn )n is an irreducible Markov chain with stationary distribution π,
then
n
1n
lim 1[Xj =x] = π(x), Pµ − a.s.
n→∞ n
j=0

Today’s goal We will show that Xn converges to π under some "strong


sense".
total variation distance
the convergence theorem
mixing times

Hao Wu (MIT) 18.445 23 February 2015 2/9


Three ways to characterize the total variation distance
µ and ν : probability measures on Ω.
||µ − ν||TV = max |µ(A) − ν(A)|.
A⊂Ω

Lemma

1X
||µ − ν||TV = |µ(x) − ν(x)|.
2
x∈Ω

1
||µ − ν||TV = sup{µf − νf : f satisfying max |f (x)| ≤ 1}.
2 x∈Ω

||µ − ν||TV = inf{P[X 6= Y ] : (X , Y ) is a coupling of µ, ν}.

Definition
We call (X , Y ) the optimal coupling if P[X 6= Y ] = ||µ − ν||TV .
Hao Wu (MIT) 18.445 23 February 2015 3/9
The Convergence Theorem

Suppose that (Xn )n is a Markov chain with transition matrix P. Assume


that P is irreducible and aperiodic, then
there exists r such that P r (x, y ) > 0 for all x, y ∈ Ω ;
there exists a unique stationary distribution π and π(x) > 0 for all
x ∈ Ω.

Theorem
Suppose that P is irreducible, aperiodic, with stationary distribution π.
Then there exist constants α ∈ (0, 1) and C > 0 such that

max ||P n (x, ·) − π||TV ≤ C αn ∀n ≥ 1.


x∈Ω

Hao Wu (MIT) 18.445 23 February 2015 4/9


Mixing time
Definition
d(n) = max ||P n (x, ·) − π||TV
x∈Ω

d̄(n) = max ||P n (x, ·) − P n (y , ·)||TV


x,y ∈Ω

Lemma
d(n) ≤ d̄(n) ≤ 2d(n)

Lemma
d̄(m + n) ≤ d̄(m) · d̄(n)

Corollary

d̄(mn) ≤ d̄(n)m
Hao Wu (MIT) 18.445 23 February 2015 5/9
Mixing time

Definition
tmix = min{n : d(n) ≤ 1/4}, tmix () = min{n : d(n) ≤ }

Lemma
1 tmix
tmix () ≤ log( )
 log 2

Questions : How long does it take the Markov chain to be close to the
stationary measure ?
Lecture 5 : Upper bounds on tmix ; Lecture 6 : Lower bounds on tmix ;
Lecture 7 : Interesting models.

Hao Wu (MIT) 18.445 23 February 2015 6/9


Couple two Markov chains

Definition
A coupling of two Markov chains with transition matrix P is a process
(Xn , Yn )n≥0 with the following two properties.
Both (Xn ) and (Yn ) are Markov chains with transition matrix P.
They stay together after their first meet.

Notation : If (Xn )n≥0 and (Yn )n≥0 are coupled Markov chains with
X0 = x, Y0 = y , then we denote by Px,y the law of (Xn , Yn )n≥0 .

Hao Wu (MIT) 18.445 23 February 2015 7/9


Couple two Markov chains

Theorem
Suppose that P is irreducible with stationary distribution π. Let
(Xn , Yn )n≥0 be a coupling of Markov chains with transition matrix P for
which X0 = x, Y0 = y . Define τ to be their first meet time :

τ = min{n ≥ 0 : Xn = Yn }.

Then
||P n (x, ·) − P n (y , ·)||TV ≤ Px,y [τ > n].
In particular,
d(n) ≤ max Px,y [τ > n].
x,y

Hao Wu (MIT) 18.445 23 February 2015 8/9


Random walk on N−cycle : Upper bound on tmix

Lazy walk : it remains in current position with probability 1/2, moves


left with probability 1/4, right with probability 1/4.
It is irreducible.
The stationary measure is the uniform measure.

Theorem
For the lazy walk on N−cycle, we have

tmix ≤ N 2 .

Hao Wu (MIT) 18.445 23 February 2015 9/9


MIT OpenCourseWare
http://ocw.mit.edu

18.445 Introduction to Stochastic Processes


Spring 2015

For information about citing these materials or our Terms of Use, visit: http://ocw.mit.edu/terms.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy