0% found this document useful (0 votes)

42 views44 pages

Chapter 3

1. The document defines continuous time Markov chains, which extend discrete time Markov chains to a continuous time parameter. 2. A continuous time Markov chain is defined by a family of transition probability matrices Pt indexed by time t. These matrices must satisfy the semi-group property Pt+s = PtPs. 3. The infinitesimal generator A of the continuous time Markov chain is defined as the derivative of Pt at t=0. Under certain conditions, the transition probability matrices can be written as Pt = e tA , solving the Kolmogorov forward/backward equations.

Uploaded by

afrah chelbabi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

42 views44 pages

Chapter 3

Uploaded by

afrah chelbabi

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 44

1 Continuous Time Processes

1.1 Continuous Time Markov Chains

Let Xt be a family of random variables, parametrized by t ∈ [0, ∞), with
values in a discrete set S (e.g., Z). To extend the notion of Markov chain to
that of a continuous time Markov chain one naturally requires

P [Xs+t = j|Xs = i, Xsn = in , · · · , Xs1 = i1 ] = P [Xs+t = j|Xs = i] (1.1.1)

for all t > 0, s > sn > · · · > s1 ≥ 0 and i, j, ik ∈ S. This is the obvious
analogue of the Markov property when the discrete time variable l is replaced
by a continuous parameter t. We refer to equation (1.1.1) as the Markov
property and to the quantities P [Xs+t = j|Xs = i] as transition probabilities
or matrices.
We represent the transition probabilities P [Xs+t = j|Xs = i] by a possibly
s
infinite matrix Ps+t . Making the time homogeneity assumption as in the
s
case of Markov chain, we deduce that the matrix Ps+t depends only on the
s
difference s + t − s = t and and therefore we simply write Pt instead of Ps+t .
Thus for a continuous time Markov chain, the family of matrices Pt (generally
an infinite matrix) replaces the single transition matrix P of a Markov chain.
In the case of Markov chains the matrix of transition probabilities after l
units of time is given by P l . The analogous statement for a continuous time
Markov chain is
Ps+t = Pt Ps . (1.1.2)
(t)
This equation is known as the semi-group property. As usual we write Pij
for the (i, j)th entry of the matrix Pt . The proof of (1.1.2) is similar to
that of the analogous statement for Markov chains, viz., that the matrix of
transition probabilities after l units of time is given by P l . Here the transition
probability from state i to state j after t + s units is given
X (t) (s) (t+s)
Pik Pkj = Pij ,
k

which means (1.1.2) is valid. Naturally P◦ = I.

Just as in the case of Markov chains it is helpful to explicitly describe the
structure of the underlying probability space Ω of a continuous time Markov
chain. Here Ω is the space of step functions on R+ with values in the state

1
space S. We also impose the additional requirement of right continuity on
ω ∈ Ω in the form

lim ω(t) = ω(a),

t→a+

which means that Xt (ω), regarded as a function of t for each fixed ω, is

right continuous function. This gives a definite time for transition from
state i to state j in the sense that transition has occurred at time t◦ means
Xt◦ − = i for > 0 and Xt◦ +δ = j for δ ≥> 0 near t◦ . One often requires
an initial condition X◦ = i◦ which means that we only consider the subset
of Ω consisting of functions ω with ω(0) = i◦ . Each ω ∈ Ω is a path or a
(t)
realization of the Markov chain. In this context, we interpret Pij as the
probability of the set of paths ω with ω(t) = j given that ω(0) = i.
The concept of accessibility and communication of states carries over
essentially verbatim from the discrete case. For instance, state j is accessible
(t) (t)
from state i if for some t we have Pij > 0, where Pij denotes the (i, j)th entry
of the matrix Pt . The notion of communication is an equivalence relation and
the set of states can be decomposed into equivalence classes accordingly.
The semi-group property has strong implications for the matrices Pt . For
example, it immediately implies that the matrices Ps and Pt commute

Ps Pt = Ps+t = Pt Ps .

A continuous time Markov chain is determined by the matrices Pt . The fact

that we now have a continuous parameter for time allows us to apply notions
from calculus to continuous Markov chains in a way that was not possible in
the discrete time chain. However, it also creates a number of technical issues
which we treat only superficially since a thorough account would require
invoking substantial machinery from functional analysis. We assume that
the matrix of transition probabilities Pt is right continuous, and therefore

lim Pt = I. (1.1.3)
h→0+

The limit here means entry wise for the matrix Pt . While no requirement of
uniformity relative to the different entries of the matrix Pt is imposed, we use
this limit also in the sense that for any vector v (in the appropriate function
space) we have limt→0 vPt = v. We define the infinitesimal generator of the

2
continuous time Markov chain as the one-sided derivative
Ph − I
A = lim .
h→0+ h
A is a real matrix independent of t. For the time being, in a rather cavalier
manner, we ignore the problem of the existence of this limit and proceed as
if the matrix A exists and has finite entries. Thus we define the derivative of
Pt at time t as
dPt Pt+h − Pt
= lim ,
dt h→0+ h
where the derivative is taken entry wise. The semi-group property implies
that we can factor Pt out of the right hand side of the equation. We have
two choices namely factoring Pt out on the left or on the right. Therefore we
get the equations
dPt dPt
= APt , = Pt A. (1.1.4)
dt dt
These differential equations are known as the Kolmogorov backward and for-
ward equations respectively. They have remarkable consequences some of
which we will gradually investigate.
The (possibly infinite) matrices Pt are Markov or stochastic in the sense
that entries are non-negative and row sums are 1. Similarly the matrix A is
not arbitrary. In fact,

Lemma 1.1.1 The matrix A = (Aij ) has the following properties:

X
Aij = 0, Aii ≤ 0, Aij ≥ 0 for i 6= j.
j

Proof - Follows immediately from the stochastic property of Ph and the

definition A = lim Phh−I . ♣
So far we have not exhibited even a single continuous time Markov chain.
Using (1.1.4) we show that it is a simple matter to construct many examples
of stochastic matrices Pt , t ≥ 0.

Example 1.1.1 Assume we are given a matrix A satisfying the properties of

lemma 1.1.1. Can we construct a continuous time Markov chain from A? If
A is an n × n matrix or it satisfies some boundedness assumption, we can in

3
principle construct Pt easily. The idea is to explicitly solve the Kolomogorov
(forward or backward) equation. In fact if we replace the matrices Pt and A
by scalars, we get the differential equation dp
dt
= ap which is easily solved by
p(t) = Ceat . Therefore we surmise the solution Pt = CetA for the Kolmogorov
equations where we have defined the exponential of a matrix B as the infinite
series
X Bj
eB = (1.1.5)
j j!

where B ◦ = I. Substituting tA for B and differentiating formally we see that

CetA satisfies the Kolmogorov equation for any matrix C. The requirement
P◦ = I (initial condition) implies that we should set C = I, so that
Pt = etA (1.1.6)
is the desired solution to the Kolmogorov equation. Some boundedness as-
sumption on A would ensure the existence of etA , but we shall not dwell on
the issue of the existence and meaning of eB which cannot be adequately
treated in this context. An immediate implication of (1.1.6) is that
det Pt = etTrA > 0,
assuming the determinant and trace exist. For a discrete time Markov chain
det P can be negative. It is necessary to verify that the matrices Pt fulfill the
requirements of a stochastic matrix. Proceeding formally (or by assuming
the matrices in question are finite) we show that if the matrix A fulfills the
requirements of lemma 1.1.1, then Pt is a stochastic matrix. To prove this let
A and B be matrices with row sums equal to zero, then the sum of entries
of the ith row of AB is (formally)
XX X X
Aij Bjk = Aij Bjk = 0.
k j j k

To prove non-negativity of the entries we make use of the formula (familiar

from calculus for A a scalar)
tA n
etA = lim (I + ) . (1.1.7)
n→∞ n
It is clear that for n sufficiently large the entries of the N × N matrix I + tA
n
are non-negative (boundedness condition on entries of A) and consequently
etA is a non-negative matrix. ♠

4
Nothing in the definition of a continuous time Markov chain ensures the
existence of the infinitesimal generator A. In fact it is possible to construct
continuous time Markov chains with diagonal entries of A being −∞. Intu-
itively this means the transition out of a state may be instantaneous. For
many Markov chains appearing in the analysis of problems of interest do
not allow of instantaneous transitions. We eliminate this possibility by the
requirement

P [Xs+h = i for all h ∈ [0, ) | Xs = i] = 1 − λi s + o(), (1.1.8)

as → 0. Here λi is a non-negative real number and the notation g() = o()

means
g()
lim = 0.
→0
Let T denote the time of first transition out of state i where we assume
X◦ = 0. Excluding the possibility of instantaneous transition out of i, the
random variable T is necessarily memoryless for otherwise the Markov prop-
erty, whereby assuming the knowledge of the current state the knowledge of
the past is irrelevant to predictions about the future, will be violated. It is a
standard result in elementary probability that the only memoryless continu-
ous random variables are exponentials. Recall that the distribution function
for the exponential random variable T with parameter λ is given by
(
1 − e−λt , if t 0;
P [T < t] =
0, if t ≤ 0.

The mean and variance of T are λ1 . It is useful to allow the possibility

λi = ∞ which means that the state i is absorbing, i.e., no transition out of it
is possible. The exponential nature of the transition time is compatible with
the requirement (1.1.8). With this assumption one can rigorously establish
existence of the infinitesimal generator A, but this is just a technical matter
which we shall not dwell on.
While we have eliminated the case of instantaneous transitions, there is
nothing in the definitions that precludes having an infinite number transi-
tions in a finite interval of time. It in fact a simple matter to construct
continuous time Markov chains where infinitely many transitions in finite
interval occur with positive probability. In fact, since the expectation of an

5
exponential random variable with parameter λ is λ1 , it is intuitively clear
that if λi ’s increase sufficiently fast we should expect infinitely transitions in
a finite interval. In order to analyze this issue more closely we consider a
family T1 , T2 , . . . of independent exponential random variables with Tk hav-
P
ing parameter λk . Then we consider the infinite sum k Tk . We consider the
events
X X
[ Tk < ∞] and [ Tk = ∞].
k k

The first event means there are infinitely many transitions in a finite interval
of time, and the second is the complement. It is intuitively clear that if
the rates λk increases sufficiently rapidly we should expect infinitely many
transitions in a finite interval, and conversely, if the rates do not increase
too fast then only finitely many transitions are possible i finite time. More
precisely, we have

Proposition 1.1.1 Let T1 , T2 , . . . be independent exponential random vari-

ables with parameters λ1 , λ2 , . . .. Then k λ1k < ∞ (resp.
P 1
k λk < ∞)
P

implies P [ k Tk < ∞] = 1 (resp. P [ k Tk = ∞] = 1).

P P

Proof - We have
X X 1
E[ Tk ] = ,
k k λk

and therefore if k λ1k < ∞ then P [ Tk = ∞] = 0 which proves the first

P P

assertion. To prove the second assertion note that

P
E[e− Tk
E[e−Tk ].
Y
]=
k

Now
Z ∞
E[e−Tk ] = P [−Tk > log s]ds
◦
Z ∞
= P [Tk < − log s]ds
◦
λk
= .
1 + λk

6
Therefore, by a standard theorem on infinite products1 ,
P 1
E[e− Tk ] =
Y
= 0.
1 + λ1k
P
Since e− Tk is a non-negative random variable, its expectation can be 0 only
if Tk = ∞ with probability 1. ♣
P

Remark 1.1.1 It may appear that the Kolmogorov forward and backward
equations are one and the same equation. This is not the case. While A and
Pt formally commute, the domains of definition of the operators APt and
Pt A are not necessarily identical. The difference between the forward and
backward equations becomes significant, for instance, when dealing with cer-
tain boundary conditions where there is instantaneous return from boundary
points (or points at infinity) to another state. However if the infinitesimal
generator A has the property that the absolute values of the diagonal entries
satisfy a uniform bound |Aii | < c, then the forward and backward equations
have the same solution Pt with P◦ = I. In general, the backward equation
has more solutions than the forward equation and its minimal solution is also
the solution of the forward equation. Roughly speaking, this is due to the
fact that A can be an unbounded operator, while Pt has a smoothing effect.
An analysis of such matters demands more technical sophistication than we
are ready to invoke in this context. ♥
Remark 1.1.2 The fact that the series (1.1.5) converges is is easy to show
for finite matrices or under some bounded assumption on the entries of the
matrix A. If the entries Ajk grow rapidly with k, j, then there will be conver-
gence problems. In manipulating exponentials of (even finite) matrices one
should be cognizant of the fact that if AB 6= BA then eA+B 6= eA eB . On the
other hand if AB = BA then eA+B = eA eB as in the scalar case. ♥
Recall that the stationary distribution played an important role in the
theory of Markov chains. For a continuous time Markov chain we similarly
define the stationary distribution as a row vector π = (π1 , π2 , · · ·) satisfying
X
πPt = π for all t ≥ 0, πj = 1, πj ≥ 0. (1.1.9)
1
Let ak be a sequencePof positive numbers, then the infinite product (1 + ak )−1
Q
diverges to 0 if and only if ak = ∞. The proof is by taking logarithms and expanding the
log and can be found in many books treating infinite series and products, e.g. Titchmarsh
- Theory of Functions, Chapter 1.

7
The following lemma re-interprets πPt = π in terms of the infinitesimal
generator A:

Lemma 1.1.2 The condition πPt = π is equivalent to πA = 0.

Proof - It is immediate that the condition πPt − π = 0 implies πA = 0.

Conversely, if πA = 0, then the Kolmogorov backward equation implies
d(πPt ) dPt
=π = πAPt = 0.
dt dt
Therefore πPt is independent of t. Substituting t = 0 we obtain πPt = πP◦ =
π as required. ♣

Example 1.1.2 We apply the above considerations to an example from

queuing theory. Assume we have a server which can service one customer
at a time. The service times for customers are independent identically dis-
tributed exponential random variables with parameter µ. The arrival times
are also assumed to be independent identically identically distributed expo-
nential random variables with parameter λ. The customers waiting to be
serviced stay in a queue and we let Xt denote the number of customers in
the queue at time t. Our assumption regarding the exponential arrival times
implies

P [X(t + h) = k + 1 | X(t) = k] = λh + o(h).

Similarly the assumption about service times implies

P [X(t + h) = k − 1 | X(t) = k] = µh + o(h).

It follows that the infinitesimal generator of Xt is

−λ λ 0 0 0 ···
 
 µ
 −(λ + µ) λ 0 0 ··· 
0 µ −(λ + µ) λ 0 ···
 
A=


 0
 0 µ −(λ + µ) λ ···
.. .. .. .. ..

..
. . . . . .
The system of equations πA = 0 becomes

−λπ◦ + µπ1 = 0, · · · , λπi−1 − (λ + µ)πi + µπi+1 = 0, · · ·

8
This system is easily solved to yield
λ
πi = ( )i π◦ .
µ
For λ < µ we obtain
λ λ i
πi = (1 − )( ) .
µ µ
as the stationary distribution. ♠
The semi-group property (1.1.2) implies
t
n
(t)
Pii ≥ Pii n
.

(t)
The continuity assumption (1.1.3) implies that for sufficiently large n, Piin >
0 and consequently
(t)
Pii > 0, for t > 0.
More generally, we have
(t) (t)
Lemma 1.1.3 The diagonal entries Pii > 0, and off-diagonal entries Pij ,
i 6= j, are either positive for all t > 0 or vanish identically. The entries of
the matrix Pt are right continuous as functions of t.
(t) (t)
Proof - We already know that Pjj > 0 for all t. Now assume Pij = 0 where
i 6= j. Then for α, β > 0, α + β = 1 we have
(t) (αt) (βt)
Pij ≥ Pii Pij .
(βt) (t)
Consequently Pij = 0 for all 0 < β < 1. This means that if Pij = 0, the
(s) (s)
Pij = 0 for all s ≤ t. The conclusion that Pij = 0 for all s is proven later
(see corollary 1.4.1). The continuity property (1.1.3)

lim Pt+h = Pt lim Ph = Pt .
h→0+ h→0+

(t)
implies right continuity of Pij . ♣
Note that in the case of a finite state Markov chain Pt has convergent
(t)
series representation Pt = etA , and consequently the entries Pij are analytic
functions of t. An immediate consequence is

9
Corollary 1.1.1 If all the states of a continuous time Markov chain commu-
(t)
nicate, then the Markov chain has the property that Pij > 0 for all i, j ∈ S
(and all t > 0). In particular, if S is finite then all states are aperiodic and
recurrent.

In view of the existence of periodic states in the discrete time case, this
corollary stands in sharp contrast to the latter situation. The existence
of the limiting value of liml→∞ P l for a finite state Markov chain and its
implication regarding long term behavior of the Markov chain was discussed
in §1.4. The same result is valid here as well, and the absence of periodic
states for continuous Markov chains results in a stronger proposition. In fact,
we have

Proposition 1.1.2 Let Xt be a finite state continuous time Markov chain

and assume all states communicate. Then Pt has a unique stationary distri-
bution π = (π1 , π2 , · · ·), and
(t)
lim Pij = πi .
t→∞

Proof - It follows from the hypotheses that for some t > 0 all entries of
Pt are positive, and consequently for all t > 0 all entries of Pt are positive.
Fix t > 0 and let Q = Pt be the transition matrix of a finite state (discrete
time) Markov chain. liml Ql is the rank one matrix each row of which is the
stationary distribution of the Markov chain. This limit is independent of
choice of t > 0 since

lim Pt (Ps )l = lim (Ps )l

l→∞ l→∞

for every s > 0. ♣

10
EXERCISES

Exercise 1.1.1 A hospital owns two identical and independent power gen-
erators. The time to breakdown for each is exponential with parameter λ and
the time for repair of a malfunctioning one is exponential with parameter µ.
Let X(t) be the Markov process which is the number of operational genera-
tors at time t ≥ 0. Assume X(0) = 2. Prove that the probability that both
generators are functional at time t > 0 is

µ2 λ2 e−2(λ+µ)t 2λµe−(λ+µ)t
+ + .
(λ + µ)2 (λ + µ)2 (λ + µ)2

Exercise 1.1.2 Let α > 0 and consider the random walk Xn on the non-
negative integers with a reflecting barrier at 0 defined by
α 1
pi i+1 = , pi i−1 = , for i ≥ 1.
1+α 1+α
1. Find the stationary distribution of this Markov chain for α < 1.

2. Does it have a stationary distribution for α ≥ 1?

Exercise 1.1.3 (Continuation of exercise 1.1.2) - Let Y◦ , Y1 , Y2 , · · · be inde-

pendent exponential random variables with parameters µ◦ , µ1 , µ2 , · · · respec-
tively. Now modify the Markov chain Xn of exercise 1.1.2 into a Markov
process by postulating that the holding time in state j before transition to
j − 1 or j + 1 is random according to Yj .

1. Explain why this gives a Markov process.

2. Find the infinitesimal generator of this Markov process.

3. Find its stationary distribution by making reasonable assumption on

µj ’s and α < 1.

11
1.2 Inter-arrival Times and Poisson Processes
Poisson processes are perhaps the most basic examples of continuous time
Markov chains. In this subsection we establish their basic properties. To con-
struct a Poisson process we consider a sequence W1 , W2 , . . . of iid exponential
random variables with parameter λ. Wj ’s are called inter-arrival times. Set
T1 = W1 , T2 = W◦ + W1 and Tn = Tn−1 + Wn . Tj ’s are called arrival times.
Now define the Poisson process Nt with parameter λ as

Nt = max{n | W1 + W2 + · · · + Wn ≤ t} (1.2.1)

Intuitively we can think of certain events taking place and every time the
event occurs the counter Nt is incremented by 1. We assume N◦ 0 and the
times between consecutive events, i.e., Wj ’s, being iid exponentials with the
same parameter λ. Thus Nt is the number of events that have taken place
until time t. The validity of the Markov property follows from the construc-
tion of Nt and the exponential nature of the inter-arrival times, so that the
Poisson process is a continuous time Markov chain. It is clear that Nt is
stationary in the sense that Ns+t − Ns has the same distribution as Nt .
The arrival and inter-arrival times can be recovered from Nt by

Tn = sup{t | Nt ≤ n − 1}, (1.2.2)

and Wn = Tn − Tn−1 . One can similarly construct other counting processes

by considering sequences of independent random variables W1 , W2 , . . . and
defining Tn and Nt just as above. The assumption that Wj ’s are exponential
is necessary and sufficient for the resulting process to be Markov. What
makes Poisson processes special among Markov counting processes is that
the inter-arrival times have the same exponential law. The case where Wj ’s
are not necessarily exponential (but iid) is very important and will be treated
in connection with renewal theory later.
The underlying probability space Ω for a Poisson process is the space non-
decreasing right continuous step function functions such that at each point of
discontinuity a we have step function ϕ where at each point of discontinuity
a ∈ R+

ϕ(a) − lim− ϕ(t) = 1,

t→a

reflecting the fact that from state n only transition to state n + 1 is possible.

12
To analyze Poisson processes we begin by calculating the density func-
tion for Tn . Recall that the distribution of a sum of independent exponential
random variables is computed by convolving the corresponding density func-
tions (or using Fourier transforms to convert convolution to multiplication.)
Thus it is a straightforward calculation to show that Tn = W1 + · · · + Wn has
density function
(
µe−µx (µx)n−1
f(n,µ) (x) = (n−1)!
for x ≥ 0; (1.2.3)
0 for x < 0.
One commonly refers to f(n,µ) as Γ density with parameters (n, µ), so that
Tn has Γ distribution with parameters (n, µ). From this we can calculate the
density function for Nt , for given t > 0. Clearly {Tn+1 ≤ t} ⊂ {Tn ≤ t} and
the event {Nt = n} is the complement of {Tn+1 ≤ t} in {Tn ≤ t}. Therefore
by (1.2.3) we have
Z t Z t e−µt (µt)n
P [Nt = n] = f(n,µ) (x)dx − f(n+1,µ) (x)dx = . (1.2.4)
◦ ◦ n!
Hence Nt is a Z+ -valued random variable whose distribution is Poisson with
parameter µt, hence the terminology Poisson process. This suggests that we
can interpret the Poisson process Nt as the number of arrivals at a server in
the interval of time [0, t] where the assumption is made that the number of
arrivals is a random variable whose distribution is Poisson with parameter
µt.
In addition to stationarity Poisson processes have another remarkable
property. Let 0 ≤ t1 < t2 ≤ t3 < t4 , then the random variables Nt2 −
Nt1 and Nt4 − Nt3 are independent. This property is called independence
of increments of Poisson processes. The validity of this property can be
understood intuitively without a formal argument. The essential point is that
the inter-arrival times have the same exponential distribution and therefore
the number of increments in the interval (t3 , t4 ) is independent of how many
transitions have occurred up to time t3 an in particular independent of the
number of transitions in the interval (t1 , t2 ). A more formal proof will also
follow from our analysis of Poisson processes.
To compute the infinitesimal generator of the Poisson process we note
that in view of (1.2.4) for h > 0 small we have

P [Nh = 0] − 1 = −µh + o(h), P [Nh = 1] = µh + o(h), P [Nh ≥ 2] = o(h).

13
It follows that the infinitesimal generator of the Poisson process Nt is

−µ µ 0 0 0 ···
 
 0
 −µ µ 0 0 ··· 
A= 0 0 −µ µ 0 ···
 

 (1.2.5)
 0 0 0 −µ µ ···
.. .. .. .. ..
 
..
. . . . . .

To further analyze Poisson processes we recall the following elementary

fact:

Lemma 1.2.1 Let Xi be random variables with uniform density on [0, a]

with their indices re-arranged so that X1 < X2 < · · · < Xm (called order
statistics. The joint distribution of (X1 , X2 , · · · , Xn ) is
(
m!
, if x1 ≤ x2 ≤ · · · ≤ xm ;
f (x1 , · · · , xm ) = am
0, otherwise.

Proof - The m! permutations decompose [0, a]m into m! subsets according

xi1 ≤ xi2 ≤ · · · ≤ xim

from which the required result follows.

Let X1 , · · · , Xm be (continuous) random variables with density function
f (x1 , · · · , xm )dx1 · · · dxm . Let Y1 , · · · , Ym be random variables such that Xj ’s
and Yj ’s are related by invertible transformations. Thus the joint density of
Y1 , · · · , Ym is given by h(y1 , · · · , ym )dy1 · · · dym . The density functions f and
h are then related by

∂(x1 , · · · , xm )
h(y1 , · · · , ym ) = f (x1 (y1 , · · · , ym ), · · · , xm (y1 , · · · , ym ))| |,
∂(y1 , · · · , ym )

where ∂(x 1 ,···,xm )

∂(y1 ,···,ym )
denotes the the familiar Jacobian determinant from calculus
of several variables. In the particular case that Xj ’s and Yj ’s are related by
an invertible linear transformation
XX
Xi = Aij Yj
j

14
we obtain
X X
h(y1 , · · · , ym ) = det Af ( A1i yi , · · · , Ami yi ).

We now apply these general considerations to calculate the conditional den-

sity of T1 , · · · , Tm given Nt = m.
Since W1 , · · · , Wm are independent exponentials with parameter µ, their
joint density is

f (w1 , · · · , wm ) = µm e−µ(w1 +···+wm ) for wi ≥ 0.

Consider the linear transformation

t1 = w1 , t2 = w1 + w2 , · · · , tm = w1 + · · · + wm

Then the joint density of random variables T1 , T2 , · · · , Tm+1 is

h(t1 , · · · , tm+1 ) = µm+1 e−µtm+1 .

Therefore to calculate
P [Am , Nt = m]
P [Am | Nt = m] = ,
Nt = m
where Am denotes the event

Am = {0 < T1 < t1 < T2 < t2 < · · · < tm−1 < Tm < tm < t < Tm+1 }

we evaluate the numerator of the right hand side by noting that the condition
Nt = m is implied by the requirement Tm < tm < t <. Now
Z
P [Am ] = µm+1 eµtm+1 ds1 · · · dsm+1
U

where U is the region

U : (s1 , · · · , sm+1 ) such that 0 < s1 < t1 < s2 < t2 < · · · < sm < tm < t < sm+1 .

Carrying out the simple integration we obtain

P [Am ] = µm t1 (t2 − t1 ) · · · (tm − tm−1 )e−µt

15
Therefore
m!
P [Am | Nt = m] = t1 (t2 − t1 ) · · · (tm − tm−1 ) . (1.2.6)
tm
To obtain the conditional joint density of T1 , · · · , Tm given Nt = m we apply
m
the differential operator ∂t1∂···∂tm to (1.2.6) to obtain

m!
fT |N (t1 , · · · , tm ) = , 0 ≤ t1 < t2 < · · · < tm ≤ t. (1.2.7)
tm
We deduce the following remarkable fact:

Proposition 1.2.1 With the above notation and hypotheses, the conditional
joint density of T1 , · · · , Tm given Nt = m is identical with that of the order
statistics of m uniform random variables from [0, t].

Proof - Follows from lemma 1.2.1 and (1.2.7). ♣.

A simple consequence of Proposition 1.2.1 is a proof of the independence
of increments for a Poisson process. To do so let t1 < t2 ≤ t3 < t4 , U =
Nt2 − Nt1 and V = Nt4 − Nt3 . Then

E[ξ U η V ] = E[E[ξ U η V | Nt4 ]]

Recall from elementary probability that conditioned on Nt4 = M , the times

t1 , t2 and t3 are uniformly distributed in [0, t4 ]. Therefore the conditional
expectation E[ξ U η V | Nt4 ] is easily computed (see example ??)
M
t2 − t1 t4 − t3 t1 + t3 − t2

U V
E[ξ η | Nt4 ] = ξ+ η+ (1.2.8)
t4 t4 t4
Since for fixed t, Nt is a Poisson random variable with parameter µt, we can
calculate the outer expectation to obtain
E[ξ U η V ] = eµ[ξ(t2 −t1 )+η(t4 −t3 )+(t1 +t3 −t2 −t4 )]
= eµ(t2 −t1 )(ξ−1) eµ(t4 −t3 )(η−1)
= E[ξ U ]E[η V ],
which proves the independence of increments of a Poisson process. Summa-
rizing

Proposition 1.2.2 The Poisson process Nt with parameter µ has the fol-
lowing properties:

16
1. For fixed t > 0, Nt is a Poisson random variable with parameter µt;
2. Nt is stationary (i.e., Ns+t − Ns has the same distribution as Nt ) and
has independent increments;
3. The infinitesimal generator of Nt is given by (1.2.5).
Property (3) of proposition 1.2.2 follows from the first two which in fact
characterize Poisson processes. From the infinitesimal generator (1.2.5) one
can construct the transition probabilities Pt = etA .
There is a general procedure for constructing continuous time Markov
chains out of a Poisson process and a (discrete time) Markov chain. The
resulting Markov chains are often considerably easier to analyze and behave
somewhat like the finite state continuous time Markov chains. It is cus-
tomary to refer to these processes as Markov chains subordinated to Poisson
processes. Let Zn be a (discrete time) Markov chain with transition ma-
trix K, and Nt be a Poisson process with parameter µ. Let S be the state
space of Zn . We construct the continuous time Markov chain with state
space S by postulating that the number of transitions in an interval [s, s + t)
is given by Nt+s − Ns which has the same distribution as Nt . Given that
there are n transitions in the interval [s, s + t), we require the probability
P [Xs+t = j | Xs = i, Nt+s − Ns = n] to be
(n)
P [Xs+t = j | Xs = i, Nt+s − Ns = n] = Kij .
(t)
Let K ◦ = I, then the transition probability Pij for Xt is given by
∞
(t) X e−µt (µt)n (n)
Pij = Kij . (1.2.9)
n=◦ n!
The infinitesimal generator is easily computed by differentiating (1.2.9) at
t = 0:
A = µ(−I + K). (1.2.10)
From the Markov property of the matrix K it follows easily that the infinite
series expansion of etA converges and therefore Pt = etA is rigorously defined.
The matrix Q of lemma 1.4.1 can also be expressed in terms of the Markov
matrix K. Assuming no state is absorbing we get (see corollary 1.4.2)
(
0, if i = j;
Qij = Kij
, otherwise. (1.2.11)
1−Kii

17
Note that if (1.2.10) is satisfied then from A we obtain a continuous time
Markov chain subordinate to a Poisson process.

18
EXERCISES

Exercise 1.2.1 For the two state Markov chain with transition matrix
p q

K= ,
p q
show that the continuous time Markov chain subordinate to the Poisson pro-
cess of rate µ has trasnsition matrix
p + qe−µt q − qe−µt

Pt =
p − pe−µt q + pe−µt

1.3 The Kolomogorov Equations

To understand the significance of the Kolmogorov equations we analyze a
simple continuous time Markov chain. The method is applicable to many
other situations and it is most easily demonstrated by an example. Consider
a simple pure birth process by which we mean we have a (finite) number
of organisms which independently and randomly split into two. We let Xt
denote the number of organisms at time t and it is convenient to assume that
X◦ = 1. The law for splitting of a single organism is given by

P [splitting in h units of time] = λh + o(h)

for h > 0 a small real number. This implies that the probability that a single
organism splits more than once in h units of time is o(h). Now suppose that
we have n organisms and Aj denote the event that organism number j splits
(at least once) and A be the event that in h units of time there is exactly
one split. Then
X X X
P [A] = P [Aj ] − P [Ai ∩ Aj ] + P [Ai ∩ Aj ∩ Ak ] − · · ·
j i<j i<j<k

The probability of each of the events Ai ∩ Aj , Ai ∩ Aj ∩ Ak etc. is clearly

o(h) for h > 0 small. Therefore

P [A] = nλh + o(h). (1.3.1)

Note that the exact value of the terms incorporated into o(h) is quite com-
plicated. We shall see that in spite of ignoring these complicated terms, we

19
can recover exact information about our continuous time Markov chain by
using Kolmogorov forward equation. Let B be the event that in h units of
time there are at least two splits, and C the event that there are no split of
the n organisms. Then
P [B] = o(h), P [C] = 1 − λnh + o(h). (1.3.2)
Equations (1.3.1) and (1.3.2) imply that the infinitesimal generator A of the
continuous time Markov chain Xt is

−λ λ 0 0 0 ···
 
 0 −2λ 2λ 0 0 ···
A= −3λ 3λ 0 · · · 

 0 0


.. .. .. .. .. . .
. . . . . .
For any initial distribution π ◦ = (π1◦ , π2◦ , · · ·) let q(t) = (q1 (t), q2 (t), · · ·)
be the row vector q(t) = π ◦ Pt which describes the distribution of states at
time t. In fact,
qk (t) = P [Xt = k | X◦ = π ◦ ].
Thus a basic problem about the Markov chain Xt is the calculation of E[Xt ]
or more generally of the generating function
FX (t, ξ) = E[ξ Xt ] = P [Xt = k]ξ k .
X

k=1

We now use Kolomogorov’s forward equation to solve this problem. From

(1.1.4) we obtain
dq(t) dPt
= π◦ = π ◦ Pt A = q(t)A.
dt dt
Expanding the qt A we obtain
dqk (t)
= −kλqk (t) + (k − 1)λqk−1 (t). (1.3.3)
dt
With a little straightforward book-keeping, one shows that (1.3.3) implies
∂FX ∂FX
= λ(ξ 2 − ξ) . (1.3.4)
∂t ∂ξ

20
The fact that FX a linear partial differential equation makes the calcula-
tion of E[Xt ] very simple. In fact, since E[Xt ] = ∂F∂ξX evaluated at ξ = 1− ,
we differentiate both sides of (1.3.4) with respect to ξ, change the order of
differentiation relative to ξ and t on the left side, and set ξ = 1 to obtain
dE[Xt ]
= λE[Xt ]. (1.3.5)
dt
The solution to this ordinary differential equation is Ceλt and the constant
C is determined by the initial condition X◦ = 1 to yield
E[Xt ] = eλt .
The partial differential equation (1.3.4) tells us considerably more than
just the expectation of Xt . The basic theory of a single linear first order
partial differential equation is well understood. Recall that the solution to a
first order ordinary differential equation is uniquely determined by specifying
one initial condition. Roughly speaking, the solution to a linear first order
partial differential equation in two variables is uniquely determined by spec-
ifying a function of one variable. Let us see how this works for our equation
(1.3.4). For a function g(s) of a real variable s we want to substitute for s
a function of t and ξ such that (1.3.4) is necessarily valid regardless of the
choice of g. If for s we substitute λt + φ(ξ), then by the chain rule
∂g(λt + φ(ξ)) ∂g(λt + φ(ξ))
= λg 0 (λt + φ(ξ)), = φ0 (ξ)g 0 (λt + φ(ξ)).
∂t ∂ξ
where g 0 denote the derivative of the function g. Therefore if φ is such
that φ0 (ξ) = ξ21−ξ , then, regardless of what function we take for g, equation
(1.3.4) is satisfied by g(λt + φ(ξ)). There is an obvious choice for φ namely
the function
Z ξ
du ξ
φ(ξ) = 1 2 = log ,
2
u −u 1−ξ
for 0 < ξ < 1. (The lower limit of the integral is immaterial and 12 is fixed
for convenience.) Now we incorporate the initial condition X◦ = 1 which in
terms of the generating function FX means FX (0, ξ) = ξ. In terms of g this
translates into
1−ξ
g(log ) = ξ.
ξ

21
That is, g should be the inverse to the mapping ξ → log 1−ξ
ξ
. It is easy to
see that
es
g(s) =
1 + es
is the required function. Thus we obtain the expression
ξ
FX (t, ξ) = (1.3.6)
ξ + (1 − ξ)e−λt
for the probability generating function of Xt . If we change the initial condi-
tion to X◦ = N , then the generating function becomes
ξN
FX (ξ, t) = .
[ξ + (1 − ξ)e−λt ]N
From this we deduce
!
(t) j−1
PN j = e−N λt (1 − e−λt )N −j
N −1
for the transition probabilities. The method of derivation of (1.3.6) is re-
markable and instructive. We made essential use of the Kolmogorov forward
equation in obtaining a linear first order partial differential equation for the
probability generating function. This was possible because we have an infi-
nite number of states and the coefficients of qk (t) and qk−1 (t) in (1.3.3) were
linear in k. If the dependence were quadratic in k, probably we could have
obtained a partial differential equation which had order two relative to the
variable ξ and the situation would have been more complex. The fact that
we have an explicit differential equation for the generating function gave us
a fundamental new tool for understanding it. In the exercises this method
will be further demonstrated.

Example 1.3.1 The birth process described above can be easily generalized
to a birth-death process by introducing a positive parameter µ > 0 and
replacing equations (1.3.1) and (1.3.2) with the requirement

if a = −1;
 nµh + o(h),

P [Xt+h = n + a | Xt = n] = nλh + o(h), if a = 1; (1.3.7)
if |a| > 1.


o(h),

22
The probability generating function for Xt can be calculated by an argument
similar to that of pure birth process given above and is delegated to exercise
1.3.5. It is shown there that for λ 6= µ

µ(1 − ξ) − (µ − λξ)e−t(λ−µ)
N
FX (ξ, t) = . (1.3.8)
λ(1 − ξ) − (µ − λξ)e−t(λ−µ)

where N is given by the initial condition X◦ = N . For λ = µ this expression

simplifies to
µt(1 − ξ) + ξ N

FX (ξ, t) = . (1.3.9)
µt(1 − ξ) + 1
From this it follows easily that E[Xt ] = N e(λ−µ)t . Let ζ(t) denote the proba-
bility that the population is extinct at time t, i.e., ζ(t) = P [Xt = 0 | X◦ = N ].
Therefore ζ(t) is the constant term, as a function of ξ for fixed t, of the gen-
erating function FX (ξ, t). In other words, ζ(t) = FX (0, t) and we obtain
(
1, if µ ≥ λ;
lim = µN (1.3.10)
t→∞ λN
, if µ < λ;

for the probability of eventual extinction. ♠

23
EXERCISES

Exercise 1.3.1 (M/M/1 queue) A server can service only one customer at a
time and the arriving customers form a queue according to order of arrival.
Consider the continuous time Markov chain where the length of the queue
is the state space, the time between consecutive arrivals is exponential with
parameter µ and the time of service is exponential with parameter λ. Show
that the matrix Q = (Qij ) of lemma 1.4.1 is
 µ

 µ+λ
, if j = i + 1;
Qij = λ
 µ+λ
, if j = i − 1;
0, otherwise.


Exercise 1.3.2 The probability that a central networking system receives a

call in the time period (t, t + h) is λh + o(h) and calls are received inde-
pendently. The service time for the calls are independent and identically
distributed each according to the exponential random variable with parameter
µ. The service times are also assumed independent of the incoming calls.
Consider the Markov process X(t) with state space the number of calls being
processed by the server. Show that the infinitesimal generator of the Markov
process is the infinite matrix

−λ ···
 
λ 0 0 0
 µ

−(λ + µ) λ 0 0 ···
 0 2µ −(λ + 2µ) λ 0 ···
 
.. .. .. .. .. ...
 
. . . . .

Let FX (ξ, t) = E(ξ X(t) ) denote the generating function of the Markov process.
Show that FX satisfies the differential equation
∂FX ∂FX
= (1 − ξ)[−λFX + µ ].
∂t ∂ξ

Assuming that X(0) = m, use the differential equation to show that

λ
E[X(t)] = (1 − e−µt ) + me−µt .
µ

24
Exercise 1.3.3 (Continuation of exercise 1.3.2) - With the same notation
as exercise 1.3.2, show that the substitution
λ(1−ξ)
FX (ξ, t) = e− µ G(ξ, t)

gets rid of the term involving FX on the right hand side of the differential
equation for FX . More precisely, it transforms the differential equation for
FX into
∂G ∂G
= µ(1 − ξ) .
∂t ∂ξ
Can you give a general approach for solving this differential equation? Verify
that
−µt )/µ
FX (ξ, t) = e−λ(1−ξ)(1−e [1 − (1 − ξ)e−µt ]m

is the desired solution to equation for FX .

Exercise 1.3.4 The velocities Vt of particles in a quantized field are assumed

to take only discrete values n + 12 , n ∈ Z+ . Under the influence of the field
and mutual interactions their velocities can change by at most one unit and
the probabilities for transitions are given by
1

1 1  (n + 2 )h + o(h),
 if m = n + 1;
P [Vt+h = m + | Vt = n + ] =  1 − (2n + 1)h + o(h), if m = n;
2 2 
(n − 12 )h + o(h), if m = n − 1.

Let FV (ξ, t) be the probability generating function

∞
1
P [Vt = n + ]ξ n .
X
FV (ξ, t) =
n=0 2
Show that
∂FV ∂FV
= (1 − ξ)2 − (1 − ξ)FX .
∂t ∂s
Assuming that V◦ = 0 deduce that
1
FV (ξ, t) = .
1 + t − ξt

25
Exercise 1.3.5 Consider the birth-death process of example 1.3.1

1. Show that the generating function FX (ξ, t) = E[ξ Xt ] satisfies the partial
differential equation
∂FX ∂FX
= (λξ − µ)(ξ − 1) ,
∂t ∂ξ

with the initial condition FX (ξ, 0) = ξ N .

2. Deduce the validity of (1.3.7) and (1.3.8).

3. Show that
λ + µ (λ−µ)t (λ−µ)t
Var[Xt ] = N e (e − 1).
λ−µ

4. Let N = 1 and Z denote the time of the extinction of the process. Show
that for λ = µ , E[Z] = ∞.

5. Show that µ > λ we have

1 µ
E[Z | Z < ∞] = log .
λ µ−λ

6. Show that µ < λ we have

1 µ
E[Z | Z < ∞] = log .
µ µ−λ

(Use the fact that

Z ∞ Z ∞
E[Z] = P [Z > t]dt = [1 − FX (0, t)]dt,
◦ ◦

to calculate E[Z | Z < ∞].)

26
1.4 Discrete vs. Continuous Time Markov Chains
In this subsection we show how to assign a discrete time Markov chain to one
with continuous time, and how to construct continuous time Markov chains
from a discrete time one. We have already introduced the notions of Markov
and stopping times for for Markov chains, and we can easily extend them to
continuous time Markov chains. Intuitively a Markov time for the (possibly
continuous time) Markov chain is a random variable T such that the event
[T ≤ t] does not depend on Xs for s > t. Thus a Markov time T has the
property that if T (ω) = t then T (ω 0 ) = t for all paths which are identical
with ω for s ≤ t. For instance, for a Markov chain Xl with state space Z
and X◦ = 0 let T be the first hitting time of state 1 ∈ Z. Then T is a
Markov time. If T is Markov time for the continuous time Markov chain Xt ,
the fundamental property of Markov time, generally called Strong Markov
Property, is
(s)
P [XT +s = j | Xt , t ≤ T ] = PXT j (1.4.1)
This reduces to the Markov property if we take T to be a constant. To
understand the meaning of equation (1.4.1), consider Ωu = {ω | T (ω) = u}
where u ∈ R+ is any fixed positive real number. Then the left hand side of
(1.4.1) is the conditional probability of the set of paths ω that after s units
of time are in state j given Ωu and Xt for t ≤ u = T (ω). The right hand
side states that the information Xt for t ≤ u is not relevant as long we know
the states for which T (ω) = u, and this probability is the probability of the
paths which after s units of time are in state j assuming at time 0 they were
in a state determined by T = u. One can also loosely think of the strong
Markov property as allowing one to reparametrize paths so that all the paths
will satisfy T (ω) = u at the same constant time T and then the standard
Markov property will be applicable. Examples that we encounter will clarify
the meaning and significance of this concept. The validity of (1.4.1) is quite
intuitive, and one can be convinced of its validity by looking at the set of
paths with the required properties and using the Markov property. It is
sometimes useful to make use of a slightly more general version of the strong
Markov property where a function of the Markov time is introduced. Rather
than stating a general theorem, its validity in the context where it is used
will be clear.
The notation Ei [Z] where the random variable Z is a function of of the
continuous time Markov chain Xt means that we are calculating conditional

27
expectation conditioned on X◦ = i. Naturally, one may replace the subscript
i by a random variable to accommodate a different conditional expectation.
Of course, instead of a subscript one may write the conditioning in the usual
manner E[? | ?]. The strong Markov property in the context of conditional
expectations implies
X (s)
E[g(XT +s ) | Xu , u ≤ T ] = EXT [g] = E[ PXT j g(j)]. (1.4.2)
j∈S

The Markov property implies that transitions between states follows a

memoryless random variable. It is worthwhile to try to understand this
statement more clearly. Let X◦ = i and define the random variable Y as

Y (ω) = inf{t | ω(t) 6= i}

Then Y is a Markov time. The assumption (1.1.8) implies that except for a
set of paths of probability 0, Y (ω) > 0, and by the right continuity assump-
tion, the infimum is actually achieved. The strong Markov property implies
that the random variable Y is memoryless in the sense that

P [Y ≥ t + s | Y > s] = P [Y ≥ t | X◦ = i].

It is a standard result in elementary probability that the only memoryless

continuous random variables are exponentials. Recall that the distribution
function for the exponential random variable T with parameter λ is given by
(
1 − e−λt , if t 0;
P [T < t] =
0, if t ≤ 0.

The mean and variance of T are λ1 . Therefore

P [Y ≥ t | X◦ = i] = P [Xs = i for s ∈ [0, t) | X◦ = i] = e−λi t . (1.4.3)

This equation is compatible with (1.1.8). Note that for an absorbing state i
we have λi = 0.
From a continuous time Markov chain one can construct a (discrete time)
Markov chain. Let us assume X◦ = i ∈ S. A simple and not so useful way is
(1)
to define the transition matrix P of the Markov chain as Pij . A more useful

28
approach is to let Tn be the time of the nth transition. Thus T1 (ω) = s > 0
means that there is j ∈ S, j 6= i, such that
i for t < s;

ω(t) =
j for s = t.
T1 is a stopping time if we assume that i is not an absorbing state. We define
Qij to be the probability of the set of paths that at time 0 are in state i and
at the time of the first transition they move to state j. Therefore
X
Qkk = 0, and Qij = 1.
j6=i

Let Wn = Tn+1 − Tn denote the time elapsed between the nth and (n + 1)st
transitions. We define a Markov chain Z◦ = X◦ , Z2 , · · · by setting Zn =
XTn . Note that the strong Markov property for Xt is used in ensuring that
Z◦ , Z1 , Z2 , · · · is a Markov chain since transitions occur at different times on
different paths. The following lemma clarifies the transition matrix of the
Markov chain Zn and sheds light on the transition matrices Pt .

Lemma 1.4.1 For a non-absorbing state k we have

P [Zn+1 = j, Wn > u | Z◦ = i◦ , Z1 , · · · , Zn = k, T1 , · · · , Tn ] = Qkj e−λk u .

Furthermore Qkk = 0, Qkj ≥ 0 and j Qkj = 1, so that Q = (Qkj ) is the

transition matrix for the Markov chain Zn . For an absorbing state k, λk = 0

and Qkj = δkj .

Proof - Clearly the left hand side of the equation can be written in the form

P [XTn +Wn = j, Wn > u | XTn = k, Xt , t ≤ T ].

By the strong Markov property2 we can rewrite this as

P [XTn +Wn = j, Wn > u | XTn = k, Xt , t ≤ T ] = P [XW◦ = j, W◦ > u | X◦ = k].

Right hand side of the above equation can be written as

P [XW◦ = j | W◦ > u, X◦ = k]P [W◦ > u | X◦ = k].

2
We are using a slightly more general version than the statement (1.4.1), but its validity
is equally clear.

29
We have
P [XW◦ = j | W◦ > u, X◦ = k] = P [XW◦ = j | Xs = k for s ≤ u]
= P [Xu+W◦ = j | Xu = k]
= P [XW◦ = j | X◦ = k].
The quantity P [XW◦ = j | X◦ = k] is independent of u and we denote it by
Qkj . Combining this with (1.4.3) (exponential character of elapsed time Wn
between consecutive transitions) we obtain the desired formula. The validity
of stated properties of Qkj is immediate. ♣
A immediate corollary of lemma 1.4.1 is that it allows to fill in the gap
in the proof of lemma 1.1.3.
(t)
Corollary 1.4.1 Let i 6= j ∈ S, then either Pij > 0 for all t > 0 or it
vanishes identically.
(t)
Proof - If Pij > 0 for some t, then Qij > 0, and it follows that for all t > 0
(t)
Pij > 0. ♣
This process of assigning a Markov chain to a continuous time Markov
chain can be reversed to obtain (infinitely many) continuous time Markov
chains from a discrete time one. In fact, for every j ∈ S let λj > 0 be
a positive real number. Now given a Markov chain Zn with state space S
and transition matrix Q, let Wj be an exponential random variable with
parameter λj . If j is not an absorbing state, then the first transition out of j
happens at time s > t with probability e−λj t and once the transition occurs
the probability of hitting state k is
Qjk
P .
i6=j Qji

Lemma 1.4.1 does not give direct and adequate information about the
behavior of transition probabilities. However, combining it with the strong
Markov property yields an important integral equation satisfied by the tran-
sition probabilities.

Lemma 1.4.2 Assume Xt has no absorbing states. For i, j ∈ S, the transi-

(t)
tion probabilities Pij satisfy
Z t
(t) (t−s)
Pij = e−λi t δij + λi e−λi s
X
Qik Pkj ds
◦ k

30
Proof - We may assume i is not an absorbing state. Let T1 be the time of
first transition. Then trivially
(t)
Pij = P [Xt = j, T1 > t | X◦ = i] + P [Xt = j, T1 ≤ t | X◦ = i].

The term containing T1 > t can be written as

P [Xt = j, T1 > t | X◦ = i] = δij P [T1 > t | X◦ = i] = e−λi t δij .

By the strong Markov property the second term becomes

Z t
(t−s)
P [Xt = j, T1 ≤ t | X◦ = i] = P [T1 = s < t | X◦ = i]PXs j ds.
◦

To make a substitution for P [T1 = s < t | X◦ = i] from lemma 1.4.1 we have

to differentiate3 the the expression 1 − Qik e−λi s with respect to s. We obtain
Z t
(t−s)
e−λi s Qik Pkj
X
P [Xt = j, T1 ≤ t | X◦ = i] = λi ds,
◦ k

from which the required result follows. ♣

An application of this lemma will be given in the next subsection. Inte-
gral equations of the general form given in lemma 1.4.2 occur frequently in
probability. Such equations generally result from conditioning on a Markov
time together with the strong Markov property.
A consequence of lemma 1.4.2 is an explicit expression for the infinitesimal
generator of a continuous time Markov chain.

Corollary 1.4.2 With the notation of lemma 1.4.2, the infinitesimal gener-
ator of the continuous time Markov chain Xt is
−λi , if i = j;

Aij =
λi Qij , 6 j.
if i =
Proof - Making the change of variable s = t − u in the integral in lemma
1.4.2 we obtain
Z t
(t) (u)
Pij = e−λi t δij + λi e−λi u
X
Qik Pkj du.
◦ k

3
This is like the connection between the density function and distribution function of
a random variable.

31
Differentiating with respect to t yields
(t)
dPij (t)
= −λi e−λi t δij + λi e−λi t
X
Qik Pkj .
dt k

Taking limt→0+ we obtain the desired result. ♣

1.5 Brownian Motion

Brownian motion is the most basic example of a Markov process with con-
tinuous state space and continuous time. In order to motivate some of the
develpments it may be useful to give an intuitive description of Brownian
motion based on random walks on Z. In this subsection we give an intuitive
approach to Brownian motion and show how certain quantities of interest can
be practically calculated. In particular, we give heuristic but not frivolous
arguments for extending some of the properties of random walks and Markov
chains to Brownian motion.
Recall that if X1 , X2 , · · · is a sequence of iid random variables with values
in Z, then S◦ = 0, S1 = X1 , · · ·, and Sn = Sn−1 + Xn , · · ·, is the general
random walk on Z. Let us assume that E[Xi ] = 0 and Var[Xi ] = σ 2 < ∞,
or even that Xi takes only finitely many values. The final result which
is obtained by a limiting process will be practically independent of which
random walk, subject to the stated conditions on mean and variance, we
start with, and in particular we may even start with the simple symmetric
random walk. For t > 0 a real number define
(n) 1
Zt = √ S[nt] ,
n
where [x] denotes the largest integer not exceeding x. It is clear that

(n) (n) [nt]σ 2

E[Zt ] = 0, Var[Zt ] = ' tσ 2 , (1.5.1)
n
where the approximation ' is valid for large n. The interesting point is that
with re-scaling by √1n , the variance becomes approximately independent of
n for large n. To make reasonable paths out of those for the random walk
on Z in the limit of large n, we further rescale the paths of the random walk
Snt in the time direction by n1 . This means that if we fix a positive integer n

32
and take for instance t = 1, then a path ω between 0 and n will be squeezed
in the horizontal (time) direction to the interval [0, 1] and the values will be
multiplied by √1n . The resulting path will still consist of broken line segments
where the points of nonlinearity (or non-differentiability) occur at nk , k =
1, 2, 3, · · ·. At any rate since all the paths are continuous, we may surmise
that the path space for limn→∞ is the space Ω = Cx◦ [0, ∞) of continuous
function on [0, ∞) and we may require ω(0) = x◦ , some fixed number. Since
in the simple symmetric random walk, a path is just as likely to up as down we
expect, the same is true of the paths in the Brownian motion. A differentiable
path on the other hand has a definite preference at each point, namely,
the direction of the tangent. Therefore it is reasonable to expect that with
probability 1 the paths in Brownian are nowhere differentiable in spite of the
fact that we have not yet said anything about how probabilities should be
assigned to the appropriate subsets of Ω. The assignment of probabilities is
the key issue in defining Brownian motion.
Let 0 < t1 < t2 < · · · < tm and we want to see what we can say about
(n) (n) (n) (n) (n)
the joint distribution of (Zt1 , Zt2 − Zt1 , · · · , Ztm − Ztm−1 ). Note that these
(n) (n)
random variables are independent while Zt1 , Zt2 , · · · are not. By the central
limit theorem, for n sufficiently large,
1

(n) (n)
Ztk − Ztk−1 = √ S[ntk ] − S[tnk−1 ]
n
is approximately normal with mean 0 and variance (tk − tk−1 )σ 2 . We assume
(n)
that taking the limit of n → ∞ the process Zt tends to a limit. Of course
this requires specifying the sense in which convergence takes place and proof,
but because of the applicability of the central limit theorem we assign prob-
abilities to sets of paths accordingly without going through a convergence
argument. More precisely, to the set of paths, starting at 0 at time 0, which
are in the open subset B ⊂ R at time t, it is natural to assign the probability
1 Z − u22
P [Zt ∈ B] = √ e 2tσ du. (1.5.2)
2πtσ B
In view of independence of Zt1 and Zt2 −Zt1 , the probability that Zt1 ∈ (a1 , b1 )
and Zt2 − Zt1 ∈ (a2 , b2 ), is
b1 b2 u2 u2
1 Z Z
− 1
2σ 2 t1
− 2
2σ 2 (t2 −t1 )
q e e du1 du2 .
2πσ 2 t1 (t2 − t1 ) a1 a2

33
Note that we are evaluating the probability of the event [Zt1 ∈ (a1 , b1 ), Zt2 −
Zt1 ∈ (a2 , b2 )] and not [Zt1 ∈ (a1 , b1 ), Zt2 ∈ (a2 , b2 )] since the random vari-
ables Zt1 and Zt2 are not independent. This formula extends to probability
of any finite number of increments. In fact, for 0 < t1 < t2 < · · · < tk the
joint density function for (Zt1 , Zt2 − Zt1 , · · · , Ztk − Ztk−1 ) is the product

u2 u2 u2
1 2 − k
− − 2σ 2 (tk −tk−1 )
e 2σ 2 t1
e 2σ 2 (t2 −t1 )
···e
q
σ k (2π)k t1 (t2 − t1 ) · · · (tk − tk−1 )

One refers to the property of independence of (Zt1 , Zt2 −Zt1 , · · · , Ztk −Ztk−1 )
as independence of increments. For future reference and economy of notation
we introduce
1 x2
pt (x; σ) = √ e− 2tσ2 . (1.5.3)
2πtσ
For σ = 1 we simply write pt (x) for pt (x; σ).
For both discrete and continuous time Markov chains the transition prob-
abilities were give by matrices Pt . Here the transition probabilities are en-
coded in the Gaussian density function pt (x; σ). It is easier to introduce the
analogue of Pt for Brownian motion if we look at the dual picture where the
action of the semi-group Pt on functions on the state space is described. Just
as in the case of continuous time Markov chains we set
Z ∞
(Pt ψ)(x) = E[ψ(Zt ) | Z◦ = x] = ψ(y)pt (x − y; σ)dy, (1.5.4)
−∞

which is completely analogous to (??). The operators Pt (acting on whatever

the appropriate function space is) still have the semi-group property Ps+t =
Ps Pt . In view of (1.5.4) the semi-group is equivalent to the statement
Z ∞
ps (y − z; σ)pt (x − y; σ)dy = pt (x − z; σ). (1.5.5)
−∞

Perhaps the simplest way to see the validity of (1.5.5) is by making use of
Fourier analysis which transforms convolutions into products as explained
earlier in subsection (XXXX). It is a straightforward calculation that
x2
Z ∞
−iλx e− 2tσ2 1 λ2 σ 2 t
e √ dx = e− 2 . (1.5.6)
−∞ 2πtσ π

34
From (??) and (1.5.6), the desired relation (1.5.5) and the semi-group prop-
erty follow. An important feature of continuous time Markov chains is that
Pt satisfied the the Kolmogorov forward and backward equations. In view of
the semi-group property the same is true for Brownian motion and we will
explain in example 1.5.2 below what the infinitesimal generator of Brownian
motion is. With some of the fundamental definitions of Brownian motion in
place we now calculate some quantities of interest.

Example 1.5.1 Since the random variables Zs and Zt are dependent, it is

reasonable to calculate their covaraince. Assume s < t, then we have
Cov(Zs , Zt ) = Cov(Zs , Zt − Zs + Zs )
(By independence of increments) = Cov(Zs , Zs )
= sσ 2 .
This may appear counter-intuitive at first sight since one expects Zs and Zt
to become more independent as t − s increases while the covariance
q depends
only on min(s, t) = s. However, if we divide Cov(Zs , Zt ) by Var[Zs ]Var[Zt ]
we see that the correlation tends to 0 as t increases for fixed s. ♠

Example 1.5.2 One of the essential features of continuous time Markov

chains was the existence of the infinitesimal generator. In this example we
derive a formula for the infinitesimal generator of Brownian motion. For a
function ψ on the state space R, the action of the semi-group Pt is given by
(1.5.4). We set uψ (t, x) = u(t, x) = Pt ψ. The Gaussian pt (x; σ) has variance
tσ 2 and therefore it tends to the δ-function supported at the origin as t → 0,
i.e.,

lim(Pt ψ)(x) = lim u(t, x) = ψ(x).

t→0 t→0

It is straightforward to verify that

∂u σ2 ∂ 2u
= . (1.5.7)
∂t 2 ∂x2
Therefore from the validity of Kolomogorov’s backward equation ( dP
dt
t
= APt )
we conclude that the infinitesimal generator of Brownian motion is given by
σ 2 d2
A= . (1.5.8)
2 dx2
Thus the matrix A is now replaced by a differential operator. ♠

35
Example 1.5.3 The notion of hitting time of a state played an important
role in our discussion of Markov chains. In this example we calculate the
density function for hitting time of a state a ∈ R in Brownian motion. The
trick is to look at the identity

P [Zt > a] = P [Zt > a | Ta ≤ t]P [Ta ≤ t] + P [Zt > a | Ta > t]P [Ta > t].

Clearly the second term on the right hand side vanishes, and by symmetry
1
P [Zt > a | Ta < t] = .
2
Therefore
P [Ta < t] = 2P [Zt > a]. (1.5.9)
The right hand side is easily computable and we obtain
2 Z ∞ − x22 2 Z ∞ − u2
P [Ta < t] = √ e 2tσ dx = √ e 2 du.
2πtσ a 2π √atσ

The density function for Ta is obtained by differentiating this expression with

respect to t:
a 1 a2
fTa (t) = √ √ e− 2tσ2 . (1.5.10)
2πσ t t
Since tfTa (t) ∼ c √1t as t → ∞ for a non-zero constant c, we obtain

E[Ta ] = ∞, (1.5.11)

which is similar to the case of simple symmetric random walk on Z. ♠

The reflection principle which we introduced in connection with the simple

symmetric random walk on Z and used to prove the arc-sine law is also
valid for Brownian motion. However we have already accumulated enough
information to prove the arc-sine law for Brownian motion without reference
to the reflection principle. This is the substance of the following example:

Example 1.5.4 For Brownian motion with Z◦ = 0 the event that it crosses
the line −a, where a > 0, between times 0 and t is identical with the event
[T−a < t] and by symmetry it has the same probability as P [Ta < t]. We
calculated this latter quantity in example 1.5.3. Therefore the probablity P

36
that the Brownian motion has at least one 0 in the interval (t1 , t2 ) can be
written as
1 Z∞ − a
2

P =√ P [Ta < t1 − t◦ ]e 2t◦ σ2 da. (1.5.12)

2πtσ −∞
Let us explain the validity of this assertion. At time t◦ , Zt◦ can be at any
2
− a
point a ∈ R. The Gaussian exponential factor √ 1 e 2t◦ σ2
is the density
2πtσ
function for Zt◦ . The factor P [Ta < t1 − t◦ ] is equal to the probability
that starting at a, the Brownian motion will assume value 0 in the ensuing
time interval (t1 − t◦ ). The validity of the assertion follows from these facts
put together. In view of the symmetry between a and −a, and the density
function for Ta which we obtained in example 1.5.3, (1.5.12) becomes
R∞ − a 2 2
a
√ 2 a
( ◦t1 −t◦ e− 2u u√1 u du)da
R
P = 2πt◦ σ ◦
e 2t◦ σ √2πσ
2 1
1√ R t1 −t◦ − 32 R ∞ − a 2 (u + t1 )
= πσ 2 t ◦ u ( ◦ ae 2σ ◦ da)du
√ R ◦
t◦ t1 −t◦ du √
= π ◦ (u+t◦ ) u
√
The substitution u = x yields
s s
2 t1 − t◦ 2 t1
P = tan−1 = cos−1 , (1.5.13)
π t◦ π t◦
for the probability of at least one crossing of 0 betwen times t◦ and t1 . It
follows that the probability of no crossing in the time interval (t◦ , t1 ) is
s
2 t1
sin−1
π t◦
which is the arc-sine law for Brownian motion. ♠
So far we have only considered Brownian motion in dimension one. By
looking at m copies of independent Brownian motions Zt = (Zt;1 , · · · , Zt;m )
we obtain Brownian motion in Rm . While m-dimensional Brownian motion
is defined in terms of coordinates, there is no preference for any direction in
space. To see this more clearly, let A = (Ajk ) be an m×m orthogonal matrix.
This means that AA0 = I where superscript 0 denotes the transposition of
the matrix, and geometrically it means that the linear transformation A
preserves lengths and therefore necessarily angles too. Let
m
X
Yt;j = Ajk Zt;k .
k=1

37
Since a sum independent Gaussian random variables is Gaussian, Yt;j ’s are
also Gaussian. Furthermore,
m
X
Cov(Yt;j , Yt;k ) = Ajl Akl = δjk
l=1

by orthogonality of the matrix A. It follows that the components Yt;j of Yt ,

which are Gaussian, are independent Gaussian random variables just as in
the case of Brownian motion. This invariance property of Brownian motion
(or independent Gaussians) under orthogonal transformations in particular
implies that the distribution of the first hitting points on the sphere Sρm−1 ⊂
Rm of radius ρ centered at the origin, is uniform on Sρm−1 . This innocuous
and almost obvious observation together with the standard fact from analysis
that the solutions of Laplace’s equation (harmonic functions)

def ∂2u ∂2u

∆m u = + · · · + =0
∂x21 ∂x2m
are characterized by their mean value property has an interesting conse-
quence, viz.,

Proposition 1.5.1 Let M and N be disjoint compact hypersurfaces in Rm

such that N is contained in the interior of M or vice versa. Let D denote
the region bounded by M and N . Denote by p(x) the probability that m-
dimensional Brownian motion with Z◦ = x ∈ D hits N before it hits M .
Then p is the unique solution to Laplace’s equation in D with

p ≡ 1 on N ; p ≡ 0 on M.

(See remark 1.5.1 below.)

Proof - Let ρ > 0 be sufficiently small so that the sphere Sρm−1 (x) of radius
ρ > 0 centered at x ∈ D in contained entirely in D. Let T be the first hitting
of the sphere Sρm−1 (x) given Z◦ = x. Then the distribution of the points
y defined by ZT = y is uniform on the sphere. Let Bx be the event that
starting x the Brownian motion hits N before M . Consequently in view of
the Markov property (see remark 1.5.2 below) we have for B ⊂ Rm
Z
1
P [Bx ] = P [By | ZT = y] dvS (y),
Sρm−1 (x) vol(Sρm−1 (x))

38
where dvS (y) denotes the standard volume element on the sphere Sρm−1 (x).
Therefore we have
Z
p(y)
p(x) = dvS (y), (1.5.14)
Sρm−1 (x) vol(Sρm−1 (x))

which is precisely the mean value property of harmonic functions. The

boundary conditions are clearly satisfied by p and the required result fol-
lows. ♣

Remark 1.5.1 The condition that N is contained in the interior of M is

intuitively clear in low dimensions and can be defined precisely in higher
dimensions but it is not appropriate to dwell on this point in this context.
This is not an essential assumption since the same conclusion remains valid
even if D is an unbounded region by requiring the solution to Laplace’s
equation to vanish at infinity. ♥

Remark 1.5.2 We are in fact using the strong Markov property of Brownian
motion. This application of this property is sufficiently intuitive that we did
give any further justification. ♥

Example 1.5.5 We specialize proposition 1.5.1 to dimensions 2 and 3 with

N a sphere of radius > 0 and M a sphere of radius R, both centered at
the origin. In both cases we can obtain the desired solutions by writing the
Laplacian ∆m in polar coordinates:

1 ∂ ∂ 1 ∂2
∆2 = (r ) + 2 2 ,
r ∂r ∂r r ∂θ
and
1 ∂ 2∂ 1 ∂ ∂ 1 ∂2

∆3 = 2 r + (sin θ ) + .
r ∂r ∂r sin θ ∂θ ∂θ sin2 θ ∂φ2
Looking for spherically symmetric solutions pm (i.e., depending only on the
variable r) the partial differential equations reduce to ordinary differential
equations which we easily solve to obtain the solutions
log r − log R
p2 (x) = , for x = (r, θ), (1.5.15)
log − log R

39
and
1 1
r
− R
p3 (x) = 1 1 , for x = (r, θ, φ), (1.5.16)

− R
for the given boundary conditions. Now notice that

lim p2 (x) = 1, lim p3 (x) = . (1.5.17)
R→∞ R→∞ r
The difference between the two cases is naturally interpreted as Brownian
motion being recurrent in dimension two but transient in dimensions ≥ 3. ♠
1
Remark 1.5.3 The functions u = rm−2 satisfies Laplace’s equation in Rm
for m ≥ 3, and can be used to rstablish the analogue of example 1.5.5 in
dimensions ≥ 4. ♥

Brownian motion has the special property that transition from a starting
point x to a set A is determined by the integration of a function pt (x − y; σ)
with respect to y on A. The fact that the integrand is a function of y − x
reflects a space homogeneity property which Brownian shares with random
walks. On the other hand, Markov chains do not in general enjoy such space
homogeneity property. Naturally there are many pocesses that are Markovian
in nature but do not have space homogeneity property. We present some such
examples constructed from Brownian motion.

Example 1.5.6 Consider one dimensional Brownian motion with Z◦ = x >

0 and impose the condition that if for a path ω, ω(t◦ ) = 0, then ω(t) = 0 for
all t > t◦ . This is absorbed Brownian motion which we denote by Z̃t . Let
us compute the transition probabilities for Z̃t . Unless stated to the contrary,
in this example all probabilities and events involving Zt are conditioned on
Z◦ = x. For y > 0 let
Bt (y) = {ω | ω(t) > y, min ω(s) > 0},
0≤s≤t
Ct (y) = {ω | ω(t) > y, min ω(s) < 0}.
0≤s≤t

We have
P [Zt > y] = P [Bt (y)] + P [Ct (y)]. (1.5.18)
By the reflection principle
P [Ct (y)] = P [Zt < −y].

40
Therefore
P [Bt (y)] = P [Zt > y] − P [Zt < −y]
= P [Zt > y − x | Z◦ = 0] − P [Zt > x + y | Z◦ = 0]
u2
1 R y+x −
= √2πtσ y−x e
2tσ 2 du
R y+2x
= y pt (u − x; σ)du.
Therefore for Z̃t we have
P [Z̃t = 0 | Z̃◦ = x] = 1 − P [B (0)]
Rx t
= 1 R− −x pt (u; σ)du
= 2 ◦∞ pt (x + u; σ)du.
Similarly, for 0 < a < b,
P [a < Z̃t < b] = P [Bt (a)] − P [Bt (b)]
= ab [pt (u − x; σ) − pt (u + x; σ)]du.
R

Thus we see that the transition probability has a discrete part P [Z̃t = 0 | Z̃◦ =
0] and a continuous part P [a < Z̃t < b] and is not a function of y − x. ♠
Example 1.5.7 Let Zt = (Zt;1 , Zt;2 ) denote two dimensional Brownian mo-
tion with Zt;i (0) = 0, and define
q
2 2
Rt = Zt;1 + Zt;2 .
This process means that for every Brownian path ω we consider its dis-
tance from the origin. This is a one dimensional process on R+ , called radial
Brownian motion or a Bessel process and its Markovian property is intuitively
reasonable and will analytically follow from the calculation of transition prob-
abilities presented below. Let us compute the transition probabilities. We
have
(y −x )2 +(y2 −x2 )2
1 − 1 1
P [Rt ≤ b] | Z◦ = (x1 , x2 )] =
RR
+ 2
y1 y2 ≤b2 2πtσ 2 e 2tσ 2 dy1 dy2
(r cos θ−x1 )2 +(r sin θ−x2 )2
e−
1 R b R 2π
= 2πtσ 2 ◦ ◦
2tσ 2 dθrdr
2 2
r Rb − r +ρ2
= ◦ e
2tσ I(r, x)dr.
2πtσ 2
where (r, θ) are polar coordinates in y1 y2 -plane, ρ = ||x|| and
Z 2π r
I(r, x) = e tσ2 [x1 cos θ+x2 sin θ] dθ.
◦
x1 x2
Setting cos φ = ρ
and sin φ = ρ
, we obtain
Z 2π rρ
I(r, x) = e tσ2 cos θ dθ.
◦

41
The Bessel function I◦ is defined as
1 Z 2π α cos θ
I◦ (α) = e dθ.
2π ◦
Therefore the desired transition probability
Z b
P [Rt ≤ b | Z◦ = (x1 , x2 )] = p̃t (ρ, r; σ)dr, (1.5.19)
◦

where
r − r2 +ρ22 rρ
p̃t (ρ, r; σ) = 2
e 2tσ I◦ ( 2 ).
tσ tσ
The Markovian property of radial Brownian motion is a consequence of the
expression for transition probabilities since they depends only on (ρ, r). From
the fact that I◦ is a solution of the differential equation
d2 u 1 du
+ − u = 0,
dz 2 z dz
we obtain the partial differential differential equation satisfied p̃:
∂ p̃ σ 2 ∂ 2 p̃ σ ∂ p̃
= + ,
∂t 2 ∂r2 2r ∂r
which is the radial heat equation. ♠

The analogue of non-symmetric random walk (E[X] 6= 0) is Brownian

motion with drift µ which one may define as

Ztµ = Zt + µt

in the one dimensional case. It is a simple exercise to show

Lemma 1.5.1 Ztµ is normally distributed with mean µt and variance tσ 2 ,
and has stationary independent increments.
In particular the lemma implies that, assuming Z◦µ = 0, the probability
of the set of paths that at time t are in the interval (a, b) is
1 Z b − (u−µt) 2
√ e 2tσ2 du.
2πtσ a

42
Example 1.5.8 Let −b < 0 < a, x ∈ (−b, a) and p(x) be the probability
that Ztµ hits a before it hits −b. This is similar to proposition 1.5.1. Instead
of using the mean value property of harmonic functions (which is no longer
valid here) we directly use our knowledge of calculus to derive a differential
equation for p(x) which allows us to calculate it. The method of proof has
other applications (see exercise 1.5.5). Let h be a small number and B denote
the event that Ztµ hits a before it hits −b. By conditioning on Zhµ , and setting
Zhµ = x + y we obtain
Z
1 ∞ (y−µh)2
p(x) = e− 2hσ2 p(x + y)dy
√ (1.5.20)
−∞ 2πhσ
The Taylor expansion of p(x + y) gives
1
p(x + y) = p(x) + yp0 (x) + y 2 p00 (x) + · · ·
2
Now y = Zh − Z◦ and therefore
(y−µh)2 (y−µh)2
Z ∞ e− 2hσ 2
Z ∞ e− 2hσ 2
y √ dy = µh, y2 √ dy = σ 2 h + h2 µ2 .
−∞ 2πhσ −∞ 2πhσ
It is straightforward to check that contribution of terms of the Taylor expan-
sion containing y k , for k ≥ 3, is O(h2 ). Substituting in (1.5.20), dividing by
h and taking limh→0 we obtain

σ 2 d2 p dp
2
+µ = 0.
2 dx dx
The solution with the required boundary conditions is
2µb 2µx
e σ2 − e− σ2
p(x) = 2µb 2µa .
e σ2 − e− σ2
The method of solution is applicable to other problems. ♠

43
EXERCISES

Exercise 1.5.1 Formulate the analogue of the reflection principle for Brow-
nian motion and use it to give an alternative proof of (1.5.9).

Exercise 1.5.2 Discuss the analogue of example 1.5.5 in dimension 1.

Exercise 1.5.3 Generate ten paths for the simple symmetric random walk
1
on Z for n ≤ 1000. Rescale the paths in time direction by 1000 and in the
1
space direction by 1000 , and display them as graphs.
√

Exercise 1.5.4 Display ten paths for two dimensional Brownian motion by
repeating the computer simulation of exercise 1.5.3 for each component. The
paths so generated are one dimensional curves in three dimensional space
(time + space). Display only their projections on the space variables.

Exercise 1.5.5 Consider Brownian motion with drift Ztµ and assume µ > 0.
Let −b < 0 < a and let T be the first hitting of the boundary of the interval
[−b, a] and assume Z◦µ = x ∈ (−b, a). Show that E[T ] < ∞. Derive a
differential equation for E[T ] and deduce that for σ = 1

a−x e−2µb − e−2µx

E[T ] = + (a + b) .
µ µ(e2µb − e−2µa )

Exercise 1.5.6 Consider Brownian motion with drift Ztµ and assume µ > 0
and a > 0. Let Taµ be the first hitting of the point a and

Ft (x) = P [Txµ < t | Z◦µ = 0].

Using the method of example 1.5.8, derive a differential equation for Ft (x).

Algorithm Foundations of Data Science: Lecture 1: Markov Chain
No ratings yet
Algorithm Foundations of Data Science: Lecture 1: Markov Chain
111 pages
Continuous Time Markov Processes (Tho... (Z-Library)
100% (3)
Continuous Time Markov Processes (Tho... (Z-Library)
289 pages
08 - Continuous Time Markov Chains
No ratings yet
08 - Continuous Time Markov Chains
250 pages
Math 450 Lect 05
No ratings yet
Math 450 Lect 05
11 pages
3CTMC Dewanji - 230509 - 232021
No ratings yet
3CTMC Dewanji - 230509 - 232021
19 pages
Stochastic Process
No ratings yet
Stochastic Process
154 pages
18 615 Notes
No ratings yet
18 615 Notes
33 pages
Memoria
No ratings yet
Memoria
70 pages
Continuous-Time Markov Chain 50050 26 Note For
No ratings yet
Continuous-Time Markov Chain 50050 26 Note For
9 pages
Marcov Chains
No ratings yet
Marcov Chains
40 pages
Control of Networks: Mathematical Background: 9.1 Markov Chains
No ratings yet
Control of Networks: Mathematical Background: 9.1 Markov Chains
57 pages
Markov Chains
No ratings yet
Markov Chains
55 pages
All Chapters
No ratings yet
All Chapters
180 pages
10.1515 - Spma 2021 0157
No ratings yet
10.1515 - Spma 2021 0157
15 pages
Week 9
No ratings yet
Week 9
13 pages
Chap1-2 Markov Chain
No ratings yet
Chap1-2 Markov Chain
82 pages
Markov Chains2
No ratings yet
Markov Chains2
75 pages
Complex Probability and Markov Stochasti
No ratings yet
Complex Probability and Markov Stochasti
10 pages
Chapman-Kolmogorov Equations 20 Considering A Fixed 48511
No ratings yet
Chapman-Kolmogorov Equations 20 Considering A Fixed 48511
9 pages
STA03B3 Lecture 3
No ratings yet
STA03B3 Lecture 3
27 pages
Stoch Bio Chapter 3
No ratings yet
Stoch Bio Chapter 3
46 pages
Chapter 1
No ratings yet
Chapter 1
13 pages
1516 Markov Chains 2 H
No ratings yet
1516 Markov Chains 2 H
21 pages
CT Markov
No ratings yet
CT Markov
18 pages
Markov Chains
No ratings yet
Markov Chains
73 pages
Co 3 Material
100% (1)
Co 3 Material
16 pages
5 Continuous-Time Markov Chains: Angela Peace
No ratings yet
5 Continuous-Time Markov Chains: Angela Peace
82 pages
DCN 1
No ratings yet
DCN 1
128 pages
Discrete Time Markov Chains
No ratings yet
Discrete Time Markov Chains
59 pages
Notes: 1.1 Backwards and Forward Kolmogorov Equations
No ratings yet
Notes: 1.1 Backwards and Forward Kolmogorov Equations
3 pages
Markov Chains (4728)
No ratings yet
Markov Chains (4728)
14 pages
Markov Chain
No ratings yet
Markov Chain
37 pages
Continuous Time Markov Chains: 6.1 Construction and Basic Definitions
No ratings yet
Continuous Time Markov Chains: 6.1 Construction and Basic Definitions
37 pages
MATH37012 Course Notes: Discrete Time: DR Jonathan Bagley
No ratings yet
MATH37012 Course Notes: Discrete Time: DR Jonathan Bagley
29 pages
MIT18 445S15 Lecture22
No ratings yet
MIT18 445S15 Lecture22
10 pages
1 Continuous-Time Markov Chains
No ratings yet
1 Continuous-Time Markov Chains
26 pages
Discrete-Time Markov Chains: He Shuangchi
No ratings yet
Discrete-Time Markov Chains: He Shuangchi
61 pages
Stochastic - Lecture Notes
100% (1)
Stochastic - Lecture Notes
108 pages
Markov Chains: 1.1 Specifying and Simulating A Markov Chain
No ratings yet
Markov Chains: 1.1 Specifying and Simulating A Markov Chain
38 pages
Lecture 22: Continuous Time Markov Chains
No ratings yet
Lecture 22: Continuous Time Markov Chains
5 pages
Math 404 Lecture Notes
No ratings yet
Math 404 Lecture Notes
31 pages
Markov Chains: J. M. Akinpelu
No ratings yet
Markov Chains: J. M. Akinpelu
56 pages
Markov Transition Kernel and The Generator Q Matrix
No ratings yet
Markov Transition Kernel and The Generator Q Matrix
6 pages
A Radical Approach To Lebesgue's Theory of Integration Part1
0% (1)
A Radical Approach To Lebesgue's Theory of Integration Part1
100 pages
Markov Chain
No ratings yet
Markov Chain
8 pages
Handbook of Mathematical Economics Volume 2 1st Edition by Michael Intriligator, Kenneth Arrow ISBN 0444861270 9780444861276
100% (11)
Handbook of Mathematical Economics Volume 2 1st Edition by Michael Intriligator, Kenneth Arrow ISBN 0444861270 9780444861276
87 pages
Continuous Time Markov Chains
No ratings yet
Continuous Time Markov Chains
37 pages
Cauchy PDF
No ratings yet
Cauchy PDF
93 pages
Graphs of Functions
No ratings yet
Graphs of Functions
13 pages
Stochastic Process Simulation in Matlab
No ratings yet
Stochastic Process Simulation in Matlab
17 pages
Notes On Markov Chain
No ratings yet
Notes On Markov Chain
34 pages
Stochastic Processes and Time Series Markov Chains - I: 1 A Coin-Tossing Game
No ratings yet
Stochastic Processes and Time Series Markov Chains - I: 1 A Coin-Tossing Game
4 pages
Cadenas de Markov
No ratings yet
Cadenas de Markov
53 pages
PDF Mathematical Analysis: Volume I Teo Lee Peng Download
100% (2)
PDF Mathematical Analysis: Volume I Teo Lee Peng Download
55 pages
1 IEOR 4701: Continuous-Time Markov Chains
No ratings yet
1 IEOR 4701: Continuous-Time Markov Chains
22 pages
Weakly Differentiable Functions Volume 120 - Pointwise Behavior of Sobolev Functions
No ratings yet
Weakly Differentiable Functions Volume 120 - Pointwise Behavior of Sobolev Functions
65 pages
MATH858D Markov Chains: Maria Cameron
No ratings yet
MATH858D Markov Chains: Maria Cameron
44 pages
Sequences and Series of Functions (Finalized)
100% (1)
Sequences and Series of Functions (Finalized)
22 pages
Lecture # 11 (Limits and Continuity)
No ratings yet
Lecture # 11 (Limits and Continuity)
22 pages
MC Notes
No ratings yet
MC Notes
42 pages
StochBioChapter3 PDF
No ratings yet
StochBioChapter3 PDF
46 pages
Continuity at A Point and On An Open Interval
No ratings yet
Continuity at A Point and On An Open Interval
20 pages
Markov Chains
No ratings yet
Markov Chains
42 pages
Markov Chains 2013
No ratings yet
Markov Chains 2013
42 pages
Limits and Continuity I Class Presentation
100% (1)
Limits and Continuity I Class Presentation
72 pages
Convergence of Fourier Series
No ratings yet
Convergence of Fourier Series
5 pages
Markov Hand Out
No ratings yet
Markov Hand Out
14 pages
Report Calculus 2
No ratings yet
Report Calculus 2
5 pages
The Space-Time Interpretation of Poincare's Conjecture Proved by G. Perelman
No ratings yet
The Space-Time Interpretation of Poincare's Conjecture Proved by G. Perelman
48 pages
Chapter 9 Uniform Convergence and Integration
100% (1)
Chapter 9 Uniform Convergence and Integration
27 pages
4703 07 Notes MC PDF
No ratings yet
4703 07 Notes MC PDF
7 pages
Maths Syllabus New IMA Bhubaneswar Tifr
No ratings yet
Maths Syllabus New IMA Bhubaneswar Tifr
4 pages
Solution of Dirichlet Problem - Perron's Method
No ratings yet
Solution of Dirichlet Problem - Perron's Method
16 pages
1 Discrete-Time Markov Chains
No ratings yet
1 Discrete-Time Markov Chains
7 pages
BS Syllabus 02-10-2018
No ratings yet
BS Syllabus 02-10-2018
17 pages
Chapter 8 - Limit
No ratings yet
Chapter 8 - Limit
17 pages
Continuity and Uniform Continuity
No ratings yet
Continuity and Uniform Continuity
8 pages
Bounded Variation & Helly's Selection Theorem
No ratings yet
Bounded Variation & Helly's Selection Theorem
16 pages
Courant & John Introduction To Calculus and Analysis I
No ratings yet
Courant & John Introduction To Calculus and Analysis I
11 pages
Funcnd Limcont
No ratings yet
Funcnd Limcont
23 pages
FINAL SCHEDULE - 20 Days Calculus-Apr
No ratings yet
FINAL SCHEDULE - 20 Days Calculus-Apr
1 page
Continuity of A Function
No ratings yet
Continuity of A Function
2 pages
Math CC13 PDF
No ratings yet
Math CC13 PDF
2 pages
2010 Origami PDE
No ratings yet
2010 Origami PDE
16 pages
Notes 1: Introduction and Definitions
No ratings yet
Notes 1: Introduction and Definitions
3 pages
1.1 Metric Space
No ratings yet
1.1 Metric Space
2 pages
Answer Key 1-3 Study Guide and Intervention
No ratings yet
Answer Key 1-3 Study Guide and Intervention
2 pages
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
From Everand
Student Solutions Manual to Accompany Economic Dynamics in Discrete Time, secondedition
Yue Jiang
4.5/5 (2)
Lectures on Integral Equations
From Everand
Lectures on Integral Equations
Harold Widom
4.5/5 (2)
Differential Forms
From Everand
Differential Forms
Henri Cartan
5/5 (2)

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.

Chapter 3

Uploaded by

Chapter 3

Uploaded by

1 Continuous Time Processes

1.1 Continuous Time Markov Chains

P [Xs+t = j|Xs = i, Xsn = in , · · · , Xs1 = i1 ] = P [Xs+t = j|Xs = i] (1.1.1)

which means (1.1.2) is valid. Naturally P◦ = I.

lim ω(t) = ω(a),

which means that Xt (ω), regarded as a function of t for each fixed ω, is

A continuous time Markov chain is determined by the matrices Pt . The fact

Lemma 1.1.1 The matrix A = (Aij ) has the following properties:

Proof - Follows immediately from the stochastic property of Ph and the

Example 1.1.1 Assume we are given a matrix A satisfying the properties of

where B ◦ = I. Substituting tA for B and differentiating formally we see that

To prove non-negativity of the entries we make use of the formula (familiar

P [Xs+h = i for all h ∈ [0, ) | Xs = i] = 1 − λi s + o(), (1.1.8)

as  → 0. Here λi is a non-negative real number and the notation g() = o()

The mean and variance of T are λ1 . It is useful to allow the possibility

Proposition 1.1.1 Let T1 , T2 , . . . be independent exponential random vari-

implies P [ k Tk < ∞] = 1 (resp. P [ k Tk = ∞] = 1).

and therefore if k λ1k < ∞ then P [ Tk = ∞] = 0 which proves the first

assertion. To prove the second assertion note that

Lemma 1.1.2 The condition πPt = π is equivalent to πA = 0.

Proof - It is immediate that the condition πPt − π = 0 implies πA = 0.

Example 1.1.2 We apply the above considerations to an example from

P [X(t + h) = k + 1 | X(t) = k] = λh + o(h).

Similarly the assumption about service times implies

P [X(t + h) = k − 1 | X(t) = k] = µh + o(h).

It follows that the infinitesimal generator of Xt is

−λπ◦ + µπ1 = 0, · · · , λπi−1 − (λ + µ)πi + µπi+1 = 0, · · ·

Proposition 1.1.2 Let Xt be a finite state continuous time Markov chain

lim Pt (Ps )l = lim (Ps )l

for every s > 0. ♣

2. Does it have a stationary distribution for α ≥ 1?

Exercise 1.1.3 (Continuation of exercise 1.1.2) - Let Y◦ , Y1 , Y2 , · · · be inde-

1. Explain why this gives a Markov process.

2. Find the infinitesimal generator of this Markov process.

3. Find its stationary distribution by making reasonable assumption on

Tn = sup{t | Nt ≤ n − 1}, (1.2.2)

and Wn = Tn − Tn−1 . One can similarly construct other counting processes

ϕ(a) − lim− ϕ(t) = 1,

P [Nh = 0] − 1 = −µh + o(h), P [Nh = 1] = µh + o(h), P [Nh ≥ 2] = o(h).

To further analyze Poisson processes we recall the following elementary

Lemma 1.2.1 Let Xi be random variables with uniform density on [0, a]

Proof - The m! permutations decompose [0, a]m into m! subsets according

xi1 ≤ xi2 ≤ · · · ≤ xim

from which the required result follows.

where ∂(x 1 ,···,xm )

We now apply these general considerations to calculate the conditional den-

f (w1 , · · · , wm ) = µm e−µ(w1 +···+wm ) for wi ≥ 0.

Consider the linear transformation

Then the joint density of random variables T1 , T2 , · · · , Tm+1 is

h(t1 , · · · , tm+1 ) = µm+1 e−µtm+1 .

where U is the region

Carrying out the simple integration we obtain

P [Am ] = µm t1 (t2 − t1 ) · · · (tm − tm−1 )e−µt

Proof - Follows from lemma 1.2.1 and (1.2.7). ♣.

E[ξ U η V ] = E[E[ξ U η V | Nt4 ]]

Recall from elementary probability that conditioned on Nt4 = M , the times

1.3 The Kolomogorov Equations

P [splitting in h units of time] = λh + o(h)

The probability of each of the events Ai ∩ Aj , Ai ∩ Aj ∩ Ak etc. is clearly

P [A] = nλh + o(h). (1.3.1)

We now use Kolomogorov’s forward equation to solve this problem. From

where N is given by the initial condition X◦ = N . For λ = µ this expression

for the probability of eventual extinction. ♠

Exercise 1.3.2 The probability that a central networking system receives a

Assuming that X(0) = m, use the differential equation to show that

is the desired solution to equation for FX .

Exercise 1.3.4 The velocities Vt of particles in a quantized field are assumed

Let FV (ξ, t) be the probability generating function

with the initial condition FX (ξ, 0) = ξ N .

2. Deduce the validity of (1.3.7) and (1.3.8).

5. Show that µ > λ we have

6. Show that µ < λ we have

(Use the fact that

to calculate E[Z | Z < ∞].)

The Markov property implies that transitions between states follows a

Y (ω) = inf{t | ω(t) 6= i}

P [Xs+h = i for all h ∈ [0, ) | Xs = i] = 1 − λi s + o(), (1.1.8)

as → 0. Here λi is a non-negative real number and the notation g() = o()