0% found this document useful (0 votes)
64 views17 pages

Stochastic Dynamic Programming: 4.1 The Axiomatic Approach To Probability: Basic Con-Cepts of Measure Theory

This chapter introduces stochastic dynamic programming, which extends the framework from the previous chapter to include uncertainty. It discusses Markov processes, where the next state depends only on the current state. It reviews key concepts from probability theory and measure theory, which provide a framework for defining conditional probabilities and expectations that are used in stochastic dynamic programming formulations. It also defines Lebesgue integration, which provides a unified way to compute expectations, and discusses properties of Markov chains where the state is finite.

Uploaded by

Vasil Yordanov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
64 views17 pages

Stochastic Dynamic Programming: 4.1 The Axiomatic Approach To Probability: Basic Con-Cepts of Measure Theory

This chapter introduces stochastic dynamic programming, which extends the framework from the previous chapter to include uncertainty. It discusses Markov processes, where the next state depends only on the current state. It reviews key concepts from probability theory and measure theory, which provide a framework for defining conditional probabilities and expectations that are used in stochastic dynamic programming formulations. It also defines Lebesgue integration, which provides a unified way to compute expectations, and discusses properties of Markov chains where the state is finite.

Uploaded by

Vasil Yordanov
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 17

Chapter 4

Stochastic Dynamic Programming

The aim of this chapter is to extend the framework we introduced in Chapter 3 to include
uncertainty. To evaluate decisions, we use the well known expected utility theory.1 With
uncertainty we will face Bellman equations of the following form
V (x, z) = sup F (x, x′ , z) + βE [V (x′ , z ′ ) | z] , (4.1)
x′ ∈Γ(x,z)

where z is a stochastic component, assumed to follow a (stationary) first order Markov


Process. A first order Markov process is a sequence of random variables {zt }∞ t=0 with
the property that the conditional expectations depend only on the last realization of the
process, that is if C is a set of possible values for z, then
Pr {zt+1 ∈ C | zt , zt−1 , ..., z0 } = Pr {zt+1 ∈ C | zt } .
To make the above statements formally meaningful we need to review some concepts of
Probability Theory.

4.1 The Axiomatic Approach to Probability: Basic Con-


cepts of Measure Theory
I am sure you are all familiar with the expression Pr {zt+1 ∈ C | zt } for conditional proba-
bilities, and with the conditional expectation operator E [· | z] in (4.1). Probability theory
is a special case of the more general and very powerful Measure Theory, first formulated
in 1901 by Henri Léon Lebesgue.2
1
For a review on the theories of decisions under uncertainty see Machina (1987).
2
This outstanding piece of work appears in Lebesgue’s dissertation, Intégrale, Longueur, Aire, pre-
sented to the University of Nancy in 1902.

69
70 CHAPTER 4. STOCHASTIC DYNAMIC PROGRAMMING

We first introduce a set Z which will be our sample space. Any subset E of Z, will
be denoted as an event. In this way, all results of set theory - unions, intersections,
complements, ... - can be directly applied to events as subsets of Z. To each event we
also assign a “measure” µ(E) = Pr {E} called probability of the event. These values are
assigned according to the function µ which has by assumption the following properties
(or axioms):

1. 0 ≤ µ(E) ≤ 1;

2. µ(Z) = 1;

3. For any finite or infinite sequence of disjoint sets (or mutually exclusive events)
E1 , E2 , ....; such that Ei ∩ Ej = ∅ for any i, j, we have
N
X

µ ∪N
i=1 Ei = µ (Ei ) where N possibly equals ∞.
i=1

All properties 1-3 are very intuitive for probabilities. Moreover, we would intuitively like
to consider E as any subset of Z. Well, if Z is a finite or countable set then E can literally
be any subset of Z. Unfortunately, when Z is a uncountably infinite set - such as the
interval [0, 1] for example - it might be impossible to find a function µ defined on all
possible subsets of Z and at the same time satisfying all the three axioms we presented
above. Typically, what fails is the last axiom of additivity when N = ∞. Lebesgue
managed to keep property 3 above by defining the measure function µ only on the so-
called measurable sets (or events). This is not an important limitation, as virtually all
events of any practical interest tuned out to be measurable. Actually, in applications
one typically considers only some class of possible events. A subset of the class of all
measurable sets.
The reference class of sets Z represents the set of possible events, and will constitute
a σ-algebra.3 Notice that Z is a set of sets, hence an event E is an element of Z, i.e. in
contrast to E ⊂ Z we will write E ∈ Z. The pair (Z, Z) constitutes a measurable space
while the turple (µ, Z, Z) is denotes as a measured (or probability) space.
3
A family Z of subsets of Z is called a σ algebra if: (i) both the empty set ∅ and Z belong to Z; (ii)
If E ∈ Z then also its complement (with respect to Z) E c = Z\E ∈ Z; and (iii) for any sequence of
sets such that En ∈ Z for all n = 1, 2, .... we have that the set (∪∞ n=1 En ) ∈ Z. It is easy to show that

whenever Z is a σ-algebra then (∩n=1 En ) ∈ Z as well. When Z is a set of real numbers, we can consider
our set of possible events as the Borel σ-algebra. Which is σ-algebra ‘generated’ by the set of all open
sets.
4.1. THE AXIOMATIC APPROACH TO PROBABILITY: BASIC CONCEPTS OF MEASURE T

I am sure it is well known to you that the expectation operator E [·] in (4.1) is nothing
more than an integral, or a summation when z takes finitely or countably many values.
For example, assume pi is the probability that z = zi . The expectation of the function f
can be computed as follows
N
X
E [f (z)] = pi f (zi ).
i=1

One of the advantages of the Lebesgue theory of integration is that, for example, it
includes both summations and the usual concept of (Riemann) integration in an unified
framework. We will be able to compute expectations4
Z
E [f (z)] = f (z)dµ(z)
Z

no matter how Z is and no matter what is the distribution µ of the events. For example,
we can deal with situations where Z is the interval [0, 1] and the event z = 0 has a
positive probability µ(0) = p0 . Since the set of all measurable events Z does not include
all possible subsets of Z, we must restrict the set of functions f for which we can take
expectations (integrals) as well.

Definition 32 A real valued function f is measurable with respect to Z if for every real
number x the set
Efx = {z ∈ Z : f (z) ≥ x}

belongs to the set of events Z.

Sometimes we do a sort of inverse operation. We have in mind a class of real valued


functions F , each one defined over the set of events Z. We define a σ-algebra ZF so that to
have every f ∈ F measurable, and any such function f ∈ F is called as random variable.

Definition 33 The Lebesgue integral of a measurable positive function f ≥ 0 is defined


as follows
Z Z Z
f (z)dµ(z) = sup φ(z)dµ(z) = inf φ(z)dµ(z).
Z 0≤φ≤f Z φ≥f ≥0 Z

4
When at z the measure µ has a density, the notation dµ (z) corresponds to the more familiar fµ (z) dz.
When µ does not admits density, dµ (z) it is just the notation we use for its analogous concept.
72 CHAPTER 4. STOCHASTIC DYNAMIC PROGRAMMING

In the definition, φ is any simple (positive) function (in its standard representation),
that is, φ is a finite weighted sum of indicator functions5
n
X
φ(z) = ai IEi (z); ai ≥ 0; and its integral is
i=1
Z n
X
φ(z)dµ(z) = ai µ(Ei ),
Z i=1

where for each i, j Ei ∩ Ej = ∅; and ∪ni=1 Ei = Z.


The Lebesgue integral of f is hence (uniquely) defined as the supremum of integrals
of nonnegative dominated simple functions φ : such that for all z, 0 ≤ φ(z) ≤ f (z); which
in turn coincides with the infimum over all the dominating simple functions: φ ≥ f . We
do not have space here to discuss the implications of this definition6 however, one should
recall from basic analysis that the Riemann integral, that is the “usual” integral we saw
in our undergraduate studies, can be defined in a similar way; where instead of simple
functions one uses step functions. One can show that each function f which is Riemann
integrable it is also Lebesgue integrable, and that there are simple examples where the
converse is false.7

4.2 Markov Chains and Markov Processes


Markov Chains We now analyze in some detail conditional expectations for the simple
case where Z is finite. So, assume that the stochastic component z can take finitely many
values, that is z ∈ Z ≡ {z1 , z2 , ..., zN } , with corresponding conditional probabilities

πij = Pr {z ′ = zj | z = zi } , i, j = 1, 2, ..., N.
5
The indicator function of a set E is defined as
(
1 if z ∈ E
IE (z) =
0 otherwise.

6
See for example SLP, Ch. 7.
7
One typical counter-example is the function f : [0, 1] → [0, 1] defined as follows
(
1 if z is rational
f (z) =
0 otherwise.
R
This function is Lebesgue integrable with f (x)dx = 0, but it is not Riemann integrable.
4.2. MARKOV CHAINS AND MARKOV PROCESSES 73

Since πij describes the probability of the system to move to state zj if the previous state
was zi , they are also called transition probabilities and the stochastic process form a
Markov chain. To be probabilities, the πij must satisfy
N
X
πij ≥ 0, and πij = 1 for i = 1, 2, ..., N,
j=1

that is, they must belong to a (N − 1)−dimensional simplex ∆N . It is typically convenient


to arrange the transition probabilities in a square array as follows
 
π11 π12 ... π1N
 π 
 21 π22 ... ... 
Π= 
 ... ... πij ... 
πN 1 ... ... πN N

Such an array is called transition matrix or Markov matrix, or stochastic matrix. If the
probability distribution over the state in period t is pt = (pt1 , pt2 , ...ptN ) , the distribution

over the state in period t + 1 is pt Π = pt+1 t+1 t+1
1 , p2 , ...pN , where
N
X
pt+1
j = pti πij , j = 1, 2, ..., N.
i=1

For example, suppose we want to know what is the distribution of the next period states if
in the current period the is zi . Well, this means that the initial distribution is a degenerate
one, namely pt = ei = (0, ..., 1, ..., 0) . As a consequence, the probability distribution
over the next period state is the i−th row of Π : ei Π = (πi1 , πi2 , ...πiN ) . Similarly, if pt
is the period t distribution, then by the properties of the matrix multiplication, pt Πn =
p(Π · Π · ...Π) is the t + n period distribution pt+n over the states. It is easy to see that
if Π is a Markov matrix then so is Πn . A set of natural question then arises. Is there a
stationary distribution, that is a probability distribution p∗ with the property p∗ = p∗ Π?
Under what conditions can we be sure that if we start from any initial distribution p0 ,
the system converges to a unique limiting probability p∗ = limn→∞ {p0 Πn }?
The answer to the first question turns out to always be affirmative for Markov chains.

Theorem 18 Given a stochastic matrix Π, there always exists at least one stationary
P
distribution p∗ such that p∗ = p∗ Π, with p∗i ≥ 0 and N ∗
i=1 pi = 1.

Proof. Notice that a solution to the system of equations p∗ = p∗ Π corresponds to


solving p∗ (I − Π) = 0, where I is the N dimensional identity matrix. Transposing both
74 CHAPTER 4. STOCHASTIC DYNAMIC PROGRAMMING

sides of the above equation gives

(I − Π′ )p∗′ = 0.

So p∗ is a nonnegative eigenvector associated with a unit eigenvalue of Π′ , normalized


P
to satisfy i p∗i = 1. So we can use linear algebra to show this result. Thanks to the
Leontief’s Input-Output analysis, during the 50s and 60s economics (re)discovered many
important theorems about matrices with nonnegative elements. Any matrix with non-
negative elements has a Frobenius root λ ≥ 0 with associated a nonnegative eigenvec-
tor. This existence result is the most difficult part of the proof and is due to Frobenius
(1912) (See also Takayama, 1996, Theorem 4.B.2, pp. 375). Fisher (1965) and Takayama
(1960) showed that when the elements of each column of the a matrix with nonnega-
tive elements sum to one then its Frobenius root equals one, i.e. λ = 1 (Takayama,
1996, Theorem 4.C.11, pp. 388). The proof of this last statement is simple: let p∗ ≥ 0
P
the eigenvector associated with λ. By definition λp∗ = Π′ p∗ , that is, λp∗i = j πij′ p∗j ,
i = 1, 2, ..., N. Summing up over i, we obtain
N N XN N N
!
X X X X
λ p∗i = πij′ p∗j = p∗j πij′
i=1 i=1 j=1 j=1 i=1
PN
P PN j=1 pj

since Π′ is the transpose of Π, N π ′
i=1 ij = π
j=1 ij = 1. Hence λ = PN
p ∗ = 1. Q.E.D.
i=1 i
Consider now the second question. Can we say that p∗ is unique? Unfortunately, in
order to guarantee that the sequence of matrices converges to a unique matrix P ∗ with
identical rows p∗ , (so that for any p we have pP ∗ = p∗ ), we need some further assumptions,
as the next exercises shows.

Exercise 41 Assume that a Markov chain (with Z = {z1 , z2 }) is summarized by the


following transition matrix
" #
π11 1 − π11
Π= .
1 − π22 π22

A stationary distribution is hence a vector (q ∗ , 1 − q ∗ ) with 1 ≥ q ∗ ≥ 0 such that (q ∗ , 1 − q ∗ )·


Π = (q ∗ , 1 − q ∗ ) . We know from above that at least one such q ∗ must exists.
(a) Using simple algebra show that q ∗ solves (2 − π22 − π11 ) q ∗ = (1 − π22 ) , and discuss
conditions where q ∗ might take multiple values.
(b) Now set π11 = π22 = π and state conditions for q ∗ to be unique.

Here is a set of sufficient conditions for uniqueness.


4.2. MARKOV CHAINS AND MARKOV PROCESSES 75

Theorem 19 Assume that πij > 0 for all i, j = 1, 2, ...N. There exists a limiting distri-
bution p∗ such that
(n)
p∗j = lim πij ,
n→∞
(n)
where πij is the (i, j) element of the matrix Πn . And p∗j are the unique nonnegative
solutions of the following system of equations
N
X
p∗j = p∗k πkj ; or p∗ = p∗ Π; and
k=1
N
X
p∗j = 1.
j=1

Proof. See below. Q.E.D.


The application of the transition matrix on a probability distribution p can be seen as
a mapping of the (N − 1)−dimensional simplex into itself. In fact, under some conditions,
the operator

TΠ : ∆N → ∆N (4.2)
TΠ p = pΠ

defines a contraction on the metric space ∆N , |·|N where
N
X
|x|N ≡ |xi | .
i=1

Exercise 42 (i) Show that ∆N , |·|N is a complete metric space. (ii) Moreover, show
that if πij > 0, i, j = 1, 2, ....; the mapping T in (4.2) is a contraction of modulus β = 1−ε,
P
where ε = N j=1 εj and εj = mini πij > 0.

(n)
When some πij = 0, we might loose uniqueness. However, following the same line of
P
proof one can show that the stationary distribution is unique as long as ε = N j=1 εj > 0.
Could you explain intuitively why this is the case?
Moreover, from the contraction mapping theorem, it is easy to see that the above
proposition remains valid if the assumption πij > 0 is replaced with: there exists a n ≥ 1
(n)
such that πij > 0 for all i, j. (see Corollary 2 of the contraction mapping Theorem (Th.
3.2) in SLP).
n ∞
Notice
" # the sequence {Π }n=0 might not always converge. For example,
that " consider
#
0 1 1 0
Π = . It is easy to verify that the sequence jumps from Π2n = and
1 0 0 1
76 CHAPTER 4. STOCHASTIC DYNAMIC PROGRAMMING

Π2n+1 = Π. However, the fact that in a Markov chain the state space is finite implies that
the long-run averages ( T −1 )∞
1X t
Π
T t=0
T =1
do always converge to a stochastic matrix P , and the sequence pt = p0 Πt converges to

T −1
1X t
lim p = p0 P ∗ .
T →∞ T
t=0
" #
1
PT −1 1/2 1/2
In the example we saw above one can easily verify that T t=0 Πt → P ∗ = ,
1/2 1/2
and the unique stationary distribution is p∗ = (1/2, 1/2) .
In other cases, the rows of the limit matrix P ∗ are not necessarily
" always# identical to
1 0
each other. For example, consider now the transition matrix Π = . It is obvious
0 1
that in this case P ∗ = Π, which has two different rows. It is also clear that both rows
constitute a stationary distribution. This is true in general: any row of the limit matrix
P ∗ is an invariant distribution for the transition matrix Π.
What is perhaps less obvious is that any convex combination of the rows of P ∗ con-
stitute a stationary distribution, and that all invariant distributions for Π can be derived
by making convex combinations of the rows of P ∗ .
" #
1 0
Exercise 43 (i) Consider first the above example with P ∗ = Π = . Show that
0 1
any vector p∗λ = (λ, 1 − λ) obtained as a convex combination of the rows of P ∗ constitutes
a stationary distribution for Π. Provide an intuition for the result. (ii) Now consider the
general case, and let p∗ and p∗∗ two stationary distributions for a Markov chains defined
by a generic stochastic matrix Π. Show that any convex combination pλ of p∗ and p∗∗
constitute a stationary distribution for Π.

Markov Processes The more general concept corresponding to a Markov chain, where
Z can take countably or uncountably many values, is denoted as a Markov Process.
Similarly to the case where Z is finite, a Markov process is defined by a transition function
(or kernel) Q : Z ×Z → [0, 1] such that: (i) for each z ∈ Z Q(z, ·) is a probability measure;
and (ii) for each C ∈ Z Q(·, C) is a measurable function.
Given Q, one can compute conditional probabilities

Pr {zt+1 ∈ C | zt = c} = Q(c, C)
4.2. MARKOV CHAINS AND MARKOV PROCESSES 77

and conditional expectations in the usual way


Z
E [f | z] = f (z ′ )dQ(z, z ′ ).
Z

Notice that Q can be used to map probability measure into probability measures since
for any µ on (Z, Z) we get a new µ′ by assigning to each C ∈ Z the measure
Z

(TQ µ) (C) = µ (C) = Q(z, C)dµ(z),
Z

and T is denoted as Markov operator.


We now define a very useful property for Q.

Definition 34 Q has the Feller property if for any bounded and continuous function f
the function
Z
g(z) = (PQ f )(z) = E [f | z] = f (z ′ )dQ(z, z ′ ) for any z

is still bounded and continuous.

The above definition first of all shown another view of Q. It also defines an operator
(sometimes called transition operator) that in general maps bounded and measurable
functions into bounded measurable functions. When Q has the feller property the operator
PQ preserves continuity.

Technical Digression (optional). It turns out that the Feller property characterizes
continuous Markov transitions. The rigorous idea is simple. Let M be the set of all
probability measures on Borel sets Z over a metrizable space Z, and for each z, let
Q (z, ·) a member of M. The usual topology defined in the space of Borel measures is
the topology of convergence in distribution (or weak topology).8 It is now useful to make
pointwise considerations. For each z the probability measure Q (z, ·) can be seen as a
linear mapping from the set of bounded and measurable functions into the real numbers
R
according to x = hf, Q (z, ·)i = f (z ′ )dQ(z, z ′ ).
It turns out that a transition function Q : Z → M is continuous if and only if it has
the Feller property. The fact that a continuous Q has the Feller property is immediate: By
definition of the topology defined on M (weak topology), via the map Ff (µ) = hf, µi each
8
R R
In this topology, a sequence {µn } in M converges to µ if and only if f dµn → f dµ for all continuous
and bounded functions f.
78 CHAPTER 4. STOCHASTIC DYNAMIC PROGRAMMING

continuous and bounded function f : Z → IR defines a continuous real valued function


Ff : M → IR.9 Now note that when µ is Q (z, ·) we have Ff (Q (z, ·)) = (PQ f )(z). Now,
continuity of Q means that as zn → z we have Q (zn , ·) → Q (z, ·) in M. Equivalently,
if we let µn (·) = Q (zn , ·) and use the usual topology on M, continuity of Q means that
Ff (Q (zn , ·)) → Ff (Q (z, ·)) (interpreted now as sequence of real numbers). We have
hence established that (PQ f ) = g is a continuous function in Z, i.e. that Q has the
Feller property. In order to show rigorously that the Feller property implies continuity -
although it is intuitive - one needs some more work.10
We can now study the issue of existence and uniqueness of a stationary distribution.
A stationary distribution for Q is a measure µ∗ on (Z, Z) such that for any C ∈ Z
Z

µ (C) = Q(z, C)dµ∗ (z),
Z

that is µ∗ , is a fixed point of the Markov operator TQ . There are many results establishing
existence and uniqueness of a stationary distribution. Here is a result which is among the
easiest to understand, and that uses the Feller property of Q.

Theorem 20 If Z is a compact set and Q has the Feller property then there exists a
R R
stationary distribution µ∗ : µ∗ = TQ µ∗ , where µ = λ if and only if f dµ = f dλ for each
continuous and bounded function f.

Proof. See SLP, Theorem 12.10, page 376-77. The basic idea of the proof can also
be get as an application of one of the infinite dimensional extensions of the Brower fixed
point theorem (usually called Brower-Shauder-Tyconoff fixed point). We saw above that
whenever Q has the Feller property, the associated Markov operator TQ is a continuous
map from the compact convex (locally convex Hausdorff) space of distributions Λ into
itself. [See Aliprantis and Border (1994), Corollary 14.51, page 485] Q.E.D.
Similarly to
n the finite state case, this invariant measure can be obtained by looking at
1
PT −1 t o∞
the sequence T t=1 TQ λ0 of T -period averages.
T =1
When the state space is not finite, we may define several different concepts of converge
for distributions. The most known ones are weak convergence (commonly denoted con-
vergence in distribution) and strong convergence (or convergence in total variation norm,
also denoted as setwise convergence). We are not dealing with these issues in these class
notes. The concept of weak convergence is in most cases all that we care about in the
9
Let xµn = Ff (µn ) . By definition of weak topology, if µn → µ then Ff (µn ) → Ff (µ) .
10
The interest reader can have a look at Aliprantis and Border (1994), Theorem 15.14, page 531-2.
4.3. BELLMAN PRINCIPLE IN THE STOCHASTIC FRAMEWORK 79

context of describing the dynamics of an economic system. Theorem 20 deals with weak
convergence. The most known results of uniqueness use some monotonicity conditions on
the Markov operator, together with some mixing conditions. For a quite general treat-
ment of monotonic Markov operators, with direct applications to economics and dynamic
programming, see Hopenhayn and Prescott (1992).
If we require strong convergence, one can guarantee uniqueness under conditions sim-
ilar to those of Theorem 19, using the contraction mapping theorem. See Chapter 11 in
SLP, especially Theorem 11.12.

4.3 Bellman Principle in the Stochastic Framework

The Finite Z case. When the shocks belong to a finite set all the results we saw
for the deterministic case are true for the stochastic environment as well. The Bellman
Principle of optimality remains true since both Lemma 1 and 2 remain true. Expectation
are simply a weighted sums of the continuation values. In this case Theorem 12 remains
true under the same conditions as in the deterministic case. From the proof of Theorem
13 and 14 it is easy to see that also the verification and sufficiency theorems can easily be
extended to the stochastic case with finite shocks. We just need to require boundedness
to be true for all z. Even the Theorems 15 and 16 are easily extended to the stochastic
case following the same lines of proof we proposed in Chapter 3.1. In order to show you
that there is practically no difference between the deterministic and the stochastic case
when Z is finite, let me be a bit boring and consider for example the stochastic extension
of Theorem 15. Assume w.l.o.g. that z may take N values, i.e. Z = (z1 , z2 , ..., zN ). We
can always consider our fixed point

N
X
V (x, zi ) = sup F (x, x′ , zi ) + β πij V (x′ , zj ), ∀i
x′ ∈Γ(x,z i) j=1

in the space CN (X) of vectors of real valued functions:

V(x) = (V (x, z1 ), ..., V (x, zN )) = (V1 (x), ..., VN (x))


80 CHAPTER 4. STOCHASTIC DYNAMIC PROGRAMMING

which are continuous and bounded in X with the metric dN


∞ , where
11

N
X N
X
dN
∞ (V, W) = d∞ (Vi , Wi ) = sup |V (x, zi ) − W (x, zi )| .
x
i=1 i=1

One can easily show that such metric space of functions is complete, and that the same
conditions for a contraction in the deterministic case can be used here to show that the
operator

T CN (X) → CN (X)
:
 ′
PN ′

 sup x ′ ∈Γ(x,z ) F (x, x , z1 ) + β
j=1 π1j V (x , zj )

 sup
1

PN ′
x′ ∈Γ(x,z2 ) F (x, x , z2 ) + β j=1 π2j V (x , zj )
T V(x) =

 ...

 P
supx′ ∈Γ(x,zN ) F (x, x , zN ) + β N
′ ′
j=1 πN j V (x , zj )

is a contraction with modulus β. It is easy to see that both boundedness and - by the
Theorem of the Maximum - continuity is preserved under T. Similarly, given that (condi-
tional) expectations are nothing more than convex combinations, concavity is preserved
under T , and the same conditions used for the deterministic case can be assumed here to
guarantee the stochastic analogous to Theorem 16.

The General case When Z is continuous, we need to use measure theory. We need to
assume some additional technical restrictions to guarantee that the integrals involved in
the expectations and the limits inside those integrals are well defined.
Unfortunately, these technical complications prevent the possibility of having a result
on the lines of Theorem 12. The reason is that we one cannot be sure that the true value
function is measurable. As a consequence, the typical result in this case are in form of
the verification or sufficiency theorems. Before stating formally the result we need to
introduce some notation.

Definition 35 A plan π is an initial value π0 ∈ X and a sequence of (ht −measurable)


functions12
πt : H t → X
11
Another possibility is to use dmax

 
dmax
∞ (V, W) = max {d∞ (Vi , Wi )} = max sup |V (x, zi ) − W (x, zi )| .
i i x

12
A function is said to be ht −measurable when it is measurable with respect to the σ−algebra generated
by the set of all possible ht histories H t .
4.3. BELLMAN PRINCIPLE IN THE STOCHASTIC FRAMEWORK 81

for all t ≥ 1, where H t is the set of all length-t histories of shocks: ht = (z0 , z1 , ...zt ) , zt ∈
Z.

That is, πt (ht ) is the value of the endogenous state xt+1 that is chosen in period t,
when the (partial) history up to this moment is ht . So, in a stochastic framework agents
are taking contingent plans. They are deciding what to do for any possible history, even
though some of these histories are never going to happen. Moreover, for any partial
history ht ∈ H t one can define a probability measure µt : µt (C) = Pr {ht ∈ C ⊆ H t }. In
this environment, feasibility is defined similarly to the deterministic case. We say that
the plan π is feasible, and write π ∈ Π(x0 , z0 ) if π0 ∈ Γ(x0 , z0 ) and for each t ≥ 1 and
ht we have πt (ht ) ∈ Γ(πt−1 (ht−1 ), zt ). We will always assume that F, Γ, β and µ are such
that Π(x0 , z0 ) is nonempty for any (x0 , z0 ) ∈ X × Z, and that the objective function
T
X Z
t

U(π) = lim F (x0 , π0 , z0 ) + β F πt−1 (ht−1 ), πt (ht ), zt dµt (ht )
T →∞ Ht
t=1
XT
 
= lim F (x0 , π0 , z0 ) + β t E0 F πt−1 (ht−1 ), πt (ht ), zt
T →∞
t=1

is well defined for any π ∈ Π(x0 , z0 ) and (x0 , z0 ) . Similarly to the compact notation for
the deterministic case, the true value function V ∗ is defined as follows

V ∗ (x0 , z0 ) = sup U(π). (4.3)


π∈Π(x0 ,z0 )

Let me first state a verification theorem for the stochastic case.

Theorem 21 Assume that V (x, z) is a measurable function which satisfies the Bellman
equation (4.1). Moreover, assume that
 
lim β t+1 E0 V (πt (ht ), zt+1 ) = 0
t→∞

for every possible contingent plan π ∈ Π(x0 , z0 ) for all (x0 , z0 ) ∈ X × Z; and that the
policy correspondence
 Z 
′ ′ ′ ′ ′
G(x, z) = x ∈ Γ(x, z) : V (x, z) = F (x, x , z) + β V (x , z )dQ(z, z ) (4.4)
Z

is non empty and permits a measurable selection. Then V = V ∗ and all plans generated
by G are optimal.
82 CHAPTER 4. STOCHASTIC DYNAMIC PROGRAMMING

Proof. The idea of the proof follows very closely the lines of Theorems 13 and 14.
A plan that solves the Bellman equation and that does not have any left-over value at
infinity, is optimal. Of course, we must impose few additional technical conditions imposed
by measure theory.13 For details the reader can see Chapter 9 of SLP. Q.E.D.
In order to be able to recover Theorem 12 we need to make an assumption on the
endogenous V ∗ :

Theorem 22 Let F be bounded and measurable. If the value function V ∗ (x0 , z0 ) defined
in (4.3) is measurable and assume that the correspondence analogous to (4.4) admits
a measurable selection. Then V ∗ (x0 , z0 ) satisfies the functional equation (4.1) for all
(x0 , z0 ) , and any optimal plan π ∗ (which solves (4.3)) also solves
Z
V (πt−1 (h ), zt ) = F (πt−1 (h ), πt (h ), zt ) + β V ∗ (πt∗ (ht ), zt+1 )dQ (zt , zt+1 ) ,
∗ ∗ t−1 ∗ t−1 ∗ t

µt (·) almost surely for all t and ht emanating from z0 .

Proof. The idea of the proof is similar to that of Theorem 12. For the several details
however, the reader is demanded to Theorem 9.4 in SLP. Q.E.D.
Let finally state the corresponding of Theorems 15 and 16 for the stochastic environ-
ment allowing for continuous shocks.

Theorem 23 Assume F is continuous and bounded; Γ compact valued and continuous;


Q possesses the Feller property, β ∈ [0, 1) and X is a closed and convex subset of IRl .
Then the Bellman operator T
Z

(T W )(x, z) = ′max F (x, x , z) + β W (x′ , z ′ )dQ(z, z ′ )
x ∈Γ(x,z) Z

has a unique fixed point V in the space of continuous and bounded functions.

Proof. Once we have noted that the Feller property of Q guarantees that if W is
R
bounded and continuous function then Z W (x′ , z ′ )dQ(z, z ′ ) is also bounded and contin-
uous for all (x′ , z), we can apply basically line by line the proof of Theorem 15. Q.E.D.

Theorem 24 Assume F is concave continuous and bounded; Γ is continuous and with


convex graph; Q possesses the Feller property, β ∈ [0, 1) and X is a closed and convex
subset of IRl . Then the Bellman operator has a unique fixed point V in the space of concave,
continuous and bounded functions.
13
For example, the policy correspondence G permits a measurable selection if there exists a function
h : X × Z → X, such that h(x, z) ∈ G(x, z) for all (x, z) ∈ X × Z.
4.4. THE STOCHASTIC MODEL OF OPTIMAL GROWTH 83

Proof. Again the proof is similar to the deterministic case. Once we have noted that
R
the linearity of the integral preserves concavity (since Z dQ(z, z ′ ) = 1 ) we can basically
apply line by line the proof of Theorem 16. Q.E.D.
It is important to notice that whenever the conditions of Theorem 23 are met, the
boundedness of V and an application of the Maximum Theorem imply the conditions of
Theorem 21 are also satisfied, hence V = V ∗ which is a continuous function (hence mea-
surable). In this case the Bellman equation fully characterizes the optimization problem
also with uncountably many possible levels of the shock.

4.4 The Stochastic Model of Optimal Growth


Consider the stochastic version of the optimal growth model
"∞ #
X
V (k0 , z0 ) = sup E0 β t u (f (zt , kt ) − kt+1 )
{kt+1 }∞
t=0 t=0
s.t. 0 ≤ kt+1 ≤ f (zt , kt ) for all t
k0 ∈ X, z0 ∈ Z,

where the expectation is over the sequence of shocks {zt }∞ ∞


t=0 . Assume that {zt }t=0 is an
i.i.d. sequence of shocks, each drawn according to the probability measure µ on (Z, Z).

Exercise 44 Let u(c) = ln c and f (z, k) = zk α , 0 < α < 1 (so δ = 1). I tell you that the
optimal policy function takes the form kt+1 = αβzt ktα for any t and zt . (i) Use this fact
to calculate an expression for the optimal policy πt∗ (ht ) [recall that ht = (z0 , ...zt )] and
the value function V ∗ (k0 , z0 ) for any initial values (k0 , z0 ), and verify that V ∗ solves the
following Bellman equation

V (k, z) = max α
ln(zk α − k ′ ) + βE [V (k ′ , z ′ )] .
0≤k ≤zk

(ii) Now show that a solution to the above functional equation is


α
V (k, z) = A(z) + ln k,
1 − βα
and discuss the relationship between V ∗ and V.

This model can be extended in many directions. This model with persistent shocks
and non inelastic labor supply has been used in the Real Business Cycles literature to
84 CHAPTER 4. STOCHASTIC DYNAMIC PROGRAMMING

study the effects of technological shocks on aggregate variables like consumption and
employment. This line of research started in the 80s, and for many macroeconomists is
still the building block for any study about the aggregate real economy. RBC will be the
next topic of these notes. Moreover, since most interesting economic problem do not have
closed forms, you must first learn how to use numerical methods to approximate V and
perform simulations.
Bibliography

[1] Bertsekas, D. P. (1976) Dynamic Programming and Stochastic Control, Academic


Press, NY.

[2] Bertsekas, D. P., and S. E. Shreve (1978) Stochastic Control: The Discrete Time
Case. New York: Academic Press.

[3] Rudin, W. Principles of Mathematical Analysis, McGraw Hill.

[4] Stokey, N., R. Lucas and E. Prescott (1991), Recursive Methods for Economic
Dynamics, Harvard University Press.

85

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy