0% found this document useful (0 votes)
3 views109 pages

Galli

The document outlines a course on macroeconomic theory, focusing on finite and infinite horizon models, dynamic programming, and stochastic environments. It includes mathematical preliminaries, optimality principles, and various economic models such as the Real Business Cycle Theory. The course is structured into sections covering theoretical frameworks, equations, and applications relevant to economic dynamics and optimization strategies.

Uploaded by

M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
3 views109 pages

Galli

The document outlines a course on macroeconomic theory, focusing on finite and infinite horizon models, dynamic programming, and stochastic environments. It includes mathematical preliminaries, optimality principles, and various economic models such as the Real Business Cycle Theory. The course is structured into sections covering theoretical frameworks, equations, and applications relevant to economic dynamics and optimization strategies.

Uploaded by

M
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 109

MAE Macro II

Carlo Galli∗

Last updated at 10:31 on Wednesday 26th March, 2025

Contents
1 Finite Horizon 4
1.1 Sequential approach . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.2 Dynamic Programming approach . . . . . . . . . . . . . . . . . . . . . . . . . . 7
1.3 A more general formulation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11

2 Infinite Horizon 13
2.1 Sequential problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
2.2 Recursive problem . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
2.2.1 Solving the recursive problem. . . . . . . . . . . . . . . . . . . . . . . . . 18

3 Maths Preliminaries 21
3.1 Metric spaces. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 22
3.2 Completeness and convergence. . . . . . . . . . . . . . . . . . . . . . . . . . . . 23
3.3 Contractions and fixed points. . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26
3.4 Theorem of the Maximum . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 33


Department of Economics, Universidad Carlos III de Madrid. Email: cgalli@eco.uc3m.es

1
4 Dynamic Programming 36
4.1 Principle of Optimality . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
4.2 Bounded returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 43
4.3 Unbounded returns . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 46

5 Stochastic Environments 48
5.1 Markov chains . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48
5.2 Stochastic Dynamic Programming . . . . . . . . . . . . . . . . . . . . . . . . . . 55
5.3 The McCall Job Search Model . . . . . . . . . . . . . . . . . . . . . . . . . . . . 57

6 Recursive Competitive Equilibrium 60


6.1 RCE with Government . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 64
6.2 RCE with Heterogeneity . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67

7 Ordinary Differential Equations Review 69


7.1 Homogeneous, Separable, Linear. . . . . . . . . . . . . . . . . . . . . . . . . . . 70
7.2 Non-Homogeneous. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 70

8 Dynamic Optimisation in Continuous Time 72


8.1 Finite horizon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 72
8.2 Infinite horizon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 75
8.3 Consumption-savings model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 76

9 Continuous Time Dynamic Programming 81


9.1 Finite horizon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 81
9.2 Infinite horizon . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 83
9.2.1 Consumption-savings model . . . . . . . . . . . . . . . . . . . . . . . . . 85
9.2.2 Neoclassical growth model . . . . . . . . . . . . . . . . . . . . . . . . . . 86
9.3 Numerical solutions . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 88

10 Stochastic Dynamic Programming in Continuous Time 90


10.1 Review of Poisson processes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 90

2
10.2 Stochastic HJB . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 91
10.3 Stochastic Euler equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 92
10.4 Kolmogorov forward equation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 94

11 Real Business Cycle Theory 95


11.1 Some stylised facts . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 96
11.2 The basic RBC model . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 97
11.3 Perturbation methods . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 101
11.3.1 Method of undetermined coefficients . . . . . . . . . . . . . . . . . . . . 103
11.3.2 Blanchard-Kahn method . . . . . . . . . . . . . . . . . . . . . . . . . . . 105

Acknowledgements
Significant parts of these notes borrow from the following material:
• Recursive Methods in Economic Dynamics by Stokey, Lucas, with Prescott (SLP), Harvard
University Press (1989);
• Recursive Macroeconomic Theory by Lars Ljungqvist and Thomas Sargent (LS), 3rd edi-
tion, MIT Press (2012)
• lecture notes of Matthias Kredler’s lectures, redacted by Sergio Feijoo;
• lecture notes on Advanced Macroeconomic Theory by Nezih Guner;
• lecture notes on Macroeconomic Theory by Dirk Krueger;
• slides on continuous time macro and numerical methods by Benjamin Moll;
• lecture notes on continuous time macro by Pablo Kurlat;
• lecture notes on Macroeconomic Theory by Victor Rios-Rull, redacted by many of his
students.

Course Information
Please refer to the syllabus in Aula Global for details on course evaluation, problem set grading,
attendance and additional reference material.

3
1 Finite Horizon
Consider a single agent living for T periods and receiving an exogenous endowment wt (think
of wage).
The utility function is u(c) such that u′ > 0, u′′ < 0, limc→0 u′ (c) = +∞ and u is twice differen-
tiable. We will assume these things basically all the time.
The objective to be maximised is

T
X
max β t u(ct )
{ct ,at+1 }T
t=0 t=0

subject to the budget constraint

at+1
w t + at = c t + for t = 0, 1, ..., T
R

and the finite-horizon version of the no-Ponzi-game (who was Ponzi? and who was Madoff?)
condition that aT +1 ≥ 0, which states that agents cannot die with debt (what would happen if
they could?).

1.1 Sequential approach

Let’s first look for the solution using the Lagrangian, which you probably already know.
The Lagrangian is

T
 X n h at+1 io
L {ct , at+1 , λt }Tt=0 , µT = β t u(ct ) + λt wt + at − ct − + β T µT (aT +1 − 0)
t=0
R

and represents the value of the agent’s lifetime utility under a certain plan {ct , at+1 }Tt=0 and
given a certain value of the constraints (which will be zero by the complementary slackness
condition).
First-order conditions (FOCs) (which are necessary but not sufficient for the solution to be an

4
optimum of our problem)

∂L
= β t [u′ (ct ) − λt ] = 0 for t ≤ T
∂ct
∂L 1
= −β t λt + β t+1 λt+1 = 0 for t ≤ T − 1
∂at+1 R
∂L 1
= −β T λT + β T µT = 0
∂aT +1 R

Simplifying we get the following. First

u′ (ct ) = λt

which states that, under the optimal consumption and savings path, the marginal utility (MU)
of consumption must equal the shadow value of wealth. Why is λt the shadow value of wealth?
Consider
∂L
= β t λt
∂wt
so λt represents the marginal change of the maximised value of our objective function L corre-
sponding to a unitary increase in wealth (wt in this case, but there are other ways to increase
wealth!).
Before we move to the other FOCs, consider the complementary slackness condition (also a
necessary condition) for the no Ponzi game constraint

µT (aT +1 − 0) = 0

which says that at least one of µT and aT +1 must equal zero. Since µT = λT and λT = u′ (cT ) > 0,
then it must be that aT +1 = 0. Note that this condition is not the same as the no Ponzi game
constraint: the constraint says you can’t die with negative wealth, this optimality condition
says that the optimal thing to do is to die with zero wealth.
Combine consumption and savings FOCs to get the Euler equation for savings

u′ (ct ) = βRu′ (ct+1 ) for t < T.

The left-hand side (LHS) is the marginal (opportunity) cost of saving a unit of your wealth
today: you could instead buy a unit of consumption with it, which gives you some MU. The

5
right-hand side (RHS) is the marginal benefit of saving a unit of wealth today: you get R units
tomorrow, which you can consume and get some MU tomorrow, which you then discount to
today with the discount factor β. The Euler equation says that in equilibrium these two things
must be equal, otherwise the consumption path {ct }Tt=0 would not be optimal: the agent could
improve upon it by saving more and consuming less or viceversa.
Another way to look at the Euler equation is to express it in a “micro” way

u′ (ct )
=R for t < T
βu′ (ct+1 )

where the LHS is the marginal rate of substitution between two “goods”, consumption today
and consumption tomorrow (why is the β there?), and the RHS is the relative price of the two
goods.
A solution of the problem is to find the optimal consumption path {ct }Tt=0 , or equivalently the
optimal savings path {at+1 }Tt=0 (can you derive one from the other?). Is what we have so far
enough to find a solution? We have T − 1 Euler equations and 1 terminal condition for aT +1 ,
and we need to find the value of the T variables {at+1 }Tt=0 , so the answer is yes. It may however
not be that easy to actually find these variables using our equilibrium conditions. Let’s see
some examples that help us with that.
Let’s suppose that βR = 1. Then the Euler equation implies u′ (ct ) = u′ (ct+1 ) for t < T , so
ct is constant for all t. We can use this! Let’s derive first the present value budget constraint
(PVBC). Start with the t = 0 and t = 1 BCs

a0 = c0 − w0 + a1 /R
a1 = c1 − w1 + a2 /R

plug the second into the first

1
a0 = c 0 − w 0 + [c1 − w1 + a2 /R].
R

If you keep going all the way to T , you get

T
X ct − wt
a0 = .
t=0
Rt

6
Now use the fact that consumption is constant and rearrange
" T
#
1−β X wt
c= a0 + (1)
1 − β T +1 t=0
Rt

where we used the properties of geometric sums and the fact that β = 1/R. Equation (1) is
nice because it expresses the level of consumption as a function of the present value of wealth
at t = 0 (the whole term in square brackets) and the discount factor. The term with the βs
outside of the brackets is called marginal propensity to consume (MPC), and it represents the
marginal change in per-period consumption due to a unit change in the PV of wealth.
c1−γ
Suppose now that βR ̸= 1 and that utility is CRRA: u(c) = 1−γ
. Then the Euler equation
becomes
c−γ
t = βRc−γ
t+1 .

Taking logs of both sides we get


 
ct+1 1
log = [log(β) + log(R)]
ct γ

which shows how the growth rate of consumption (left-hand side)1 depends on the discount
factor, the rate of return and the intertemporal elasticity of substitution (IES) given by 1/γ.
The latter is called like that because it represents the elasticity of consumption growth with
respect to R.

1.2 Dynamic Programming approach

We now use an alternative approach, on which we will focus most of the course this term. We
will do so because this approach is very useful in representing and solving dynamic problems,
but we’ll say more on this later.
Consider the problem sequentially, starting from the last period. We now introduce an object
called value function: let VT (a) denote the (maximised) value (of lifetime utility) for the agent

1
To see this note that  
ct+1 c c
log = log(1 + gt,t+1 ) ≈ gt,t+1
ct
c
where gt,t+1 is the growth rate of consumption between periods t and t + 1.

7
entering period T with a units of the asset. The words written in parentheses are usually omitted
for brevity from the definition of the value function V , but they help understand what we’re
talking about. This value is given by

VT (a) = max

u(a + wT − a′ /R) = u(a + wT ) (2)
a

where the last equality is due to the fact that we already know it is optimal to set aT +1 = 0.
Note that
• We carry around income wT like that because it’s fully known and exogenous, so we treat
it as if it was a parameter basically.
• We are not using a time subscript for a (which represents aT ) or for a′ (which represents
aT +1 ): it’s not necessary, as VT is supposed to indicate the value of lifetime utility for
an agent with some wealth a, whatever that value is, as we don’t know that yet. And
a′ is some choice of future assets, and we can call it whatever we want as we already
know we are in period T . We will refer to a as a state variable, because it is some sort
of pre-determined initial condition in the problem at hand, and to a′ as control variable,
because it is what we can actually choose in the problem at hand.
• In equation (2), we substituted out c and replaced it with the budget constraint: this
approach may be useful sometimes, but it is by no means compulsory. You can also leave
c there and put the budget constraint as the constraint it is. We will see an example of
that later when we deal with VT −2 .
Let’s now go back one period to T − 1

VT −1 (a) = max

{u(a + wT −1 − a′ /R) + βVT (a′ )} .
a

Here a represents initial wealth at T − 1, and a′ represents the wealth choice at T − 1, but also
initial wealth at T (as we saw in the previous step). Now VT −1 denotes the (maximised, as the
agent is behaving optimally, that’s why there’s a max) value (of lifetime utility, as it includes
all remaining future periods) for the agent entering period T − 1 with a units of the asset.
Let’s find the optimal value of a′ : take the derivative of the RHS and set it to zero

1 d
−u′ (a + wT −1 − a′ /R) + β ′ VT (a′ ) = 0. (3)
R da

We can easily compute the derivative in the last term because we know the form of VT from

8
above
d
VT (a) = u′ (a + wT ).
da
What we’ve just done is to derive a value function (the one for the last period) with respect
to its state variable. We’ll refer to this derivative as the envelope condition. Equation (3) then
becomes
1
u′ (a + wT −1 − a′ /R) = βu′ (a′ + wT )
R
which is one equation in one unknown (a′ ). Its solution will give us the optimal savings choice
at T − 1 when initial wealth is a. We’ll denote this as a function with a′ = gT −1 (a) and call it
policy function. Again, it doesn’t matter if we use a and a′ or aT −1 and aT , it’s just notation
and it doesn’t change the substance. It should be clear that what we just derive is equivalent
to
u′ (c) = βRu′ (c′ )

which is identical to the Euler equation we got earlier.


We won’t solve explicitly for gT −1 , but let’s use it to write the maximised value of VT −1
 
gT −1 (a)
VT −1 (a) = u a + wT −1 − + βVT (gT −1 (a)).
R

Let’s do one more step and go back one more period. The problem is

VT −2 (a) = max

{u(c) + βVT −1 (a′ )}
c,a
a′
s.t. c+ = a + wT −2 .
R

Here we represented the problem with its constraint. The way to take FOCs with this approach
is to build something equivalent to the Lagrangian in the sequential approach, i.e.

a′
 

VT −2 (a) = max {u(c) + βVT −1 (a ) + λ a + wT −2 − c − } (4)
c,a′ R

where you can think of the objective function to be maximised (which now also includes the
Lagrange multiplier and the budget contraint) as the Lagrangian

a′
 
′ ′
L(a, c, a ) = u(c) + βVT −1 (a ) + λ a + wT −2 − c − .
R

9
The FOCs for c and a′ are

u′ (c) = λ
d
λ = β ′ VT −1 (a′ ).
da

We need the envelope condition too


 
d ′ gT −1 (a)
VT −1 (a) = u a + wT −1 − .
da R

The reason why we call that envelope condition is given by the envelope theorem, which is the
reason why, when deriving VT −2 (a) with respect to a, we did not worry about deriving gT −1 (a)
with respect to a. In fact, if we did that, we’d have

u′ (c)
 
d dgT −1 (a) d
VT −1 (a) = u′ (c) − ′
− β ′ VT (a ) .
da da R da

dgT −1 (a)
You can see that the term that multiplies da
must equal zero, because it is exactly equal
to the FOC for the T − 1 problem we took in equation (3). This is what the envelope theorem
states, that you can ignore the response of a′ to a because it’s already optimal, and just focus
on the effect of the state variable a on the value function Vt (a). We will see this more formally
later on in the course.
You can see that in all steps we found a policy function gt and a value function Vt , and we could
keep going. Once we finish going back in time, we’ll have a whole sequence of value and policy
functions
{Vt (a), gt (a)}Tt=0

which is the solution to the problem in the same way as the sequences {ct , at+1 }Tt=0 were the
solution to the sequential problem. The nice property of the dynamic programming approach
will be that, in infinite horizon settings, we’ll get rid of the t subscripts and so the solution will
just be two functions V and g, while the solution to the sequential problem will continue to be
an infinite sequence of variables, which is a much less handy object to deal with.

10
1.3 A more general formulation

Sequential Problem. A general version of our initial problem in the sequential form is

T
X
V0 (x0 ) = sup β t Ft (xt , xt+1 )
{xt+1 }T
t=0 t=0
(SP)
s.t. xt+1 ∈ Γt (xt ) for t = 0, 1, ..., T.
x0 given.

We’ll call this SP for “Sequential Problem”.


A bit of notation now. Let
• X denote the set of all the possible values that xt+1 can take at any period t, that is,
xt+1 ∈ X ∀t.
• Γt : X → X denotes the correspondence (which is a one-to-many function) that maps the
set of feasible actions xt+1 that can be taken in a given period, for a certain value of the
variable xt in that period.
• Ft (xt , xt+1 ) : X × X → R denote the per-period return function.
The ingredients of our problem are (X, F, β, Γ), where X, Γ are somehow related to the technol-
ogy of the problem, while F, β are related to preferences. And V0 (x0 ) is, again, the maximised
lifetime value for an agent starting with initial condition x0 .

Recursive Problem. As we did in our example before, we can rewrite (SP) as a Functional
Equation (FE henceforth)2

Vt (x) = sup {Ft (x, x′ ) + βVt+1 (x′ )} for t = 0, 1, ..., T. (FE)


x′ ∈Γt (x)

In this formulation, Vt is a value function, x and t are the state variables, x′ is the control
variable, Γt is the feasible set correspondence and Ft (x, x′ ) is the return function.
Vt is the value function at t: it denotes the maximised residual lifetime value for an agent that
enters preiod t with state x.

2
Once again, it’s irrelevant whether you write variables with or without time subscripts, as here the only
distinction that matters is current vs. future periods. So (x, x′ ), (xt , xt+1 ) or (x, x+ ) are all examples of valid
notation pairs.

11
What is a state? It’s something sufficient to summarise the problem at any point in time. It’s
the smallest set of variables at time t that allows to:

1. determine the feasible set of controls Γt (x)


2. determine the current-period return Ft (x, x′ ) given a choice of controls x′
3. determine the value tomorrow given a choice of controls

These are the main criteria to decide what variables need to belong to the set of state variables.
What instead does not need to belong to such set, are the fixed objects of the problem, that do
not really change with choices or shocks. Things like the discount factor and the utility function
parameters are not state variables, they are just parameters that are fixed, so we do not need
to carry them around in our value function arguments. With respect to things like endowments
or wages, if they are deterministic, they also do not really need to be included in the set of
states, because they are already fixed and known. But if you do include these last ones as state
variables, it’s not a mistake, so there is some flexibility.
The policy function associated to the problem will be

gt (x) = arg sup {Ft (x, x′ ) + βVt+1 (x′ )} . (5)


x′ ∈Γt (x)

As we will see later on, (FE) and (SP) are equivalent!

Analogy with the Consumption-Savings problem. In the consumption-savings problem:


• the state variable is a (again, think of wt as a parameter given it’s fully exogenous and
deterministic)
• the control variable is a′
• the return function is F (a, a′ ) = u(wt + a − a′ /R)
• the feasible set correspondence is Γt (a) = [−∞, (wt +a)R] for t = 0, 1, ..., T −1 (since c ≥ 0)
and ΓT (a) = [0, (wT + a)R] since the no Ponzi condition does not allow for borrowing.
• the value function is

Vt (a) = ′max {u(wt + a − a′ /R) + βVt+1 (a′ )} for t = 0, 1, ..., T


a ∈Γt (a)

VT +1 (a) = 0.

12
Reduced-Form representation. Note that we can get rid of the decision variable (consump-
tion in the example above) and write the problem in reduced form only under some conditions.
To see what they are, consider for a second a more general formulation where C is the domain
of the control variable(s), D : X ⇒ C is the feasible control set correspondence, F : X × C → R
is the return function, and q : X × C ⇒ X is the law of motion for the state variable(s)
(in stochastic problems, it is a conditional probability.) In the consumption-savings example,
C = R, Dt (a) = [0, a + wt ], F (a, c) = u(c), and qt (a, c) = (wt + a − c)R.
We can replace the control variable(s) with the future state(s) and write down the problem in
reduced form if
• the dimension of the action space C is the same as the dimension of the state space X;
• given state(s) x, the law of motion x′ = q(x, c) is a bijection (one-to-one).
Then from (x, x′ ) we can get c = ξ(x, x′ ) by inverting the mapping q, we can write the feasible
set correspondence in terms of state variables only as

Γ(x) = {q(x, c) : c ∈ D(x)},

and the return function as


F (x, ξ(x, x′ )).

In the consumption-savings example, C = R has the same dimension of X = R, q is one-to-one,


we can get c = ξt (a, a′ ) = wt + a − a′ /R by inverting the mapping q, and we can write the
feasible set correspondence in terms of current and future state variables Γt (a).

2 Infinite Horizon
We now move to the analysis of the neoclassical growth model (NGM) in infinite horizon.
There is a single, representative agent in a production economy. There is a single good yt that
can be either consumed ct or transformed into capital used for production. The production
technology is yt = H(kt , nt ) where kt ≥ 0 is capital and nt ∈ [0, 1] is labour. We’ll assume that
H is concave, continuously differentiable and features constant returns to scale.3 The resource

3
That is, it is homogeneous of degree one, which means that

H(zkt , znt ) = zH(kt , nt ).

13
constraint of the economy is
ct + i t ≤ y t

the capital law of motion (LOM) is

it = kt+1 − (1 − δ)kt

and initial capital is equal to k0 and is exogenously given.


The agent has preferences

X
β t u(ct )
t=0

which satisfies the usual assumptions. We assume labour is fixed (nt = 1) so there is no disutility
from labour and we can define

f (kt ) = H(kt , 1) + (1 − δ)kt

so that the resource constraint becomes

ct + kt+1 = f (kt ).

2.1 Sequential problem

Using the new resource constraint to substitute out consumption we can write the Sequential
Problem:

X
V0 (k0 ) = max∞ β t u(f (kt ) − kt+1 )
{kt+1 }t=0
t=0
(SP-NGM)
s.t. kt+1 ∈ [0, f (kt )]
k0 given.

Let’s find the optimality conditions using the Lagrangian, and let’s assume for a second that
we’re in a finite horizon setting where the last period is T (then we’ll take the limit for T → ∞,

14
but this helps). The Lagrangian is

T
X
L ({kt+1 , µt }, ) = β t u(f (kt ) − kt+1 ) + µt (kt+1 − 0)
t=0

where we ignored the constraint ct ≥ 0 (or equivalently kt+1 ≤ f (kt )) given that it will always
be satisfied since marginal utility is infinity at c → 0. The FOC for capital is

∂L
= −β t u′ (ct ) + β t+1 f ′ (kt+1 )u′ (ct+1 ) + µt = 0 for t = 0, 1, ..., T − 1
∂kt+1
∂L
= −β T u′ (cT ) + µT = 0.
∂kT +1

and the complementary slackness condition for the non-negativity constraint is µt kt+1 = 0.
For all periods except the last one there is no danger that kt+1 = 0 because that would imply
zero production and consumption in the following period, so µt = 0 for t < T and we get the
usual Euler equation
u′ (ct ) = βf ′ (kt+1 )u′ (ct+1 ).

In the last period, investment is a bad idea as there is no production tomorrow, and it reduces
consumption today, so kT +1 = 0 and the non-negativity constraint is binding (you would set
kT +1 negative if you could, but you can’t). You can also see that µT > 0 because u′ (cT ) must
be positive. Using the second FOC, the complementary slackness condition for the last period
can be rewritten as
β T u′ (cT )kT +1 = 0.

The intuition is simple: either the MU of consumption is zero, and then you don’t care about
having kT +1 positive because changes in consumption are irrelevant, or the MU of consumption
is positive and then investing for no production tomorrow is a bad idea.
Now let’s go back to infinite horizon: the equivalent of the complementary slackness condition
becomes
lim β t u′ (ct )kt+1 = 0.
t→∞

This is what is called tranversality condition (TVC henceforth) and is an important optimality
condition in infinite horizon models. Using the Euler equation, we can rewrite it as

lim β t u′ (ct )f ′ (kt )kt = 0.


t→∞

15
This says that the marginal value (in discounted utility terms) of “terminal” capital must be
zero. This is the infinite-horizon equivalent of saying that it’s not optimal to die with positive
wealth in a finite horizon model.
The Euler equation and the TVC below

u′ (ct ) = βf ′ (kt+1 )u′ (ct+1 )


lim β t u′ (ct )kt+1 = 0
t→∞

are necessary and sufficient condition for an optimum in this model (you will have to prove the
sufficiency part in one of the problem sets, where you will see why the TVC is needed and has
the form it has).
Now the question is, how do we find a solution of this problem from these conditions? In Macro
I, you may have seen that in the special case with u(c) = log c, δ = 1 and f (k) = k α , one can
guess that kt+1 = γktα and using the Euler equation verify that γ = αβ so that kt+1 = αβktα
and in turn ct = (1 − αβ)ktα . But it is very rare to be able to derive a solution by pen. When
this is not possible, then solving the problem boils down to solving a second-order difference
equation in kt , kt+1 , kt+2 (which is the Euler equation) with a terminal condition given by the
TVC, which is not an easy task. Dynamic Programming is meant to give us the tools to solve
this problem in a simpler and faster way, by hand (also rare) or with a computer.

2.2 Recursive problem

Going back to our general formulation, we now have:


• return function F (xt , xt+1 ) given by u(f (kt ) − kt+1 )
• feasible set correspondence given by Γ(kt ) = [0, f (kt )]
First, let’s see how to go from the sequential to the recursive formulation. The sequential
problem was

X
V0 (k0 ) = max∞ β t u(f (kt ) − kt+1 ) (6)
{kt+1 }t=0 t=0
k0 given

16
using the properties of the max operator we can rewrite (note the changes in the time subscripts!)
 
 

 ∞
X 

V0 (k0 ) = max u(f (k0 ) − k1 ) + β max∞ β t−1 u(f (kt ) − kt+1 ).
k1 ∈ Γ(k0 ) 
 {kt+1 }t=1 t=1 

 
k0 given k1 given

Let us do a change of variable in the last term (let s ≡ t − 1 and use xs instead of kt ) to make
things even clearer
 
 

 ∞
X


V0 (k0 ) = max u(f (k0 ) − k1 ) + β max∞ β s u(f (xs ) − xs+1 )
k1 ∈ Γ(k0 ) 
 {xs+1 }s=0 s=0


 
k0 given x0 = k1 given

and now finally note that the last object in the equation is equivalent to what equation (6)
would be for V0 (x0 ), which is the same as V0 (k1 ) if we change the variable names back. Now
let’s get rid of the time subscript on V given that, as we just showed, the value function is
timeless, because “time to death” is always infinity. In other words, at any point in time the
problem is only defined by the state variable k and nothing else!
We can thus write
V (k0 ) = max {u(f (k0 ) − k1 ) + βV (k1 )}
k1 ∈Γ(k0 )

but here the time subscripts in k0 and k1 only really denote “current” vs “future” capital, so
we can just use k, k ′ for that purpose. To conclude, we have shown that our value function is

V (k) = max

{u(f (k) − k ′ ) + βV (k ′ )} (7)
k ∈Γ(k)

and the associated policy function will have the form

g(k) = arg max



{u(f (k) − k ′ ) + βV (k ′ )} . (8)
k ∈Γ(k)

As said before, the DP (or recursive) approach boils down to finding the value and policy
functions, i.e. just two functions which can thus be applied to any value of the state variable k
to get the policy and the value of lifetime utility in that state, rather than finding the infinite
sequence {kt+1 }∞
t=0 that solves the problem.

17
2.2.1 Solving the recursive problem.

Let us take FOCs and derive once more the Euler equation from the recursive formulation of
the neoclassical growth model in infinite horizon. In (7) we have formulated the problem in
reduced form, i.e. plugging the resource constraint instead of c. As we say with (4), there is
an alternative, more flexible way to pose the problem which is to set up something similar to a
Lagrangian, as we were doing in the sequential problem:

V (k) = max

{u(c) + βV (k ′ ) + λ[f (k) − k ′ − c]}
c,k

The FOCs are

u′ (c) = λ
βV ′ (k ′ ) = λ

and the envelope condition is V ′ (k) = u′ (c)f ′ (k), which can be rolled forward by one period and
gives V ′ (k ′ ) = u′ (c′ )f (k ′ ). Putting everything together we get our usual Euler equation

u′ (c) = βf ′ (k)u′ (c′ ).

We’ve properly specified the neoclassical growth model in recursive form and we’ve shown that
it looks equivalent (we’ll be more formal on this later) to the sequential one. Now, how do we
find a solution though? How do we know there’s only one solution, for example?
One option is to solve for the policy function in the Euler equation, but this is a similar idea to
the way we solve the sequential problem, it’s difficult and there is rarely an explicit solution.
Another option is to solve for function V first. Mathematically, (7) is a specific type (called
Bellman equation) of what is known as a functional equation. Let T be an operator on function
V , given by
(T V )(k) = max

{u(f (k) − k ′ ) + βV (k ′ )} . (9)
k ∈Γ(k)

What T does to function V (last object you see inside curly brackets) is the following (going
from right to left in the right-hand side of equation (9)): it evaluates it at some point k ′ , it
discounts it with β, it adds u(f (k) − k ′ ) to it, and then it looks for the k ′ that maximises this
expression within some feasible set Γ(k). This is the meaning, in words, of applying operator
T to function V and evaluating it at k. Note that, since we are taking a max with respect

18
to k ′ , T V does not depend on k ′ but just on k. Then, looking for the function V that solves
our Bellman equation actually means we are solving for a fixed point (but really it’s a “fixed
function”) of the operator T , i.e. any function V such that, once you apply T to it, returns V
itself. Mathematically, we’re looking for any solution to V = T V .
We already know what is a fixed point: for example, function f (x) = x3 has three fixed points,
i.e. three solutions to x = f (x), which are x = {−1, 0, 1}.
Let us now look at some functional equations and their solutions.

Example 1. One example is


f (x) = x − y + f (y).

The RHS of the equation is an operator that takes function f , evaluates it at some (any) point
y, adds x and subtracts y to it, finally returning (T f )(x) as output. That is, we are applying
operator T to function f for whatever value of y, evaluating it at point x: (T f )(x) = x−y+f (y).
One solution (may not be the only one) to this functional equation is f (x) = x + c for any c ∈ R
(check it!).

Another example is
f (x)f (y) = f (x + y)

where one solution is f (x) = ex (check this too!).


A last example:
f (x)f (y) = f (xy).

One solution of this FE is f (x) = xn . But note that also f (x) = 0 is a solution for x = 0,
while also f (x) = 1 is a solution for any x ̸= 0. This example is particularly instructive of the
fact that the restrictions we impose on the solution, or on the domain and codomain of our
variables, do matter.
So, how do we find a solution to our special functional equation, i.e. any dynamic macro
problem expressed in a recursive way? The two most popular answers are Guess & Verify and
Value Function Iteration. The former is a pretty quick method, but it is rarely applicable as
it requires you to have a good idea of what shape the value function will have, which may not
always be the case, and if you make the wrong guess you will not find a solution. The latter
method instead is a slower but very robust method: you make some guess for the value function
(any guess will work!) and then iterate on it until you converge to the solution. The powerful
thing of this method is that it can easily be done with the computer and, under some conditions

19
which we will see soon, it always yields the unique solution of the problem.
In general, this is how value function iteration (VFI) works (using the NGM as an example).
• Start with an initial guess V0 (k) (important: here the subscript denotes the iteration
count, not the time!). You can even guess that V0 (k) = 0 for all k, for example.
• Write down the first iteration

V1 (k) = max

u(f (k) − k ′ ) + βV0 (k ′ )
k ∈Γ(k)

and since we know the shape of V0 , we can take the FOCs for k ′ , find the optimal policy
g0 (k), and get
V1 (k) = u(f (k) − g0 (k)) + βV0 (g0 (k)).

This step consisted in applied the Bellman operator T once, i.e. V1 = T V0 .


• You can keep going, and at the n-th step you will have

Vn+1 (k) = max



u(f (k) − k ′ ) + βVn (k).
k ∈Γ(k)

The goal is that as you continue to iterate, you will eventually converge to the true value
function V (k), that is limn→∞ Vn (k) = V (k). The true V is in fact a fixed point of the
Bellman operator, V = T V .
As mentioned above, the “magic” of dynamic programming is that, under some conditions, Vn
will always converge to V as n → ∞, and that there exists only one solution V to the Bellman
equation.

Guess & Verify example. Let’s take the NGM example with log utility and full depreciation.
Our Bellman equation is

V (k) = ′maxα {log(k α − k ′ ) + βV (k ′ )}. (10)


k ∈[0,k ]

Let’s guess that the value function has the shape V (k) = A + D log(k), where A and D are two
constants that we must solve for. Our Bellman equation thus becomes

V (k) = ′maxα {log(k α − k ′ ) + β[A + D log(k ′ )]}.


k ∈[0,k ]

20
βD
Taking the FOC for k ′ yields k ′ = 1+βD
kα, which implies c = 1
1+βD
kα. So our Bellman equation
is now  
α α βD
V (k) = log(k ) − log(1 + βD) + βA + βD log(k ) + βD log
1 + βD
where we have expressed the whole RHS as a function of k and parameters only. Let us now
plug our guess into the LHS as well
 
βD
A + D log(k) = α log(k) − log(1 + βD) + βA + βDα log(k) + βD log .
1 + βD

For this equality to hold, we need the constant terms on the LHS to equal those on the RHS,
and the terms with log(k) on the LHS to equal those on the RHS. Let’s start with the latter,
α
that requires D = α(1 + βD) so we get D = 1−αβ
. Solving for the constant terms is a bit more
tedious, and after a few lines of algebra we get
 
1 αβ
A= log(1 − αβ) + log(αβ) .
1−β 1 − αβ

We got value of A and D as a function of parameters only, so we’re done! We can write down
αβ
the policy function too, which is g(k) = 1+αβ
kα.
To conclude, let’s check if the solution we found satisfies the transversality condition:

lim β t u′ (ct )kt+1 =


t→∞
1 αβ
= lim β t   ktα =
t→∞
ktα 1 − αβ 1 − αβ
1−αβ

= lim β t αβ = 0
t→∞

3 Maths Preliminaries
As anticipated, we will now study the properties of the Bellman equation (BE) and find out
when VFI will yield a solution, when such solution is unique, and why that is so. Recall that

21
the BE in its general form4 is

v(x) = sup {F (x, x′ ) + βv(x′ )} . (BE)


x′ ∈Γ(x)

In other words, the Bellman operator is

(T v)(x) = sup {F (x, x′ ) + βv(x′ )} . (BO)


x′ ∈Γ(x)

The questions we will ask are:


• When is (BE) well defined?
• When does (BE) have a solution?
• A solution to (BE) is a fixed point of the operator (BO): is T v of the same “type” of v?
• When is the solution unique?
• How do we find the solution? Is it always the case that T n v → v?
• Are (BE) and the sequential problem (SP) the same? Does v coincide with the solution
of (SP)?
First, let us introduce some maths concepts. What we will do here will be an application of
what you have seen with Juan Pablo.

3.1 Metric spaces.

Definition 1 (Metric). Pick any set X. A metric (or distance) of set X is a function d :
X × X → R+ such that for all (x, y, z) ∈ X

1. d(x, y) ≥ 0
2. d(x, y) = 0 if and only if x = y
3. d(x, y) = d(y, x)
4. d(x, z) ≤ d(x, y) + d(y, z) (this is known as triangle inequality)

Definition 2 (Metric Space). A set-distance pair (X, d) is called a metric space.

Examples of metric spaces:

4
We now use small v to denote the value function.

22
• X = R and d(x, y) = |x − y|
• X = R2 and d((x1 , y1 ), (x2 , y2 )) = [(x1 − x2 )2 + (y1 + y2 )2 ]1/2 (this distance is called d2 or
2-norm or Euclidean/Pythagorean distance)
hP i1/n
• X = Rk and d(x, y) = k
(x
i=1 i − y i )n
(this distance is called dn or n-norm)
• X = R and d(x, y) = max (|x1 − y1 |, ..., |xk − yk |) (this distance is called d∞ or sup-
k

norm)
Metric spaces need not just be sets of elements and metrics/distance between elements. One
can also have spaces of functions and metrics between functions.
Some examples of sets of functions are
• C(X) is the set of all continuos functions
• B(X) is the set of all bounded functions5
• {f : X → R s.t. f bounded and continuous} is the set of all bounded and continuos
functions.
Some examples of metrics on sets of functions are:
1/n
• dn (f, g) = X [f (x) − g(x)]n dx
R
(when n = 1, this is the area between the graphs of
the two functions)
• d∞ (f, g) = maxx∈X |f (x) − g(x)|

3.2 Completeness and convergence.

Definition 3 (Convergence). A sequence {xn }∞


n=0 in metric space X converges to limit x ∈ X
if for any ϵ > 0, there exists a number Nϵ such that d(xn , x) < ϵ for all n ≥ Nϵ .

Definition 4 (Cauchy Sequence). A sequence {xn }∞


n=0 is a Cauchy sequence if for any ϵ > 0,
there exists a number Nϵ such that d(xn , xm ) < ϵ for all n, m ≥ Nϵ .

Both of these two definitions basically ask that the elements of a sequence stay close to each
other after a certain number of steps. A convergent sequence is always Cauchy, but a Cauchy
sequence is not necessarily a convergent one. For example, consider xn = x/n in the metric
space ((0, 1), | · |). The sequence is Cauchy in (0, 1), but does not converge to any point inside
the interval. This however is a somehow fine point, and henceforth we will focus on sequences
where also the converse is true.

5
A function f : X → R is bounded if there exists some M ∈ R such that |f (x)| ≤ M for all x ∈ X.

23
Definition 5 (Completeness). A metric space (X, d) is complete if every Cauchy sequence
{xn }∞
n=0 such that xn ∈ X ∀n converges to some x ∈ X.

Some examples of metric spaces (X, d) that are (or not) complete, taken from exercise 3.6 of
Stokey, Lucas with Prescott (SLP) and its solutions by Irigoyen, Rossi-Hansberg and Wright:

• X is the set of all integers, d(x, y) = |x − y|.


The metric space is complete. Take a Cauchy sequence with xn ∈ X for all n. Choose
ϵ ∈ (0, 1). Being the sequence Cauchy, there exists Nϵ such that |xn − xm | < ϵ < 1 for all
n, m ≥ Nϵ . Hence, xn = xm = x ∈ X for all n, m ≥ Nϵ .
• X is the set of integers, d(x, y) = 1(x ̸= y).
The metric space is complete. The reasoning is the same as above, we can pick an ϵ ∈ (0, 1)
such that xn = xm = x ∈ X for all n, m ≥ Nϵ .
• X is the set of functions that are C([a, b]) and are strictly increasing, d(f, g) = maxa≤x≤b |f (x)−
g(x)|.
The metric space is not complete. Consider fn (x) = 1 + x/n. This is a Cauchy sequence
because fn (x) and fm (x) get arbitrarily close as n, m grow. But fn (x) → f (x) as n → ∞,
and f (x) = 1 which is not strictly increasing and thus f (x) ∈ / X.
Rb
• X is the set of functions that are C([a, b]) and d(f, g) = a |f (x) − g(x)|dx.
n
The metric space is not complete. Take fn (x) = x−a b−a
. This is a Cauchy sequence

 1 if x = b
that has limit f (x) = , which is clearly not a continuous function so
 0 if x ∈ [a, b)
f (x) ∈
/ X.
• X is the set of functions that are C([a, b]) and d(f, g) = supa≤x≤b |f (x) − g(x)|.
This metric space is complete. The reason why the previous counterexample does not
apply to this case too, is that sequences of the form fn (x) = xn are not Cauchy under the
sup-norm metric. Why? For the sequence to be Cauchy, we need that for all ϵ we can find
a Nϵ such that supx |xn − xm | < ϵ for all n, m ≥ Nϵ . With the sup-norm, Nϵ depends on
x and so the sequence is not Cauchy.
The formal reasoning is the following. Suppose the N that satisfies the definition exists.
Then it must be that supx |xN − xm | < ϵ for all m ≥ N . Now note that xN for a fixed N is

24
continuous6 , so for any η we can find a δ such that (take the definition of continuity and
let x = 1) |f (1) − f (x′ )| = |1 − (x′ )N | < η for |1 − x′ | < δ. In other words, if x′ > 1 − δ
then (x′ )N > 1 − η. Now, note that for x′ ∈ (1 − δ, 1) there exists an m large enough that
(x′ )m < η. So at x′ we have that

|(x′ )N − (x′ )m | = (x′ )N − (x′ )m > 1 − 2η

Let 2η = 1 − ϵ. We finally get

sup |xN − xm | ≥ |(x′ )N − (x′ )m | ≥ ϵ


x

which contradicts the definition of Cauchy sequence.

It will be useful to know that the set Rn for any n always forms a complete metric space.

Definition 6. Let (X, d) be a metric space.


• set A ⊂ X is closed if an ∈ A such that an → a implies that a ∈ A
• set A ⊂ X is bounded if there exists a D such that d(a, a′ ) ≤ D for all (a, a′ ) ∈ A
• set A ⊂ X is compact if it is closed and bounded.

The following is a useful theorem that we will use going forward.

Theorem 1 (Completeness of continuous bounded functions (SLP Theorem 3.1)). Let X ∈


Rn , C(X) denote the set of bounded and continuous functions f : X → R and d(f, g) =
supx∈X |f (x) − g(x)|. The metric space (C(X), d) is a complete metric space.

To clarify, completeness requires that every Cauchy sequence is convergent, and is a property of
metric spaces. Closedness requires that a set contains all its limit points, and is a property of a
set within a metric space. A metric space (X, d) can be not complete, but such that X is a closed
set within a (weakly) larger space (Y, d). One example is the metric space (X, d) = ((0, 1], |x−y|).
This is not a complete metric space because 1/n is a Cauchy sequence but converges to 0 which
does not belong to X. Within this metric space, set (0, 1] is closed because it contains all its
limit points within the space considered, so 0 does not count as a missing limit point because it

6
A function f (x) is continuous in x if for any ϵ > 0 we can find a δ > 0 such that |f (x) − f (x′ )| < ϵ for
|x − x′ | < δ.

25
does not exist in X. Another example is the metric space (X, d) = (Q, |x − y|) where Q is the
set of rational numbers. The set Q is closed within this space, because all sequences of rational
numbers are rational, and irrational numbers don’t exist in this space. By the opposite logic, Q
is not closed with the space (R, |x − y|). And Q is not complete, just take any Cauchy sequence

that converges to 2.

3.3 Contractions and fixed points.

Definition 7 (Contraction Mapping). Let (X, d) be a metric space. The operator T : X → X


is a contraction mapping with contraction parameter (or modulus) β if and only if

d(T (x), T (y)) ≤ βd(x, y)

for any pair (x, y) ∈ X.

Note that this definition also applies to sets of functions.


Let us look at some examples.
• T (x) = 0 is a contraction, since d(T (x), T (y)) = 0 while βd(x, y) ≥ 0 for any (x, y) ∈ X.
• Consider T : [a, b] → [a, b] where T is continuous, differentiable and with a slope that is
uniformly less than β, i.e. supx∈(a,b) |T ′ (x)| ≤ β < 1. We have that T is a contraction.
Why? Because of the mean-value theorem, there exists a x ∈ (a, b) such that f ′ (x) =
f (b)−f (a)
b−a
if f is continuous and differentiable. Then

|T (x) − T (y)| ≤ sup |T ′ (z)||x − y| ≤ β|x − y|


z∈(a,b)

and T satisfies the definition of contraction.

Definition 8 (Fixed Point). A fixed point of a mapping T : X → X is some element x ∈ X


such that T (x) = x.

Again, X can be a set of functions, in which case x will be some function in the set. Some
examples of fixed points of functional equations can be found in Section 2.2.1 where we gave
some examples.

26
Example 2. Let us look at an example now. Consider the functional equation

v(x) = sup {y − 2x + βv(y)}


y∈[0,2x]

where β ∈ (1/2, 1) (it will be clear soon why we make this assumption). Let us set up the problem
with the feasible set constraints (with Lagrange multipliers λ and µ)

v(x) = max {y − 2x + βv(y) + λ(y − 0) + µ(2x − y)} .


y

Assuming that v is differentiable, let us derive with respect to y and write down the FOC

1 + βv ′ (y) + λ − µ = 0.

To find v ′ (x) we typically apply the envelope condition, but (as we will see formally later) the
envelope theorem only applies to cases where the optimal policy is interior, which may not be
the case here, so let us be careful and proceed in steps.
First, suppose that the optimal policy is indeed interior (i.e., the choice of y is not constrained
by the feasible set [0, 2x]). The envelope condition is

v ′ (x) = −2 + 2µ

Since the FOC is independent of the state variable x, the policy must be interior at all periods,
so all Lagrange multipliers are zero, and we get that the left-hand side of the FOC becomes
1 − 2β which is strictly negative because of what we assumed for β. We thus verified that the
optimal policy cannot be interior.
Second, suppose that the optimal policy is at the left corner, i.e. y = 0 for any x, in which case
we have that the functional equation becomes

v(x) = −2x + βv(0).

Consider x = 0 (we did not restrict the domain of the state variable x, so we assume x ∈ R):

v(0) = βv(0)

which shows us that one solution of this functional equation must satisfy v(0) = 0. We now

27
know that v(0) = 0 and that v ′ (x) = −2, so have enough to find one solution: v(x) = −2x.
Third, suppose that the optimal policy is at the other corner, y = 2x for all x. Then v(x) =
βv(2x), whose only solution is a constant function independent of x. Since at x = 0 we still
have that v(0) = βv(0), we have found another solution which is v(x) = 0 for all x. Note that,
if you “believed” the envelope condition we derived earlier, you would have only obtained the
first solution and not this second one.
Finally, the last possibility is that the optimal policy is any value in the feasible set, i.e. y ∈
[0, 2x], which is only possible if v ′ (x) = −1/β. Then let’s guess v(x) = A − x/β for some
constant A and see if we can find a third solution. We have

A − x/β = y − 2x + β(A − y/β)

simplifying we get
A(1 − β) = x(1/β − 2)

which must hold for all values of x. This would be a solution if A = 0 and β = 1/2, which is
however ruled out by our restriction on β.

This was an example of a functional equation with multiple fixed points. We will come back to
this example to show why one of the two fixed points is “better” than the other.

Example 3. One more example. Consider the operator

x2
T v(x) = sup − x′ + βv(x′ ).

x ∈R 2

Our functional equation is

x2
v(x) = T v(x) = sup − x′ + βv(x′ ).

x ∈R 2

The FOC is −1 + βv ′ (x′ ) = 0. Here guess the solution is interior, so we can write down the
envelope condition v ′ (x) = x, so the optimal policy is x′ = 1/β for any x. Plug it into our FE

x2
v(x) = − 1/β + βv(1/β).
2

28
This FE must hold for any value of x, so let us look at the case where x = 1/β:

(1/β)2 1
v(1/β) = − + βv(1/β)
2 β
1
β ( 2β1 −1)
which gives v(1/β) = 1−β
. We found the value of v(1/β)! But we still don’t know the value
of v(x) when x ̸= 1/β. It is
 
1 1
x2
1 β 2β
−1
v(x) = − +β .
2 β 1−β

We now have all the elements to state the contraction mapping theorem.

Theorem 2 (Contraction Mapping Theorem, SLP Theorem 3.2). If


• (X, d) is a complete metric space
• T : X → X is a contraction mapping with parameter β
then
• T has exactly one fixed point v ∈ X (i.e. v = T v) (this is the “existence & uniqueness”
part of the theorem)
• for any v0 ∈ X, we have that d(T n v0 , v) ≤ β n d(v0 , v) for any n = 0, 1, ... (this is the
“convergence everywhere” part of the theorem)

This is a central theorem in dynamic programming and recursive methods in general, so we will
go through the proof.

Proof. First, we prove the existence of a fixed point. Let {vn }∞


n=0 where vn+1 = T vn . We know

d(vn+1 , vn ) = d(T vn , T vn−1 ) ≤ βd(vn , vn−1 ) = βd(T vn−1 , T vn−2 )

where the inequality comes from the properties of a contraction. We can keep following the
backwards iteration to get to

d(vn+1 , vn ) ≤ β n d(v1 , v0 ) for n = 0, 1, ...

Now let us verify that d(vm , vn ) for m > n is a Cauchy sequence. First

d(vm , vn ) ≤ d(vm , vm−1 ) + d(vm−1 , vn )

29
by the triangle inequality. We can keep going for all the numbers between n and m

d(vm , vn ) ≤ d(vm , vm−1 ) + d(vm−1 , vm−2 ) + ... + d(vn+1 , vn )

and using the results from above

d(vm , vn ) ≤ β m d(v1 , v0 ) + β m−1 d(v1 , v0 ) + ... + β n d(v1 , v0 )


= β n β m−n + β m−n−1 + ... + β + 1 d(v1 , v0 )


βn
≤ d(v1 , v0 )
1−β
βn
where the last line comes from the properties of geometric sums, and is such that 1−β
d(v1 , v0 ) →
0 as n → ∞. So we have a Cauchy sequence, because we can always pick a n that makes d(vm , vn )
as small as we want. Given that (X, d) is a complete metric space by assumption, then every
Cauchy sequence has a limit inside X, that is {vn }∞
n=0 → v ∈ X. That is, the completeness
assumption gives us the existence of v inside of X.
Now, let’s show that the limit v is also a fixed point of operator T

d(T v, v) ≤ d(T v, T n v0 ) + d(T n v0 , v) ≤ βd(v, T n−1 v0 ) + d(T n v0 , v) →n→∞ 0.

The first inequality comes from the triangle inequality, the second comes from the properties of
a contraction, the last limit is what we proved in the previous paragraph. We have thus proved
the existence of a fixed point.
Second, let’s prove that the fixed point of T is also unique. We do this by contradiction. Suppose
∃v̂ such that T v̂ = v̂ and v̂ ̸= v. Then it must be that d(v̂, v) = a > 0 for some a, and

a = d(v̂, v) = d(T v̂, T v) ≤ βd(v̂, v) = βa

which is a contradiction. It must thus be that v is the unique fixed point of T .


Third, let’s prove convergence everywhere. We do this by induction. For any v0 , the initial step
is
d(T 0 v0 , v) = d(v0 , v) ≤ β 0 d(v0 , v)

30
and is true by the definition of contraction. The n-th step is

d(T n v0 , v) ≤ β n d(v0 , v)

and since d(T n+1 v0 , v) ≤ βd(T n v0 , v) then

d(T n+1 v0 , v) ≤ β n+1 d(v0 , v)

which completes our proof.


So to apply the Contraction Mapping Theorem (CMT) we need a complete metric space and
a contraction operator. The former is typically true in the environments we consider, although
not always, and we’ve seen that checking for it can be tough. The latter instead is something
that must be verified case by case, since our Bellman operator will depend on the economic
problem at hand. We can however find some conditions that are sufficient for an operator to be
a contraction.

Theorem 3 (Blackwell Sufficient Conditions, SLP Theorem 3.3). Let


• X ⊆ RL
• B(X) denote the set of bounded function f : X → X
• d(f, g) = supx∈X |f (x) − g(x)|.
If T : B(X) → B(X) is such that

1. for any f, g ∈ B(X) such that f (x) ≤ g(x) for all x ∈ X, it holds that (T f )(x) ≤ (T g)(x)
for all x ∈ X (monotonicity)
2. there exists a β ∈ (0, 1) such that T (f + a)(x) ≤ (T f )(x) + βa for any f ∈ B(X), a ≥ 0
and x ∈ X (discounting)

then T is a contraction with parameter β.

Proof. See SLP (Theorem 3.3). ■

Let us check whether the Bellman operator for the NGM satisfies the Blackwell Conditions and
is a contraction. Let’s pick the required metric, composed by the set of bounded function on

31
X = [0, ∞) and the sup norm. Our operator is, once again,

(T v)(k) = max

{u(f (k) − k ′ ) + βv(k ′ )}
k ∈Γ(x)

where we assume that u is a bounded function. First, let’s show T maps X into itself. Since
we assume that both u and v are bounded, then T v must be bounded as it is the maximum of
the sum of two bounded functions. Second, consider some function w(k) ≥ v(k) for all k. Then

(T w)(k) = max

{u(f (k) − k ′ ) + βw(k ′ )} ≥ max

{u(f (k) − k ′ ) + βv(k ′ )} = (T v)(k)
k ∈Γ(x) k ∈Γ(x)

which proved monotonicity. Third, let’s check discounting

T (v + a)(k) = max

{u(f (k) − k ′ ) + β[v(k ′ ) + a]}
k ∈Γ(x)

= max

{u(f (k) − k ′ ) + βv(k ′ )} + βa = (T v)(k) + βa
k ∈Γ(x)

so the discounting property is satisfied. It follows that the Bellman operator associated to the
NGM is a contraction and therefore there exists a unique solution which can be obtained by
iterating on the operator from any initial guess. We’ll see that this is true in both the analytical
and computational part of problem set 2.

Example 4 (Exercise 137 of Guner’s notes.). Let X = (1, ∞), d = |x − y| and f : X → R be a


function given by
1 a
f (x) = x+ .
2 x
Show that f is a contraction for a ∈ (1, 3).

1. First, let’s show f maps X into itself. The function first derivative f ′ (x) = 21 1 + xa2 ,


which has a stationary point f ′ (x∗ ) = 0 at x∗ = a. Since f ′′ (x) = xa3 > 0, x∗ is the unique

global minimum and f (x∗ ) = a. Given that a > 1, we have proved that f (x) ∈ (1, ∞)
for any x ∈ R (even though we just needed to prove it for x ∈ (1, ∞)).
2. Second, let’s check that f is a contraction using the definition itself: we need to show that

d(f (x), f (y)) ≤ βd(x, y).

32
Consider the LHS
     
1 a 1 a 1 a 1 a
x+ − y+ = ... = (x − y) 1 − = 1− |x − y|
2 x 2 y 2 xy 2 xy

so going back to the inequality


 
1 a
1− |x − y| ≤ β|x − y|
2 xy

implies  
1 a
1− ≤ β.
2 xy
The LHS is decreasing in a and non
 monotone
 in xy. Its largest possible value is attained
when a is largest and xy = 1: 2 1 − xy = − 22 = 1, which proves that for all a ∈ (1, 3)
1 3

we have that d(f (x), f (y)) ≤ βd(x, y).

3.4 Theorem of the Maximum

We will now ask what can we say about the properties of the value function v and its associated
policy function g, being as general as possible. We will focus on a particular type of operator
which is the Bellman operator, in general form

v(x) = sup {F (x, y) + βv(y)}.


y∈Γ(x)

Going forward, let us define


f (x, y) := F (x, y) + βv(y). (11)

To describe the feasible set, we use correspondences: Γ(x) maps each element of set X into
some set Y of feasible choices. First, we want to show when T is a self-mapping, e.g. that T f
is continuous when f is continuous. Let’s define some properties of Γ which we will need later.

Definition 9. A correspondence Γ : X → Y is
• compact-valued if Γ(x) is a compact subset of Y for all x ∈ X
• closed-valued if Γ(x) is a closed subset of Y for all x ∈ X
• convex-valued if Γ(x) is a convex subset of Y for all x ∈ X.

Note that convex sets can be open, closed sets can be non-convex, and compact sets are closed

33
and bounded.
We denote with correspondence graph the set A = {(x, y) : y ∈ Γ(x)}, in words the set of
state-choice pairs such that choice y is feasible given state x.
We will need correspondences to satisfy some notion of continuity. We consider two types of
continuity for correspondences.

Definition 10. A correspondence Γ : X → Y is lower hemi-continuous (LHC) at x if


• Γ(x) is non-empty
• for every y ∈ Γ(x) and every sequence xn → x, there exist a number N and a sequence
{yn }∞
n=N such that yn → y and yn ∈ Γ(xn ) for all n ≥ N .

In words, a correspondence is LHC if any feasible choice y ∈ Γ(x) can be reached by some
sequence yn ∈ Γ(xn ), i.e. that is also feasible. LHC fails if the limit y belongs to the correspon-
dence but there are no convergent sequences that do.

Definition 11. A correspondence Γ : X → Y is upper hemi-continuous (UHC) at x if


• Γ(x) is non-empty
• for every sequence xn → x, every sequence yn ∈ Γ(xn ) has limit yn → y ∈ Γ(x).

In words, a correspondence is UHC if any sequence yn ∈ Γ(xn ) converges to y ∈ Γ(x). UHC


fails if a sequence belongs to the correspondence but the limit does not.

Definition 12. A correspondence is continuous if it is both UHC and LHC.

Please refer to your maths notes for some examples of LHC and UHC. Guner’s notes contain
two graphs (Fig. 20, 21) that provide some intuition.
We can now go back to our Bellman operator. When does the max of f (x, y) exist? If f is
continuous in y and Γ is non-empty and compact-valued, then maxy∈Γ(x) f (x, y) exists and we
need not use the sup notation. It follows that a general version of the policy function can be
written as
G(x) = arg max f (x, y) = {y ∈ Γ(x) : f (x, y) = v(x)}.
y∈Γ(x)

We can now say more about the properties of v and G.

Theorem 4 (Theorem of the Maximum (SLP Theorem 3.6)). Let


• X ∈ RL , Y ∈ RM

34
• f : X × Y → R be a continuous function
• Γ : X ⇒ Y be a compact-valued and continuous correspondence
Then
• v : X → R is a continuous function
• G : X ⇒ Y is a non-empty, compact-valued and UHC correspondence.

Proof. See SLP. ■

This theorem has such a name because it essentially states that the maximum of the function
f (x, y) (defined in (11)), i.e. maxy∈Γ(x) f (x, y), both exists and is a continuous function of
the state variable x. Loosely speaking, existence of the maximum is due to the fact we are
maximising a continuous function on a compact set; continuity of the maximum (i.e. of the
value function) comes from the fact that Γ is continuous and so the maximiser and the maximum
are smooth functions of the state.

Corollary 1 (Convex Corollary). If additionally


• f is strictly concave in y for all x
• Γ is convex-valued
then G : X ⇒ Y is a single-valued and continuous function.

Let’s look at our NGM example:


• f (x, y) = u(f (k) − y) + βv(y) has u and k α + (1 − δ)k − y as continuous functions, so if v
is continuous then T v must also be so.
• Γ(x) = [0, f (k)] where both 0 and f (k) are continuous, which implies that Γ is a continu-
ous, non-empty and compact correspondence.

35
4 Dynamic Programming

4.1 Principle of Optimality

The last part of our maths work concerns the equivalence between the sequential and recursive
formulations of the problem. The sequential problem in general form is

X

V (x0 ) = sup β t F (xt , xt+1 )
{xt+1 }∞
x=0 t=0
(SP)
s.t. xt+1 ∈ Γ(xt )
x0 given.

We will refer to V ∗ as the supremum function.


Our recursive problem instead is given by our functional equation

v(x) = sup {F (x, x′ ) + βv(x′ )}. (FE)


x′ ∈Γ(x)

We want to show the following:


• V ∗ solves (FE) (SP ⇒ F E)
• v evaluated at x0 solves (SP) (F E ⇒ SP )
• the solution to (FE) exists (proved it already)
• the sequence {x∗t+1 }∞
t=0 attains the maximum in (SP) if it satisfies

v(x∗t ) = F (x∗t , x∗t+1 ) + βv(x∗t+1 ).

Some notation:
• x̃ := {xt+1 }∞
t=0 denotes a plan, which will be feasible if xt+1 ∈ Γ(xt ) for all t ≥ 0;
• Π(x0 ) = {{xt+1 }∞t=0 : xt+1 ∈ Γ(xt )∀t ≥ 0} is the set of all feasible plans;
• for some feasible plan x̃, we define
n
X
un (x̃) = β t F (xt , xt+1 )
t=0

X
u(x̃) = β t F (xt , xt+1 ) = lim un (x̃).
n→∞
t=0

36
Assumptions we’ll make
• (A1): Γ(x) is non-empty for all x ∈ X
• (A2): u(x̃) = limn→∞ nt=0 β t F (xt , xt+1 ) exists for all x0 ∈ X and x̃ ∈ Π(x0 ), even though
P

it may be plus or minus infinity.


For assumption A2 to hold, it is sufficient that F is bounded from one side, above or below.
These assumptions imply that the set of feasible plans is non-empty and that u(x̃) is well
defined, and so is the (SP) problem. Specifically, V ∗ (x0 ) = supx̃∈Π(x0 ) u(x̃) will be uniquely
defined, although there may exist multiple x̃ that attain it.
Our interest is in connecting the supremum function V ∗ and the solution(s) v to (FE). It is
important to remember that while V ∗ is always uniquely defined, we know much less about the
solutions to (FE), which may very well be zero, one or many.
Before we move to the theorem and its proof, it will be useful to be more precise on the
meaning of the statements “to satisfy” (FE) and (SP). We refer the reader to SLP for alternative
conditions for the case where V ∗ or v are not bounded.
• V ∗ (x0 ) is the supremum function if it satisfies

V ∗ (x0 ) ≥ u(x̃) for all x̃ ∈ Π(x0 ) (SP1)

V ∗ (x0 ) ≤ u(x̃) + ϵ for some x̃ ∈ Π(x0 ), any ϵ > 0. (SP2)

These conditions do sound quite obvious and will be needed for the proof of the Principle
of Optimality. To see what they mean in words, think of V ∗ (x0 ) as some value which is
not necessarily related to u(x̃). Such value is the supremum if it is weakly larger than the
value of the best possible sequence (SP1)), and if it is actually attained by some sequence
(SP2).
• v satisfies (FE) if it satisfies

v(x0 ) ≥ F (x0 , y) + βv(y) for all y ∈ Γ(x0 ) (FE1)

and if
v(x0 ) ≤ F (x0 , y) + βv(y) + ϵ for some y ∈ Γ(x0 ), any ϵ > 0. (FE2)

The intuitive explanation is the same as that of the previous bullet point.

Theorem 5 (Principle of optimality (SLP Theorems 4.2-4.3)). Under assumptions (A1), (A2),

37
the following statements hold
• (F E ⇐ SP ) the supremum function V ∗ (x0 ) = supx̃∈Π(x0 ) u(x̃) satisfies (FE)
• (F E ⇒ SP ) if v satisfies (FE) and

lim β n v(xn ) = 0 for all x0 ∈ X and all x̃ ∈ Π(x0 ) (12)


n→∞

then v = V ∗ .

Proof. The direction (F E ⇐ SP ) is not our primary objective so please refer to SLP Theorem
4.2 for its proof.
Let us prove the second statement, i.e. direction (F E ⇒ SP ), in two parts.
(FE1)⇒(SP1). If v satisfies (FE), then (FE1) holds. That implies

v(x0 ) ≥ F (x0 , x1 ) + βv(x1 ) for all x1 ∈ Γ(x0 )


≥ F (x0 , x1 ) + βF (x1 , x2 ) + β 2 v(x2 ) for all x1 ∈ Γ(x0 ), x2 ∈ Γ(x1 )
..
.
≥ un (x̃) + β n v(xn+1 ) for any x̃ and for all n ≥ 1.

Taking the limit for n → ∞ and using (12) we get

v(x0 ) ≥ u(x̃) for all x0 , x̃ ∈ Π(x0 )

which is condition (SP1).


(FE2)⇒(SP2). If v satisfies (FE), then (FE2) holds. That implies

v(xt ) ≤ F (xt , xt+1 ) + βv(xt+1 ) + ϵt ∀t ≥ 0.

Starting from t = 0 and iterating forward for some plan x̃


n
X ∞
X
t n+1
v(x0 ) ≤ β F (xt , xt+1 ) + β v(xn+1 ) + ϵt
t=0 t=0

≤ un (x̃) + β n+1 v(xn+1 ) + ϵ̄.

Taking the limit for n → ∞, we know that the second-last term is equal to zero by (12), and

38
we get
v(x0 ) ≤ u(x̃) + ϵ̄

which is condition (SP2). ■

Recall that, when we solved the (SP) using the Lagrangian, the necessary and sufficient condi-
tions for an optimum were the Euler equation and the TVC. In some sense, condition (12) is
for the recursive problem what the TVC is for the sequential problem.
As said before, we have characterised under what conditions the (unique) solution to the SP
problem is equivalent to one particular solution to the FE. Note that in this theorem we have not
said anything about the uniqueness of the solution to FE: there may be many, and Theorem 5
provides a sufficient condition to identify which of the possibly many solutions is related to
the sequential problem. More precisely, there can only be one solution to (FE) that satisfies
(12) (can you prove it?). Also note that condition (12) is a sufficient condition, but it is not
necessary. We now present an example that combines all these observations.

Example 5. Consider a simple problem where the rate of return on storage is equal to the
discount factor and utility is linear. The supremum function is

X
V ∗ (x0 ) = β t ct
t=0

s.t. ct + βxt+1 ≤ xt
ct ≥ 0 for all t.

Note we are not imposing any no-Ponzi game condition here, so the budget set is such that
xt+1 ∈ Γ(xt ) = (−∞, xt /β]. Given that, pick a sequence where x1 = −∞ (so that c0 = +∞)
and xt+1 = xt /β (and ct = 0) for all t ≥ 2. The constraints are satisfied in all periods, so this
is a feasible sequence and V ∗ (x0 ) = +∞.
Now consider the corresponding functional equation

v(x) = sup x − βx′ + βv(x′ )


x′ ∈Γ(x)

s.t. Γ(x) = (−∞, x/β].

39
This has two solutions. First, v(x) = +∞ is a valid solution here7 , and it obviously coincides
with the supremum function, which we know because v(x0 ) = V ∗ (x0 ). Note that this solution
for v does not satisfy (12), but that’s fine because that condition is sufficient but not necessary,
as we said before.
Second, ṽ(x) = x is also a solution associated with policy function g(x) = y for any y ∈ Γ(x),
i.e. with any feasible sequence (check it!). This second solution also does not satisfy condition
(12), and on top of that it is clearly different to the supremum function. To see why condition
(12) does not hold here, take sequence x̃ = {xt = xt−1 /β = β −t x0 }∞
t=0 . Then

lim β n v(xn ) = lim β n xn = lim β n β −n x0 = x0 ̸= 0.


n→∞ n→∞ n→∞

So this is an example of a case where there exist multiple solutions to the functional equation,
some of them are not the supremum function V ∗ , and (12) does not help us in seeing which is
which.

Example 6. Consider again an example from earlier where the FE is

v(x) = sup {y − 2x + βv(y)}.


y∈[0,2x]

We know that there are two solutions, v(x) = 0 and v(x) = −2x. If we can prove that one of the
two satisfies (12), then we know that will be the one that coincides with the supremum function.
Clearly v(x) = 0 satisfies the condition since limn→∞ β n v(xn ) = 0 for any feasible sequence. We
cannot say the same for the second solution: pick sequence xt = 2xt−1 , which is feasible. Then

lim β n v(xn ) = lim β n (−2)(2n )x0 = ∞


n→∞ n→∞

given that β ∈ (1/2, 1).

We have thus seen two examples where (FE) admits multiple solutions, and additional criteria
are useful to identify which of the solutions is the one that coincides with the supremum func-
tion. Clearly, when the Contraction Mapping Theorem applies, this machinery is not necessary
because you know that we converge always to the unique solution to (FE), which must thus

7
To see that, plug it in as a continuation value, find that the optimal policy is always x′ = −∞, which implies
v(x) = +∞.

40
coincide with the supremum function. The Principle of Optimality gives us additional tools we
can use when the CMT cannot be used.
We now consider a version of the Principle of Optimality that concerns policy rather than value
functions.

Theorem 6 (Optimal policy (SLP Theorems 4.4-4.5)). Under assumptions (A1), (A2), we
have that

1. if x̄ ∈ Π(x0 ) solves (SP), then

V ∗ (x̄t ) = F (x̄t , x̄t+1 ) + βV ∗ (x̄t+1 ) for all t ≥ 0.

2. if x̂ ∈ Π(x0 ) satisfies the functional equation

V ∗ (x̂t ) = F (x̂t , x̂t+1 ) + βV ∗ (x̂t+1 )

and is such that


lim sup β t V ∗ (x̂t ) ≤ 0 (13)
t→∞

then x̂ solves (SP).

The first part of the theorem says that any optimal plan from the sequential problem is optimal
in the recursive problem, that is, the plan satisfies the functional equation without the max
operator when the supremum function is the value function. The second part says that, given
the “right” value function that solves the functional equation, the associated policy function
generates a plan that solves the sequential problem as long as it satisfies the additional limit
condition.
Again, we’ll focus on the second part of the theorem because we are more interested in that
direction. This theorem may seem redundant: if we have a value function that satisfies (12),
condition (13) automatically holds because (12) must hold for any feasible plan and so it does
also for the optimal plan. The theorem is however not redundant because the converse may not
be true! That is, there may be cases where (13) holds but (12) does not.

41
Example 7. Consider the following modification of a previous example

X

V (x0 ) = β t (xt − βxt+1 )
t=0

s.t. 0 ≤ xt+1 ≤ xt /β
x0 given.

The associated FE is
v(x) = max

{x − βx′ + βv(x′ )}.
0≤x ≤x/β

To find the supremum function, iterate the objective function forward

V ∗ (x0 ) = x0 − βx1 + β(x1 − βx2 ) + β 2 (x2 − βx3 ) + ... = x0 − lim β t xt .


t→∞

Since now xt ≥ 0, the limit cannot be negative. At the same time, under any optimal sequence
it cannot be positive, because that would mean “dying with positive savings”, which is clearly not
optimal. So we set it equal to zero, and find that V ∗ (x0 ) = x0 is the supremum function.8 As
the first part of Theorem 5 implies, we can see that V ∗ (x0 ) = x0 solves the functional equation
since v(x) = x is a solution of the FE. By the way, note that this solution does not satisfy
condition (12), because we can find a feasible plan (xt = x0 /β t ) where that condition fails. So,
if we did not know the supremum function directly, we couldn’t use the second part of Theorem 5
to conclude that v = V ∗ here.
We can however now apply the second part of Theorem 6 to find that certain plans are optimal,
while some other plans are not. Consider the plan {b
xt } defined by x
b1 = x0 /β and x
bt = 0 for
all t > 1 (i.e. save everything in period 0 and consume everything in period 1). This plan is
feasible, it satisfies (FE)9 and condition (13) since

lim β t V ∗ (b
xt ) = lim β t V ∗ (0) = 0.
t→∞ t→∞

We can thus conclude that by Theorem 6 the plan x


bt solves (SP).

8
Note that without the no Ponzi game condition, we could not do this derivation, because limt→∞ β t xt is
not bounded below and is optimally set to go to −∞.
9
To prove it, look at the case t = 0 and at the case t > 0.

42
There are many other plans that are also optimal (which suggests that there is no unique solution
here!): in fact, all plans such that all of x0 is consumed in finite time. But not all plans are
bt = x0 /β t , which is a plan
optimal, and condition (13) is useful to detect them. Pick again x
where x0 is all saved and never consumed: it does satisfy (FE) but it does not satisfy (13) since

lim β t V ∗ (b
xt ) = lim sup β t β −t x0 = x0 .
t→∞ t→∞

Hence we cannot claim that x


bt is optimal.

4.2 Bounded returns

We now consider the subset of problems where the return function F (x, y) is bounded. This will
allow us to say more about the value function, the optimal policy and the Bellman operator.
We’ll make the following additional assumptions
• (B1): X is a convex subset of RL and Γ : X ⇒ X is non-empty, compact-valued and
continuous
• (B2): F (x, y) : A → R (where again A = {(x, x′ ) : x ∈ X, x′ ∈ Γ(x)} ) is bounded and
continuous.
First, these assumptions imply that assumptions (A1) and (A2) are satisfied. Second, they
suggest that we will be looking for v in the space of bounded and continuous functions on X
(let us call that C B (X) ) and under the sup-norm metric. Let T be the Bellman operator on
C B (X) defined by
T v(x) = max

F (x, x′ ) + βv(x′ )
x ∈Γ(x)

and recall that the policy correspondence is defined as

G(x) = {y ∈ Γ(x) : v(x) = F (x, y) + βv(y)}.

We have the following theorem.

Theorem 7 (Bounded returns (SLP Theorem 4.6)). Under assumptions (B1) and (B2) the
Bellman operator T is such that

1. it maps C B (X) into itself


2. it has a unique fixed point, i.e. !∃v ∈ C B (X) : T v = v

43
3. for all v0 ∈ C B (X) we have that d(T n v0 , v) ≤ β n d(v0 , v)
4. the policy correspondence G associated with v is non-empty, compact-valued and UHC.

Proof (Sketch). The proof uses a lot of the results we have used so far. To prove 1), we must
show that T maps into itself because it preserves boundedness and continuity: the former is
straightforward, the latter follows from the Theorem of the Maximum. To prove 2) and 3), we
have to show that C B (X) with the sup-norm is a complete metric space, and that T satisfies the
Blackwell sufficient conditions. It then follows that T is a contraction, and so the fixed point
v = T v exists, is unique and we converge to it from any v0 ∈ C B (X). To prove 4), we use again
the Theorem of the Maximum. ■

Let us once again use our NGM example

v(k) = max {u(f (k) − k ′ ) + βv(k ′ )}


k′ ∈[0,f (k)]

to check whether the assumptions of the theorem are satisfied:


• Let k̄ define the highest maintainable capital stock, i.e. the level of capital at which, even
setting c = 0, it is not possible to increase its stock: k̄ = f (k̄). If we let X = [0, k̄], then
X is a convex subset of R.
• Γ(k) = [0, k α + (1 − δ)k] is compact-valued and continuous.
• For any utility function that is continuous and bounded in [0, k̄], u(c) is bounded because
u(0) ≤ u(f (k) − k ′ ) ≤ u(f (k̄)).
• β ∈ (0, 1).
So both assumptions (B1) and (B2) are satisfied and Theorem 7 applies.
We now move on to discuss concavity and the single-valuedness of the policy function. Under
the following additional assumptions
• (B5): F (x, y) is strictly concave, i.e.

F (θx + (1 − θ)x′ , θy + (1 − θ)y ′ ) ≤ θF (x, y) + (1 − θ)F (x′ , y ′ )

for all (x, y), (x′ , y ′ ) ∈ A where A was defined earlier (see Assumption (B2)).
• (B6): Γ is convex in the sense that, for all θ ∈ [0, 1] and x, x′ ∈ X, if y ∈ Γ(x) and
y ′ ∈ Γ(x′ ) then θy + (1 − θ)y ′ ∈ Γ(θx + (1 − θ)x′ ).
We can state the following theorem

44
Theorem 8 ((SLP Theorem 4.8)). Under assumptions (B1), (B2), (B5), (B6) we have that
• v is strictly concave
• G is a single-valued and continuous function.

We now move on to discuss a very important property: value function differentiability.

Theorem 9 (Benveniste and Scheinkman (SLP Theorem 4.10)). Let


• X ∈ RL be a convex set
• v : X → R be a concave function
• x0 ∈ int(X)10
• D be a neighbourhood of x0 .
If there exists a function W : D → R that is concave, differentiable and such that W (x0 ) = v(x0 )
and W (x) ≤ v(x) for all x ∈ D then v is differentiable at x0 and

∂v(x0 ) ∂W (x0 )
=
∂xi ∂xi

for all i = 1, . . . , L.

Adding one last additional assumption we can then apply this to dynamic programming
• (B7): F is continuously differentiable inside set A.
We can now state the following theorem.

Theorem 10 (Differentiability of v (SLP Theorem 4.11)). Under assumptions (B1), (B2),


(B5), (B6), (B7) we have that, if x0 ∈ int(X) and g(x0 ) ∈ int(Γ(X)), then
• v is continuously differentiable at x0
• and
v ′ (x0 ) = F1 (x0 , g(x0 )).

What this theorem says in practice is that, under some conditions, we can disregard the response
of the control variable when we differentiate the value function with respect to the state variable.
Take our usual one-dimension FE

v(x) = max {F (x, y) + βv(y)}


y∈Γ(x)

10
The notation int(X) refers to the interior of set X.

45
and suppose you want to compute v ′ (x0 ), i.e., the derivative of the value function with respect to
the state variable, evaluated at x0 . By hypothesis, x0 and the optimal policy g(x0 ) are interior
points of X and Γ(X) respectively. Pick a function

W (x) = F (x, g(x0 )) + βv(g(x0 ))

that is, a function where you choose for any state the policy that is optimal when the state is
x0 . Clearly such value function will be lower than v(x) when x ̸= x0 and will be equal to it when
x = x0 . Also, W is concave and differentiable since it’s a sum of a concave and differentiable
function (F ) and a constant (v(g(x0 ))). We can then compute

W ′ (x) = F1 (x, g(x0 ))

and once we evaluate it at x0 , we know it must be such that

W ′ (x0 ) = F1 (x, g(x0 )) = v ′ (x0 ).

4.3 Unbounded returns

We now move on to consider the subset of dynamic programming problems where the return
function is not bounded. We maintain assumptions (A1) and (A2) that we introduced previously.
We present a theorem that is useful to go from the FE to the SP when the supremum function
V ∗ satisfies the FE (first part of Theorem 5) but the boundedness hypothesis (condition (12)
in the second part of Theorem 5) does not hold.

Theorem 11 (Principle of optimality when returns are unbounded (SLP Theorem 4.14)).
Under assumptions (A1) and (A2), if

1. there exists a function v̂ : X → R such that

(a) T v̂ ≤ v̂
(b) limn→∞ β n v̂(xn ) ≤ 0 for all x0 ∈ X and all x̃ ∈ Π(x0 )
(c) u(x̃) ≤ v̂(x0 ) for all x0 ∈ X and all x̃ ∈ Π(x0 )
2. the function v defined as
v(x) = lim T n v̂(x)
n→∞

is a fixed point of T

46
then v is the supremum function (i.e. v = V ∗ ).

Essentially, this theorem replaces condition (12), which is great because then we don’t need to
check that the condition holds for all feasible sequences. Also, if the CMT applies, the theorem
ensures us that the unique solution of the problem is the supremum function, without the need
to actually know what v is.
Let us look at an example11 with the NGM with an unbounded return function. Consider

X

V (x0 ) = max∞ β t log(ktα − kt+1 )
{kt+1 }t=0
t=0

s.t. kt+1 ∈ [0, ktα ]

for X = [0, ∞). Note that the choice of the return function and the feasible set implies that
returns are unbounded from below and from above (although we can prove that capital can
never go above 1 here).
Let us check whether we can use the latest theorem. Assumption (A1) is satisfied as the
feasible set correspondence is never empty. To check Assumption (A2), it is sufficient to show
that the PV of lifetime utility is bounded from either above or below.12 Consider a (unfeasible)
plan where we both consume and save all available resources: ct = kt+1 = ktα . Then capital
will be such that log(kt+s ) = αs log(kt ), and consumption such that log(ct+s ) = log(kt+s
α
) =
α log(kt+s ) = αs+1 log(kt ). So the PV of lifetime utility under this plan is
∞ ∞
X
t
X α log(k0 )
β log(ct ) = β t αt+1 log(k0 ) = .
t=0 t=0
1 − αβ

Clearly this will be an upper bound for the PV of lifetime utility of any feasible plan, so (A2)
holds.
α log(k0 )
Now, let us look for our v̂ function. Let v̂(k0 ) = 1−αβ
. This function satisfies requirements
(1a) to (1c) of the theorem: T v̂ ≤ v̂, because the resource constraint does not allow to consume
and save everything; limn→∞ β n v̂(xn ) ≤ 0 for any feasible plan, since v̂ is bounded from above
even when we follow the proposed unfeasible plan; v̂(k0 ) ≥ u(x̃) for any feasible plan, because v̂

11
If you want to see further examples, please refer to the quadratic example in SLP Section 4.4.
12
Check discussion in Section 4.1 of SLP for more details.

47
is the PV of lifetime utility of an infeasible plan. Finally, we know that limn→∞ T n v̂(x) = v(x)
because we know v is a contraction mapping and so we converge to it from any initial guess,
including v̂. Hence we proved that v = V ∗ .

5 Stochastic Environments
We now consider recursive problems in infinite horizon and stochastic environments.
Consider a vector of shocks z t ∈ Rn for t = 0, 1, .... Let us introduce some notation:
• superscripts denote histories z t = {z 0 , z 1 , ..., z t }, so z t is a (t + 1) × n vector;
• the probability of any given history is given by π(z t );
• most things will depend on such histories, e.g. ct (z t ) is consumption at period t when
history z t has realised.
We will consider iid processes such that P(z t+1 |z t ) = P(z t+1 ), and Markov processes such that
P(z t+1 |z t ) = P(z t+1 |z t ). The latter are the protagonist in a large share of modern dynamic
stochastic macro.
Before going directly to consider how this new stochastic environment affects our dynamic
programming problem, let us review a bit of properties of Markov processes13 .

5.1 Markov chains

A stochastic process has the Markov property if P(z t+1 |z t ) = P(z t+1 |z t ). A Markov chain is a
triplet
({ei }ni=1 , P , π 0 )

where
• n is the number of possible event realisations;
• ei is a “selector” vector that is composed by a 1 in the i-th position if even i has realised,
and all zeros elsewhere;
• P is a n × n transition matrix;
• π 0 is a vector that specified the probabilities of the initial realisation of the process.
Any Markov chain must satisfy the following assumptions

13
Most of this section follows LS chapter 2.

48
• P is a stochastic matrix, i.e. nj=1 P i,j = 1 for all i. In words, all rows of P sum up to 1.
P


Pn
i=1 π 0,i = 1, i.e. the vector of initial probabilities must also sum up to 1.

We will define stochastic variable xt that is a vector operating on the chain ({ei }ni=1 , P , π 0 ).
That is, it is distributed according to a Markov chain, i.e. takes the value ei for some i according
to some initial distribution π 0 and some transition matrix P .

Conditional and unconditional probabilities. Elements of P denote conditional proba-


bilities. Each element of P indicates the probability of observing event j tomorrow conditional
on observing event i today: P i,j = P (xt+1 = ej |xt = ei ).
We can compute conditional probabilities of events further away in time too. For example
n
X n
X
P (xt+2 = ej |xt = ei ) = P (xt+2 = ej |xt+1 = eh )P (xt+1 = eh |xt = ei ) = P i,h P h,j = P 2i,j
h=1 h=1

where the second-last step represents the dot product of two rows of P .
The example shows that the probability of observing event j conditional on observing event i
k periods before is given by
P (xt+k = ej |xt = ei ) = P ki,j .

What we’re really interested in are unconditional distributions. That is, if I have a stochastic
model, what states are visited more often than others? Let π ′1 be the 1 × n vector denoting the
unconditional probability distribution at t = 1. It is computed as

π ′0 P = π ′1

so for example
n
X
π 1,k = π 0,j P j,k
j=1

that is, the probability of being in state k at t = 1 is given by the probability of being in j
at t = 0 times the probability of going from j to k at t = 1, summed over all possible j’s. It
follows that
π ′s = π ′0 P s = π ′s−1 P .

49
Stationarity. An unconditional distribution π is stationary if it is such that

π′ = π′P (14)

that is, if it is constant over time. Rearranging equation (14) we get

π ′ (I − P ) = 0

where I is the identity matrix. This says that the stationary distribution is the left eigenvector14
associated with the unit eigenvalue of matrix P 15 .

Example 8. Consider the stochastic matrix


 
1 0 0
P = .2 .5 .3 .
 

0 0 1

What are the stationary distributions associated with it? Is there more than one? We can look
for the eigenvectors associated with the unit eigenvalue, solving

π ′ (I − P ) = 0.

In this example,
 
h i 0 0 h 0 i
π1 π 2 π3 −.2 .5 −.3 = −.2π2 .5π2 −.3π2 .
 

0 0 0
h i
It follows that any probability distribution x 0 1 − x is stationary: the transition matrix is

14
Alternatively, we can write
(P ′ − I)π = 0
and say that π is the right eigenvector associated with the unit eigenvalue of P ′ .
15
Recall that, given matrix A and

Av = λv ⇔ (A − λI)v = 0

we say that v is the right eigenvector associated with the λ eigenvalue of matrix A.

50
such that if you start in position 1 or 3, you stay there forever, so any distribution that gives
positive probability only to those states will be stationary.

This example is somewhat extreme in that, depending on the starting point of the stochas-
tic process, convergence happens immediately. What is more common is that some stochastic
system converges only eventually to a stationary distribution. In other words, is there a dis-
tribution π ∞ such that limt→∞ π t = π ∞ ? We will now consider the convergence properties of
Markov chains.
We first state the conditions for existence and uniqueness of such limiting stationary distribution.

Theorem 12 (LS.2 Theorem 1). If P i,j > 0 for all i, j, then there exists a unique π ∞ such
that limt→∞ π t = π ∞ .

A looser version of this theorem is the following

Theorem 13 (LS.2 Theorem 2). If for some n we have that P ni,j > 0 for all i, j, then there
exists a unique π ∞ such that limt→∞ π t = π ∞ .

h i
Expectations. Consider now a random variable yt = y ′ xt where y = y1 y2 ... yn is a vector
that contains all possible realisations of some variable (e.g. GDP, productivity, and so on).
When xt = ej , then yt takes on the j-th value of vector y (yt = y j ).
The conditional expectations of yt+1 is given by
n
X
E[yt+1 |xt = ei ] = P i,j y j = (P y)i
j=1

which is the i-th row of the (P y) matrix.


The unconditional expectation of yt , on the other hand, is given by

E[yt ] = π ′t y = π ′0 P t y.

Invariance. Consider now a stationary Markov chain, that is a triplet ({ei }ni=1 , P , π) such
that π ′ P = π, and again a random variable yt = y ′ xt where xt is a selector vector16 . We

16
Henceforth we will refer to stationary Markov chains as (P , π) pairs, omitting the selector vector simply for
brevity.

51
consider first a version of the law of large numbers (LLN): for any stationary Markov chain,

T
1X
lim yt = E[y∞ |x0 ]
T →∞ T
t=1

with probability one.


This is a somewhat weak version of the LLN, because the time average of yt only converges to
the conditional expectation of the limit of yt , not that of any yt . We will now get to a stronger
version of the LLN, after having defined some additional concepts.
We say that random variable yt = y ′ xt is invariant if yt = y0 for all t and all xt . That is, if yt
is constant over time.

Theorem 14 (LS2 Theorem 2.2.2). If a stationary Markov chain is such that

E[yt+1 |xt ] = yt (15)

then yt = y ′ xt is invariant.

Any stochastic process satisfying (15) is defined as a martingale. The theorem is saying that a
martingale that is a function of a discrete and finite Markov chains must actually be constant
over time, which is a special case of the martingale convergence theorem. Equation (15) can
actually be rewritten as
(P − I)y = 0 (16)

which says that any invariant (random variable that is a) function (yt ) of the state (xt ) has a
support y that is a right eigenvector associated to the unit eigenvalue of P .

Ergodicity.

Definition 13. A stationary Markov chain (P , π) is ergodic if all invariant functions are
constant with probability one, i.e. if y i = y j for all i, j such that π i > 0, π j > 0.

That is, if all vectors y that satisfy (16) are such that all positive probability elements are
identical, then (P , π) is an ergodic Markov chain.
We now get to why this is useful.

52
Theorem 15 ((LS2 Theorem 2.2.3)). If yt is a random variable on an ergodic Markov chain
(P , π), then
T
1X
yt → E[y0 ]
T t=1

with probability 1.

Consider the following examples.

Example 9. Let " #


0 1
P = .
1 0
The associated stationary distribution is any vector π that solves

π ′ (P − I) = 0
" #
1/2
which yields π = . The invariant functions of the state satisfy
1/2

(P − I)y = 0
" #
x
which yields y = for any value of x. So since all invariant functions are constant, the
x
stationary Markov chain (P , π) is ergodic.

Example 10. Let " #


1 0
P = .
0 1
" #
p
The associated stationary distribution is π = for any p ∈ [0, 1]. The invariant functions
1−p
" #
a
are y = for any (a, b) pair. It follows that when p ∈ (0, 1), the stationary Markov chain
b
(P , π) is not ergodic because its invariant functions are not constant. The intuition is that the
random variable yt = y ′ xt takes either value with some positive probability at t = 0, and then
never moves away from that value. Since the other state is never visited, the sample average will
converge to the value of y0 that realises, and not to the unconditional expectation of yt which is
given by π ′ y and is different from either a or b.

53
When instead p = 0 or p = 1, the stationary Markov chain (P , π) is ergodic because its invariant
functions are constant in their positive-probability elements. The sample average will equal the
first realisation of yt , but since the initial distribution is degenerate, the sample average will
coincide with the unconditional expectation of yt .

Continuous Markov chains. Let us now cover quickly the possibility that our random
variable of interest follows a continuous Markov chain, i.e. that it takes on a continuum of
values (while we continue to assume that time is discrete). Let s denote the random variable
and S denote its domain. Let π(s′ |s) denote its transition density, which must be such that
π(s′ |s)ds′ = 1, and let π0 (s) denote the initial probability distribution, which must be such
R
S
R
that S π0 (s)ds = 1.
The unconditional distribution of s at period 1 is given by
Z
π1 (s1 ) = π(s1 |s0 )π0 (s0 )ds0 .
S

The unconditional distribution of s at period t is given by


Z
πt (st ) = π(st |st−1 )πt−1 (st−1 )dst−1 .
S

A stationary distribution is defined as


Z

π∞ (s ) = π(s′ |s)π∞ (s)ds.
S

A random variable y(s) is invariant if


Z
y(s′ )π(s′ |s)ds′ = y(s).
S

A Markov chain is ergodic if all invariant functions y(s) are constant with probability 1 according
to π∞ , i.e. y(s) = y(s′ ) for all s, s′ such that π∞ (s) > 0, π∞ (s′ ) > 0.
Finally, if y(s) is a random variable on a stationary and ergodic Markov chain π(s′ |s), π(s) and
E[|y|] < ∞, then
T Z
1X
yt → E[y] = y(s)π(s)ds
T t=1 S

54
with probability 1.

5.2 Stochastic Dynamic Programming

Consider a general formulation of a Bellman equation. Let x denote the endogenous state
variable, and z denote the exogenous and stochastic state variable that follows a Markov chain
with transition probabilities P (z ′ |z). Let Γ(x, z) define the feasible set correspondence and
let a ∈ Γ(x, z) denote the control variable. Let F (x, z, a) denote the return function, and
x′ = q(x, z, a) denote the law of motion for the endogenous state variable.

v(x, z) = max F (x, z, a) + βE[V (x′ , z ′ )|z] (17)


a ∈ Γ(x, z)
x′ = q(x, z, a)

where the expectation can be either E[V (x′ , z ′ )|z] = P (z ′ |z)V (x′ , z ′ ) or E[V (x′ , z ′ )|z] =
P
z′
v(x′ , z ′ )f (z ′ |z)dz ′
R

Example 11. Let’s apply this general notation to the NGM. One option is to continue to
substitute out consumption by using the resource constraint. In this case, the endogenous state
is k, the exogenous stochastic state is productivity z, the control variable is k ′ , the feasible set
correspondence is Γ(k, z) = [0, zk α + (1 − δ)k], the return function is F (k, z, k ′ ) = u(zk α + (1 −
δ)k − k ′ ), the law of motion for capital is k ′ = k ′ (trivial, says how the future state depends on
the current control variable, which here happen to be same...) and for productivity is given by
π(z ′ |z), the Bellman equation is
 Z 
′ ′ ′ ′ ′
v(k, z) = ′max F (k, z, k ) + β V (k , z )f (z |z)dz
k ∈Γ(k,z) Z

Another option is to leave consumption in the problem and keep the resource constraint. Then
we have that the endogenous state is k, the exogenous stochastic state is productivity z, the
control variable is c, the feasible set correspondence is Γ(k, z) = [0, zk α + (1 − δ)k], the return
function is F (c) = u(c), the law of motion for capital is k ′ = q(k, z, c) = zk α + (1 − δ)k − c and

55
for productivity is given by π(z ′ |z), the Bellman equation is
 Z 
′ ′ ′ ′
v(k, z) = max F (c) + β V (k , z )f (z |z)dz .
Z
c ∈ Γ(k, z)
k ′ = q(k, z, c)

Example 12. Consider another example: a consumption-savings model with an AR(2) earnings
process given by wt = ρ1 wt−1 +ρ2 wt−2 +ϵt where ϵt ∼ N (0, σ 2 ). The budget constraint is standard
and given by
ct + at+1 /R ≤ at + wt

and the no-borrowing constraint at+1 ≥ 0. The elements of our dynamic programming problem
are the following:
• the endogenous state is a, the exogenous states are w, w− (the latter denotes income in
the previous period)
• the control variable is a′
• the feasible set correspondence in Γ(a, w) = [0, (a + w)R]
• the return function is F (w, a, a′ ) = u(a + w − a′ /R)
• the laws of motion for the states are given by a′ = a′ for the endogenous state, w′ =
π(w′ |w, w− ) and w = w for the exogenous stochastic states
• the Bellman equation is given by
 Z 
′ ′ ′ ′ ′
v(a, w, w− ) = ′ max F (a, w, a ) + β v(a , w , w)π(w |w, w− )dw .
a ∈Γ(a,w)

Let’s derive the Euler equation to see how it looks like in this stochastic environment. The FOC
for a′ is given by Z

Fa′ (w, a, a ) + β va′ (a′ , w′ , w)π(w′ |w, w− )dw′

where Fa′ (w, a, a′ ) = −u′ (c) R1 . The envelope condition is given by

va (a, w, w− ) = Fa (w, a, a′ )

where Fa (w, a, a′ ) = u′ (c). Putting things together we get the Euler equation
Z

−Fa′ (w, a, a ) = β Fa′ (w′ , a′ , a′′ )π(w′ |w, w− )dw′

56
which becomes the well-known
Z

u (c) = βR u′ (c′ )π(w′ |w, w− )dw′ .

5.3 The McCall Job Search Model

This is an example of a finite horizon stochastic problem, which we’ll use to get more acquainted
with stochastic dynamic programming.
Consider the problem of a single agent, a worker. She can be either employed or unemployed.
If she’s unemployed, she receives a random wage offer wt ∈ [w, w], where wt is distributed
according to some distribution F . The agent consumes her income yt and has linear utility.
The worker can accept the offer, in which case she gets income equal to the wage wt for the rest
of her life, or reject the offer and receive unemployment benefit b that are some constant inside
the [w, w] interval. The ingredients of the problem of the unemployed worker are:
• the state variable is the wage offer w
• the choice variable is the discrete accept/reject choice c ∈ {a, r}
• the return function is F (w, c) which is equal to w if c = a and to b if c = r.

Two-period case. Consider the first a worker that only lives two periods t = 0, 1. Clearly, we
could also look at the problem in sequential form. The objective function would be E0 (y0 +βy1 ),
and we could derive the optimal worker’s policy and the value of her maximised utility. We’ll use
the dynamic programming approach to get used to that, but you can try to solve the problem
in sequential form as an exercise and check that you do get the same results.
Let’s solve the problem backwards. Let V1 denote the maximised value for the unemployed
worker of drawing offer w at t = 1

V1 (w) = max {V1a (w), V1r (w)}


{a,r}

where

V1a (w) = w
V1r (w) = b.

57
The optimal policy of the worker clearly is

 a if w ≥ b
g1 (w) =
 r if w < b

that is, the worker accepts if the offer is above a threshold which we call reservation wage and
we denote with ŵ1 = b.
In t = 0, let V0 denote the maximised value for the unemployed worker of drawing offer w at
t=0
V0 (w) = max {V0a (w), V0r (w)}
{a,r}

where

V0a (w) = w + βw
Z w  Z ŵ1 Z w 
′ ′ ′ ′ ′ ′
V0r (w) = b + βE1 [V1 (w )] = b + β V1 (w )dF (w ) = b + β b dF (w ) + w dF (w ) .
w w ŵ1

Here too the value of accepting is increasing in w and the value of rejecting is constant and
independent of w, so the worker will accept if and only if the wage offer is above a threshold
which is the reservation wage at t = 0 ŵ0 . What can we say about ŵ0 ? How does it change
with b? Is it larger or smaller than ŵ1 ?
By construction, ŵ0 must be the wage offer at which the worker is indifferent between a and r.
So let’s write this with math

V0a (ŵ0 ) = V0r (ŵ0 )


 Z w 
′ ′
ŵ0 (1 + β) = b + β bF (b) + w dF (w )
b
Z w
ŵ0 (1 + β) = b(1 + β) + β (w′ − b) dF (w′ )
b
Z w
β
ŵ0 = b + (w′ − b) dF (w′ ).
1+β b

This tells us a few things. First, ŵ0 > b, because in this model there is an option value of
waiting: if you reject the offer at t = 0, you consume b today but you have a chance of getting
dŵ0
a very high draw tomorrow, so you’re willing to risk it. Second, we can show that db
> 0 (try

58
to show that at home!), because obviously you’re less willing to accept a certain offer if the
alternative to that is to get a higher unemployment benefit.

Infinite horizon. Given the assumptions that employed workers stay on the job forever, the
infinite horizon version of this problem is actually extremely similar to the t = 0 version of
the previous problem, with the only difference that now the future lasts an infinite number of
periods rather than only one period.
The optimised value of lifetime utility for an employed agent with wage w is

w
V e (w) = .
1−β

The optimised value of lifetime utility for an unemployed agent with a wage offer w is

V u (w) = max {aV e (w) + (1 − a)[b + βE[V u (w′ )]]} .


a∈{0,1}

As in the finite-horizon environment, the optimal policy of the unemployed agent is to accept the
wage offer if it is higher than some reservation wage, which can be derived as we did previously.

Extension: receive offer with probability ϕ. Consider the case where unemployed agents
only receive an employment offer with probability ϕ. The value of being employed is unchanged.
The value of being unemployed with no offer is

V n = b + β(1 − ϕ)V n + βϕE[V o (w′ )].

The value of being unemployed with an offer w is


 
o w n
V (w) = max ,V .
a,r 1−β

We can keep going by first solving for V n and then writing it as a weighted average

b
V n = a(ϕ) + [1 − a(ϕ)]E[V o (w′ )]
1−β
1−β
where a(ϕ) = 1−β(1−ϕ)
. With this we can for example compute the impact of changes in ϕ on
the reservation wage.

59
Extension: receive multiple offers. Consider the case where an unemployed worker re-
ceived two wage offers in each period. Then the wage offer to be considered is y = max{w1 , w2 }.
One can show that
P (y < y ′ ) = P (max{w1 , w2 } < y ′ ) = F (y ′ )2 .

So the value of being unemployed is the same as in the one-offer case, just that now
Z

u
E[V (w )] = V (w′ )dF (y ′ )2 .

6 Recursive Competitive Equilibrium


So far we have seen either planner’s problems (the NGM), where there are no prices or markets
to clear, or partial equilibrium problems, where prices are taken as given and market clearing
is ignored. We will now move on to consider competitive equilibria (CE). So far you have
seen Arrow-Debreu time-0 equilibria as well as equilibria with sequential markets. Now we will
consider recursive competitive equilibria (RCE).
Let us get started by considering once again the NGM example. Let K, N respectively denote
aggregate capital and labour. Consider a continuum of measure 1 of households indexed by
i ∈ [0, 1]. Each household faces a budget constraint given by

cit + kt+1
i
≤ wt nit + (1 − δ + rt )kti (18)

for all i ∈ [0, 1] and all t. Agents now rent their capital and labour in a competitive market
respectively at rental rate rt and wage wt . Note that these are equilibrium prices so they are
not agent-specific. We look for a symmetric equilibrium, since all agents have the same utility
function, face the same budget constraint and are therefore identical. Since there is a continuum
of households, each of them is infinitesimally small and takes prices as given17 . That is, this
is different from a setting where agents have market power (think of industrial organisation

17
To gain intuition, it is useful to see the continuum assumption
PN as the limit of a case where there is a finite
number N of agents, each with weight λi = 1/N such that i=1 λi = 1. Taking the limit for N → ∞ the weight
PN
of each individual goes to zero. Aggregate variables are given by X = i=1 N1 xi , so in the limit for N → ∞ it
∂X
follows that ∂xi
→ 0. By the way, this feature is what gives rise to externalities and coordination problems (but
we won’t cover these here).

60
examples where firms choose quantities knowing that revenues are given by Q · P (Q)) and
therefore internalise the effect of their decisions on equilibrium prices.
Aggregate variables are determined by individual variables through the following
Z 1
Kt = kti di
Z0 1
Nt = nit di.
0

Given agents are all identical, we will have that K = k i and N = ni for all i. Below we will
sometimes drop the i superscript to make notation lighter, but we will always use small-caps
notation to denote individual variables and capital letters to denote aggregate variables.
Let X = (Z, K) denote the aggregate state vector, where Z denotes aggregate productivity,
which is exogenous and follows some (for now) undefined stochastic process.
Before considering the household individual problem, let us analyse the firms’ problem. There
is a continuum of measure 1 of firms, that are perfectly competitive and subject to free entry,
so they make zero profits. Like households, firms also take prices as given, and firm f faces the
following static profit-maximisation problem

max ZF (k f , nf ) − wnf − rk f
kf ,nf

where F (k f , nf ) is a standard Cobb-Douglas production function and k f , nf denote capital and


labour demanded by firm f . The first-order conditions are

ZFk (k f , nf ) − r = 0
ZFn (k f , nf ) − w = 0.
R1
Aggregation in the firms’ sector works in the same way described above, so K = 0 k f df and
R1
N = 0 nf df . It follows that, since all firms are identical, aggregate capital and labour in the
firms’ sector is given by K = k f and N = nf for any f , and in a symmetric CE the prices r and
w are functions of the aggregate state X = (Z, K) only, via a time-invariant mapping which is
a crucial feature of a recursive equilibrium. It follows that firms’ capital and labour demand
will also be a time-invariant function of the aggregate states.
We now go back to look at the households. They face an inter temporal consumption-savings

61
problem which is affected, for example, by the future rental rate of capital r(X ′ ). To derive
optimal individual behaviour we thus need to specify the law of motion of prices. Since prices
only depend on the aggregate states, we need the law of motion for the aggregate states. Let us
assume for now that agents have a perceived (i.e., for now this can be arbitrary) law of motion
for capital given by K ′ = G(K): given current aggregate capital, agents postulate what future
capital will be. The Bellman equation for the household problem is

v(X; k) = max

{u(c, 1 − n) + βE [v(X ′ ; k ′ )]}
c,k ,n

s.t. c + k ′ ≤ w(X)n + (1 − δ + r(X))k


K ′ = G(X)
c ≥ 0.

Solving the household problem will give us the usual Euler equation

uc (c, 1 − n) = βE [uc (c′ , 1 − n′ )] (1 − δ + r(X ′ )) .

Current and future prices depend on the current and future aggregate states, and the expected
future aggregate state depends on the perceived law of motion G. Hence the optimal individual
policy of any agent will be a function of the individual state as well as all these factors: let
us denote the policy functions for consumption, labour supply and investment with c(X; k, G),
n(X; k, G) and g(X; k, G).

Competitive equilibrium definition.

Definition 14. A Recursive Competitive Equilibrium (RCE) with arbitrary expectations G is


• a set of functions (v, c, n, g) for the individuals, (w, r) for prices and (k f , nf ) for firms
• perceived and actual laws of motion (G, H)
such that
• given price functions (r, w) and perceived law of motion G, the individual value and policy
functions (v, c, n, g) solve the household problem (i.e. the Bellman equation);
• the actual law of motion for aggregate capital is given by H and is such that

K ′ = H(X) = g(X, k = K, G);

62
• given price functions (r, w), the firm policy functions (k f , nf ) solve the firm problem;
• (prices are such that) the markets for labour, capital and consumption (i.e. the resource
constraint) clear
Z Z
f
n (X)df = n(X, k, G) di
Z Z
f
k (X)df = k di

c(X; K, G) + g(X; K, G) = ZF (K, L) + (1 − δ)K.

A few things are worth noting. First, we use small-cap and big-cap notation here because in
a symmetric equilibrium individual and aggregate variables coincide. Second, we defined the
actual law of motion of capital by taking the individual investment policy function (g(X; k, G))
and plugging in aggregate capital (g(X; K, G)), which is sometime referred to as representative
agent condition. Third, the resource constraint can be obtained by combining the household
budget constraint with the firm optimality conditions

c + k ′ = ZFn (K, N )n + [ZFk (K, N )k + (1 − δ)]k,

replacing individual with aggregate variables, and using the fact that the production function
is assumed to be homogenous of degree 118 , which allows to go from the following

C + K ′ = Z[Fn (K, N )N + Fk (K, N )K] + (1 − δ)K

to the following
C + K ′ = ZF (K, N ) + (1 − δ)K.

Rational expectations. We have not however asked that the resulting law of motion, which
we labelled H(X), is consistent with G. When that is the case, we have a RCE with rational

18
Recall that a function F is homogeneous of degree n if, for a vector of inputs x and a scalar α, we have that

F (αx) = αn F (x).

63
expectations, and we have the extra requirement that

H(X) = G(X) = g(X; K, G)

i.e. that the realised law of motion of aggregate capital is consistent with the law of motion
perceived by the individual agents.

Example 13. To see the meaning of the rational expectations condition, consider the following
toy example. Suppose that agents optimally choose some action a′ as a linear function of their
individual state a and the future value of the aggregate state A′ , that is

a′ = g0 + g1 a + g2 A′

where g0 , g1 and g2 are functions of the model parameters. Next suppose agents believe A′ follows
some function η(A) = η0 + η1 A. It follows that for any individual

a′ = g0 + g1 a + g2 (η0 + η1 A) = g(A; a, η).

Impose the representative agent condition (a = A) and obtain the actual law of motion of A
which is given by individual actions and beliefs according to

A′ = g(A; A, η) = (g0 + g2 η0 ) + (g1 + g2 η1 )A.

The rational expectation condition requires that agents’ beliefs are consistent with the actual
LOM, i.e. that  
g0
 η0 = g0 + g2 η0  η0 = 1−g2
⇒ .
g1
 η1 = g1 + g2 η1  η1 = 1−g2

6.1 RCE with Government

We now consider a variation of the previous problem which includes a government. The gov-
ernment taxes labour income using a proportional tax τ , and uses the proceeds of such tax to
buy “medals” M .

Wasteful spending and inelastic labour supply. For now, think of M as wasteful gov-
ernment spending, which we’ll assume to follow some exogenously given policy function or rule.

64
For simplicity we consider a deterministic setting without productivity shocks and with labour
supply exogenously fixed at 1 and no utility from leisure
The household problem is

v(K; k) = max

{u(c) + βv(K ′ ; k ′ )}
c,k

s.t. c + k ′ ≤ w(K)(1 − τ ) + (1 − δ + r(K))k


K ′ = G(K).

For brevity we do not solve the firm problem explicitly and just assume that wages and rental
rates are function of the aggregate state, but we know that’s true here.
The government budget constraint is given by

τ w(K) = M

where we don’t specify what medal policy is. But for any given medal policy (that is not a
linear function of w(K)), we will have that labour taxes must be a function of K as well, for
the government budget to be balanced.
Solving the household problem yields our usual Euler equation

u′ (c) = β[1 − δ + r(K)]u′ (c′ ).

Will the equilibrium be efficient? To answer that let’s look at the Planner’s problem

v(K) = max ′ {u(C) + βv(K ′ )}


C,M,K

s.t. C + K ′ + M ≤ F (K, 1) + (1 − δ)K.

The Euler equation for this problem is

u′ (C) = β[1 − δ + FK (K ′ , 1)]u′ (C ′ )

which is identical to the one we derived just above and thus implies that the RCE in this
particular problem will be efficient. Let us define such RCE explicitly

Definition 15. A Recursive Competitive Equilibrium (RCE) with rational expectations (RE) is

65
• a set of functions (v, c, g) for the individuals, (w, r) for prices and (τ, M ) for the govern-
ment
• capital law of motion G
such that
• given price functions (r, w), capital law of motion G and government policy (τ, M ), the
individual value and policy functions (v, c, g) solve the household problem (i.e. the Bellman
equation);
• the law of motion for aggregate capital is consistent with individuals’ policy function

K ′ = G(K) = g(K, k = K, G);

• the factor price functions (r, w) equal the factor marginal product (equivalently, firms
behave optimally and the labour and capital market clear);
• government policy is feasible (i.e. the government budget constraint is satisfied)
• (prices are such that) the market for consumption (i.e. the resource constraint) clears

c(K; K, G) + g(K; K, G) + M (K) = F (K, 1) + (1 − δ)K.

Note that the last bullet point here is redundant, because the market for consumption clears
by Walras’ law when that all other markets also do so, which is what we have ensured in the
previous points.

Useful spending. Now let’s consider the case where households get utility from government
spending (medals) according to u(c, M ). Since government policy is taken as given by house-
holds, the Euler equation will not change much for the households

uc (c, M ) = β[1 − δ + r(K)]uc (c′ , M ′ )

nor for the planner


uc (C, M ) = β[FK (K, 1) + (1 − δ)K]uc (C ′ , M ′ ).

However, the planner problem will include one more condition, namely

uc (C, M ) = um (C, M ) (19)

66
i.e. the MRS between consumption of the private and the public good must equal the relative
price, which is 1 here.
It follows that an RCE with useful spending will be efficient if and only if government policy is
such that condition (19) is always satisfied.

Endogenous labour supply. Lastly, consider the case where households get utility from
leisure and have a labour/leisure choice. The household problem is

v(K; k) = max′ {u(c, n) + βv(K ′ ; k ′ )}


c,n,k

s.t. c + k ′ ≤ w(K)n(1 − τ ) + (1 − δ + r(K))k


K ′ = G(K)

and the government budget constraint is given by

τ w(K)n = M.

What’s new here is that households will have a consumption-leisure optimality condition, typi-
cally defined as the labour supply equation, given by

un (c, n) = (1 − τ )w(K)uc (c, n).

In the planner’s problem, the equivalent of this condition is given by

un (C, N ) = Fn (K, N )uc (C, N )

and since w(K) = Fn (K, N ) we can clearly see that the RCE in this economy will not be efficient
because labour taxes create a distortion. The only case where the welfare theorems hold is that
where M = 0 always, which implies taxes are also zero and thus there is no distortion.

6.2 RCE with Heterogeneity

We now consider something different from the perfectly symmetric world we have analysed so
far.
Suppose there are two groups of agents, with a possibly different endowment of initial wealth

67
or capital. Aggregate capital in this economy is given by

K = µK1 + (1 − µ)K2

where µ is the measure of the first group, and Ki denotes the amount of capital held by group
i.
There is still a continuum of identical firms, so factor prices remain a function r(K), w(K) of
aggregate capital only.
The Bellman equation for an individual in group i is given by

v(K1 , K2 ; k) = max

{u(c) + βv(K1′ , K2′ ; k ′ )}
c,k

s.t. c + k ′ ≤ w(K) + (1 − δ + r(K))k


K ′ = G(K1 , K2 ).

Note that the aggregate state is now given by the capital of each group. The reason is that,
while wages and capital rental rates are only a function of aggregate capital K, future values
of K ′ may depend on how K is distributed across groups, via each group’s saving function. To
see that, consider the Euler equation for an individual

u′ (c) = β[1 − δ + r(K ′ )]u′ (c′ )

whose solution will yield a policy function k ′ = g(k, K1 , K2 ). Applying the representative agent
condition we get that

K1′ = g(K1 , K1 , K2 )
K2′ = g(K2 , K1 , K2 ).

Imposing the rational expectation condition

K1′ = G(K1 , K2 ) = g(K1 , K1 , K2 )

and vice versa for K2′ . Therefore the law of motion for capital is given by

K ′ = µ g(K1 , K1 , K2 ) + (1 − µ) g(K2 , K1 , K2 )

68
which will be different from g(µ1 K1 + µ2 K2 , K1 , K2 ), unless g is linear in its first argument.
What can we say about the steady state? The Euler equation implies that aggregate capital is
pinned down by the usual condition

1 = β[1 − δ + Fk (K ss , 1)]

which is the same for any group i. How do we recover the group-specific levels of consumption
and investment? The household budget constraint

css ss ss ss
i = w(K ) + ki [Fk (K , 1) − δ]

pins down the combination of css ss


i and ki , but not each of them separately. It follows that we
know steady-state aggregate consumption and capital, but the individual levels are undeter-
mined.

7 Ordinary Differential Equations Review


First order differential equations are of the form

F (t, y, ẏ) = 0

and y = f (t) is said to be a solution if

F (t, f (t), f ′ (t)) = 0 ∀t.

An ODEs is homogeneous if it has the (manageable) form

ẏ = h(y, t).

A first-order ODE is separable if it has the form

ẏ = f (t)g(y).

69
In such case, we can attempt to solve it as
Z Z
1
dy = f (t)dt.
g(y)

7.1 Homogeneous, Separable, Linear.

Consider
ẏ + p(t)y = 0.

It is separable, and both y and ẏ appear at the first power. The solution is
Z Z
1
dy = −p(t)dt
y

which yields Z
log y = −p(t)dt + C

and
R
−p(t)dt
y = Ae

where A := eC . Given an initial condition y(t0 ) = y0 we can easily recover the value of A.

7.2 Non-Homogeneous.

It is of the form
ẏ + p(t)y = q(t)
y(t0 ) = y0 .
First, you find the general solution to the “associated homogeneous equation” ẏ + p(t)y = 0,
R
which is y(t) = AeP (t) where P (t) = −p(t)dt. Then, we make the guess that A = a(t) and
therefore y(t) = a(t)h(t) where h(t) = eP (t) .
Then the left-hand side of the non-homogenous equation is given by

ẏ + p(t)y =
= ȧ(t)h(t) + a(t)ḣ(t) + p(t)a(t)h(t) =
= a(t)[ḣ(t) + p(t)h(t)] + ȧ(t)h(t) =
= ȧ(t)h(t).

70
which means we must solve
ȧ(t)h(t) = q(t)

to get a particular solution of the non-homogenous equation. That requires finding the an-
q(t)
tiderivative of ȧ(t) = h(t)
. Since q and h are known functions of time, we’ll get
Z
q(t)
a(t) = dt.
h(t)

There could be a constant, but we can set it to zero because we are solving for a particular
(rather than general) solution of the non-homogeneous equation. The final solution is given by
Z 
P (t) P (t) q(t)
y(t) = A |{z}
e +a(t) e|{z} = A h(t) + dt h(t) (20)
h(t)
h(t) h(t)

and A can again be obtained by solving y(t0 ) = y0 .


When q(t) = κ and p(t) = ϕ are constant and y(0) = y0 , the solution is given by

κ
y(t) = y0 e−ϕt + (1 − eϕt )
ϕ

which is a weighted average (with exponentially decaying weights) of the initial condition y0
and the steady state κ/ϕ (which can be obtained setting ẏ = 0).
Let us use this simple case to see why the general solution of a non-homogeneous ODE looks
like (20). The reason has a lot to do with the fact that we are solving indefinite integrals
and how we choose the undetermined constants. The solution of the associated homogeneous
equation is y(t) = Ae−ϕt , which yields ẏ + ϕy = 0 for any A, and allows to satisfy y(0) = y0
once A is appropriately chosen, but clearly does not solve the full ODE because it does not take
care of q(t). The particular solution is a(t)y(t) = ϕκ eϕt e−ϕt = ϕκ , which solves the full ODE since
ẏ + ϕy becomes 0 + κ which equals q(t) = κ, but does not satisfy the condition y(0) = y0 . The
sum of the two solutions, y = Ae−ϕt + κ
ϕ
instead works, because it satisfies both the full ODE
as well as the initial condition (once A is solved for).

71
8 Dynamic Optimisation in Continuous Time

8.1 Finite horizon

Our typical problem has the form


Z T
V (x0 , 0) = max r(t, x(t), a(t))dt
x(t),a(t) 0

subject to

dx(t)
ẋ(t) := = f (t, x(t), a(t))
dt
x(0) = x0 given
g(x(T )) ≥ 0.

where x, a are time functions that map from [0, T ] into a set A to be specified, x(t) is the
state variable, a(t) is the control variable, r is the return function and g specifies the terminal
condition on the state variable (e.g. the no Ponzi game condition).
The continuous time equivalent of the Lagrangian is the Hamiltonian19

H(t, x, a, λ) = r(t, x, a) + λf (t, x, a).

The necessary equilibrium conditions are the following:

Ha = 0 (maximum principle)
λ̇(t) = −Hx (adjoint equation)
ẋ(t) = f (t, x, a) (dynamics)
λ(T ) = ζ g ′ (x(T )) (transversality condition)

We now formally derive these conditions using the so-called “variational” approach.

19
We now start omitting the dependence on time to lighten up notation, but keep in mind that x, a, λ are all
functions of time, not constants.

72
Proof. Consider function â(t) that achieves the optimum of our problem. Consider a function
that deviates from â in the following way

a(t, ϵ) = â(t) + ϵη(t)

where η is an arbitrary continuous function. Using a(t, ϵ) as our solution for the control variable
we’d get associated dynamics for the state variable of the form

ẋ(t, ϵ) = f (t, x(t, ϵ), a(t, ϵ)).

Let Z T
Φ(ϵ) = r(t, x(t, ϵ), a(t, ϵ))dt.
0

If â is the solution, then we should have that Φ(0) ≥ Φ(ϵ) for any ϵ. Take the dynamics,
multiply by the co-state and integrate, and do the same (but without integration) for the
terminal condition
Z T
λ(t)[f (t, x(t, ϵ), a(t, ϵ)) − ẋ(t, ϵ)]dt + ζ g(x(T, ϵ)) = 0
0

which holds by construction. The co-state ζ is just a number since the constraint is only there
for period T . Since this whole expression is zero, we can add this to our previous equation
Z T h  i
Φ(ϵ) = r(t, x(t, ϵ), a(t, ϵ)) + λ(t) f (t, x(t, ϵ), a(t, ϵ)) − ẋ(t, ϵ) dt + ζ g(x(T, ϵ)).
0

Integrate the term λ(t)ẋ(t) by parts


Z T Z T
λ(t)ẋ(t)dt = λ(T )x(T, ϵ) − λ(0)x0 − λ̇(t)x(t, ϵ)dt.
0 0

Plug it back into our main equation and drop some function dependences for simplicity (we
keep time dependence for the terminal condition to remember things are a function of T there)
Z T h i
Φ(ϵ) = r(t, x, a) + λ(t)f (t, x, a) + λ̇(t)x dt − λ(T )x(T, ϵ) + λ(0)x0 + ζ g(x(T, ϵ)).
0

73
Now differentiate with respect to ϵ
Z T h i

Φ (ϵ) = (rx + λfx + λ̇)xϵ + (ra + λfa )aϵ dt − [λ(T ) − ζ g ′ (x(T, ϵ))] xϵ (T, ϵ).
0

Evaluate this derivative at ϵ = 0, where recall â(t) = a(t, 0)


Z T h i

Φ (0) = (rx (t, x̂, â) + λfx (t, x̂, â) + λ̇)xϵ + (ra (t, x̂, â) + λfa (t, x̂, â))η dt−[λ(T ) − ζ g ′ (x̂(T ))] xϵ (T, 0).
0

Now let’s suppose that we picked the adjoint function λ(t) such that the term multiplying xϵ
goes away, i.e.
λ̇ = −rx (t, x̂, â) − λfx (t, x̂, â).

Then as long as the following conditions hold

ra (t, x̂, â) + λfa (t, x̂, â) = 0


λ(T ) = ζ g ′ (x̂(T ))

we are sure that Φ′ (0) = 0 for any arbitrary “deviation function” η.


The last three equations we have derived correspond to the optimality conditions we spelled
out earlier, together with the dynamics equation. ■

Remarks. First, the transversality condition (TVC), which is an optimality condition, relates
the multiplier of the terminal condition (which is instead a constraint) to the co-state λ evaluated
at the last period. Joining that condition with the complementary slackness condition of the
terminal constraint itself will yield a more familiar version of the TVC, as we will see in the
consumption-savings example.
Second, conditions (maximum principle)-(transversality condition) are necessary conditions for
an optimum. They are also sufficient if we make additional concavity assumptions which we
have not made here so far.
Third, the adjoint (or costate) variable λ(t) is a time function that has the same role of the
Lagrange multiplier in discrete time. Here is represents the “flow” value of relaxing the dynamics
equation (also called flow constraint) by one unit, or the marginal value of incrementing the
state variable by one unit at time t in an optimal plan.

74
8.2 Infinite horizon

We now assume exponential discounting, return functions and state dynamics that do not
depend explicitly on time, and terminal conditions instead of terminal values.
The problem is Z ∞
V (x0 , 0) = max e−ρt r(x(t), a(t))dt
x(t),a(t) 0

subject to

ẋ(t) = f (x(t), a(t))


x(0) = x0 given
lim b(t)x(t) ≥ 0
t→∞

where b(t) is some exogenously defined function.


We define the present-value Hamiltonian as

H(t, x, a, λ) = e−ρt r(x, a) + λf (x, a).

The equilibrium conditions are still the maximum principle, the state dynamics, the adjoint
equation and the TVC

Ha = e−ρt ra + λfa = 0
ẋ = f (x, a)
λ̇ = −Hx
lim λ(t)x(t) = 0.
t→∞

The intuitive (but informal) way to derive the TVC is to consider first a finite horizon setting
with terminal condition b(T )x(T ) ≥ 0. We get

ζ b(T ) = λ(T ).

Since the complementary slackness condition is ζ b(T )x(T ) = 0, combine this to get λ(T )x(T )
and then take the limit for T → ∞ to get the infinite horizon TVC.

75
Now that we have discounting, the adjoint becomes

λ̇ = −Hx = −e−ρt rx + λfx .

We can also use a different version of the Hamiltonian, called current-value because we discount
“forward” the costate. Define µ(t) = eρt λ(t). We have a Hamiltonian which is not an explicit
function of time any more, i.e. where there is no discounting

H cv (x, a, µ) = r(x, a) + µf (x, a).

The equilibrium conditions become

Hacv = ra + µfa = 0
ẋ = f (x, a)
µ̇ − ρµ = −Hxcv = −rx − µfx
lim e−ρt µ(t)x(t) = 0.
t→∞

The third equation follows from the fact that, since λ(t) = e−ρt µ(t), we get that eρt λ(t) =
µ̇(t) − ρµ(t).

8.3 Consumption-savings model

The problem is Z ∞
V (0, a0 ) = max e−ρt u(c(t))dt
a(t),c(t) 0

such that

ȧ(t) = a(t)r − c(t)


a(0) = a0 given
lim e−rt a(t) ≥ 0.
t→∞

Note that we keep the time index in the value function, even though it is an infinite horizon
problem, to denote the fact that V (0, a0 ) is a present-value function. We will show later how
we can get rid of the time argument and define a current-value function that is time invariant.

76
To see how to derive the flow budget constraint from its discrete time equivalent, start with it

at+1 = at (1 + r) − ct

then evaluate it not at t + 1 but at t + ∆ where ∆ is an arbitrary time interval

at+∆ = at (1 + r∆) − ct ∆

where we put a ∆ in front of the flow variables (consumption and interest) but not the stock
variables (the stock of savings). Divide through by ∆ and then take the limit

at+∆ − at
lim = at r − c t
∆→0 ∆

which finally yields


ȧ(t) = a(t)r − c(t).

The current-value Hamiltonian is20

H cv (a, c, µ) = u(c) + µ[ar − c]

Compute the maximum principle and the adjoint equation

Hc = u′ (c) − µ = 0
µ̇ − ρµ = −µr.

Differentiate Ha = 0 one more time

ċ u′′ (c) − µ̇ = 0.

20
The present-value Hamiltonian is

H pv (t, x, a, µ) = e−ρt u(c) + λ[ar − c]

77
Combine this with the adjoint equation and the first expression for Ha

ċ u′′ (c) = (ρ − r)µ = (ρ − r)u′ (c).

Rearranging
ċ u′ (c)
= (r − ρ).
c −c u′′ (c)
The fraction on the RHS is the inverse of relative risk aversion, so with a CRRA utility function
it is equal to 1/γ, which yields the Euler equation in continuous time

ċ r−ρ
= .
c γ

Following the steps from above, the transversality condition is

lim e−ρt µ(t)a(t) = lim e−ρt u′ (c(t))a(t) = 0.


t→∞ t→∞

The Euler equation is an homogeneous differential equation, which we can solve for c. These
are the steps

dc(t) 1 r−ρ
=
dt c(t) γ
r−ρ
Z Z
dc(t)
= dt
c(t) γ
r−ρ
c(t) = e( γ )t A
c(0) = A = c0
r−ρ
c(t) = e( γ )t c0 .

We can then plug this into the state dynamics

da(t) r−ρ
= a(t)r − e( γ )t c0 .
dt

78
This is a non-homogeneous differential equation. The solution to the homogeneous part is
Z Z
da(t)
= rdt
a(t)
a(t) = ert B.

Guess a(t) = ert v(t) as a particular solution. Then ȧ(t) = rert v(t) + ert v̇(t) = a(t)r + ert v̇(t).
Compare this with the state dynamics

a(t)r + ert v̇(t) = a(t)r − c(t)


r−ρ
ert v̇(t) = −e( γ )t c0
Z Z
r−ρ
dv(t) = −c0 e−rt+( γ )t dt
γ r(1−γ)−ρ
v(t) = −c0 e( γ )t + D
r(1 − γ) − ρ

and we can set D to zero as we only need one particular solution. Finally, our general solution
is the sum of the solution to the homogeneous equation and the particular solution

a(t) = ert v(t) + ert B


γ r−ρ
= −c0 e( γ )t + ert B
r(1 − γ) − ρ
γ
a(0) = −c0 +B
r(1 − γ) − ρ
γ
B = a0 + c 0 .
r(1 − γ) − ρ

So finally  
γ ( r−ρ
) t rt γ
a(t) = −c0 e γ +e a0 + c 0
r(1 − γ) − ρ r(1 − γ) − ρ
which we can plug into the TVC to solve for c0 as a function of a0 . It is easier to first do so in
some special cases.

79
Time preference equal to interest rate (ρ = r). First, let ρ = r (the rate of time
preferences is equal to the interest rate on savings) and γ = 1 (log utility). Then we get

c(t) = c0
c0  c0 
a(t) = + ert a0 − .
r r

To find the optimal value of c0 let us plug our solution in the TVC
hc  c0 i  c0 
0
lim e −ρt ′
u (c(t))a(t) = lim e−ρt c−γ
0 +e rt
a0 − = a0 − =0
t→∞ t→∞ r r r

which holds if an only if c0 = a0 r.


Consider the alternatives: if c0 > a0 r, then the agent would be consuming “too much” and
would have a terminal value of savings that is negative, thus violating the no Ponzi game
condition. If instead c0 < a0 r, then the agent would be consuming “too little”: the no Ponzi
game condition would not be violated, but the transversality condition above would not hold.
This makes the necessity part of the TVC clear: there are many paths c̃(t) with c̃0 < a0 r that
satisfy the constraints and the Euler equation, but are not optimal because there exists a plan
c(t) with c0 = a0 r that yields a higher utility.

Log utility γ = 1. Then

c(t) = e(r−ρ)t c0
 
c0 (r−ρ)t rt c0
a(t) = e +e a0 − .
ρ ρ

Again, plugging this into the TVC we get

lim e−ρt c(t)−1 a(t) =


t→∞
  
−ρt (ρ−r)t −1 c0 (r−ρ)t rt c0
= lim e e c0 e +e a0 − =
t→∞ ρ ρ
1 c0 c0
= lim e−ρt + a0 − = a0 − =0
t→∞ ρ ρ ρ

which implies c0 = a0 ρ.

80
General case. When no simplifying assumptions on parameters are made, we can still plug
our general solution into the TVC. After a bit of algebra we obtain
 
γ −(
ρ−(1−γ)r
) γ
c1−γ
0 lim e γ
t γ
+ c 0 a0 − c 0 = 0.
ρ − r(1 − γ) t→∞ ρ − (1 − γ)r

Getting rid of the second term requires c0 = a0 ρ−(1−γ)r


γ
. For this to make sense, we will then
ρ−(1−γ)r
need that parameters are such that γ
> 0, so that c(t) > 0, and the first term converges
to zero as t → ∞, which is necessary for the TVC to be satisfied. The condition on parameters
r−ρ
can be rewritten as γ
< r, which means that the growth rate of consumption implied by the
Euler equation must be smaller than the growth rate of savings implied by the homogeneous
part of the flow budget constraint. This makes intuitive sense, as the growth rate of savings
implied by the general solution for a(t) is a combination of these two objects.

9 Continuous Time Dynamic Programming

9.1 Finite horizon

Let’s go back to a finite horizon problem


Z T
V (0, x0 ) = max r(t, x(t), a(t))dt
x(t),a(t) 0

ẋ(t) = f (t, x(t), a(t))


x(0) = x0 given
g(x(T )) ≥ 0.

Consider the problem at a subset of the whole time interval [τ, T ] ⊂ [0, T ]
Z T
V (τ, x(τ )) = max r(t, x(t), a(t))dt
x(t),a(t) τ

ẋ(t) = f (t, x(t), a(t))


x(τ ) given
g(x(T )) ≥ 0.

81
Now, consider an even smaller subset of time which ends before T
Z τ +h
V (τ, x(τ )) = max r(t, x(t), a(t))dt + V (τ + h, x(τ + h))
x(t),a(t) τ

ẋ(t) = f (t, x(t), a(t))


x(τ ) given.

Note that the terminal condition disappears because it’s only there at the end of time, i.e. when
τ + h = T.
We could in principle keep going and divide each V into smaller and smaller time intervals.
This is an illustration of Bellman’s principle of optimality: the continuation of an optimal plan
is itself optimal, so we can break up a problem in smaller pieces and proceed from the end
backwards, and we will always be looking for an optimal plan in the time interval at hand since
the future already unfolds according to an optimal plan.
Take the last expression, bring the LHS to the right and divide through by h
Z τ +h 
r(t, x(t), a(t)) V (τ + h, x(τ + h)) − V (τ, x(τ ))
0 = max dt + .
x(t),a(t) τ h h

Take the limit for h → 0 and we get

0 = max {r(τ, x(τ ), a(τ )) + Vx (τ, x(τ ))f (τ, x(τ ), a(τ )) + Vt (τ, x(τ ))} . (21)
a(τ )

This is the Hamilton-Jacobi-Bellman (HJB) equation, the continuous time equivalent of the
Bellman equation. It is a partial differential equation (PDE), with final condition g(x(T )) ≥ 0.
If the function V , which we still do not know, is differentiable with respect to x and t, then
it must satisfy the HJB, the initial condition, and the terminal condition. Note that, since we
shrank the time interval to an infinitesimal step, we are not maximising with respect to the
state variable any more, because we are implicitly assuming that it cannot be changed in an
instant, while the control variable can.

82
9.2 Infinite horizon

Let’s go back to an infinite horizon problem starting from some period τ


Z ∞
V (τ, x(τ )) = max e−ρt r(x(t), a(t))dt
x(t),a(t) τ

ẋ(t) = f (x(t), a(t))


x(τ ) given
lim b(t)x(t) ≥ 0.
t→∞

This is the present value function, because we are discounting everything to t = 0. We can
consider a rescaled version, the current value function
Z ∞
ρτ
V (x(τ )) := e V (τ, x(τ )) = max e−ρ(t−τ ) r(x(t), a(t))dt
x(t),a(t) τ

which is not any more a function of time, as the distance between t = τ and t = 0 becomes
irrelevant. So we have defined a current, time-invariant value function which depends only on
the initial value of the state variable, and which is such that

V (x) = V (x, 0)
e−ρt V (x) = V (x, t)
Vt (t, x) = −ρe−ρt V (x).

We can derive the HJB using the same reasoning as before. With the present value function
Z τ +h
V (x(τ ), τ ) = max e−ρt r(x(t), a(t))dt + V (x(τ + h), τ + h)
x(t),a(t) τ

with the current value function


Z τ +h
−ρτ −ρτ
e V (x(τ )) = e max e−ρ(t−τ ) r(x(t), a(t))dt + e−ρ(τ +h) V (x(τ + h))
x(t),a(t) τ

simplifying and rearranging


Z τ +h
−ρτ
0=e max e−ρ(t−τ ) r(x(t), a(t))dt + e−ρ(τ +h) V (x(τ + h)) − e−ρτ V (x(τ ))
x(t),a(t) τ

83
dividing by h
τ +h
e−ρ(τ +h) V (x(τ + h)) − e−ρτ V (x(τ ))
 Z 
−ρτ −ρ(t−τ ) r(x(t), a(t))
0 = max e e dt +
x(t),a(t) τ h h

taking the limit for h → 0

0 = max e−ρt r(x(t), a(t)) − ρe−ρt V (x(t)) + e−ρt V ′ (x(t))ẋ(t)



a(t)

where the last two terms come from the total differentiation of e−ρt V (x(t)) and the chain rule.
We get the HJB (omitting time dependence now)

ρV (x) = max {r(x, a) + V ′ (x)f (x, a)} . (22)


a

This is a ODE! The LHS gives us the “flow” value function; term r(x, a) gives us the “current”
or instantaneous payoff, and term V ′ (x)f (x, a) gives us the “capital gain”, i.e. the marginal
value of a change in the state variable.
To see the connection with the Hamiltonian, in the same way in which we have linked the
sequential and recursive approaches in discrete time, take the FOC with respect to a in (22)

ra (x, a) + V ′ (x)fa (x, a) = 0

Define µ(t) = V ′ (x(t)). First, note that once you replace V ′ (x) with µ, the last equation is the
maximum principle. Then, differentiate with respect to t

µ̇(t) = V ′′ (x(t))ẋ(t).

Now differentiate the HJB evaluated at the optimum, so we get rid of the max operator, with
respect to x (not t this time)

ρV ′ (x) = rx (x, a) + V ′′ (x)f (x, a) + V ′ (x)fx (x, a).

Plug the second- and third-last equations into the last and rearranging

µ̇ − ρµ = −rx − µfx

84
which is exactly the adjoint equation. So we have established that there is a connection between
the Hamiltonian and the HJB, and that the derivative of the value function is equal to the co-
state variable, or the shadow value of the state variable.

9.2.1 Consumption-savings model

Let’s go back to our example and see that we can derive the same Euler equation from the HJB.
The HJB is
ρV (a) = max{u(c) + V ′ (a)[ar − c]}.
c

Take the FOC


u′ (c) = V ′ (a).

Differentiate the FOC with respect to time

u′′ (c)ċ = V ′′ (a)ȧ = V ′′ (a)[ar − c].

Differentiate the HJB with respect to a

ρV ′ (a) = V ′′ (a)[ar − c] + V ′ (a)r.

Get rid of the value function derivatives

ρu′ (c) = u′′ (c)ċ + u′ (c)r

which finally yields


ċ u′ (c)
= (r − ρ).
c −c u′′ (c)

85
9.2.2 Neoclassical growth model

Variational approach. The time-0 problem is


Z ∞
V (k0 ) = max e−ρt u(c(t))dt
k(t),c(t) 0

k̇(t) = f (k(t)) − c(t) − δk(t)


k(0) = k0 given
k(t) ≥ 0
lim k(t)b(t) ≥ 0.
t→∞

The current value Hamiltonian

H(k, c, λ) = u(c) + µ[f (k) − δk − c].

Maximum principle
u′ (c) = µ.

Differentiate with respect to time


u′′ (c)ċ = µ̇.

Adjoint equation
µ̇ − ρµ = −Hk = −µ[f ′ (k) − δ].

Put everything together


u′′ (c)ċ − ρu′ (c) = −u′ (c)[f ′ (k) − δ]

which yields
ċ u′ (c)
= (f ′ (k) − δ − ρ).
c −c u′′ (c)

Dynamic programming approach. The HJB is

ρV (k) = max{u(c) + V ′ (k)[f (k) − c − δk]}.


c

Take FOC
u′ (c) = V ′ (k)

86
Differentiate the FOC with respect to time

u′′ (c)ċ = V ′′ (k)k̇ = V ′′ (k)[f (k) − δk − c].

Differentiate the HJB with respect to k

ρV ′ (k) = V ′′ (k)[f (k) − δk − c] + V ′ (k)[f ′ (k) − δ].

Get rid of the value function derivatives

ρu′ (c) = u′′ (c)ċ + u′ (c)[f ′ (k) − δ]

which finally yields


ċ u′ (c)
= (f ′ (k) − δ − ρ).
c −c u′′ (c)
Notice that this is the same as the Euler equation for the consumption-savings model once you
replace the marginal rate of return on savings (r) with that on capital (f ′ (k) − δ).

Characterising the solution. With CRRA, the Euler equation becomes

ċ 1
= (f ′ (k) − δ − ρ).
c γ

Together with the dynamics for the state

k̇ = f (k) − δk − c

we have a system of two ODEs in two unknowns (k, c). We can do a phase diagram, since it’s
easy to characterise when ċ and k̇ are positive, negative or zero.
From the HJB, however, we can also get a single DE (rather than a system) which we can then
solve directly, either by hand or with numerical methods. Consider the FOC for consumption

u′ (c) = V ′ (k)

which implies that


c = (u′ )−1 (V ′ (k))

87
with log utility, this becomes
c = 1/V ′ (k).

Plug this inside the HJB

ρV (k) = − log V ′ (k) + V ′ (k)[f (k) − δk] − 1. (23)

This is a nonlinear (see the logs) differential equation for V (k). With numerical methods, this
can be solved very quickly, in a way that is similar in spirit (but more efficient) to the value
function iteration method we used in discrete time.

Remark. We will not go through it here, but there exist theorems that prove (i) existence and
uniqueness of a solution to the HJB equation, even when there are non differentiabilities, under
some conditions; and (ii) under what conditions the solutions of the dynamic programming
approach (HJB) and the variational approach (Hamiltonian) coincide.

9.3 Numerical solutions

Plain ODEs. Simpler versions of the procedure above can be used to solve for simpler, first
order ODEs of the form

ẋ(t) = f (x(t), t)
x(0) = x0 .

The simplest method is the Forward Euler scheme. Our ODE is to be solved for t ∈ (0, T ], and
so we will look for the solution x(t) at discrete time points ti = i ∆ for i = 1, . . . , n, where
clearly ti+1 − ti = ∆. For simplicity, let xi := x(ti ). We will approximate ẋ(t) with a one-sided
forward difference
x(ti+1 ) − x(ti ) xi+1 − xi
ẋ(ti ) ≈ = .
∆ ∆
The idea is that if you know xi then you can compute xi+1 from

xi+1 − xi
= f (xi , ti ) ⇒ xi+1 = f (xi , ti )∆ + xi .

88
We have x0 , so we start with
x1 = f (x0 , t0 )∆ + x0

and then we keep going iteratively.

Bellman equations. As said before, equations like (22) are nonlinear differential equations
that can be efficiently solved with a computer. Convergence is typically achieved in well-behaved
problems, although there is no theorem equivalent to the CMT here that guarantees it. To solve
for v in (22), the procedure is as follows:
• construct a grid for capital k = [k1 , ..., kn ], where ∆k defines the distance between points
• construct an initial guess for v, i.e. a vector v = [v1 , ..., vn ]
• compute v ′ (k) (again, a vector v ′ = [v1′ , ..., vn′ ]) by taking finite differences of v. There are
different options here: one can take backward differences

′ vi − vi−1
vi,B = ,
∆k

forward differences
′ vi+1 − vi
vi,F = ,
∆k
or central differences
′ vi+1 − vi−1
vi,C = .
2 ∆k
We’ll use different methods depending on which gives the lowest absolute drift in the state
′ ′
variable (“upwind scheme”). If v is concave then we should get that vi,F < vi,B .
• use the FOC to get consumption ci from vi ′

ci,j = (u′ )−1 (vi,j



)

for j ∈ {F, C, B}. Since utility is also concave, we’ll have that ci,F > ci,B .
• compute the drift in capital using the state dynamics equation

k̇i,j = f (ki ) − δki − ci,j

again for j ∈ {F, C, B}. We’ll have that k̇i,F < k̇i,B .
• to decide which method to use to compute vi′ and ci
– if 0 < k̇i,F < k̇i,B , use forward differencing

89
– if k̇i,F < k̇i,B < 0, use backward differencing
– if k̇i,F < 0 < k̇i,B , assume we are in the steady state for capital, set ci = f (ki ) − δki
and vi′ = u′ (f (ki ) − δki ).
• compute the new value (iteration n + 1) for vi using the HJB and the recently obtained
values for ci and vi′ (from iteration n)


ρvin+1 = u(cni ) + vin k̇in .

This method is the most intuitive, but such a “full” update of the value function creates
convergence problems. What works is to update the value function slowly, i.e. to set


vin+1 = vin + λ[u(cni ) + vin k̇in − ρvin ]

choosing a small value for the step size λ.

10 Stochastic Dynamic Programming in Continuous Time

10.1 Review of Poisson processes

We now consider the simplest form of randomness in CT, which is Poisson processes. Consider a
random variable zt , with a discrete support given by {z1 , . . . , zn }, that has a transition intensity
λij . That means that a jump of zt from value zi to zj is an event with a Poisson intensity λij .
Let’s refresh the concepts of intensity and arrival probability. A point process is an increasing
sequences of random (time) points 0 < t1 < t2 < . . . < tn < . . ., each of which is a random
variable ti indicating the time at which the i-th occurrence of an event has happened. The
random counting function Nt indicates the number of events happened up to t. A point process
nt , or its counting function Nt , is a Poisson process if {Nt , t ≥ 0} is a process with stationary
independent increments. Independence means that the increments Nti − Nti−1 for all i are
independent; stationarity means that Nt − Ns depends only on t − s. If Nt is the counting
function of a Poisson process, then

−λ(t−s) [λ(t − s)]k


P(Nt − Ns = k) = e for k ∈ N
k!

that is, the number of events taking places in a time interval follows a Poisson distribution with

90
rate parameter λ(t − s). The parameter λ is the intensity of the Poisson process, and it gives
the average number of events in a unit of time (t − s = 1):

E[Nt+1 − Nt ] = λ.

Now, let an event be a “jump” from zi to zj as mentioned earlier. The probability of observing
such a jump during a period of time of length h is given by

P(Nt+h − Nt = 1) = e−λh λh ≈ λh

where the approximation (first order Taylor expansion) holds when h is small. The probability
of observing zero jumps in such a period is e−λh , which is approximately equal to 1 − λh up
2
to a first order. The probability of observing two jumps in such a period is e−λh (λh)
2
, which is
approximately zero up to a first order.

10.2 Stochastic HJB

Consider now, to make things more transparent, the usual example of the consumption savings
model. Here the agent receives stochastic labour earnings w, where w ∈ {wl , wh } and the jumps
from one value to the other are Poisson with intensities λlh = λhl = λ. Let’s derive the stochastic
HJB in this case with the usual procedure.
The problem between t and t + h, conditional on w(t) = wi is

h Z t+h
−ρt −ρt
e V (a(t), wi ) = max e e−ρ(s−t) u(c(s))ds
a(s),c(s) t
i
+ e−ρ(t+h) [(1 − λh)V (a(t + h), wi ) + λhV (a(t + h), wj )]

where wj is the other realisation of w. Rearrange the expression:

n Z t+h
−ρt
0 = max e e−ρ(s−t) u(c(s))ds + e−ρ(t+h) V (a(t + h), wi ) − e−ρt V (a(t), wi )
a(s),c(s) t
o
+ e−ρ(t+h) λh[V (a(t + h), wj ) − V (a(t + h), wi )]

91
Dividing by h everywhere, taking the limit for h → 0, and omitting time subscripts, we get

ρV (a, wi ) = max u(c) + Va (a, wi )[ar − c + wi ] + λ[V (a, wj ) − V (a, wi )].


c

Clearly, there exists another HJB for wj , where the i and j wage subscripts are inverted. This
HJB is identical to the deterministic one, except for the fact that now we have the partial
derivative(s) of the value function with respect to the deterministic state(s), and that there is
an additional term that represents wage risk.

10.3 Stochastic Euler equation

How do we derive the Euler equation in this setting? First, let’s take the FOC for consumption

u′ (c) = Va (a, wi ).

Second let’s differentiate the HJB with respect to assets

ρVa (a, wi ) = Vaa (a, wi )[ar − c + wi ] + Va (a, wi )r + λ[Va (a, wj ) − Va (a, wi )]. (24)

Now we have to change our approach because we can’t differentiate the FOC with respect to
time and keep going, since all variables of interest (consumption and savings) follow stochastic
paths. The road to follow is to try to get an analogue of the discrete time Euler equation with
uncertainty, which relates the expected growth rate of marginal utility to the interest rate and
the discount factor
E[u′ (ct+1 )] 1

= .
u (ct ) βR
To do so, let’s define the infinitesimal generator A of some arbitrary function of the states
f (a, w) as the following expected time derivative:

Et [f (a(t + h), w(t + h))] − f (a(t), w(t))


Af (a(t), w(t)) = lim
h→0 h

In words, we have to be careful in using time derivatives now, and we need expected time
derivatives rather than deterministic ones such as the ċ we have used earlier. In our current

92
example, we know that we can approximate that expectation as

E[f (a(t + h), w(t + h)) | w(t) = wi ] ≈ (1 − λh)f (a(t + h), wi ) + λhf (a(t + h), wj ).

Consider a first-order Taylor expansion

f (a(t + h), w) ≈ f (a(t), w) + fa (a(t), w)ȧh

and let’s use it in our expectation

E[f (a(t + h), w(t + h)) | w(t) = wi ] ≈(1 − λh)[f (a(t), wi ) + fa (a(t), wi )ȧh]
+ λh[f (a(t), wj ) + fa (a(t), wj )ȧh]
≈f (a(t), wi ) + λh[f (a(t), wj ) − f (a(t), wi )] + fa (a(t), wi ) ȧ h

where the last approximation comes from the fact that the terms with h2 are of second order
and so we can ignore them because they will be very small.
So finally we have our operator definition reduced to

Af (a, wi ) = λ[f (a, wj ) − f (a, wi )] + ȧfa (a, wi ).

Now let us apply it to f (a, w) = Va (a, w):

AVa (a, wi ) = λ[Va (a, wj ) − Va (a, wi )] + ȧVaa (a, wi )

and let us plug it in (24)


(ρ − r)Va (a, wi ) = AVa (a, wi )

which gives us
AVa (a, wi )
=ρ−r
Va (a, wi )
or
Au′ (c)
= ρ − r.
u′ (c)
That is, the expected percentage change in marginal utility is a function of the interest rate and
the rate of time preference, exactly as we saw in the discrete time case.

93
10.4 Kolmogorov forward equation

With idiosyncratic (i.e. individual) shocks, such as those we just show, we have in hand a
heterogeneous agents model. Some agents will earn little, some a lot, so we’ll have a non-trivial
asset market clearing condition, and a non-trivial distribution of wealth across agents. With the
Kolmogorov forward (KF) equation, we can characterise such distribution and derive its law of
motion.
To derive the KF equation, we once again start from a discrete time approximation. Let

si (t) = at r − ct + wi

denote the savings function at time t of an individual with wage wi and wealth a (which we
omit to keep notation light). It follows that

at+h = at + hsi (t). (25)

Our goal is to derive the density of the wealth distribution gi (a, t), i.e. the mass of people with
wealth a and wage wi at moment t. Let us however first start with the CDF of such distribution

Gi (x, t) = P(at < x, wt = wi ).

Ignore for a second the possibility that wage earnings jump. Using the discrete dynamics in
(25) we can write

Gi (x, t + h) = P(at+h < x, wt+h = wi ) = P(at < x − hsi (t), wt+h = wi ) = Gi (x − hsi (t), t).

Reintroducing the stochastic earning process

Gi (x, t + h) = Gi (x − hsi (t), t)(1 − λh) + Gj (x − hsj (t), t)λh

subtract Gi (x, t) from both sides and divide through by h

Gi (x, t + h) − Gi (x, t) Gi (x − hsi (t), t) − Gi (x, t)


= + λ[Gj (x − hsj (t), t) − Gi (x − hsi (t), t)]
h h

94
take the limit for h → 0

∂ ∂
Gi (x, t) = −si (t) Gi (x, t) + λ[Gj (x, t) − Gi (x, t)].
∂t ∂a

This equation already has a clear intuition: the time change in the share of people with wage wi
and below a certain wealth level x (LHS) is given by inflow/outflows due to continuous changes
in wealth (first term on the RHS) and due to discrete jumps in wage earnings (second term on
the RHS). If we differentiate with respect to a, we can get the KF equation with the densities

∂ ∂
gi (x, t) = − [si (t)gi (x, t)] + λ[gj (x, t) − gi (x, t)].
∂t ∂a

Finally, the stationary distribution gi (a) is that which, by construction, does not change over
time. We can write a KF equation for that too


0=− [si (a)gi (a)] + λ[gj (a) − gi (a)].
∂a

This is a partial differential equation! If you can solve it, then you can derive the stationary
wealth distribution without having to simulate a model over time.

11 Real Business Cycle Theory


The behaviour of macroeconomic variables (GDP, consumption, hours worked, ...) is typically
decomposed in two components, a trend and a cycle. The study of the trend is growth theory,
which focuses on what determines the long-run behaviour of the economy. We do not cover
it here. The study of the cycle is RBC theory, which focuses on what determines short-run
fluctuations around the trend, and how policy can affect those. We use the adjective “real”
because we will study a model without money or any other nominal variables.
Before the so-called RBC revolution, the main doctrine was Keynesianism, which postulated
that investors were driven by non-rational “animal spirits”, and studied the aggregate effects of
these confidence swings on the economy, and especially the government role in correcting these
spirits, by stimulating or cooling off demand. On top of that, and this is the famous critique
by Lucas (1976), Keynesian models studied policy by taking the patterns of agents’ behaviour
as given (e.g. their savings or consumption function).

95
The main point of the RBC revolution (Lucas (1977), Kydland and Prescott (1982) and Long Jr
and Plosser (1983)) was that, on the contrary, the behaviour of optimising agents does change
as a function of economic policy. These papers proposed macroeconomic models that had
three key components: (i) they were micro-founded, modelling behaviour at the individual
level; (ii) they assumed a single shock, technology, that would be the sole driver of short-run
fluctuations; and (iii) they were all “real”, in the sense that they did not consider the role
and behaviour of monetary variables. The important other feature of RBC theory is efficiency:
perfect competition and frictionless markets are assumed, so the welfare theorems do apply.
Of course RBC theory has competitors in claiming the ability to explain short-run fluctuations.
The most relevant alternatives are New-Keynesian models, which are monetary models where
price rigidity acts as the main mechanism of shock propagation; and sunspot theories, where
the existence of micro-founded multiple equilibria implies that cycles are oscillations between
them.
Much of modern macroeconomics uses the basic RBC model as a frictionless benchmark, on
which one can add all sort of frictions (financial, information, search, rationality) to improve
the model’s ability to explain recent (or less so) facts.

11.1 Some stylised facts

Kydland and Prescott (1982) paved the way by filtering the data, identifying cycles, and then
documenting a number of stylised facts, akin to the Kaldor facts you have studies with respect
to growth.
To de-trend variables, they used the Hodrick-Prescott (HP) filter. The filter works in the
following way
T
X
min (yt − ȳt )2
{ȳt }T
t=1 t=1

subject to
T −1
X
[(ȳt+1 − ȳt ) − (ȳt − ȳt−1 )]2 ≤ K.
t=2

Assuming that the constraint binds, we can write the following Lagrangian

T
X
(yt − ȳt )2 − µ [(ȳt+1 − ȳt ) − (ȳt − ȳt−1 )]2 + (yT − ȳT )2 + (y1 − ȳ1 )2

L=
t=1

96
where we omit K and instead pick µ in order to choose the importance of the trend smoothness
(the constraint) relative to trend fit (the objective). The typical values used in the literature
are 1600 for quarterly data and 400 for annual data. Once one has derived a sequence for ȳt ,
the cycle can be obtained from yt − ȳt .
σx
Let σ̃x = µx
denote the coefficient of variation, so we can express standard deviations in per-
centage term of means. Kydland and Prescott’s business cycle facts are the following.

1. σ̃c < σ̃y (consumption smoothing)


2. σ̃d > σ̃y (durable consumption is volatile)
3. σ̃i ≈ 3 σ̃y (investment is very volatile)
4. σ̃tb > σ̃y (trade balance, i.e. exports minus imports, is volatile)
5. σ̃n ≈ σ̃y (total hours worked are similar to output)
6. σ̃e ≈ σ̃y (employment is similar to output)
7. σ̃k << σ̃y (the capital stock is slow-moving)
8. σ̃w < σ̃y/n (output per hour is more volatile than real hourly wages).

A few more facts are worth stating

1. ρ(x, y) > 0 and high for x ∈ { investment, consumption, employment, hours worked }
2. all macro aggregates display fairly high serial correlation, including TFP.

11.2 The basic RBC model

We now introduce the framework we will use. We will follow a “cookbook”: first, we specify the
model we use, including its functional forms (utility, production, etc...); second, we calibrate
the model, i.e. we pick the parameters from external estimates; third, we solve the model,
which is typically done using numerical methods; fourth, we simulate the model and analyse its
outcomes.
We will use a stochastic version of the neoclassical growth model, with the quantitative goal of
“calibrating it” to try and match the facts we mentioned above.
The objective is

X
max E0 β t u(ct , lt )
{ct ,nt ,lt ,it ,kt+1 }
t=0

97
subject to

ct + it = zt F (kt , nt )
F (kt , nt ) = ktα nt1−α
it = kt+1 − (1 − δ)kt
lt + nt = 1
log(zt+1 ) = ρ log(zt ) + ϵt+1 .

As said, the goal here is to specify functional forms, solve the model, pick some values for the
parameters, and examine the model’s quantitative performance. To pick parameters, we will
use here external sources, such as micro data or long-run trend data. Note that this is different
from moment matching, where parameters are picked in order to minimise the distance between
the moments generated by a simulation of the model and those measured in the data. We will
come back to this later.
The first thing we need to measure is the most important: the only shock in this economy,
total factor productivity (TFP). To measure it in the data, take the production function yt =
zt F (kt , nt ), perform total differentiation

dyt = F (kt , nt )dzt + zt Fk (kt , nt )dkt + zt Fn (kt , nt )dnt .

Then divide each term by yt and simplify some things

dyt dzt zt Fk (kt , nt ) zt Fn (kt , nt )


= + dkt + dnt .
yt zt yt yt

With perfect competition, the firm’s FOCs imply

wt = zt Fn (kt , nt )
rtk = zt Fk (kt , nt )

so we get
dyt dzt rtk kt dkt wt nt dnt
= + + .
yt zt yt kt yt nt
Since F (k, n) = k α n1−α is constant returns to scale, we have that kFk +nFn = F and the factors

98
rtk kt wt nt
share in output is constant and given by yt
= α and yt
= 1 − α. Hence

dzt dyt dkt dnt


= −α − (1 − α) ,
zt yt kt nt

and we can get the time series for TFP growth rates using the time series for output, capital
and labour, as well as their factor shares. Once we have the time series for zt , we can estimate
the AR(1) process it follows.
Now on to functional forms. For utility we pick
 1−σ
c1−θ θ
1−θ
l −1
u(c, l) = .
1−σ

Here θ is the relative importance of leisure vs consumption, and σ is the degree of relative risk
aversion (and its inverse is the inter temporal elasticity of substitution). The functional form
for production is a standard CRS function, as specified above.
Even when studying cycles, one may want to include growth factors. One example is to have pop-
ulation growth (e.g. at some constant rate η) as well as labour-augmenting technical progress,
which amounts to exponential growth in TFP of the form

zt = z0 (1 + γ)t(1−α) eωt
ωt = ρωt−1 + ϵt .

With these assumptions, to study cycles we would have to de-trend the model by filtering out
the trend, which is deterministic in this case. Here for simplicity we will abstract from this and
consider a fully stationary economy where there is no trend growth but only fluctuations.
The equilibrium equations are the labour supply equation

θ ct
= wt
1 − θ lt

the Euler equation


k
uc (t) = βEt [(zt+1 rt+1 + 1 − δ)uc (t + 1)]

and the firm FOCs we stated earlier.

99
Calibration. Now, let’s see how we pick values for parameters.
The parameters of the TFP process are estimated on US data (for 1950-1990 in the original
paper) in the way we discussed earlier. We have ρ = 0.972 and σϵ2 = 0.0072.
Taking the law of motion for capital in steady state

iss
1=1−δ+
kss
iss
we can use the long run average for kss
= 0.076 to pin down the value for δ.
We set α = 1/3 to match the long run capital share in production, which is fairly stable over
time.
To compute utility function parameters, we use the optimality conditions. The risk aversion
coefficient is a tricky parameter to estimate, even more so when (like here) it both represents
risk aversion and the IES21 . Estimates range from 0.5 to 5 approximately, with the asset pricing
literature typically using even higher values. We’ll assume log utility, for simplicity and because
it is sort of in the middle of the range. To pick β, we use the Euler equation in steady state
which yields
1 yss
=α + 1 − δ.
β kss
Using a long run output/capital ratio of 0.3012 we get β = 0.976.
To pick θ, we use the labour supply equation in steady state after substituting out wages

θ 1 − lss yss
= (1 − α) .
1 − θ lss css
yss
We get the long run value for the consumption/output ratio from the data as css
= 1.173, use
the long run value for labour and leisure (respectively 1/3 and 2/3 of a day) and solve directly
for θ = 0.61. The value of θ is a controversial subject, because it is a parameter that is the object
of many studies in the labour literature. For instance, the parameter θ has a mapping into the
Frisch wage elasticity of labour supply, that is the wage elasticity of nt when we keep constant
the marginal utility of wealth (hence the marginal utility of consumption, in this model). With
the value we derived for θ, we would get a value of 2 for the Frisch elasticity (can you show it?),
which is very much at odds with micro studies that find a value close to zero for it.

21
There are other types of utility functions, such as Epstein-Zin, that separate the two objects.

100
We have now set all of the parameters. The next steps are

1. Solve the model, in order to get policy functions for consumption, investment, labour and
leisure. We will see ways to do that in the next subsection;
2. Simulate the model: start from some value for k0 , draw a long series of TFP shocks,
compute time series for all other variables using the model policy functions, compute
moments (means, standard deviations, correlations) and check how they compare to their
data counterparts.

11.3 Perturbation methods

The solution methods that can be used to solve the model are either global or local. Global
methods, such as guess and verify or value function iteration, yield exact (often non-linear)
solutions. Local methods are typically called perturbation methods: we approximate the system
around some point (usually the steady state), and then we solve the simpler, approximated
problem.
We now look at the latter. The way we approximate the system is through log-linearisation
around the steady state. We’ll use the following notation: xt denotes a variable, x denotes its
steady-state, and x̂t denotes its log-deviation from steady state, which is approximately equal
xt −x 22
to its percentage deviation from steady state since x̂t := log xxt ≈ x
.
To approximate a generic power expression, we’ll use the rule xαt = xα (1 + αx̂t ). This comes
from
xαt = xα eα log xt /x = xα eαx̂t

which we then approximate around x̂t = 0 using the following first-order Taylor expansion

eαx̂t ≈ eα0 + αeα0 (x̂t − 0) = 1 + αx̂t .

22
The approximation here comes from a first-order Taylor expansion:
1
log xt ≈ log x + (xt − x)
x
which implies
xt xt − x
log ≈ .
x x

101
Let us approximate the production function, as an example. Start from

yt = zt ktα n1−α
t

replace variables with their approximations

y(1 + ŷt ) = z(1 + ẑt )k α (1 + αk̂t )n1−α (1 + (1 − α)n̂t )

simplify the steady state stuff, and then do all the cross products

1 + ŷt = 1 + ẑt + αkt + (1 − α)n̂t

where we got rid of all products of log deviations (e.g. x̂t ŷt ) because they are approximately
zero up to a first-order. We get the log-linearised version of the production function

ŷt = ẑt + αk̂t + (1 − α)n̂t . (26)

There is another, alternative approach. Take logs of the production function directly

log yt = log zt + α log kt + (1 − α) log nt .

Replace with first-order approximations around the steady state, using x̂t = log xxt

log y + ŷt = log z + ẑt + α log k + αk̂t + (1 − α) log n + (1 − α)n̂t ,

then get rid of the steady state variables and get to the same conclusion as with the previous
method.
Let us now approximate all relevant equilibrium condition: the resource constraint

c k k
ĉt + k̂t+1 = ẑt + αk̂t + (1 − α)n̂t + (1 − δ) k̂t (27)
y y y

the Euler equation

y h i
Et ĉt+1 − ĉt = βα Et ẑt+1 + (1 − α)n̂t+1 + (α − 1)k̂t+1 (28)
k

102
the labour supply equation
n
ĉt + n̂t = ẑt + αk̂t − αn̂t (29)
1−n
and the law of motion for TFP
ẑt+1 = ρẑt + ϵt+1 . (30)

Now that we have a different, approximated model in front of us, we can look for its solution, i.e.,
policy functions for consumption, future capital and hours worked. We will see two methods:
the method of undetermined coefficients (Uhlig (2001)), and the Blanchard and Kahn (1980)
method.

11.3.1 Method of undetermined coefficients

This method consists in guessing that the controls are linear functions of the state variables,
with coefficients to be solved for. Guess

ĉt = γc k̂t + µc ẑt


n̂t = γn k̂t + µn ẑt
k̂t+1 = γk k̂t + µk ẑt .

In other words we’re guessing a linear system ât = H x̂t where ât is a 3 × 1 vector of controls,
H is a 3 × 2 matrix of coefficients, and x̂t is a 2 × 1 vector of states.
Plug the guesses into the resource constraint

c k k
[γc k̂t + µc ẑt ] + [γk k̂t + µk ẑt ] = ẑt + αk̂t + (1 − α)[γn k̂t + µn ẑt ] + (1 − δ) k̂t
y y y

Collecting terms
   
c k k c k
k̂t γc + γk − α − (1 − δ) − (1 − α)γn = ẑt − µc − µk + 1 + (1 − α)µn .
y y y y y

Since this equation must hold at all points of the system, i.e. for any combination of values for
k̂t and ẑt , both square brackets must equal zero, which gives us two equations in 6 unknown
parameters.

103
Repeat the same process for the Euler equation
h i
E γc (γk k̂t + µk ẑt ) + µc (ρẑt + ϵt+1 ) − (γc k̂t + µc ẑt ) =
k h  i
= βα Et (ρẑt + ϵt+1 ) + (1 − α) γn (γk k̂t + µk ẑt ) + µn (ρẑt + ϵt+1 ) + (α − 1)(γk k̂t + µk ẑt ) .
y

Use the fact that Et ϵt+1 = 0, get rid of expectations and collect terms
n y o
k̂t γc γk − γc − βα [(1 − α)γn γk + (α − 1)γk ] =
k
n yh   io
= ẑt −µc ρ + µc + βα ρ + (1 − α) γn µk + µn ρ + (α − 1)µk .
k

As before, since this equation must hold at all points of the system, the curly brackets on
both sides of the equation must equal zero, which gives us two more equations in 6 unknown
parameters.
We repeat the same process for the labour supply equation

n  
γc k̂t + µc ẑt + γn k̂t + µn ẑt = ẑt + αk̂t − α(γn k̂t + µn ẑt )
1−n

which becomes
   
n n
k̂t γc + γn − α + αγn = ẑt −µc − µn + 1 − αµn
1−n 1−n

which again gives us two equations in four parameters (rather than six, there are no γk , µk here).
What we do now is to solve for the coefficients. Let’s simplify this example by assuming that
labour supply is inelastic, so we lose the labour supply equation and the parameters γn , µn .
Take the terms for k̂t in the resource constraint and Euler equation

c k k
γc + γk = α + (1 − δ)
y y y
k
γc (γk − 1) = βα (α − 1)γk .
y

Put them together to get


   
y c k
γk2 − γk 1 + α + 1 − δ + αβ + α + (1 − δ) =0
k y y

104
which is a second order equation in γk that admits two solutions. How do we know which
solution is the right one for our policy function parameter? For typical parametrisations, we
get that
0 < γk,1 < 1 < γk,2 .

Suppose you choose γk,2 . You’d get that

k̂t+1 = γk,2 k̂t + µk ẑt

iterating backwards
t
X
t+1
k̂t+1 = γk,2 k̂0 + (γk,2 )j µk ẑt−j .
j=0

Since we chose the solution larger than 1, this implies that capital will explode to infinity (unless
k̂0 = 0, which is unlikely). This cannot be a solution, and you can use the TVC to see this
formally. Thus we’ll pick γk,1 , and then derive all other undetermined coefficients.
Note that once we have derived all of the coefficients, we can express the policy functions not
only in log-deviations (as we defined it) but also in levels, since

k̂t+1 = γk k̂t + µk ẑt

implies
kt+1 = γk kt + (1 − γk )k + µk k(zt − 1).

If we had the global solution too, we could then compare the exact and approximate solutions.
We’ll see that they are pretty close when kt is close to the steady state, and they diverge
significantly once we get further away from the steady state.

11.3.2 Blanchard-Kahn method

This is a method that is closely related the one we just saw. Write down the system of equilib-
rium conditions (Euler equation, resource constraint, labour supply equation) in matrix form
" # " #
x̂t+1 x̂t
A =B + Dv̂ t+1
ât+1 ât

105
where ât is a vector of controls, x̂t is a vector of states, and v̂ t+1 is a vector of errors (which
will be the TFP innovation for the exogenous state variable, and mean-zero prediction errors
that imply future control variable are not known with certainty as of today).
For example, consider our example with inelastic labour. We’d have
        k 
k/y 0 0 k̂t+1 α + (1 − δ) ky 1 − yc k̂t 1 0 0 v̂t+1
0 1 0 ẑt+1  =  0 ρ 0  ẑt  + 0 1 0  ϵt+1  .
        

αβ k (1 − α) −αβ ky
y
1 ĉt+1 0 0 1 ĉt c
0 0 1 v̂t+1

Invert the LHS matrix and rewrite in reduced form


" # " #
x̂t+1 x̂t
=F + Gv̂ t+1
ât+1 ât

where F := A−1 B and G := A−1 D. Consider the Jordan decomposition of F = HJH −1 where
J is a diagonal matrix whose diagonal elements are the eigenvalues of F , which we denote with
λ1 , · · · , λn . We typically assume that the ordering of the system is such that the λs are sorted
from the smallest to the largest in absolute value.
In general (away from our example), let h denote the number of eigenvalues that are larger than
1 in absolute value, and let m denote the number of control variables. The BK conditions are
the following:
• if h = m, then the system has a unique, stable solution;
• if h > m, then there exist no solution;
• if h < m, then there exist infinite solutions.
We will understand what is the reason behind this result. After the Jordan decomposition, we
have " # " #
x̂t+1 −1 x̂t
= HJH + Gv̂ t+1 .
ât+1 ât

Pre-multiply by H −1 and get


" # " #" #
x
e t+1 J1 0 x
et
= + Gv̂
e t+1
a
e t+1 0 J2 a et

106
" # " #
x
et −1
x̂t e = H −1 G. Let us take expectations on both sides
where =H and G
a
et ât
" # " #" #
x
e t+1 J1 0 x
et
Et = .
a
e t+1 0 J2 a et

Since J is diagonal, we can look at the system block by block, where the first and second block
are related to the state and control variables respectively. In the NGM example we only have
one control variable, so let us focus on the second block which is now the last row

at+1 = λ3e
Ete at . (31)

Iterate forward to get


at+s = λs3e
Ete at .

If the BK conditions are satisfied, |λ3 | > 1 and |λ1 | < |λ2 | < 1, so in the limit it must be that
at = 0 for all t. This is equivalent to saying that we need to “kill” the unstable (eigenvalue
e
outside the unit circle) part of the system to rule out explosive paths which would not give us
a stable system that converges to the steady state in the absence of shocks.
This yields
h31 k̂t + h32 ẑt + h33 ĉt = 0

where hij denotes the ij-th element of matrix H −1 . This becomes

h31 h32
ĉt = − k̂t − ẑt
h33 h33

which is clearly the policy function for consumption.


When the BK conditions are satisfied, we have m control variables and h = m eigenvalues
outside the unit circle. By ruling out such explosive paths, we’ll have m conditions relating
control variables to state variables. Once we have solved for the policy functions of the control
variables, it is simple to back out the implied policy functions for the endogenous state variables.
Finally, in our example the policy function for consumption also implies that the prediction
errors are such that G
e(3, :) v̂ t+1 = 0, i.e. the prediction errors are only a function of the exogenous
shocks. In general, we’ll have m conditions pinning down the m prediction errors associated
with the control variables.

107
When m < h, we have too many explosive eigenvalues (or too few control variables) and no
solution exists that makes the system stable.
When m > h, there exist infinite stable solutions: the control variables are not uniquely pinned
down as a function of the state variables, and the prediction errors associated to the control
variables are also not pinned down uniquely. In that case, the path of the system depends
on self-fulfilling beliefs. To see this, consider equation (31) and suppose |λ3 | ∈ (0, 1): we get
at ̸= 0, which means that
convergence to steady state even if e

1
at =
e at+1 ,
Ee
λ3

i.e., the current value of the control variable has multiple solutions and depends on the states
as well as the expectation of its future value. So agents beliefs determine equilibrium outcomes,
and prediction errors are not independent of fundamental variables.

108
References
Blanchard, Olivier Jean and Charles M. Kahn, “The Solution of Linear Difference Models
under Rational Expectations,” Econometrica, 1980, 48 (5), 1305–1311.

Jr, John B Long and Charles I Plosser, “Real business cycles,” Journal of political Econ-
omy, 1983, 91 (1), 39–69.

Kydland, Finn E and Edward C Prescott, “Time to build and aggregate fluctuations,”
Econometrica: Journal of the Econometric Society, 1982, pp. 1345–1370.

Lucas, Robert E, “Understanding business cycles,” Carnegie-Rochester Conference Series on


Public Policy, 1977, 5, 7–29.

Lucas, Robert Jr, “Econometric policy evaluation: A critique,” Carnegie-Rochester Confer-


ence Series on Public Policy, January 1976, 1 (1), 19–46.

Uhlig, Harald, “30A Toolkit for Analysing Nonlinear Dynamic Stochastic Models Easily,” in
“Computational Methods for the Study of Dynamic Economies,” Oxford University Press, 10
2001.

109

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy