Monash Estimation-1
Monash Estimation-1
Monash Macroeconomics
Summer School
Parametrizing models:
Estimation
Petr Sedláček
University of Bonn
February 2016
Goal
Parametrizing DSGE models, in particular via estimation
discussion of alternatives
Maximum Likelihood estimation
Kalman filter (based on Hamilton, 1994)
DSGE models in state-space form
singularity problem
Bayesian estimation
priors
evaluating the posterior
Markov-Chain Monte-Carlo (MCMC) methods
practical issues
Sedláček Monash Macro
Introduction
Discussion of alternatives
How to parametrize a model
Calibration
Maximum Likelihood
Moment matching
Bayesian estimation: the basics
Maximum Likelihood and Bayesian
Summary
calibration
estimation
Maximum Likelihood
Bayesian
moment matching (GMM, SMM, II)
Calibration
wide-spread methodology
at least since Kydland and Prescott (1982)
Calibration
Calibration
Calibration
xt is a vector of variables
Ψ are model parameters
min g (X , Ψ)0 Ωg (X , Ψ)
Ψ
Ω is a weighting matrix
optimal weighting matrix:
inverse of the var-covar matrix of g (X , Ψ)
Indirect inference
Indirect inference
pick Ψ s.t.
δ(xt ) = δ(zt , Ψ)
yt =H 0 ζt + wt , E(wt , wt0 ) = R ∀t
ζt+1 =F ζt + vt+1 , E(vt , vt0 ) =Q ∀t
Some preliminaries
Yt = (yt0 , yt−1
0
, ..., y10 )
Σz,x
zb = Eb(z|x) = z + (x − x)
Σx,x
→ forecast ζb2|1
1. Update step
use linear projection to produce update of ζt
conditional on expectations from t − 1
conditional on observation of yt
Σz,x
zb = Eb(z|x) = z + (x − x)
Σx,x
b t |Yt ] = ζbt|t−1 +
ζbt|t =E[ζ (1)
h i
−1
E (ζt − ζbt|t−1 )(yt − ybt|t−1 )0 xE (yt − ybt|t−1 )(yt − ybt|t−1 )0
x
(yt − ybt|t−1 )
2. Forecast step
covariance term
h i
E (ζt − ζbt|t−1 )(yt − ybt|t−1 )0 (3)
h i
= E (ζt − ζbt|t−1 )(H 0 (ζt − ζbt|t−1 ) + wt )0
h i
= E (ζt − ζbt|t−1 )(ζt − ζbt|t−1 )0 H
= Pt|t−1 H
variance term
= H 0 Pt|t−1 H + R
error term
Combining (1) and (2) and using (3) to (5) we can write
h i
Pt+1|t =E (ζt+1 − ζbt+1|t )(ζt+1 − ζbt+1|t )0 (7)
h i
=E (F ζt + vt+1 − F ζbt|t )(F ζt + vt+1 − F ζbt|t )0
h i
=F E (ζt − ζbt|t )(ζt − ζbt|t )0 F 0 + E vt+1 vt+1
0
=FPt|t F 0 + Q
update:
forecast:
ζbt+1|t = F ζbt|t
Pt+1|t = FPt|t F 0 + Q
combined specification:
Summary. We develop a statistical model for the analysis and forecasting of football match
results which assumes a bivariate Poisson distribution with intensity coefficients that change
stochastically over time. The dynamic model is a novelty in the statistical time series analysis of
match results in team sports. Our treatment is based on state space and importance sampling
methods which are computationally efficient. The out-of-sample performance of our methodol-
ogy is verified in a betting strategy that is applied to the match outcomes from the 2010–2011
and 2011–2012 seasons of the English g g
football Premier League. We show that our statistical
modelling framework can produce a significant positive return over the bookmaker’s odds.
Sedláček Monash Macro
Introduction Time series model
How to parametrize a model Kalman filter
Maximum Likelihood Maximum Likelihood estimation
Bayesian estimation: the basics Back to DSGE models and singularity problem
Summary Summary
Estimating parameters
Preliminaries
Preliminaries
Likelihood function
(x − µ)2
1
√ exp −
σ 2π 2σ 2
Likelihood function
What next?
−υ
ct−ν =Et βct+1 (αzt+1 ktα−1 + 1 − δ)
α
ct + kt =zt kt−1 + (1 − δ)kt−1
zt =1 − ρ + ρzt−1 + t
t ∼ N(0, σ 2 )
Linearized version
yt =H 0 ζt + wt , E(wt , wt0 ) = R ∀t
ζt+1 =F ζt + vt+1 , E(vt , vt0 ) =Q ∀t
kt − k akk akz kt−1 − k 0
= +
zt+1 − z 0 ρ zt − z t+1
kt−1 − k
kt−1 − k = [1 0] + 0
zt − z
Different observables?
i k
t−1 − k
h
α−1 α
pt − p = αzk k + 0
zt − z
Ways out?
Ways out?
if ut is measurement error
OK from an econometric point of view
but is it truly measurement error?
Singularity problem
(stochastic) singularity:
many endogenous variables ...
driven by a smaller number of structural shocks
General rule
Note that:
more shocks (measurement errors) than observables is OK
the choice of observables for estimation is not innocent
there are ways to choose observables carefully
see e.g. Canova, Ferroni, Matthes (2012)
Frequentist view:
inferences about Ψ
based on probabilities of particular Y T for given Ψ
Bayesian view:
Bayes’ rule
L(Y T |Ψ)P(Ψ)
P(Ψ|Y T ) =
P(Y T )
g (Ψ)P(Ψ|Y T )dΨ
R
E[g (Ψ)] = R
P(Ψ|Y T )dΨ
Special/Simple case:
Idea of priors
Priors
normal
beta, support ∈ [0, 1]
persistence parameters
uniform
often (incorrectly) referred to as “uninformative”
Some terminology
Jeffreys prior
non-informative prior
Some terminology
Starting point
g (Ψ)P(Ψ|Y T )dΨ
R
E[g (Ψ)] = R
P(Ψ|Y T )dΨ
we want to simulate x
x comes from truncated normal with
mean µ and variance σ 2
and a < x < b
Solution:
Importance sampling
Importance sampling
T
g (Ψ) P(Ψ|Y )
R
h(Ψ) h(Ψ)dΨ
E[g (Ψ)] = R P(Ψ|Y T )
h(Ψ) h(Ψ)dΨ
R
g (Ψ)ω(Ψ)h(Ψ)dΨ
= R
ω(Ψ)h(Ψ)dΨ
P(Ψ|Y T )
ω(Ψ) =
h(Ψ)
Importance sampling
Importance sampling
Markov property:
Transition kernel:
Metropolis-Hastings algorithm
Main idea same as with importance sampling:
Acceptance probability
“Metropolis”
" #
P(Ψ∗i+1 |Y T )
q(Ψi+1 |Ψi ) = min 1,
P(Ψi |Y T )
Acceptance probability
“Metropolis-Hastings”
" #
P(Ψ∗i+1 |Y T ) h(Ψi ; θ)
q(Ψi+1 |Ψi ) = min 1,
P(Ψi |Y T ) h(Ψ∗i+1 ; θ)
Convergence statistics
J
I X 2
B= Ψj − Ψ
J
j=1
Geweke statistic
4. go back to 2
adjust (random walk variant) stand-in distribution
do not adjust (independence variant) stand-in distribution
Sedláček Monash Macro
Frequentist vs. Bayesian views
Introduction Priors
How to parametrize a model Evaluating the posterior
Maximum Likelihood Importance sampling
Bayesian estimation: the basics Markov Chain Monte Carlo methods
Summary Practical issues with MH algorithm
Summary
Bayesian inference
HPD intervals, Bayes’ factors
model comparisons
prior selection
“system priors”
Summary
What’s next?