0% found this document useful (0 votes)
37 views10 pages

Forecast

1. The document discusses forecasting time series data using linear predictors. 2. The best linear forecast of yt+h based on information up to time t (It) is the conditional expectation E[yt+h|It]. If the errors are independent white noise, then the best forecast is a linear function of past errors. 3. Methods for computing the best linear forecasts are presented for AR(1) and AR(p) models by putting them in state space form and iterating forecasts forward h periods. The forecast errors and their mean squared errors are also derived.

Uploaded by

Mayank
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
37 views10 pages

Forecast

1. The document discusses forecasting time series data using linear predictors. 2. The best linear forecast of yt+h based on information up to time t (It) is the conditional expectation E[yt+h|It]. If the errors are independent white noise, then the best forecast is a linear function of past errors. 3. Methods for computing the best linear forecasts are presented for AR(1) and AR(p) models by putting them in state space form and iterating forecasts forward h periods. The forecast errors and their mean squared errors are also derived.

Uploaded by

Mayank
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Forecasting Define yt+h|t as the forecast of yt+h based on It known

parameters. The forecast error is


Let {yt} be a covariance stationary are ergodic process,
εt+h|t = yt+h − yt+h|t
e.g. an ARMA(p, q) process with Wold representation
∞ and the mean squared error of the forecast is
X
yt = μ + ψ j εt−j , εt ~W N (0, σ 2)
j=0 M SE(εt+h|t) = E[ε2t+h|t]
= μ + εt + ψ 1εt−1 + ψ 2εt−2 + · · · = E[(yt+h − yt+h|t)2]
Let It = {yt, yt−1, . . .} denote the information set avail- Theorem: The minimum MSE forecast (best forecast)
able at time t. Recall, of yt+h based on It is
E[yt] = μ yt+h|t = E[yt+h|It]

X
var(yt) = σ 2 ψ 2j Proof: See Hamilton pages 72-73.
j=0
Goal: Using It produce optimal forecasts of yt+h for
h = 1, 2, . . . , s

Note:

yt+h = μ + εt+h + ψ 1εt+h−1 + · · ·


+ψ h−1εt+1 + ψ hεt + ψ h+1εt−1 + · · ·
Remarks Linear Predictors

1. The computation of E[yt+h|It] depends on the distri- A linear predictor of yt+h|t is a linear function of the
bution of {εt} and may be a very complicated nonlinear variables in It.
function of the history of {εt}. Even if {εt} is an un-
Theorem: The minimum MSE linear forecast (best linear
correlated process (e.g. white noise) it may be the case
predictor) of yt+h based on It is
that
yt+h|t = μ + ψ hεt + ψ h+1εt−1 + · · ·
E[εt+1|It] 6= 0

Proof. See Hamilton page 74.


2. If {εt} is independent white noise, then E[εt+1|It] =
0 and E[yt+h|It] will be a simple linear function of {εt} The forecast error of the best linear predictor is

yt+h|t = μ + ψ hεt + ψ h+1εt−1 + · · · εt+h|t = yt+h − yt+h|t


= μ + εt+h + ψ 1εt+h−1 + · · ·
+ψ h−1εt+1 + ψ hεt + · · ·
−(μ + ψ hεt + ψ h+1εt−1 + · · · )
= εt+h + ψ 1εt+h−1 + · · · + ψ h−1εt+1
and the MSE of the forecast error is
MSE(εt+h|t) = σ 2(1 + ψ 21 + · · · + ψ 2h−1)
Remarks Example: BLP for MA(1) process

yt = μ + εt + θεt−1, εt ∼ WN(0, σ 2)
1. E[εt+h|t] = 0
Here
2. εt+h|t is uncorrelated with any element in It ψ 1 = θ, ψ h = 0 for h > 1
Therefore,
3. The form of yt+h|t is closely related to the IRF
yt+1|t = μ + θεt
4. M SE(εt+h|t) = var(εt+h|t) ≤ var(yt) yt+2|t = μ
yt+h|t = μ for h > 1
5. limh→∞ yt+h|t = μ The forecast errors and MSEs are

6. limh→∞ MSE(εt+h|t) = var(yt) εt+1|t = εt+1, MSE(εt+1|t) = σ 2


εt+2|t = εt+2 + θεt+1, MSE(εt+2|t) = σ 2(1 + θ2)
Prediction Confidence Intervals Predictions with Estimated Parameters

If {εt} is Gaussian then Let ŷt+h|t denote the BLP with estimated parameters:

yt+h|It ∼ N (yt+h|t, σ 2(1 + ψ 21 + · · · + ψ 2h−1)) ŷt+h|t = μ̂ + ψ̂ hε̂t + ψ̂ h+1ε̂t−1 + · · ·


A 95% confidence interval for the h−step prediction has where ε̂t is the estimated residual from the fitted model.
the form The forecast error with estimated parameters is
q
yt+h|t ± 1.96 · σ 2(1 + ψ 21 + · · · + ψ 2h−1) ε̂t+h|t = yt+h − ŷt+h|t
= (μ − μ̂) + εt+h + ψ 1εt+h−1 + · · · + ψ h−1εt+1
³ ´ ³ ´
+ ψ hεt − ψ̂ hε̂t + ψ h+1εt−1 − ψ̂ h+1ε̂t−1
+···
Obviously,

MSE(ε̂t+h|t) 6= MSE(εt+h|t) = σ 2(1+ψ 21+· · ·+ψ 2h−1)


Note: Most software computes
d 2 2 2
MSE(εt+h|t) = σ̂ (1 + ψ̂ 1 + · · · + ψ̂ h−1)
Computing the Best Linear Predictor The best linear forecasts of yt+1, yt+2, . . . , yt+h are com-
puted using the chain-rule of forecasting (law of iterated
The BLP yt+h|t may be computed in many different but projections)
equivalent ways. The algorithm for computing yt+h|t
yt+1|t = μ + φ(yt − μ)
from an AR(1) model is simple and the methodology al-
yt+2|t = μ + φ(yt+1|t − μ) = μ + φ(φ(yt − μ))
lows for the computation of forecasts for general ARMA
models as well as multivariate models. = μ + φ2(yt − μ)
..

Example: AR(1) Model yt+h|t = μ + φ(yt+h−1|t − μ) = μ + φh(yt − μ)

yt − μ = φ(yt−1 − μ) + εt The corresponding forecast errors are


εt ~W N (0, σ 2) εt+1|t = yt+1 − yt+1|t = εt+1
μ, φ, σ 2 are known εt+2|t = yt+2 − yt+2|t = εt+2 + φεt+1
In the Wold representation ψ j = φj . Starting at t and = εt+2 + ψ 1εt+1
iterating forward h periods gives ..
εt+h|t = yt+h − yt+h|t = εt+h + φεt+h−1 + · · ·
yt+h = μ + φh(yt − μ) + εt+h + φεt+h−1 + · · ·
+φh−1εt+1 +φh−1εt+1
= μ + φh(yt − μ) + εt+h + ψ 1εt+h−1 + · · · = εt+h + ψ 1εt+h−1 + · · · + ψ h−1εt+1
+ψ h−1εt+1
The forecast error variances are AR(p) Models

var(εt+1|t) = σ 2
Consider the AR(p) model
var(εt+2|t) = σ 2(1 + φ2) = σ 2(1 + ψ 21)
.. φ(L)(yt − μ) = εt, εt ∼ W N (0, σ 2)
2 2 2(h−1) 2 1 − φ2h φ(L) = 1 − φ1L − · · · φpLp
var(εt+h|t) = σ (1 + φ + · · · + φ )=σ 2 1−φ The forecasting algorithm for the AR(p) models is essen-
= σ 2(1 + ψ 21 + · · · + ψ 2h−1) tially the same as that for AR(1) models once we put the
Clearly, AR(p) model in state space form. Let Xt = yt − μ. The
AR(p) in state space form is
lim yt+h|t = μ = E[yt] ⎛ ⎞ ⎛ ⎞⎛ ⎞ ⎛ ⎞
h→∞
Xt φ1 φ2 · · · φp Xt−1 εt
σ2 ⎜ Xt−1 ⎟ ⎜ ⎟⎜ X ⎟ ⎜ 0 ⎟
lim var(εt+h|t) = ⎜ ⎟ ⎜ 1 0 · · · 0 ⎟ ⎜ t−2 ⎟ ⎜ ⎟
⎜ .. ⎟=⎜ .. ⎟ ⎜ .. ⎟+⎜ .. ⎟
h→∞ 1 − φ2 ⎝ ⎠ ⎝ ... ⎠⎝ ⎠ ⎝ ⎠

X Xt−p+1 0 1 0 Xt−p 0
= σ2 ψ 2h = var(yt)
h=0 or

ξt = Fξt−1+wt
var(wt) = Σw
Starting at t and iterating forward h periods gives The forecast errors are given by

ξt+h = Fhξt + wt+h + Fwt+h−1 + · · · + Fh−1wt+1 wt+1|t = ξt+1 − ξt+1|t = wt+1


Then the best linear forecasts of yt+1, yt+2, . . . , yt+h wt+2|t = ξt+2 − ξt+2|t = wt+2 + Fwt+1
..
are computed using the chain-rule of forecasting are
wt+h|t = ξt+h − ξt+h|t = wt+h + Fwt+h−1 + · · ·
ξt+1|t = Fξt
+Fh−1wt+1
ξt+2|t = Fξt+1|t= F2ξt
.. and the corresponding forecast MSE matrices are
ξt+h|t = Fξt+h−1|t= Fhξt var(wt+1|t) = var(wt) = Σw
The forecast for yt+h is given by μ plus the first row of var(wt+2|t) = var(wt+2) + Fvar(wt+1)F0
ξt+h|t = Fhξt : = Σw + FΣw F0
..
⎛ ⎞h ⎛ ⎞
φ1 φ2 · · · φp yt − μ h−1
X 0
⎜ ⎟ ⎜ y ⎟ var(wt+h|t) = Fj Σw Fj
1 0 · · · 0 ⎟ ⎜ t−1 − μ
ξt+h|t = ⎜
⎜ ... .. ⎟ ⎜ ..

⎟ j=0
⎝ ⎠ ⎝ ⎠
0 1 0 yt−p+1 − μ Notice that

var(wt+h|t) = Σw + Fvar(wt+h−1|t)F0
Forecast Evaluation Diebold-Mariano Test for Equal Predictive Accuracy

1
Let {yt} denote the series to be forecast and let yt+h|t
2
and yt+h|t denote two competing forecasts of yt+h based
1
on It. For example, yt+h|t could be computed from an
2
AR(p) model and yt+h|t could be computed from an
ARMA(p, q) model. The forecast errors from the two
models are

ε1t+h|t = yt+h − yt+h|t


1

ε2t+h|t = yt+h − yt+h|t


2

The h−step forecasts are assumed to be computed for


t = t0, . . . , T for a total of T0 forecasts giving

{ε1t+h|t}T 2 T
t0 , {εt+h|t}t0
Because the h-step forecasts use overlapping data the
forecast errors in {ε1t+h|t}T 2 T
t0 and {εt+h|t}t0 will be seri-
ally correlated.
The accuracy of each forecast is measured by a particular The Diebold-Mariano test is based on the loss differential
loss function
dt = L(ε1t+h|t) − L(ε2t+h|t)
i
L(yt+h, yt+h|t ) = L(εit+h|t), i = 1, 2 The null of equal predictive accuracy is then
Some popular loss functions are:
H0 : E[dt] = 0
³ ´2
L(εit+h|t) = εit+h|t : squared error loss The Diebold-Mariano test statistic is
¯ ¯
¯ ¯
L(εit+h|t) = ¯εit+h|t¯ : absolute value loss d¯ d¯
S=³ ´1/2 = ³ ´
To determine if one model predicts better than another d d)
avar( ¯ d /T 1/2
LRV d¯
we may test null hypotheses
where
H0 : E[L(ε1t+h|t)] = E[L(ε2t+h|t)] 1 X T
d¯ = dt
against the alternative T0 t=t
0

X
H1 : E[L(ε1t+h|t)] 6= E[L(ε2t+h|t)] LRVd¯ = γ 0 + 2 γ j , γ j = cov(dt, dt−j )
j=1
Note: The long-run variance is used in the statistic be-
cause the sample of loss differentials {dt}T
t0 are serially
correlated for h > 1.
Diebold and Mariano (1995) show that under the null of
equal predictive accuracy
A
S ~ N (0, 1)
So we reject the null of equal predictive accuracy at the
5% level if
|S| > 1.96
One sided tests may also be computed.

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy