KS An 01 002 en PDF
KS An 01 002 en PDF
2003 EDITION
Luxembourg: Office for Official Publications of the
European Communities, 2003
ISBN 92-894-5392-3
ISSN 1725-4825
Cat. No. KS-AN-03-030-EN-N
Economic forecasting:
Models, indicators and
data needs
E U R O P E A N
COMMISSION
Europe Direct is a service to help you find answers to your questions about the European Union
New freephone number:
00 800 6 7 8 9 10 11
A great deal of additional information on the European Union is available on the Internet.
It can be accessed through the Europa server (http://europa.eu.int).
ISBN 92-894-5392-3
ISSN 1725-4825
1. Introduction ............................................................................................................ 1
Literature ................................................................................................................ 9
Appendix.............................................................................................................. 10
Economic forecasting: Models, indicators and data needs
1. Introduction
Economic forecasting is a difficult ‘art’ and a good performance demands a balanced use of
different models, ad hoc indicators and a huge amount of good data. The better data available and,
in general, the more developed models the less need we have for ad hoc indicators. However,
economic theory should always play a significant role. On the other hand, if we rely only on
economic theory and theory related empirical models significant forecast mistakes are more likely
than if we combine model work and non-economic statistical work containing information based
on high frequency macroeconomic indicators. Furthermore, many other sources of information are
relevant such as survey data.
In this paper we look at the benefit and the shortcoming of using the traditional large-scale
macroeconomic model and how these shortcomings can be reduced if a large-scale macroeco-
nomic model is combined with non-economic statistical work.
Economic theory gives a good reference for developing large-scale macroeconomic models. In the
Danish Economic Council such a model have been used since 1973. The model SMEC (the
Simulation Model of the Economic Council) has changed features several times as statistical
methods, economic theory and data made changes relevant.
The present version of the model from 1999 is fully empirically based and contains a description
of the Danish economy disaggregated into 8 sectors. The model contains some 600 equations and
1000 variables. In this model as in most of this type of models the short run production is
determined by the demand but in the long run production is mainly supply driven, particularly by
the supply of labour.
Key areas in the model is the input-output structure, the wage formation and the determination
of the demand of inputs. Also the housing market and the consumption related to the housing
market plays a significant role in the Danish model.
Estimations of the various relations are based on the national account data. In the Danish model
annual data are used but in other models quarterly data are used. In both cases it is relevant to
update key variables so adjustments can be made to take into account the most actual information.
Also several exogenous variables, national as well as international, are important to feed into the
model before the forecast procedures can be finished. It is important to have the most updated
information available and the relevant question is how we gather or produce the information or
how we by using other types of models, surveys or indicators can improve the estimates of key
economic variables relevant for policy makers.
1
3. Diffusion indexes
One of our latest progresses in the forecast area, still labelled as (early) work in progress, is setting
up a forecast model which is based on linear diffusion indexes. The setup we use for this model
follows closely James H. Stock´s and Mark W. Watson´s work on this topic. We want to give
them full credit for the framework as we have adopted most of the model setup from them and
applied it to Danish data.
The diffusion index makes it possible to extract the information contained in a very big number
of high frequency macroeconomic indicators, in both a systematically and consistent, but a non-
economic framework. One main advantage of forecast models using diffusions index is their ability
to use the information from up to several hundred macroeconomic time series and to do
forecasting based on these on central macroeconomic variables. The model setup with diffusion
indexes even allows to incorporate explanatory variables with different frequencies and different
end periods.
As the model setup is purely statistical it will be of little use in understanding the interdependences
between macroeconomic variables and therefore at least the model must have better forecast
performance than traditionally indicator models. Evaluation of this model is therefore primarily
based on a comparison of forecasts found by using different competing forecast models.
The model
In the following a short introduction to the model is given. A much more detailed description can
be found in Stock and Watson (1998). Our presentation will focus on the factor structure, the
forecast model, data and some results.
2
Factor structure
The factor structure captures the co-movement in macroeconomic time series and can be
represented in terms of a statistical factor model such as;
(1) X t = ΛFt + µ t
where Xt is a matric with N time series variables that contains useful information in forecasting the
variable of interest. Notify that in general Xt will also contain the variable that is going to be
forecasted. Ft are the common factors that correspond to the principal components in Xt, and Λ
is the coefficients (factorloadings). In the present approach Λ is assumed to be time invariant, but
in a more general setting it can be time dependent. The disturbance µt is generally correlated
across time and series. Our data set contains 169 series based on monthly frequency and 90 on
quarterly frequency (N=259).
The factor model explains the co-movement in Xt based on a small number of k common factors.
These types of models are however not new1 and have been used in traditional indicator models2.
The EM (Expectation Maximization) algorithm is used to estimate the common factors (Ft) and
the factor loadings (Λ). The standard principal component analysis does not apply or is at least
infeasible as we are using an unbalanced data panel characterised by missing observations, different
frequencies and series that are available over shorter time spans. The last observation date is
usually not the same among the time series, due to different publication/collection procedures. In
practise this results in a data set where some time series only have observations up to three or four
months before the most updated variable. Some variables get published very quickly as for
example the interest rate, exchange rates etc. There is a tradeoff in setting the end period, when
estimating the common factors. In forecasting there is an obvious desire to use all the newest
information - but setting an end period where only a very limited number of variables have
observations results in a weaker estimation of the common factors. As a compromise, which
deserves further analyse, we require that there are common end period observations for at least
1/2 of the variables.
Forecast model
The statistical model is used to predict the growth rate in variables of interest for example in
unemployment, GDP or inflation. The variable of interest is denoted yt and our goal is to estimate
the growth rate in yt - 1, 3, 6 or 12 month ahead given the information contained in the common
factors Ft and allowing for autoregresive lags of yt . The forecast model has the following general
specification for a forecast horizon on 12 months;
y t + 12 j
yt − i
(2) log
y
= δ0 + ∑ ϕ k F k ,t + ∑ βi log
y
+ εt
t k i =0 t − 1− i
1
Sargent and Sims (1977), Geweke (1977), Stock and Watson 1998
2
Stock and Watson 1989 discuss the index of NBER=s index of coincident of leading
indicators using a model much like 1.1.
3
Where E(et+1|{yt-i,Xt-i,Ft-i}i=1-4 = 0. For specific assumptions used for asymptotic analysis see Stock
and Watson (1998). A two-step procedure is used to select the number of common factors and
the lags of yt. Information criteria are used in this model selection. Firstly, we find which common
factors that should be included in the model. In this step no lags of the endogenous variable are
included in the estimation. Secondly, the number of lags are determined. Stock and Watson
provides sufficient conditions under which an information criterion such as BIC 3or AIC4
consistently estimates the numbers of factors and produces an asymptotically efficient forecast.
The exact procedure used by Stock and Watson implies that the model is selected by recursively
estimation of the information criteria for the first common factor then for the second and the third
common factor and so on. The resulting model then consists of a sequence of common factors
starting with the first common factor. In our setup we also use an information criterion in choosing
the number of common factors, but we allow the model to consist of common factors that are not
necessarily in a following sequence. Our approach is similar to a General to Specific procedure
based on an information criterium (AIC). In practice this means hundreds of thousands of
regressions for each model selection. In the present model we constrain our self to testing only the
first twenty common factors.
Data
A large number of Danish data series have been applied to the model. So far the model setup has
been theoretically not economic. However, in the selection of data categories economic theory is
the guideline. The applied categories follow Stock and Watson=s setup. Some categories are
however added to reflect that this model is based on data from a small open economy. The data
set is grouped in nine categories, which are
Each group contains approximately 25 time series. All series are transformed to achieve stationary
time series.
An analysis of the k common factors reveals that groups of variables such as real output,
employment, prices and investments are more or less represented by specific common factors. As
the common factors only are identified up to a kxk matrix, it is not warranted to push this analysis
too far. However, to get an idea about the character of the common factors we have shown the
development in two common factors5 and the development in unemployment and net consumer
prices (net of taxes and duties CPI). A more systematically analysis of the interpretation of the
common factors are not yet done.
3
Sawa=s Baysian information criterium
4
Aike=s information criterion
5
The presented common factors are linear transformed in this presentation.
4
Figure 1First common factor
20
First commonfactor
Unemployment (year/year)
10
-10
-20
-30
1990 1992 1994 1996 1998 2000
3.5
3.0
2.5
2.0
1.5
1.0
0.5
1990 1992 1994 1996 1998 2000
The forecast performance is measured here by the mean square error (MSE) relatively to a
univariate autoregresive process. The univariate autoregressive forecast model has the following
specification, based on a 12-month horizon.
5
y t + 12 j
yt− i
(3) log
y
= δ0 + ∑ βi log
y
+ εt
t i =0 t − 1− i
This model is used as a first approach, to compare the forecast results from the diffusionsindexs-
model. It should be notified that other, more intelligent, models should be compared to the
diffusionsindex model before any superiority can be claimed. Some results from our model applied
on Danish data are shown below. These results are based on forecast models that all have a
horizon of 12 months. The diffusions indexes model reduces the mean square error, in these
models, with approximately 20-45 pct. compared to a univariat autoregressiv forecast model. In
other words there seems to be valuable information in the common factors when forecasting.
Comparisons of forecast results from other models such as leading indicators, structural VAR=s
etc. are however more appropriate in determining the gain of using leading indicators. This work
remains to be done with our data set. Stock and Watson show some really impressing results
where the linear diffusions index model outperforms several competing models.
6
So far the presented results from the diffusions index model are based on - in sample- estimation
of the growth rate in different variables of interest. As an example we will now show the
procedure used in forecasting -out of sample- and some results. The date of forecast is November
13, 2000. Our objective is to estimate the growth rate in net consumer prices up to 12 months
ahead. The data set on a quarterly frequency has at this time observations up to second quarter
of 2000. The data on a monthly frequency has a few observations for October, but most has
September as the last observation date. September is chosen as last observation date for estimation
of the common factors. The applied EM- algorithm is used to fill out any missing observations.
This involves that time series with end periods before September 2000 are assigned estimated
values by the EM algorithm. In the next step the common factors are estimated. To show the
development through the next 12 months, it is necessary to set up 12 forecast models that have
horizons corresponding to 1 - 12 months. For each of the horizons, the previously described
selection method, chooses the Abest@ model - that is the relavant common factors and the number
of lags. The forecast model with a horizon on 12 months is shown in equation (2). This equation
estimates the yearly growth rate 12 months ahead. In table 2 is shown the 12 models (for the
respective 12 horizons) and the entering explanatory variables in these models. The models only
slightly differ in what explanatory variables that are included in these models. As an illustration of
the models for a forecasting horizon of 1, 2, 11 and 12 months can be seen in the appendix. The
forecast from the 12 forecast models, and the corresponding forecasts 1 - 12 months ahead, are
linked together and presented in figure 3. The diffusions indexes model predicts a lower inflation,
measured by the net consumer priceindex in the coming months. Notify that before September
2000, the estimated -in sample - values is derived from the model with a horizon of 12 months.
0
1990 1992 1994 1996 1998 2000
7
Table 2 Forcastmodels for net consumer prices with 12 different horizons
Model with a horizon on # months:
Explanatory variables 1 2 3 4 5 6 7 8 9 10 11 12
F1 x x x x x x x x x x x x
F2 x x x x x x x x x x x
F3 x x x x x x x x x x
F4 x x x x x x x x x x x x
F5
F6 x x x x x x x x x
F7
F8 x x x x x x x x x x
F9 x x x x x x x x x x x x
F10
F11 x x
F12
F13
F14 x x x x x x x x x x x
F15 x
F16 x x x
F17 x x x x x x x x x x x x
F18
F19 x x x x x x x x x x
F20 x x x x x x
#Lag(y) 1 1
F<i> is common factor number <i>
4. Concluding remarks
The diffusion index models used here are still being worked on. There are still many open
questions about how to handle specific practical procedure in making forecast within this
framework. The results -so far- however indicates that there is valuable information in using
diffusion indexes in forecasting macroeconomics variables.
There will newer be one single or simple way to proceed when economic forecasting is performed.
A combination of using large-scale models, small-scale models and various ways of using short
8
term indicators, including surveys, will, in our opinion, always be around and to a certain degree
compete but also supplement each other.
The demand for data will increase and at the same time an increasing focus on quality,
comparability and accessibility. However, if the use of diffusion indexes becomes common these
demands will be even more prevailing.
Literature
Det Økonomiske Råds Sekretariat (The Secretariat of the Danish Economic Council). 1999.
SMEC (Simulation Model of the Danish Economic Council). Arbejdspapir 1999:7 (Working
Paper 1999:7)
Geweke, J. 1977. The Dynamic Factor Analysis of Economic Time Series. In D. J. Aigner and A.
S. Goldberger (eds.). Latent Variables in Socio-Economic Models. North-Holland, Amsterdam
Sargent, T. J. and C. A. Sims. 1977. Business Cycle Modeling without Pretending to have too
much a-priori Economi Theory. In C. Sims et al. (eds.). New Methods in Business Cycle Research.
Federal Reserve Bank of Minneapolis.
Stock, J. H. and M. W. Watson. 1998. Diffusion Indexes. NBER Working Paper 6702.
www.nber.org/paper/w6702
Watson, M. W. 2000. Macroeconomic Forecasting Using Many Predictors. Working Paper, July
2000 (Revised August 2000). www.wws.princeton.edu/~mwatson/html.wp
Lahiri, K. and Moore, G. H.1991. Leading economic indicators. Cambridge University Press.
9
Appendix
Estimated forecastmodels for net consumer prices with 4 different horizons.
yt are the net consumer priceindex
f<i> are the commonfactor number <i>
log(yt+12/yt)
= - 0.00455 * f1 + 0.00291 * f2 + 0.00196 * f3
(8.08153) (2.91884) (2.58279)
log(yt+11/yt)
= - 0.00446 * f1 + 0.00337 * f2 + 0.00143 * f3
(8.63210) (3.70531) (2.06621)
10
Model with a 2 months horizon
Ordinary Least Squares
MONTHLY data for 163 periods from MAR 1987 to SEP 2000
log(yt+2/yt)
= - 0.00093 * f1 + 0.00063 * f3 - 0.00123 * f4
(4.90774) (2.77840) (5.13800)
log(yt+1/yt)
= - 0.00067 * f1 + 0.00071 * f2 - 0.00076 * f4
(5.11757) (3.55288) (4.78190)
11