6 Arch-Garch
6 Arch-Garch
Bahadtin Rüzgar ∗
İsmet Kale ∗∗
Abstract: This paper presents the performance of 11 ARCH-type models each with
four different distributions combined with ARMA specifications in conditional mean
in estimating and forecasting the volatility of IMKB 100 stock indices, using daily
data over a 9 years period. The results suggest that fractionally integrated asym-
metric models outperform the non-FI versions and, using skewed-t and student-t
distributions provide better fit to the data for almost every model in estimating
volatility. In forecasting volatility a clear improvement is not observed by altering a
specific model component or distribution.
Keywords: GARCH; EGARCH; GJR; APARCH; IGARCH; FIGARCH; FIAPARCH;
FIEGARCH; HYGARCH; ARMA; GED; Skewed-t; Ox; G@RCH
1. Introduction
Until the 80’s most of the analytical research was on finding the relation between
factors and outcomes. For this purpose it was a mostly used assumption for simplic-
ity that errors were random constants. The models that best represent relations were
those that produced minimum errors. As a result the errors were minimized in mod-
els, and remained out of the subject of quantitative prediction. However in some
cases it is exactly the quantity of those errors, the prediction of which is important.
The world of finance is one of the first who supported the research as they realized
that this error can be interpreted as what we may call the risk.
Generalized AutoRegressiv Conditional Heteroskedasticity (GARCH) is a model
of errors. It is mostly used in other models to represent volatility. The models that
∗
Yrd. Doç. Dr. Bahadtin Rüzgar, Marmara Üniversitesi Bankacılık ve Sigortacılık Yüksekokulu Aktüerya
Bölümü’n’de öğretim üyesidir.
∗∗
İsmet Kale, Marmara Üniversitesi Bankacılık ve Sigortacılık Yüksekokulu Bankacılık Bölümü’nde Yüksek
Lisans öğrencisidir.
The Use Of ARCH And GARCH Models…. 79
make use of GARCH vary from predicting the spread of toxic gases in the
atmosphere to simulating neural activity. But finance is still the leading area and
dominates the research on GARCH.
ARCH class models were first introduced by Nobel price awarded Engle (1982)
with the ARCH model. Since then, numerous extensions have been put forward, all
of them modelling the conditional variance as a function of past (squared) returns
and associated characteristics.
In recent years, the tremendous growth of trading activity and trading losses of
financial institutions has led financial regulators and supervisory committee of banks
to favor quantitative techniques which appraise the possible loss that these
institutions can incur. Value-at-Risk (VaR) has become one of the most sought-after
techniques. The computation of the VaR for a collection of returns requires the
computation of the empirical quantile at level α of the distribution of the returns of
the portfolio. Because quantiles are direct functions of the variance in parametric
models, ARCH class models immediately translate into conditional VaR models.
These conditional VaR models are important for characterizing short term risk for
intradaily or daily trading positions.
In this paper we investigate the estimating and forecasting capabilities of
GARCH models when applied to daily IMKB 100 index data. We furthermore aim
to understand whether IMKB data exhibits the common caracteristics of financial
time series observed in developed countries. We thereby wish to contribute to the
risk management research in Turkey, the outcomes of which will be of crucial value
after the implementation of Basel II regulations in 2007.
The rest of the paper is organized in the following way. In Section 2, we describe
ARMA and GARCH processes as the building blocks of analysed variance models.
These models are applied to daily stock index data in Section 3 where we assess
their performances and conclude.
GARCH models are designed to capture certain characteristics that are commonly as-
sociated with financial time series: fat tails, volatility clustering and leverage effects.
Probability distributions for asset returns often exhibit fatter tails than the
standard normal, or Gaussian, distribution. Time series that exhibit a fat tail
distribution are often referred to as leptokurtic. In addition, financial time series
usually exhibit a characteristic known as volatility clustering, in which large changes
tend to follow large changes, and small changes tend to follow small changes. In
either case, the changes from one period to the next are typically of unpredictable
80 Bahadtin Rüzgar, İsmet Kale
sign. Large disturbances, positive or negative, become part of the information set
used to construct the variance forecast of the next period's disturbance. In this
manner, large shocks of either sign are allowed to persist, and can influence the
volatility forecasts for several periods. Volatility clustering, or persistence, suggests
a time-series model in which successive disturbances, although uncorrelated, are
nonetheless serially dependent.
Finally, certain classes of asymmetric GARCH models are also capable of captur-
ing the so-called leverage effect, in which asset returns are often observed to be
negatively correlated with changes in volatility.
A standard approach of time series analysis is to take a time series that exhibits com-
plicated behavior and try to convert it to a simpler form. Optimally, such simplification
would yield time series that were so simple that they could reasonably be modeled
as independent and identically distributed (IID). In practice, and especially in financial
applications, this is rarely possible. Stationarity is a condition similar to IID, but not
as strong. Two different forms of stationarity are defined:
i) A process is said to be strictly stationary if the unconditional joint distribution
of any segment (yt, yt+1, ..., yt+r) is identical to the unconditional joint distribution of
any other segment (yt+s, yt+s+1, ..., yt+s+r) of the same length.
ii) A process is said to be covariance stationary if the unconditional joint distribution
of any segment (yt, yt+1, ..., yt+r) has means, standard deviations and correlations that are
identical to the corresponding means, standard deviations and correlations of the uncon-
ditional joint distribution of any other segment (yt+s, yt+s+1, ..., yt+s+r) of equal length.
Correlations include autocorrelations and cross correlations.
Strict stationarity is appealing because it affords a form of homogeneity across terms
without requiring that they be independent. Covariance stationarity is the condition that
is more frequently assumed in GARCH models. It does require that all first and second
moments exist whereas strict stationarity does not. In this one respect, covariance sta-
tionarity is a stronger condition (Holton, 1996).
In this paper we are going to construct linear models combining mean and variance
equations holding either covariance or strict stationary. We will use ARMA for mean
and GARCH for variance spesification.
2.1 ARMA (R, S) And ARFIMA (R, D, S) Processes In The Conditional Mean
Box and Jenkins introduced a flexible family of time series models capable of ex-
pressing a variety of short-range serial relationships in terms of linear regression,
where the predictors are previous observations and previous residual errors.
The Use Of ARCH And GARCH Models…. 81
In equation 1, the current value yt is partly based on the value at time t - i (i≤r),
and partly based on a random variable ε, typically Gaussian noise. The influence of
prior values is usually assumed to decay over time, such that φ 1> φ 2> …> φ r .
A second component in the Box-Jenkins framework is the moving average (MA)
process. In an MA process, the observation yt is dependent not on the previous val-
ues of yt, but rather on the values of the noise random variable ε. A moving average
model of order s, MA(s), is defined by
s
yt = ∑ θ j ε t − j + ε j (2)
j =1
where yt depends on the previous s errors εt-j (j≤s) and the current error εt.
Ooms and Doornik (1999) present the basic ARMA(r,s) model as
where r is the order of the AR(r) part, φi its parameters, s the order of the MA(s)
part, θ j its parameters and εt normally and identically distributed noise or innovation
process.
The family of ARMA models as defined by equation 4 is flexible and able to con-
cisely describe the serial dependencies of seemingly complex time series in terms of
the number of parameters (i.e., the order or history) of the AR and MA components,
and the values of these parameters.
In fields such as physics and economics, phenomena that fluctuate over time of-
ten display long-range serial correlations. In order to correctly identify and parsimo-
niously describe processes that give rise to persistent serial correlations, traditional
ARMA time series models can be extended to allow for fractional integration to cap-
ture long-range correlations. The resulting ARFIMA models, popular in economet-
rics and hydrology, allow for simultaneous maximum likelihood estimation of the
parameters of both short-range and long-range processes.
82 Bahadtin Rüzgar, İsmet Kale
Following the description of Laurent and Peters (2002), by using lag polynomi-
nals and introducing a mean μ, the equation 4 becomes
φ ( L)( yt − μ ) = θ ( L)ε t (5)
where L is the lag operator, μ is the unconditional mean of yt,
r s
φ (L) = 1 − ∑i =1
φ i L i and θ ( L ) = 1 + ∑
i =1
θ i L i are the autoregressiv and
the moving average operators in the lag operator. They are polinominals of order r
and s respectively. With a fractional integration parameter d, the ARFIMA(r, d, s)
model is written as
φ ( L )(1 − L ) d ( yt − μ ) = θ ( L )ε t (6)
The fractional differencing operator (1-L)d is a notation for the following infinite
polynominal: ∞
Γ(i − d ) ∞
(1 − L) d = ∑ Γ(i + 1)Γ(−d )
i =0
∑
Li ≡ π i (d ) Li
i =0
(7)
where π i ( z ) ≡ Γ(i − z ) / Γ(i + 1)Γ(− z ) and Γ(.) is the Standard gamma func-
tion. To ensure statitionary and invertibility of the process yt, d lies between -0.5 and
0.5. Given data series yt one can use conditional or exact likelyhood method to spec-
ify the order and parameters. The Ljung-Box statistics of residuals can check the fit.
Bhardwaj and Swanson (2004) found that ARFIMA models perform better for
greater forecast horizons and that they under certain conditions provide significantly
better out-of-sample predictions than AR, MA, ARMA, GARCH, simple regime
switching, and related models.
Throughout the paper ARMA spesification will only be used to model the mean
of returns. ARMA (0,0) implies a constant mean, ARMA(1,0) is simply AR(1). It is
also possible to make the conditional mean a function of the conditional variance. In
that case the conditional variance derived from the GARCH model will be a variable
in the mean equation. This then will be the so called ARCH-in-mean model, which
we denote in this paper with (-m) in naming our models.
If the value of the stock market index at time t is marked Pt , the return of the index
at time t is given by yt = ln (Pt / Pt-1) where ln denotes natural logarithm.
For the log return series yt, we assume its mean is ARMA modelled, then let
εt = yt - μt be the mean corrected log return. Stock market index returns can be mod-
elled with the help of the following equation:
yt = μ + εt, (8)
The Use Of ARCH And GARCH Models…. 83
where µ is the mean value of the return, which is expected to be zero; t is a ran-
dom component of the model, not autocorrelated in time, with a zero mean value.
Sequence εt may be considered a stochastic process, expressed as:
εt = zt σt (9)
where zt is a sequence of independently and identically distributed random vari-
ables, with a distribution E(zt) = 0 and Var(zt) = 1. By definition εt is serially uncor-
related with a mean equal to zero, but its conditional variance equals σt2 and there-
fore may change over time, contrary to what is assumed in the standard regression
model. The conditional variance is the measure of our uncertainty about a variable
given a model and information set.
Following Markowitz' definition of volatility as standard deviation of the ex-
pected return, σt is the volatility of log returns at time t, the changes of which will be
modelled by means of the following ARCH-type models.
q
σ t2 = α + ∑ α iσ t2−i zt2−i
0 (13)
i =1
The ARCH model can describe volatility clustering through the following mecha-
nism: if εt-1 was large in absolute value, σt2 and thus εt is expected to be large in abso-
lute value as well. Even if the conditional variance of an ARCH model is time vary-
q
ing, the unconditional variance of εt is constant provided that α0 > 0 and ∑α
i =1
i < 1.
Conditional variance σt2 has to be positive for all t. Sufficient conditions are when
α0 > 0 and αi ≥ 0. Evidence has shown that a high ARCH order has to be selected to
catch the dynamics of the conditional variance. This involves the estimation of a
large number of parameters. The generalized ARCH (GARCH) model of Bollerslev
(1986) is based on an infinite ARCH spesification and it allows reducing the number
of estimated parameters by imposing nonlinear restrictions on them.
The GARCH model additionally assumes that forecasts of variance changing in time
also depend on the lagged conditional variances of capital assets. An unexpected in-
crease or fall in the returns of an asset at time t will generate an increase in the vari-
ability expected in the period to come.
Introduced by Engle (1982) and Bollerslev (1986) the mostly used GARCH (p,q)
models make σt2 a linear function of lagged conditional variances and squared past
residual
σ t2 = α 0 + α 1ε t2−1 + ... + α qε t2−q + β 1σ t2−1 + ... + β pσ t2− p (14)
using ∑ operator
q p
σ t = α + ∑ α iε t −i + ∑ β iσ t −i
2
0
2 2 (15)
i =1 j =1
where p is the degree of GARCH; q is the degree of the ARCH process, α0 >0, αi ≥0,
q p
βj≥0 . The covariance stationary condition is ∑α + ∑ β
i =1
i
j =1
j < 1. Since the equation
expresses the dependence of the variability of returns in the current period on data
(i.e. the values of the variables εt-i2 and σt-j2) from previous periods, we denote this
variability as conditional.
One can observe that an important feature of the GARCH (p,q) model is that it can
be regarded as an ARMA (r,s), where r is the larger of p and q. This result allows
econometricians to apply the analysis of ARMA process to the GARCH model.
Using the lag operator, the GARCH (p,q) model can be rewritten as: (ω=α0)
The Use Of ARCH And GARCH Models…. 85
Falls and increases in the returns can be interpreted as good and bad news. If a
fall in returns is accompanied by an increase in volatility greater than the volatility
induced by an increase in returns, we may speak of a ‘leverage effect’. Following
classes of asymmetric GARCH models are capable of capturing this effect.
where St is a dummy variable with St-i =1, if εt-i < 0 and St-i =0, if εt-i ≥ 0.
In this model, it is assumed that the impact of εt2 on the conditional variance σt2 is
different when εt is positive or negative. The TARCH model of Zakoian (1994) is
very similar to the GJR, where he preferred to model the standard deviation instead
of the conditional variance. Its basic variant is GJR (1,1), which is expressed by:
σ t2 = ω + αε t2−1 + γσ t2−1 + βε t2−1St −1 (21)
The model can be interpreted that unexpected (unforeseen) changes in the returns of
the index yt expressed in terms of εt, have different effects on the conditional vari-
ance of stock market index returns. An unforeseen increase is presented as good
news and contributes to the variance in the model through multiplicator α. An un-
foreseen fall, which is a bad news, generates an increase in volatility through multi-
plicators α and β. The asymmetric nature of the returns is then given by the nonzero
value of the coefficient β, while a positive value of β indicates a ‘leverage effect’.
q p
The covariance stationary condition is ∑ α (1 + γ
t =1
i i
2
) + ∑ β j < 1.
j =1
Therefore δ1 zt adds the effect of the sign of εt whereas δ2 {|zt| - E (|zt|) adds its
magnitude effect. E (|zt|) depends on the choice of the distribution of return series.
For the normal distribution E (|zt|) =2 .
π
Its basic variant is EGARCH (1, 1) with normal distribution is expressed by:
ε t −1 ⎡ ε t −1 2⎤
ln σ t2 = ω + α (δ 1 + δ2 ⎢ − ⎥ ) + β ln σ t −1
2
(24)
σ t −1 ⎣ σ t −1 π ⎦
The asymmetric nature of the returns is then given by the nonzero value of the coef-
ficient δ1, while a positive value of δ1 indicates a ‘leverage effect ’.
The use of ln transformation ensures that σt2 is always positive and consequently
there are no restrictions on the sign of the parameters. Moreover external unexpected
shocks will have a stronger influence on the predicted volatility than TARCH or
GJR.
In general, the inclusion of a power term acts so as to emphasise the periods of rela-
tive tranquillity and volatility by magnifying the outliers in that series. Squared
terms are therefore so often used in models. If a data series is normally distributed
than we are able to completely characterise its distribution by its first two moments
(McKenzie and Mitchell, 2001). If we accept that the data may have a non-normal
error distribution, other power transformations may be more appropriate.
Recognising the possibility that a squared power term may not necessarily be op-
timal, Ding, Granger and Engle (1993) introduced a new class of ARCH model
called the Power ARCH (PARCH) model. Rather than imposing a structure on the
data, the Power ARCH class of models estimates the optimal power term.
Ding, Granger and Engle (1993) also specified a generalised asymmetric version
of the Power ARCH model (APARCH). The APARCH (p,q) model can be ex-
pressed as: q p
δ
t 0 ∑
σ =α + α (ε −γ ε ) + β σδ
i =1
i t −i
δ
i t −i ∑
j =1
j t− j
(25)
GJR when δ = 2
It also includes four other ARCH extentions which are not tested in this paper
TARCH when δ = 1,
NARCH when γi = 0 (i = 1,. . . ,p) and βj = 0 (j = 1,. . . ,p),
The Log-ARCH, when δ → 0 and
Taylor / Schwert GARCH when δ = 1, and γi = 0 (i = 1, . . . , p).
A stationary solution for APARCH model exists. See Ding, Granger and Engle
(1993) for details.
In explaining the GARCH (p, q) model it was mentioned that GARCH may be re-
garded as an ARCH (∞) process, since the conditional variance linearly depends on
all previous squared residuals. Moreover it was stated that a GARCH process is co-
q p
variance stationary if and only if ∑ α i + ∑ β j < 1. But strict stationarity does not
i =1 j =1
require such a stringent restriction that the unconditional variance does not depend
q p
on t, in fact we often find in estimation that ∑αi + ∑ β j
i =1 j =1
is close to 1.
Lets denote h as the timelag between the present shock and future conditional
variance. Then a shock to the conditional variance σt2 has a decaying impact on σt+h2.
When h increases this impact becomes neglectable indicating a short memory.
q p
However if ∑α + ∑ β
i =1
i
j =1
j ≥ 1, the effect on σt+h2 does not die out even for a very
high h. This property is called persistence in the literature. In many high frequency
time series applications, the conditional variance estimated using GARCH (p,q)
process exhibits a strong persistence.
It was also mentioned that the GARCH (p, q) process can be seen as an ARMA
process. It is known that such an ARMA process has a unit root when
q p
∑ α i + ∑ β j = 1. When
i =1 j =1
the sum of all AR coefficients and MA coefficients is
equal to one, the ARMA process is integrated (ARIMA). Due to their similarity to
ARMA models GARCH models are symetric and have short memory.
The Use Of ARCH And GARCH Models…. 89
q p
A GARCH model that satisfies ∑αi + ∑ β j =
i =1 j =1
1 (or equally rewritten
φ ( L ) (1 − L ) ε t2 = ω + [1 − β ( L ) ] ( ε t2 − σ t
2
), (29)
σ t2 =
ω
[1 − β ( L ) ] {
+ 1 − φ ( L )(1 − L ) [1 − β ( L ) ]
−1
}ε t
2 (30)
As shown in Ding, Granger, and Engle (1993) among others, the effects of a shock
can take a considerable time to decay. Therefore, the distinction between I(0) and
I(1) processes seems to be far too restrictive. In an I(0) process the propagation of
shocks occurs at an exponential rate of decay so that it only captures the short-
memory, while for an I(1) process the persistence of shocks is infinite.
In the conditional mean, the ARFIMA specification has been proposed to fill the
gap between short and complete persistence, so that the short-run behavior of the
time-series is captured by the ARMA parameters, while the fractional differencing
parameter allows for modelling the long-run dependence.
The first long memory GARCH model was the fractionally integrated GARCH
(FIGARCH) introduced by Ballie, Bollerslev and Mikkelsen (1996). The FIGARCH
(p, d, q) model is a generalization of the IGARCH model by replacing the operator
(1-L) of the IGARCH equation by (1-L)d , where d is the memory parameter.
φ ( L) ⎡⎣(1 − α ) + α (1 − L) d ⎤⎦ ε t2 = ω + [1 − β ( L)]ν t (31)
σ t2 =
ω
[1 − β ( L ) ] {
+ 1 − φ ( L )(1 − L ) d [1 − β ( L ) ]
−1
}ε t
2 (32)
FIGARCH models exhibit long memory. They include GARCH models (for d=1)
and IGARCH models (for d=1). In contrast to ARFIMA models, where the memory
parameter d is -0.5 < d <+0.5, FIGARCH d is 0 < d < 1.
FIGARCH-processes are non-stationary like IGARCH-processes. This shows that
the concept of unit roots can hardly be generalized from linear to nonlinear proc-
esses. Furthermore, the interpretation of the memory parameter d is difficult in the
FIGARCH set up.
σ t2 =
ω
[1 − β ( L ) ] { { }}
+ 1 − [1 − β ( L ) ] φ ( L ) 1 + α ⎡⎣ (1 − L ) d ⎤⎦ ε t2
−1
(33)
Chung (1999) underscores some little drawbacks in the BBM model: there is a struc-
tural problem in the BBM specification since the direct implementation of the
ARFIMA framework originally designed for the conditional mean equation is not
perfect for the use in conditional variance equation, leading to difficult interpreta-
tions of the estimated parameters.
Indeed the fractional differencing operator applies to the constant term in the
mean equation (ARFIMA) while it does not in the variance equation (FIGARCH).
Chung (1999) proposes a slightly different process:
{ }
σ 2 = σ 2 + 1 − [1 − β ( L ) ] φ ( L ) (1 − L ) d ( ε 2 − σ 2 )
t
−1
t
(34)
or
σ t
2
=σ 2
+ λ ( L ) ( ε t2 − σ 2 ) (35)
λ (L) is an infinite summation which, in practice, has to be truncated. BBM propose
to truncate λ (L) at 1000 lags and initialize the unobserved εt2 at their unconditional
moment. Contrary to BBM, Chung (1999) proposes to truncate λ (L) at the size of
the information set (t-1) and to initialize the unobserved (εt2- σ2) at 0. In our analysis
we hold the proposal of BBM and truncate at 1000 lags.
The idea of fractional integration has been extended to other GARCH types of mod-
els, including the Fractionally Integrated EGARCH (FIEGARCH) of Bollerslev and
Mikkelsen (1996) and the Fractionally Integrated APARCH (FIAPARCH) of Tse
(1998).
Similarly to the GARCH (p, q) process, the EGARCH (p, q) can be extended to
account for long memory by factorizing the autoregressive polynomial
[1 − β ( L ) ] = φ ( L ) (1 − L ) d where all the roots of φ (z) = 0 lie outside the unit cir-
cle. The FIEGARCH (p, d, q) is specified as follows:
ln(σ t2 ) = ω + φ ( L)−1 (1 − L)− d [1 + α ( L)] s( zt −1 ) (36)
92 Bahadtin Rüzgar, İsmet Kale
{ }
σ tδ = ω + 1 − [1 − β ( L ) ] φ ( L )(1 − L ) d ( ε t − γε t ) δ
−1
(37)
3. Empirical Applications
It is apparent that t distributions shall be preferred if one aims to obtain a better rep-
resentation of the existing data. Among the first 15 best basic estimating models ac-
cording to all five criteria all was either student-t or skewed-t. GED and Normal dis-
The Use Of ARCH And GARCH Models…. 93
Table 3.1.1: First 20 And Last Ten Models With Constant Mean İn
Estimating Performances According To Different Criteria
Akaike Schwartz Shibata Hannan-Quinn
Log-L LogL k LogL k log[log(n)] LogL log (k ) LogL ⎛ n + 2k ⎞
−2 + 2 −2 +2 −2 +2 −2 + log⎜ ⎟
n n n n n n n ⎝ n ⎠
FIAparchCh11Skt FIAparchBBM11St-t FIgarchCh11St-t FIAparchBBM11St-t FIgarchCh11St-t
5
5
FIAparchBBM11Skt FIAparchCh11Skt FIgarchBBM11St-t FIAparchCh11Skt FIgarchBBM11St-t
FIAparchBBM11St-t FIAparchBBM11Skt HYGarch11St-t FIAparchBBM11Skt FIAparchBBM11St-t
FIAparchCh11St-t FIAparchCh11St-t FIgarchCh11Skt FIAparchCh11St-t FIAparchCh11St-t
HYGarch11Skt FIgarchCh11St-t FIGarchBBM11Skt FIgarchCh11St-t HYGarch11St-t
HYGarch11St-t FIgarchBBM11St-t FIAparchBBM11St-t FIgarchBBM11St-t FIgarchCh11Skt
10
10
FIgarchCh11Skt HYGarch11St-t Igarch11St-t HYGarch11St-t FIGarchBBM11Skt
FIGarchBBM11Skt FIgarchCh11Skt FIAparchCh11St-t FIgarchCh11Skt FIAparchCh11Skt
FIgarchCh11St-t FIGarchBBM11Skt Garch11St-t FIGarchBBM11Skt FIAparchBBM11Skt
FIgarchBBM11St-t HYGarch11Skt Gjr11St-t HYGarch11Skt HYGarch11Skt
Aparch11Skt Gjr11St-t HYGarch11Skt Gjr11St-t Gjr11St-t
15
15
Gjr11Skt Gjr11Skt FIAparchCh11Skt Gjr11Skt Gjr11Skt
Aparch11St-t Aparch11St-t FIAparchBBM11Skt Aparch11St-t Garch11St-t
Gjr11St-t Aparch11Skt Igarch11Skt Aparch11Skt Aparch11St-t
FIAparchCh11GED Garch11St-t Garch11Skt Garch11St-t Igarch11St-t
FIAparchBBM11GED Garch11Skt Gjr11Skt Garch11Skt Garch11Skt
20
20
10
For optimizing maximum likelihood, skewed-t performs better than student-t for
all models. On the other hand if we evaluate according to the other four criteria, by
which more complicated models are penalized for the inclusion of additional pa-
rameters, skewed-t looses its apparent advantage, because it requires an additional
skewness parameter. Especially Hannan-Quinn test seems to judge according to the
distribution rather than the model specification and prefer student-t.
We found that the choice of models is at least as important as the choice of
distributions, because best performing models combined with both distributions
found place in the front ranks, mostly successively. For ranking models with
constant mean in estimating performances according to different criteria, test
statistics, illustrated on Table 3.1.5, are used.
more than ARMA specifications alone. Together they improve more but the
marginal benefit decreases.
Among the basic models fractionally integrated ones, especially FIAPARCH of
BBM and that of Chung combined with student-t and skewed-t distributions perform
outstanding based on all criteria. It is also to note that based on the estimation
power, the methods of BBM and Chung report only slight differences. We can con-
clude that among models with the same distribution and same mean specification a
FI (1) model is a better estimator than its FI (0) counterpart. This is a clear indicator
that IMKB 100 indices shows strong persistence and the effect of shocks influence
future returns for long periods. That FI (1) performs better then I (1) show that the
persistence is not completely permanent.
Best estimator in test, Log-L=4241,82. Note that the series and the residuals are
almost identical implying a good reproduction of characteristics, outliers are
perfectly cached.
As we expected, the best models for estimation are not necessarily the best ones for
forecasting. The same thing is also true for the distributions. The specification of the
model has a more clear and predictable effect on Mincer Zarnowitz regression R2.
As explained by Laurent and Peters (2002) the Mincer-Zarnowitz regression has
been largely used to evaluate forecasts in the conditional mean. For the conditional
variance, it is computed by regressing the forecasted variances on the actual
variances.
σt2 = α + βσˆt2 +υt (34)
100 Bahadtin Rüzgar, İsmet Kale
The other criteria are minimizing errors and lead to difficult interpretation and
inconsistent rankings. Like the maximum likelihood in estimation, R2 is in general
more consistent with the aggregated ranking results.
Graph 3.2.1: IGARCH(1,1) Skt is one of the integrated models that proved to be a
good forecast model based on R2. MSE tells it is the third worse.
The Use Of ARCH And GARCH Models…. 101
Graph 3.2.2: EGARCH(1,1) GED converged only weakly after 208 BFGS
iterations in 52 seconds. However it reached a record level R2 and MSE.
Table 3.2.2: First 20 And Last Ten Models With Constant Mean İn
Forecasting Performances According To Different Criteria
RMSE
MAE
Rank
Rank
MSE
R2
R2
R2
R2
EGarch11GED Garch11Skt 14 Gjr11GED 12 Garch11GED 17
5
5
Igarch11Skt FIAparchBBM11Skt 19 Aparch11Ged 13 FIEgarch11Skt 40
Igarch11N FIAparchCh11Skt 18 EGarch11GED 1 Garch11Skt 14
Igarch11GED FIGarchBBM11Skt 25 FIEgarch11N 38 FIAparchBBM11Sk 19
t
Igarch11St-t Igarch11Skt 2 Aparch11N 10 FIAparchCh11Skt 18
Aparch11Skt HYGarch11Skt 21 Gjr11N 11 FIGarchBBM11Skt 25
10
10
Gjr11Skt FIgarchCh11Skt 22 FIAparchBBM11G 20 Igarch11Skt 2
ED
Aparch11St-t Aparch11Skt 6 FIAparchCh11GED 36 HYGarch11Skt 21
Gjr11St-t Gjr11Skt 7 FIEgarch11GED 39 FIgarchCh11Skt 22
Aparch11N HYGarch11St-t 32 Garch11GED 17 FIAparchBBM11St-t 23
Gjr11N FIgarchCh11St-t 33 FIgarchCh11GED 28 FIAparchCh11St-t 37
15
15
Gjr11GED FIEgarch11Skt 40 FIAparchCh11N 35 Igarch11St-t 5
Aparch11Ged Igarch11N 3 Igarch11GED 4 Garch11St-t 16
Garch11Skt FIAparchBBM11St-t 23 FIAparchBBM11N 24 Aparch11Skt 6
Garch11N FIAparchCh11St-t 37 HYGarch11GED 26 Gjr11Skt 7
Garch11St-t Igarch11St-t 5 FI- 31 HYGarch11St-t 32
20
20
garchBBM11GED
Garch11GED Garch11St-t 16 Aparch11St-t 8 FIgarchCh11St-t 33
FIAparchCh11Skt FIgarchBBM11St-t 34 Gjr11St-t 9 FIgarchBBM11St-t 34
FIAparchBBM11Skt Gjr11GED 12 Garch11N 15 FIgarchCh11GED 28
FIAparchBBM11GED Aparch11Ged 13 FIgarchBBM11N 29 FIAparchCh11N 35
……….…………………………………………………………………………………
FIgarchBBM11GE HYGarch11GED 26 FIgarchCh11St-t 33 FIAparchCh11GED 36
10
10
garchBBM11GED
FIAparchCh11St-t HYGarch11N 30 FIGarchBBM11Skt 25 Gjr11GED 12
FIEgarch11N FIgarchCh11N 27 Igarch11Skt 2 Aparch11Ged 13
FIEgarch11GED EGarch11GED 1 HYGarch11Skt 21 EGarch11GED 1
FIEgarch11Skt FIEgarch11N 38 FIgarchCh11Skt 22 FIEgarch11N 38
It is also hard to draw conclusions from the model specification. Remarkable are the
forecasting performances of GJR, IGARCH and GARCH. R2 gave the worst
performances with FI (1) models while the best performers were I (1) and non-
integrated ones. Therefore we can conclude that either a complete integration or no
integration is preferred to a fractional integration. In general one obtains better R2
results the simpler a model is specified. Increased parameters through modifications in
mean or higher orders provide poor R2 results. Especially the order (2, 2) consistently
outputs very poor R2. The order (3, 3) can be either a good performer or a bad choice,
but it is worth trying. The results of FI processes of BMM and Chung are again very
similar. According to the evaluation criteria they are either among the first or among
the very last.
RMSE, MSE and MAE give very different rankings in cross comparison.
However in general mean specifications restrict the flexibility of all models and
result in a general trend and can not capture the outliers.
Simple GARCH(1,1) performs generally well according to all criteria. GARCH
estimation outputs the sum of all coefficients very close to 1. This explains its
forecasting success close to IGARCH. The size of the sample is a crucial factor
affecting the forecasting performance. Therefore we believe that most models would
behave differently with different sample sizes which could be the topic of a separate
research. Test statistics for forecast evaluation measures of some models are given
in the following Table.
4. Conclusion
Most linear time series models for prediction of returns descend from the AutoRe-
gressiv Moving Average (ARMA) and Generalized Autoregressiv Conditional Het-
eroskedastic (GARCH) models. Both concepts are useful in volatility modeling, but
less useful in return prediction.
Scientific prediction involves the spotting of past patterns or regularities and test-
ing them on recent observations. The data used to spot the patterns can therefore be
called the training data. Parametric models like GARCH make use of the training
data to modify the parameters in such a way that it fits best to the data. As a conse-
quence well structured models are able to model the data almost precisely. However
in the attempt to predict the future values with the same model one actually assumes
that the future results will follow the same characteristics, same patterns. It is also
assumed that the reactions to factors not included in the model are similar in both
the past and the future. This is the reason why GARCH models as parametric speci-
fications operate best under relatively stable market conditions. Although GARCH
is explicitly designed to model time-varying conditional variances, GARCH models
can fail to predict highly irregular phenomena, including wild market fluctuations
(e.g., crashes and subsequent rebounds), and other highly unanticipated events that
106 Bahadtin Rüzgar, İsmet Kale
can lead to significant structural change. The choice of the optimum sample size, the
window size for p and q are still highly an art based on experience.
In this study, in order to the estimation and forecasting performances of different
GARCH processes, Ox 3.40 and its G@RCH 3.0 packages dedicated to GARCH
models and many of its extensions by Laurent and Peters is used for analyzing the
models. With open-source code are able to add or modify specifications, processes
or graphics in the future.
The (p,q) = (1,1) variant of all models are systematically tested with four different
distributions, namely, Gaussian Normal, Student-t, Generalized Error Distribution
(GED) and Skewed-t distribution. 94 models of higher order are unsystematically
experimented to examine the effects on estimation and forecasting performances
with an emphasis on Maximum Likelihood.
Estimation results are evaluated on the basis of ML, Akaike, Schwarz, Shibata
and Hannan-Quinn values, whereas forecasting results are ranked according to Min-
cer Zarnowitz regression R2, Root Mean Square Error, Mean Square Error and Mean
Absolute Error value criteria. In comparing the estimation powers of models, we re-
strict our comments to student and skewed-t distributions. For optimizing maximum
likelihood, skewed-t performs better than student-t for all models. On the other hand
if we evaluate according to the other four criteria, by which more complicated mod-
els are penalized for the inclusion of additional parameters, skewed-t looses its ap-
parent advantage, because it requires an additional skewness parameter. Especially
Hannan-Quinn test seems to judge according to the distribution rather than the
model spesification and prefer student-t. We found that the choice of models is at
least as important as the choice of distributions, because best performing models
combined with both distributions found place in the front ranks, mostly succes-
sively.
Log Likelihood results are consistent with aggregate results. Normal distribution
estimates worse for all models. Higher orders alone improve results more than
ARMA specifications alone. Together they improve more but the marginal benefit
decreases. As a result we can conclude that, based on the maximum likelihood, the
more complex a model is the better it fits to the data.
Combined rankings allow for the conclusion that maximum likelihood is a
consistent evaluation criterion. While Akaike, Shibata, Schwarz and Hannan-Quinn
may result in complete different rankings, the aggregate results are consistent with
that of the maximum likelihood.
If skewed-t or student-t distributions are used, it is always possible to increase the
likelihood by adding more tailored processes, implying a better fit to the data on the
The Use Of ARCH And GARCH Models…. 107
Özet: Bu çalışma, 9 yıllık günlük verilere dayanarak IMKB 100 endeksinin vola-
tilitesini değerlendirmek ve tahmin etmek için, her biri dört ayrı dağılımla denenen,
ARMA özellikleri eklenebilen 11 değişik ARCH modelinin performansını sunmaktadır.
Elde edilen sonuçlara göre, aynı dağılım kullanılırsa, kısmi entegre edilmiş asimetrik
modeller bu özelliğe sahip olmayan orjinal versiyonlarından daha iyi volatilite değer-
lemesi yapabilmektedir. Çarpık-t ve Student-t dağılımlarının kullanılması modelin
veriye daha uyumlu olmasını sağlamaktadır. Sonuç olarak, belirli bir model veya da-
ğılımın kullanılmasının volatilite tahmininde açık bir iyileşmeye yol açmadığı gözlen-
miştir.
Anahtar Kelimeler: GARCH; EGARCH; GJR; APARCH; IGARCH; FIGARCH;
FIAPARCH; FIEGARCH; HYGARCH; ARMA; GED; Skewed-t; Ox; G@RCH
108 Bahadtin Rüzgar, İsmet Kale
References
Laurent, S., and J. P. Peters (2002): “A tutorial for G@RCH 2.3: a complete Ox Package
for Estimating and Forecasting ARCH Models,” G@RCH 3.0 documentation.
Lambert, P., and S. Laurent (2001): “Modelling Financial Time Series Using GARCH-Type
Models and a Skewed Student Density,” Mimeo, Universite de Liege., November 10,
2004. http://www.core.ucl.ac.be/~laurent/pdf/Lambert-Laurent.pdf
McKenzie, M. and Mitchell, H. (2001): “Generalised Asymetric Power ARCH Modelling of
Exchange Rate Volatility” Royal Melbourne Institute of Technology discussion paper,
January 4, 2005. http://mams.rmit.edu.au/mmt5alsrzfd2.pdf
Mina, J. and Xiao, J..Y. (2001): “Return to RiskMetrics: The Evolution of a Standard” Copy-
right © 2001 RiskMetrics Group. Update and restatement of the mathematical models
in the 1996 RiskMetrics Technical Document, November 2, 2004.
http://www.riskmetrics.com
Nelson, D. (1991): “Conditional heteroskedasticity in asset returns: a new approach,”
Econometrica, 59, 349–370., December 7, 2004. http://www.sciencedirect.com
Peters, J.P. (2001) “Estimating and Forecasting Volatility of Stock Indices Using Asym-
metric GARCH Models and Skewed Student-t Densities,” Working Paper, École d'Ad-
ministration des Affaires, University of Liège, Belgium, January 3, 2005.
http://www.panagora.com/2001crowell/2001cp_50.pdf
Tse, Y. (1998): “The Conditional Heteroscedasticity of the Yen-Dollar Exchange Rate,”
Journal of Applied Econometrics, 193, 49–55., November 10, 2004.
http://qed.econ.queensu.ca:80/jae/1998-v13.1/ by http://ideas.repec.org
Zakoian, J. M. (1994): “Threshold Heteroscedastic Models,” Journal of Economic Dynamics
and Control, 18, 931–955., November 10, 2004. http://papers.ssrn.com
Zemke, S. (2003): “Data Mining for Prediction. Financial Series Case” The Royal Institute
of Technology Department of Computer and Systems Sciences Doctoral Thesis, Decem-
ber 4, 2004. http://szemke.math.univ.gda.pl/zemke2003PhD.pdf