Modeling Higher Moments
Modeling Higher Moments
December 2017
Contents
Contents 1
1 Stylized facts 2
3 Non-Gaussian distributions 3
3.1 Skewed Student-t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 4
3.1.1 The package fGarch . . . . . . . . . . . . . . . . . . . . . . . . . . . 10
3.2 Skewed Generalized Error . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.2.1 The package fGarch . . . . . . . . . . . . . . . . . . . . . . . . . . . 12
3.3 Johnson SU . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.1 The package SuppDists . . . . . . . . . . . . . . . . . . . . . . . . . 16
3.4 Skewed Generalized-t . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
3.4.1 The package sgt . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
3.5 Generalized Hyperbolic Skew Student-t . . . . . . . . . . . . . . . . . . . . 20
3.5.1 The package SkewHyperbolic . . . . . . . . . . . . . . . . . . . . . . 22
1. Fat tails: The unconditional distribution of returns has fatter tails than that ex-
pected from a Normal distribution. This means that, if we use the Normal distribu-
tion to model financial returns, we will underestimate the number and magnitude of
crashes and booms.
3. Aggregated normality: As the frequency of the returns lengthens, the return distri-
bution gets closer to the Normal distribution.
2
Equation (1) decomposes the return at time t into a conditional mean, µt , and an error
term, εt . The dynamics of the conditional mean is given by (2). The standardized inno-
vation, zt = (xt − µt (θ))/σt (θ) has zero mean and unit variance. Equation (4) determines
the dynamic of volatility. This may be any specification, for instance, the GARCH-type
models. Finally, equation (5) specifies that the standardized innovation follows a condi-
tional distribution f . Vector θ contains all the parameters associated with the conditional
mean, the conditional variance and conditional distribution. In the last, we have shape
parameters. If the conditional distribution is assumed to be Normal N (0, 1), we do not
have shape parameters. In the more general case we consider, shape parameters will gen-
erally involve parameters capturing asymmetry and fat-tailedness of the distribution.
3 Non-Gaussian distributions
In this chapter, we investigate an important issue for the modeling of asset return. It is
the modeling of entire density of returns, so that incorporates some of the features de-
scribed previously, in particular the asymmetry and the fat-tailedness of the distribution.
In several problems in finance, it is crucial to recognize that the conditional distribution
of returns is non-Normal and to correctly model it. A precise knowledge of the conditional
distribution is required, for instance, for asset allocation, or VaR (Value-at-Risk) compu-
tation.
We therefore consider now the explicit modeling of higher moments of asset returns
and their distributions. Since the seminal work of Engle (1982), time-varying volatility
has been shown to produce fat tails in the unconditional distribution. But time-varying
volatility alone is not enough to explain all the tail fatness; volatility filtered residuals
still have tails fatter than the Normal distribution. Instead of consider all the possible
distributions that may fit the returns data empirically, our strategy here is to focus on
the two stylized facts above and set out to find distributions that can capture these two
3
characteristics. Although there are many fat-tailed extensions to the Normal distributions
(such as the Pareto and the Student-t), not all of these distributions can capture asym-
metry. Hence, finding a distribution with a suitable asymmetry property will be a main
objective here. An important criteria in this search for a better alternative distribution is
to have an as large as possible range of admissible skewness and kurtosis. Ideally, the only
constraints on this domain of definition would be those ensuring that the distribution is
definite.
u ∼ i.i.d. g(0, 1)
and x, a Bernoulli process, with probability of success ξ 2 /(1 + ξ 2 ). Let us consider the
following mixture
1
ε = xξ|u| − (1 − x) |u|
ξ
One can show that the unconditional density f (ε|ξ) of ε is
P r(ε ≥ 0|ξ)
= ξ2
P r(ε < 0|ξ)
Figure 1 shows two probability density functions of the Skewness Student-t, one with
negative skewness and another one with positive skewness. The pdf with negative skew-
ness has 0.4 of the probability at the right of the mode (= 0)qand 0.6 of the q
probability at
the left of the mode, i.e. it has an skewness parameter ξ = PP rob(ε≥0|ξ)
rob(ε<0|ξ) =
0.4
0.6 = 0.816.
The pdf with positive skewness has 0.6 of the probability at the right of the mode (= 0)
and 0.4
q of the probability
q at the left of the mode, i.e. it has an skewness parameter
P rob(ε≥0|ξ) 0.6
ξ = P rob(ε<0|ξ) = 0.4 = 1.225. Both density distributions are compared with the
Normal pdf.
4
Both skewness Student-t distributions are not standardized. The SKST with nega-
tive skewness has a mean equal to −0.3001 and a skewness equal to −0.8543 and SKST
with positive skewness has a mean equal to 0.3011 and a skewness equal to 0.8543. Both
distributions have a variance equal to 1.0766 and a kurtosis equal to 10.3656.
In Figure 6 it is observed that skewed Student-t distribution have the same mode
(= 0).
5
The main drawback of this density is that it is expressed in terms of the mode and the
dispersion. To keep it in the ARCH tradition, Lambert and Laurent (2001) expressed the
skewed Student-t density in terms of the mean and the variance.
According to Lambert and Laurent the innovation process zt is said to follow a (stan-
dardized) skewed Student-t distribution, SKST (0, 1, ξ, ν), if
2
f (z|ξ, ν) = s{g[ξ(sz + m)|ν]I(−∞,0) (z + m/s) + g[(sz + m)/ξ|ν]I[0,∞) (z + m/s)} (6)
ξ + 1ξ
where g(·|ν) is the symmetric (unit variance) Student-t density and ξ is the skewness
parameter 1 ; m and s2 are, respectively the mean and the variance of the non-standardized
skewed Student-t and are defined as,
√
Γ( ν−1 ) ν−2
2 1
E(ε|ξ, ν) = √ ξ− ≡m
πΓ( ν2 ) ξ
1
2
V (ε|ξ, ν) = ξ + 2 − 1 − m2 ≡ s2
ξ
where
εt − m
zt =
s
is the standardized random variable with mean 0 and variance 1.
T
1X 2 szt + m −It
− ln(σt ) + (1 + ν)ln 1 + ξ
2 ν−2
t=1
(
1 si zt ≥ − m
s
where It =
−1 si zt < − m
s
Notice also that the density f (zt |1/ξ, ν) is the mirror of f (zt |ξ, ν) with respect to
the (zero) mean, i.e., f (zt |1/ξ, ν) = f (−zt |ξ, ν). Therefore, the sign of ln(ξ) indicates the
direction of the skewness: the third moment is positive (negative), and the density is skew
to the right (left), if ln(ξ) > 0(< 0).
1
The skewness parameter ξ > 0 is defined such that the ratio of probability masses above and below
the mean is
P rob(z ≥ 0|ξ)
= ξ2
P rob(z < 0|ξ)
6
Provided that the positive real values are finite, we can easily obtain from the es-
timated standardized residuals ẑt the empirical skewness and kurtosis coefficients. If ẑt is
normally distributed, Sk(ẑt ) and K(ẑt ) should not be significantly different from 0 and 3,
respectively 2 .
Accordingly and following Lambert and Laurent (2001), if ẑt ∼ SKST (0, 1, ξ, ν) 3 ,
E(ẑ 4 |ξ, ν) − 4E(ẑ|ξ, ν)E(ẑ 3 |ξ, ν) + 6E(ẑ 2 |ξ, ν)E(ẑ|ξ, ν)2 − 3E(ẑ|ξ, ν)4
K(ẑt |ξ, ν) = (8)
V (ẑ|ξ, ν)2
The skewed Student-t distribution leads to a finite rth order moment (r∈ R) if and
only if the corresponding moment of g(·) exists (i.e., for ξ = 1). In our case, g(·) is the
(standardized) Student-t probability density function, with number of degrees of freedom
ν > 2. In particular,
r
ξ r+1 + (−1)
ξ r+1
E(z r |ξ, ν) = Mr
ξ + 1ξ
where Z ∞
Mr = 2sr g(s)ds
0
Mr is the rth order moment of g(·) truncated to the positive real values. For r ≤ −1,
the unimodality of g(·) implies that Mr = ∞. Thus, let us concentrate on positive integer
order moments. From Mr , the following properties can be shown to hold for noncen-
tered moments: For odd r, the rth order moment retains the same absolute value but
changes sign if we invert ξ, takes the value 0 only for ξ = 1, and is an increasing function
of ξ with limξ→∞ E(z r |ξ) = ∞. Even moments, on the other hand, are entirely unaf-
fected by inverting ξ and again increase without bounds in ξ for ξ > 1. Consequently,
minξ E(z r |ξ) = E(z r |ξ = 1) for even r. Expressions for centered moments are readily
available from Mr . In particular, the variance possesses all of the properties just men-
tioned for even noncentered moments.
As shown in Equations (7) and (8), both ξ and ν define skewness and kurtosis. Fur-
thermore, careful scrutiny of the algebra yielding (7) shows that skewness exist if ν > 3.
Last, kurtosis in (8) is well defined if ν > 4. Given these restrictions on the underlying
2
Lambert and Laurent (2001) and Giot and Laurent (2003a) have shown that for various financial daily
returns, it is realistic to assume that ẑt is skewed Student-t distribution.
3
There is a closed form of the kurtosis but it is not tractable.
7
parameters, it is clear that the range of skewness and kurtosis will also be restricted to
a certain domain. The dominating feature of skewness is the ξ parameter and kurtosis is
mainly governed by ν. Note that when ξ = 1 and ν = +∞, we get the skewness and the
kurtosis of the Gaussian density and when ξ = 1 but ν > 2, we have the skewness and the
kurtosis of the (standardized) Student-t distribution.
In Figure 3 both skewed Student-t probability density functions are symmetrical with
respect to the mean (=0). The SKST(0,1,3,0.816) has a mode equal to − m s = 0.2892 and
m
the SKST(0,1,3,1.225) has a mode equal to − s = −0.2892.
8
Figure 4: Probability density functions N (0, 1), ST (0, 1, ν) and SKST (0, 1, ν, ξ) with
ν = 5 and 15 and ξ = 1.3, 1.5 and 2.
Figure 5: Probability density functions N (0, 1), ST (0, 1, ν) and SKST (0, 1, ν, ξ) with
ν = 5 and 15 and ξ = 1/1.3, 1/1.5 and 1/2.
Lambert and Laurent (2001) show that the quantile function skst∗α,ν,ξ of a non-standardized
skewed Student-t density is
(
1
α 2
1
∗ ξ stα,ν 2 (1 + ξ ) , if α < 1+ξ 2,
skstα,ν,ξ = −2
−ξstα,ν 1−α 1
2 (1 + ξ ) , if α ≥ 1+ξ 2 .
where stα,ν is the quantile function of the (unit variance) Student-t density. It is straight-
9
forward to obtain the quantile function of the standardized skewed Student-t skstα,ν,ξ =
(skst∗α,ν,ξ − m)/s.
In the next example, we fit a skewed Student-t distribution to daily logarithmic re-
turns of stock index NIKKEI 225 (2000-2015). We obtain the following estimated pa-
rameters ν = 3.792 and ξ = 0.965. The four first moments obtained from estimation
are: mean= 0.012, standard deviation= 1.526, skewness= −0.303 and kurtosis= Inf . We
obtain this kurtosis because the value of ν is outside of the domain for which kurtosis is
defined. Necessary ν > 4 for kurtosis is defined. The sample statistics of the data series
are: mean= 0.00012, standard deviation= 1.499, skewness= −0.409 and kurtosis= 9.725.
We observe that the fit is good. The 1-day V aR1% is −4.155 with Skewed Student-t,
−4.041 with Student-t (ξ = 1), −3.538 with Normal (ξ = 1 and ν = ∞) and the historical
VaR is −4.088. We note that VaR moves leftwards when we have negative skewness and
kurtosis.
> library(fGarch)
> returns<- scan("rnikkei225.txt")
> fit<- sstdFit(returns)
> param_sstd<- fit$estimate
> mean<- param_sstd[1]
> sd<- param_sstd[2]
> nu<- param_sstd[3]
> xi<- param_sstd[4]
> objective<- fit$minimum
> VaR_sstd<- qsstd(0.01,mean,sd,nu,xi)
> sstdSlider(type="dist")
10
Figure 6: Example with sstdSlider of fGarch package.
Γ(3κ−1 )
V ar(X) = β 2 22/κ
Γ(κ−1 )
Γ(5κ−1 )Γ(κ−1 )
K(X) =
Γ2 (3κ−1 )
As κ increases the density gets flatter and flatter while in the limit as κ → ∞ the
distribution tends toward the uniform. Special cases are the Normal when κ = 2 and the
Laplace when κ = 1.
Standardization is simple and involves rescaling the density to have unit standard
11
deviation, s
Γ(κ−1 )
V ar(X) = 1 ⇒ β = 2−2/κ
Γ(3κ−1 )
Finally substituting into conditional density,
r κ
Γ(κ−1 )
− 12 2−2/κ
z
Γ(3κ−1 )
κe
f (z|κ) = q
Γ(κ−1 ) 1+κ−1
2−2/κ Γ(3κ−1 ) 2 Γ(κ−1 )
The skewed version proposed by Fernández and Steel is obtained from GED probabil-
ity density function. We replace g(·) in equation (6) by standardized GED density.
In Figure 7 both skewed Student-t probability density functions are symmetrical with
respect to the mean (=0). The SGED(0,1,2,0.816) has a mode equal to − m s = 0.3163
m
and the SGED(0,1,2,1.225) has a mode equal to − s = −0.3163. The skewness of these
distributions are -0.3118 and 0.3118 respectively and the kurtosis is close to 3 because for
κ = 2 we are in the Normal case.
12
> library(fGarch)
> returns<- scan("rnikkei225.txt")
> fit<- sgedFit(returns)
> param_sged<- fit$par
> mean<- param_sged[1]
> sd<- param_sged[2]
> kappa<- param_sged[3]
> xi<- param_sged[4]
> objective<- fit$objective
> VaR_sged<- qsged(0.01,mean,sd,kappa,xi)
> sgedSlider(type="dist")
3.3 Johnson SU
The Johnson SU distribution was one of the distribution derived by Johnson (1949) based
on translating the Normal distribution to certain functions. Letting Z ∼ N (0, 1), the
standard Normal distribution, the random variable Y has the Johnson system of fre-
quency curves from this method of transformation Z = γ + δg((Y − ξ)/λ). The form of
the resulting distribution depends on the choice of function g. When g(u) = sinh−1 (u),
the distribution is unbounded, called the Johnson SU distribution. The parameters of the
distribution are ξ, λ > 0, γ, δ > 0.
13
We use a parametrization 4 of the original Johnson SU distribution, so that parameter
ξ and λ are the mean and the standard deviation of the distribution. The parameter γ
determines the skewness of the distribution with γ > 0 indicating positive skewness and
γ < 0 negative. The parameter δ determines the kurtosis of the distribution. δ should be
positive and most likely in the region above 1.
The pdf of the Johnson’s SU , denoted here as JSU (ξ, λ, γ, δ), is defined by
δ 1 1 1 2
fY (y) = p √ exp − z
cλ (r2 + 1) 2π 2
where h i
z = −γ + δsinh−1 (r) = −γ + δ log r + (r2 + 1)1/2
where h i
z = γ + δsinh−1 (r) = γ + δ log r + (r2 + 1)1/2
r = (y − ξ)/λ
Note that Z ∼ N (0, 1), y ∈ R, φ is the probability density function of the standard
Normal distribution. The parameters of the JSU are (ξ, λ, γ, δ)0 with each affecting the
location, scale, skewness and kurtosis of the distribution. The distribution is positively or
negatively skewed according as γ is negative or positive. Holding γ, increasing δ reduces
the kurtosis. As δ → ∞ the distribution approaches the Normal density function. The
parameters are not the discreet raw moments of the distribution. We give the first four
moments of Johnson SU distribution. The mean and the variance are
λ2
V ar(Y ) = σ 2 = (ω − 1)(ωcosh2Ω + 1)
2
4
This parametrization is used by R rugarch package, which we use for estimating the parameters of our
models.
14
where ω = exp(δ −2 ) and Ω = γ/δ. Since there is not much simplification in the
expressions for skewness and kurtosis, we give the third and fourth central moments µ3
and µ4 , respectively
1
µ3 = − ω 2 (ω 2 − 1)2 [ω 2 (ω 2 + 2)sinh3Ω + 3sinhΩ]
4
1
µ4 = (ω 2 − 1)2 [ω 4 (ω 8 + 2ω 6 + 3ω 4 − 3)cosh4Ω + 4ω 4 (ω 2 + 2)cosh2Ω + 3(2ω 2 + 1)]
8
From the transformation of the Normal distribution, the cumulative distribution func-
tion of the JSU distribution is shown below. If Y ∼ JSU (ξ, λ, γ, δ), FY (y) = Φ(γ +
δsinh−1 [(y − ξ)/λ]) where the function Φ(u) is the cumulative distribution function of the
standard Normal distribution.
From the equation above, the quantile function FY−1 can be directly derived as FY−1 =
ξ + λsinh[(Φ−1 (p) − γ)/δ] where the quantile function simply depends on the quantiles of
the standard Normal distribution Φ−1 (p).
Figure 9 shows the distribution’s authorized domain, i.e. the region of values of skew-
ness and kurtosis for which a density exists. This is known as the Hamburger moment
problem, which characterizes the maximum attainable skewness given a level of kurtosis
(see Widder, 1946). From the plot, it is clear that the skewed Student-t has the widest
possible combination of skewness and kurtosis for values of kurtosis less than ∼ 9, whilst
the Johnson SU distribution has the widest combination for values greater than ∼ 9.
Figure 9: Region for Skewness-Kurtosis for which Skewed Student-t and Johnson SU
distributions exist.
15
3.3.1 The package SuppDists
The package SuppDists has ten distributions supplementing those built into R, Inverse
Gauss, Kruskal-Wallis, Kendall’s Tau, Friedman’s chi-squared, Spearman’s rho, maximum
F ratio, the Pearson product moment correlation coefficient, Johnson distributions, nor-
mal scores and generalized hypergeometric distributions.
> library(SuppDists)
> returns<- scan("rnikkei225.txt")
> fit<- JohnsonFit(returns)
> xi<- fit$xi
> lambda<- fit$lambda
> gamma<- fit$gamma
> delta<- fit$delta
16
> VaR_jsu<- qJohnson(0.01,fit)
> hist(returns,freq=FALSE,breaks=50)
> plot(function(x)dJohnson(x,fit),-15,15,add=TRUE,col=2)
Figure 10 shows the fit of Johnson SU distribution for NIKKEI return series.
where 1
2νσλq p B p2 , q − p1
m=
B p1 , q
2 − 12
3
B − p2
p, q B p2 , q − p1
− p1 2
ν=q (3λ + 1) − 4λ2
1 1
B p, q B p, q
B(·) is the beta function, and µ, σ, λ, p and q are the location, scale, skewness, peakedness
and tail-thickness parameters, respectively. Note that the parameters have the following
17
restrictions σ > 0, −1 < λ < 1, p > 0 and q > 0. The skewness parameter λ controls
the rate of descent of the density around x = 0. The parameters p and q control the
height and tails of the density, respectively. The parameter q has the degrees of freedom
interpretation in case λ = 0 and p = 2.
18
The skewness, for pq > 3, is,
3
2q p λ(νσ)3 1 3
2 2 2 1
Sk(X) = 3 8λ B ,q − − 3(1 + 3λ )B ,q
1 p p p
B p, q
2
2 1 3 2 2 1 4 3
B ,q − B ,q − + 2(1 + λ )B ,q B ,q −
p p p p p p p
> library(sgt)
> start = list(lambda = 0, p = 2, q = 6)
> mu.f = mu ~ mean(returns)
> sigma.f = sigma ~ sd(returns)
> fit<- sgt.mle(X.f= ~ returns, mu.f=mu.f, sigma.f=sigma.f, start=start,
method=’nlm’, mean.cent = TRUE, var.adj = TRUE)
> summary(fit)
> param_sgt<- fit$estimate
> VaR_sgt<- qsgt(0.01, mu = mean(returns), sigma = sd(returns),
lambda = param_sgt[1], p = param_sgt[2],q = param_sgt[3])
19
> start1 = list(mu=0, sigma=1,lambda = 0, p = 2, q = 6)
> fit1<- sgt.mle(X.f=~ returns, start=start1)
> param1_sgt<- fit1$estimate
> VaR1_sgt<- qsgt(0.01, mu = param1_sgt[1], sigma = param1_sgt[2],
lambda = param1_sgt[3], p = param1_sgt[4], q = param1_sgt[5])
Figure 12 shows the flexibility of the SGT distribution. The black curve in each graph
has parameter values: µ = 0, σ = 1, λ = 0, p = 2 and q = 100. This approximates a
standard Normal pdf very closely. All other curves change just on parameter.
20
is the only subclass of the GH family of distribution having this property. This is an
alternative for modeling the empirical distribution of financial returns. It is often skewed,
having one heavy and one semiheavy or more gaussian-like tail. The skew extensions to
the Student-t distribution, like that of Fernandez and Steel, have two tails behaving as
polynomials. This means that they fit heavy-tailed data well, but they do not handle
substantial skewness. By substantial skewness is reached with one heavy tail and one
non-heavy tail.
The probability density function of the GH Skew Student-t is given by
1−ν ν+1
p
2 2 δ ν |β| 2 K ν+1 β 2 (δ 2 + (x − µ)2 ) exp(β(x − µ))
2
fX (x) = ν+1 β 6= 0
ν √
p
2
Γ( 2 ) π 2
δ + (x − µ) 2
and −(ν+1)/2
Γ( ν+1 (x − µ)2
2 )
fX (x) = √ 1+ β=0
πδΓ( ν2 ) δ2
pπ
where Kν (x) ∼ 2x exp(−x) for x → ±∞ is the modified Bessel function (Abramowitz
and Stegun, 1972), µ, δ, β and ν determine the location, scale, skew and shape parameters,
respectively.
The density fX (x) when β = 0 can be recognized as that of noncentral Student-t
distribution with ν degrees of freedom, expectation µ and variance δ 2 /(ν − 2).
The mean and variance of a GH skew Studen-t distributed random variate X are
βδ 2
E(X) = µ +
ν−2
and
2β 2 δ 4 δ2
V ar(X) = +
(ν − 2)2 (ν − 4) ν − 2
The variance is only finite when ν > 4, as opposed to the symmetric Student-t distri-
bution, which only requires ν > 2. The derivation of the skewness and kurtosis is relative
straightforward (but cumbersome) due to the normal mixture structure of the distribution.
These are given by
2(ν − 4)1/2 βδ 8β 2 δ 2
Sk(X) = 3(ν − 2) +
[2β 2 δ 2 + (ν − 2)(ν − 4)]3/2 ν−6
and
16β 2 δ 2 (ν − 2)(ν − 4) 8β 4 δ 4 (5ν − 22)
6 2
K(X) = (ν − 2) (ν − 4) + +
[2β 2 δ 2 + (ν − 2)(ν − 4)]2 ν−6 (ν − 6)(ν − 8)
The skewness and kurtosis do not exist when ν ≤ 6 and ν ≤ 8, respectively.
Utilizing the property of the modified Bessel function, it can be shown that in the
tails, the skew Student-t density behaves as
21
Hence the heaviest tail decays as
β<0 and x → −∞
fX (x) ∼ const|x|−ν/2−1 when
β>0 and x → +∞
and the lightest as
β<0 and x → +∞
fX (x) ∼ const|x|−ν/2−1 exp(−2|βx|) when
β>0 and x → −∞
The fitting of data to the skew hyperbolic Student-t distribution is accomplished by the
function skewhypFit. Suitable starting values can be determined with the routine skewhyp-
FitStart. The parameters are determined numerically by applying the ML principle. The
negative log-likelihood is minimized by employing either the general purpose optimizer
optim or the function nlm. For the former the user can use either the BFGS or Nelder-
Mead algorithm. The function skewhypFit returns an object of informal class skewhypFit.
For objects of this kind print, plot and summary methods are available. Goodness of fit
can be inspected graphically by means of a QQ and/or PP plot. The relevant functions
are termed qqskewhyp and ppskewhyp, respectively.
22
> library(SkewHyperbolic)
> returns<-scan("rnikkei225.txt")
> fit<-skewhypFit(returns)
> param_ghst<-fit$param
> mu<-param_ghst[1]
> delta<-param_ghst[2]
> beta<-param_ghst[3]
> nu<-param_ghst[4]
> skewhypMean(param=param_ghst)
> skewhypVar(param=param_ghst)
> skewhypSkew(param = param_ghst)
> skewhypKurt(param = param_ghst)
> mean<-mu+(delta^2*beta)/(nu-2)
> var<-2*beta^2*delta^4/((nu-2)^2*(nu-4))+delta^2/(nu-2)
> skewness<- 2*(nu-4)^(1/2)*beta*delta/(2*beta^2*delta^2+(nu-2)*(nu-4))^(3/2)*
(3*(nu-2)+8*beta^2*delta^2/(nu-6))
> kurtosis<- (6/((2*beta^2*delta^2+(nu-2)*(nu-4))^2))*((nu-2)^2*(nu-4)+
(16*beta^2*delta^2*(nu-2)*(nu-4))/(nu-6)+(8*beta^4*delta^4*(5*nu-22))/((nu-6)*(nu-8)))
> VaR_ghst<-qskewhyp(0.01,param=param_ghst)
# Right tail
> skewhypTailPlotLine(returns, param = param_ghst)
> paramFit <- skewhypFit(returns, plots = FALSE)$param
> tailPlot(returns)
> skewhypTailPlotLine(returns, param = paramFit, col = "steelblue")
# Left tail
> tailPlot(returns, side = "l")
> skewhypTailPlotLine(returns, param = paramFit, side = "l",col = "steelblue")
23
Figure 13: Left and right tail plots for NIKKEI225 with Generalized Hyperbolic Skew
Student-t.
Figure 14: QQ-plot and PP-plot for NIKKEI225 with Generalized Hyperbolic Skew
Student-t.
24
Figure 15: Tail-distributions for NIKKEI225. SSTD is Skewed Student-t, SGED is the
Skewed Generalized Error, JSU is the Johnson SU and GHST is the Generalized Hyper-
bolic Skew Student-t distribution.
V aR1%
Historical -4.088
Normal -3.538
Student-t -4.041
Skewed Student-t -4.155
Skewed Generalized Error -4.483
Johnson SU -4.101
Skewed Generalized-t -4.218
Generalized Hyperbolic Skew Student-t -4.302
25
4 Asymmetric GARCH models
GARCH models are able to capture volatility clustering as well as some amount of fat-
tailedness. However, in these models, positive and negative past values have a symmetric
effect on the conditional variance. Yet, an abundant literature has documented that neg-
ative returns “bad news” tend to be followed by larger increases in volatility than equally
large positive returns “good news”.
Pagan and Schwert (1990) and Engle and Ng (1993) have defined the concept of news
impact curve, which relates past returns shocks (news) to current volatility. This curve
measures how new information is incorporated into volatility estimates. In the GARCH
model, the curve is designed to increase differently in the two directions.
Several parameterizations have been proposed to capture such asymmetry in the re-
sponse of volatility to shocks. To mention some of them, we have the Exponential GARCH
(EGARCH) model of Nelson (1991), the Threshold GARCH (TGARCH) of Zakoian
(1994), the GJR model of Glosten, Jagannathan and Runkle (1993), the Asymmetric
Power ARCH (APARCH) model of Ding, Granger and Engle (1993) and the Absolute
GARCH (AGARCH) model of Hentschel (1995). Some general expressions have been
designed to incorporate the most well-known asymmetric GARCH models (Higgings and
Bera, 1992, or Hentschel, 1995).
We focus on APARCH model. This volatility model introduce the Box-Cox trans-
formation in the conditional standard deviation, and the free parameter (δ) determines
the shape of the transformation.
26
Pq Pp
2. 0 ≤ i=1 αi + j=1 βj ≤1
The APARCH model is a general model because it has great flexibility, having as
special cases: i) The simple ARCH model of Engle (1982) when δ = 2, βj = 0(j = 1, ..., p)
and γi = 0(i = 1, ..., q), ii) The simple GARCH model of Bollerslev (1986) when δ = 2
and γi = 0(i = 1, ..., q), iii) The Absolute Value GARCH (AVGARCH) model of Taylor
(1986) and Schwert (1990) when δ = 1 and γi = 0(i = 1, ..., q), iv) The GJR-GARCH
model of Glosten et al. (1993) when δ = 2, v) The Threshold GARCH (TGARCH) model
of Zakoian (1994) when δ = 1, vi) The Non Linear ARCH model of Higgins et al. (1992)
when βj = 0(j = 1, ..., p) and γi = 0(i = 1, ..., q), vii) The Log-ARCH model of Geweke
(1986) and Pantula (1986)when δ → 0.
A typical work flow would start by specifying the kind of GARCH model with the
function ugarchspec. This is similar in design and purpose to the function garchSpec in
fGarch introduced earlier. It takes five arguments. The kind of the variance model is
determined by a list object for the variance.model argument, the mean equation is set
with the mean.model argument and the distribution of the error process with the distri-
bution.model argument. Starting values for the parameters according to these first three
arguments, as well as whether any of them should be kept fixed, can be controlled with
the start.pars and fixed.pars arguments. The function returns an object of formal class
uGARCHspec.
Object of this class can then be used for fitting data to the chosen specifications,
which is achieved by calling ugarhfit. Apart from an uGARCHspec which is passed as ar-
gument spec to the function, the data set is passed to the body of the function as argument
data. This can be either a numeric vector, a matrix or data.frame object or one of the
specific time series class objects: zoo, xts, timeSeries or irts. The numerical solution of the
model can be determined by one of the optimizers nlminb, solnp or gosolnp, which is set
by the argument solver. Control arguments pertinent to these solvers can be passed down
by providing a list object for the argument solver.control. In addition, the conformance of
stationarity constraints, the calculation of standard errors in the case of fixed parameters
and/or whether the data is to be scaled prior to optimization can be set with the argument
fit.control. A feature of the function is the argument out.sample. Here the user can curtail
27
the size of the sample used for fitting, leaving the remaining n data points available for
pseudo ex ante forecasts.
Forecasts can be generated with the function ugarchforecast. Here, either a fitted
or specified model has to be provided. If the latter is used, a valid set of fixed parameters
must have been set and a data set must be passed to the function as argument data. The
number of forecast periods and/or whether the forecasts shall be derived from a rolling
window are determined by the arguments n.ahead and n.roll, respectively. If exogenous
regressors are specified in either the mean or the variance equation, values for these must
be supplied as a list object for external.forecasts. The forecast function returns object
uGARCHforecast for which data extraction, plotting and performance measure methods
are available. Confidence bands for the forecasts can be generated by means of bootstrap-
ping which is implemented as function ugarchboot.
In addition to using a fitted model for forecasting purposes, the function ugarchsim
enables the user to simulate data from it. Data extraction, show and plot methods are
implemented for the returned object of class uGARCHsim.
The function ugarchroll can be used to back-test the VaR of a return/loss series. The
user can swiftly produce VaR forecasts of a GARCH model and analyze the results of
the object returned, which is of class uGARCHroll, with the methods provided for data
extraction of the VaR numbers, plotting, reporting, testing of the forecast performance
and/or summarizing the backtest results. The backtest can be conducted either for a
recursively extending or a moving sample window.
The user can filter data for a given model specification by utilizing the function ugarch-
filter. The function returns object uGARCHfilter for which the same set of methods is
available as in the case of uGARCHfit objects. Hence, filtering could in principle be uti-
lized to assess the robustness of results, given differing parameter values, and to check
whether an alternative specification yields reasonable results.
> library(rugarch)
> datos<-scan("rnikkei225.txt")
> variance.model=list(model="apARCH",garchOrder=c(1,1),submodel=NULL,
external.regressors=NULL)
> mean.model=list(armaOrder=c(1,0),include.mean=T,arfima=F,external.regressors=NULL)
> spec=ugarchspec(variance.model=variance.model,mean.model=mean.model,
distribution.model="jsu")
> fit=ugarchfit(data=datos,spec=spec,out.sample=0,solver="hybrid")
> gamma<-coef(fit)["skew"]
> delta<-coef(fit)["shape"]
> w<-exp(delta^(-2))
28
> omega<- -gamma/delta
> omega1<- -omega
> lambda<- sqrt(2/((w-1)*(w*cosh(2*omega1)+1)))
> xi<- -lambda*w^(1/2)*sinh(omega1)
The AR(1) model is specified for the conditional mean return, which is sufficient to
produce serially uncorrelated innovations as the Ljung-Box test shows. The Ljung-Box Q
statistics for nine/five/one lag/s computed on standardized residuals are 1.87/1.48/1.38
with p-values 0.75/0.45/0.24 and these are not significant at 1% of significance level. The
APARCH(1,1) model is particularly successful in capturing the heteroskedasticity exhib-
ited by the data. The Ljung-Box Q statistic for nine/five/one lag/s computed on squared
standardized residuals are 1.87/1.48/1.39 with p-values 0.23/0.49/0.32 and these are not
also significant at 1% of significance level. The autorregresive effect in volatility is strong,
with β1 around 0.90, suggesting strong memory effect. The coefficient γ1 is positive and
statistically significant for this serie, indicating the existence of a leverage effect for neg-
ative returns in the conditional variance specification. Finally, δ takes value 1.23, being
significantly different from 2. The result suggest that, contrary to standard practice,
we should model the conditional standard deviation, rather than the conditional vari-
ance. The estimated parameters of the Johnson SU distribution suggest the convenience
of incorporating negative asymmetric feature in the distribution to model standardized
innovations appropriately, i.e. skew parameter (λ in Johnson SU distribution) is nega-
tive and statistically significant. Also we observe this distribution captures kurtosis of
the standardized innovations, shape parameter is greater than 2 (the shape parameter of
Normal distribution is 2) so that kurtosis will be greater than 3.
In summary, these results indicate the need for a model featuring a negative lever-
age effect in the conditional variance equation combined with an asymmetric distribution
for the underlying error term when representing stock market data like NIKKEI225.
29
Figure 16: Estimated parameters of model AR(1)-APARCH(1,1) with Johnson SU distri-
bution.
Figure 17 shows 1% 1-day VaR for NIKKEI 225 obtained with AR(1)-APARCH(1,1)-
JSU model.
30
Figure 17: Daily percentage returns and V aR1% obtained with AR(1)-APARCH(1,1)-JSU
model.
Table 2 shows that the mean of the V aR1% is lower with Normal distribution and it
is higher with Generalized Hyperbolic Skew Student-t. We also observe that minimum
and maximum VaR are produced 28/10/2008 and 22/07/2005 respectively for all distri-
butions. The maximum expected loss in a 1-day with 99% of probability is produced
28/10/2008 (in the crisis period) and this is higher with Generalized Hyperbolic Skew
Student-t. According to the medians, the median is more robust with respect to outliers
than the mean because the VaR series are not normally distributed, we observe that Gen-
eralized Hyperbolic Skew Student-t has a expected loss higher than Normal distribution.
In a backtesting analysis, Generalized Skew Student-t distribution sometimes tends to
overestimate risk and Normal and Student-t tends to underestimate risk. In general, the
Skewed Generalized Error, the Johnson SU and the Skewed Generalized-t are suitable to
estimate VaR.
Table 2: Summary of conditional V aR1% obtained from different distributions with AR(1)-
APARCH(1,1) model.
31