Chapter 8 (Brooks) : Modelling Volatility and Correlation
Chapter 8 (Brooks) : Modelling Volatility and Correlation
1
Stylized Facts of Financial Data
• The linear structural and time series models assume constant error variance. This assumption is
not realistic in financial time series data especially at high frequency. Some of the stylized facts
of financial returns data are as follows:
- Fat Tails and Leptokurtosis: Distribution of financial data has tails thicker than the normal
distribution. Tails decrease slower than . Kurtosis of
x 2financial data are almost always greater
e
than 3 (correspond to normal). Ignoring this may lead to under estimation of tail risk which has
obvious consequence for asset allocation and risk management.
- Volatility clustering: Tendency for volatility in financial markets to appear in bunches. Thus
large returns (of either sign) are expected to follow large returns and small returns (of either
sign) to follow small returns. This is due to the information arrival process. Information (news)
arrive in bunches.
- Asymmetric volatility (leverage effects): Tendency of volatility to rise more following a large
price fall than following a large price rise of the same magnitude. A price fall results in increase
in firms debt to equity ratio (i.e. leverage). Shareholder's residual claim is at larger risk.
2
A Sample Financial Asset Returns Time Series
0.04
0.02
0.00
-0.02
-0.04
-0.06
-0.08
1/01/90 11/01/93 Date 9/01/97
3
KSE-100 index returns
10
8
6
4
2
0
-2
-4
-6
-8
-10
Day
4
KSE-100 index returns
1,200
Series: KSE index returns
Sample Jan 1990 July 1999
1,000
Observations 2609
5
Are KSE returns different from IID
normal returns
Simulated iid returns with same mean and standard deviation as
KSE data for Jan 1990-Dec 1999
8
6
4
IID Retruns
2
0
-2
-4
-6
Day
6
KSE-100 index returns
10
8
6
4
2
0
-2
-4
-6
-8
-10
Da y
7
KSE-100 index returns
600
Series: KSE index returns
Sample Jan 2000 July 2008
500
Observations 2217
8
Daily Exchange rate (Pak Rs/US$)
changes April 2006 to July 2008
Daily Exchange Rate log change
2
(% exchange rate change)
-1
-2
-3
day
9
Daily Exchange rate (Pak Rs/US$)
changes April 2006 to July 2008
400
Series: Daily Exchange Rate
Sample april 2006 july 2008
Observations 588
300
Mean 0.028107
Median 0.000000
Maximum 2.406344
200 Minimum -1.902032
Std. Dev. 0.318125
Skewness 0.650558
Kurtosis 20.89349
100
Jarque-Bera 7885.813
Probability 0.000000
0
-2 -1 0 1 2
10
Monthly inflation rate (CPI-12 major
cities %log change)
Inflation Rate
3
2.5
Monthly Inflation rate (%)
2
1.5
1
0.5
0
-0.5
-1
-1.5
Month
11
Monthly inflation rate (CPI-12 major
cities %log change)
14
Series: INFLATION
12 Sample 1995M01 2006M06
Observations 126
10
Mean 0.512396
Median 0.471274
8
Maximum 2.366600
Minimum -0.851813
6 Std. Dev. 0.599854
Skewness 0.381571
4 Kurtosis 3.255628
2 Jarque-Bera 3.400593
Probability 0.182629
0
-0.5 -0.0 0.5 1.0 1.5 2.0
12
Non-linear Models: A Definition
• Models with nonlinear g(•) are “non-linear in mean”, while those with nonlinear
2(•) are “non-linear in variance”. Our “traditional” structural model could be
something like:
yt = 1 + 2x2t + ... + kxkt + ut, or more compactly y = X + u.
• We also assumed ut N(0,2).
13
Heteroscedasticity Revisited
14
Autoregressive Conditionally Heteroscedastic
(ARCH) Models
• So use a model which does not assume that the variance is constant.
• Recall the definition of the variance of ut:
t2 t ut-1, ut-2,...) = E[(ut-E(ut))2 ut-1, ut-2,...]
= Var(u
We usually assume that E(ut) = 0
2
so t = Var(ut ut-1, ut-2,...) = E[ut2 ut-1, ut-2,...].
• What could the current value of the variance of the errors plausibly
depend upon?
– Previous squared error terms.
• This leads to the autoregressive conditionally heteroscedastic model for
the variance of the errors:
= 0 + 1 t2 ut21
• This is known as an ARCH(1) model.
15
Autoregressive Conditionally Heteroscedastic
(ARCH) Models (cont’d)
• The full model would be
yt = 1 + 2x2t + ... + kxkt + ut , ut N(0, t2)
2 2
where t = 0 + 1 ut 1
• We can easily extend this to the general case where the error variance
depends on q lags of squared errors:
t2 = 0 + 1 ut 1+2 ut 2+...+q ut q
2 2 2
16
Another Way of Writing ARCH Models
t 0 1ut21 , vt N(0,1)
• The two are different ways of expressing exactly the same model. The
first form is easier to understand while the second form is required for
simulation from an ARCH model.
17
Testing for “ARCH Effects”
1. First, run any postulated linear regression of the form given in the equation
above, e.g. yt = 1 + 2x2t + ... + kxkt + ut
saving the residuals, û t.
2. Then square the residuals, and regress them on q own lags to test for ARCH
of order q, i.e. run the regression
uˆt2 0 1uˆt21 2uˆt2 2 ... quˆt2 q vt
where vt is iid.
Obtain R2 from this regression
18
Testing for “ARCH Effects” (cont’d)
If the value of the test statistic is greater than the critical value from the
2 distribution, then reject the null hypothesis.
• Note that the ARCH test is also sometimes applied directly to returns
instead of the residuals from Stage 1 above.
19
Problems with ARCH(q) Models
• How do we decide on q?
• The required value of q might be very large
• Non-negativity constraints might be violated.
– When we estimate an ARCH model, we require i >0 i=1,2,...,q
(since variance cannot be negative)
• A natural extension of an ARCH(q) model which gets around some of
these problems is a GARCH model.
20
Generalised ARCH (GARCH) Models
• Due to Bollerslev (1986). Allows the conditional variance to be dependent
upon previous own lags
• The variance equation is now
2 2 2
= + u + (1)
t 0 1 t
1 t-
1
variance equation.
• We could also write
2 2 2
t
-1 = 0 + 1u
t2+ t-2
2 2
2
=
t
-
2+
0u
+
1
t
3t
-
3
• Substituting into (1) for t-12 :
2 22 2
t = 0 + 1 u
t 1+ ( +
u
0
1
t+
t
-
2
2)
22 2
+
=
u
0
1
t+
0
1
+
u
1
t+
t
-
2
2
21
Generalised ARCH (GARCH) Models (cont’d)
=
• Now substituting into (2) for t-22
2
+22
u
++u+2
(+
u2
+2
)
t
0 1
t
101
t
2 0
1
t
3t
-
3
2 22 2223
2
=+u
++u+ +
u+
t01
t
101
t0
2t1
3t
-
3
2 2 2 22
32
=
t(
1
+
0+
+
)1u
(
1
t
1L
+
+L)
+
t-
3
• An infinite number of successive substitutions would yield
2 2 2 2 2 2
t = 0 (
1+ + + ..
.)+ 1 ut1(1+ L + L +.
..
)+ 0
• So the GARCH(1,1) model can be written as an infinite order ARCH model.
• We can again extend the GARCH(1,1) model to a GARCH(p,q):
2 2 22 22 2
=
t+
0u
+
1
t
1u
2
t+
.
.
.
+
2u
+
q
t
q1
t+
-
1
2+
t
-
2.
.
.
+
pt
-
p
q p
2
t =0
u
j t
j
2 2
i t
i
i1 j
1
22
Generalised ARCH (GARCH) Models (cont’d)
23
The Unconditional Variance under the GARCH
Specification
•
This would be useful in volatility forecast
is termed “non-stationarity” in variance
1 1
• 1
=
1is termed intergrated GARCH
24
Estimation of ARCH / GARCH Models
• Since the model is no longer of the usual linear form, we cannot use OLS.
• We use another technique known as maximum likelihood.
• The method works by finding the most likely values of the parameters given
the actual data.
• More specifically, we form a log-likelihood function and maximise it.
25
Estimation of ARCH / GARCH Models (cont’d)
1. Specify the appropriate equations for the mean and the variance - e.g. an
AR(1)- GARCH(1,1) model:
yt = + yt-1 + ut , ut N(0,t2)
2 22
=
t+
0u
+
1
t
1t
-
1
2 2 t 1 2 t 1
3. The software will maximise the function and give parameter values and
their standard errors
26
Parameter Estimation using Maximum Likelihood
• Consider the bivariate regression case with homoskedastic errors for
simplicity: y t 1 2 xt u t
27
Parameter Estimation using Maximum Likelihood
(cont’d)
• Then the joint pdf for all the y’s can be expressed as a product of the individual
density functions
f ( y1 , y 2 ,..., yT 1 2 X t , 2 ) f ( y1 1 2 X 1 , 2 ) f ( y 2 1 2 X 2 , 2 )...
(2) f ( yT 1 2 X 4 , 2 )
T
f ( yt 1 2 X t , 2 )
t 1
• Substituting into equation (2) for every yt from equation (1),
(3)
f ( y , y ,..., y x , 2 ) 1 1 T ( y t 1 2 xt ) 2
1 2 T 1 2 t exp
( 2 )
T T
2 t 1 2
28
Parameter Estimation using Maximum Likelihood
(cont’d)
• The typical situation we have is that the xt and yt are given and we want to
estimate 1, 2, 2. If this is the case, then f() is known as the likelihood
function, denoted LF(1, 2, 2), so we write
1 1 T
( y x ) 2
(4)LF ( , , )
1 2
2
exp t 1 2 t
( 2 )
T T
2 t 1 2
29
Parameter Estimation using Maximum Likelihood
(cont’d)
• Since max f ( x ) max log( f (,xwe )) can take logs of (4).
x x
• Then, using the various laws for transforming functions containing logarithms, we
obtain the log-likelihood function, LLF:
T 1 T
( y x ) 2
LLF T log log(2 )
2
2 t 1
t
1
2
2 t
• which is equivalent to
(5) T T 1 T
( y x ) 2
2
2 t
y ˆ ˆ x 0
t 1 2 t
y Tˆ ˆ x 0
t 1 2 t
1 ˆ ˆ 1
(9) T
t 1 2T
y x t 0
ˆ1 y ˆ 2 x
31
Parameter Estimation using Maximum Likelihood
(cont’d)
• From (7), ( y ˆ ˆ x ) x 0
t 1 2 t t
y x ˆ x ˆ x 0
t t 1 t 2
2
t
y x ˆ x ˆ x 0
t t 1 t 2
2
t
ˆ x y x ( y ˆ x ) x
2
2
t t t 2 t
ˆ x y x Tx y ˆ Tx
2
2
t t t 2
2
ˆ 2 ( xt2 Tx 2 ) y t xt Tx y
(10)
ˆ 2
y x Tx y
t t
( x Tx )2
t
2
• From (8), T 1
ˆ 2 ˆ 4
(y t ˆ1 ˆ 2 xt ) 2
32
Parameter Estimation using Maximum Likelihood
(cont’d)
• Rearranging, ˆ 2 1 ( y t ˆ1 ˆ 2 xt ) 2
T
1
(11) 2 ut2
T
33
Estimation of GARCH Models Using
Maximum Likelihood
Now we have yt = + yt-1 + ut , ut N(0,
T
2
=
t+
0
2
u
+
1
t
1
2
t
-
1
1 T 1 T
) t2
L log(2 ) log( t ) ( y t y t 1 ) 2 / t
2 2
2 2 t 1 2 t 1
• Unfortunately, the LLF for a model with time-varying variances cannot be maximised
analytically, except in the simplest of cases. So a numerical procedure is used to
maximise the log-likelihood function. A potential problem: local optima or
multimodalities in the likelihood surface.
34
Non-Normality and Maximum Likelihood
35
QML Estimator
• Let the normal density for error is.
36
Extensions to the Basic GARCH Model
37
The EGARCH Model
2
• Due to Glosten, Jaganathan and Runkle
=
t01
t
1t
-
1
+
t
-
1
t-
1 u 2
+ 2
+
= 0 otherwise
• For a leverage effect, we would see > 0.
• We require 1 + 0 and 1 0 for non-negativity.
39
An Example of the use of a GJR Model
40
News Impact Curves
The news impact curve plots the next period volatility (ht) that would arise from various
positive and negative values of ut-1, given an estimated model.
News Impact Curves for S&P 500 Returns using Coefficients from GARCH and GJR
Model Estimates:
0.14
GARCH
GJR
0.12
Value of Conditional Variance
0.1
0.08
0.06
0.04
0.02
0
-1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
41
Value of Lagged Shock
GARCH-in Mean
• Engle, Lilien and Robins (1987) suggested the ARCH-M specification.
A GARCH-M model would be
yt = + t-1 + ut , ut N(0, t2)
2 22
=
t+
0u
+
1
t
1t
-
1
• GARCH can model the volatility clustering effect since the conditional
variance is autoregressive. Such models can be used to forecast volatility.
• We could show that
Var (yt yt-1, yt-2, ...) = Var (ut ut-1, ut-2, ...)
• So modelling t2 will give us models and forecasts for yt as well.
• Variance forecasts are additive over time.
43
Forecasting Variances using GARCH Models
• What is needed is to generate are forecasts of T+12 T, T+22 T, ...,
T+s2 T where T denotes all information available up to and
including observation T.
• Adding one to each of the time subscripts of the above conditional
variance equation, and then two, and then three would yield the
following equations
T+12 = 0 + 1 u2T+T2 , T+22 = 0 + 1 u2T+1 +T+12 , T+32 = 0 + 1
u2T+2+T+22
44
Forecasting Variances
using GARCH Models (Cont’d)
2
• Let 1f,T be the one step ahead forecast for 2 made at time T. This is
easy to calculate since, at time T, the values of all the terms on the RHS
are known.
• 1f,T 2 would be obtained by taking the conditional expectation of the
first equation at the bottom of slide 44:
2
f 2 2
1
,
T0 +
u
=
+
1
T T
f 2
• Given, 1f,T how is 2 ,T , the 2-step ahead forecast for 2 made at time T,
2
45
Forecasting Variances
using GARCH Models (Cont’d)
• We can write
E(uT+12 t) = T+12
• But T+12 is not known at time T, so it is replaced with the forecast for it,
,sof 2that the 2-step ahead forecast is given by
1,T 2 2 f 2
=20f,T+ 1 + 1f,T 1,T
2
=2 ,0T + (1+)
f 2 1,T
f
1. Option pricing
C = f(S, X, 2, T, rf)
2. Conditional betas
im,t
i ,t
m2 ,t
3. Dynamic hedge ratios
The Hedge Ratio - the size of the futures position to the size of the underlying
exposure, i.e. the number of futures contracts to buy or sell per unit of the spot
good.
47
What Use Are Volatility Forecasts? (Cont’d)
• What if the standard deviations and correlation are changing over time?
Use s ,t
h p
t t
F ,t
48
Testing Non-linear Restrictions or
Testing Hypotheses about Non-linear Models
• Usual t- and F-tests are still valid in non-linear models, but they are
not flexible enough.
49
Likelihood Ratio Tests
50
Likelihood Ratio Tests (cont’d)
51
Modelling Volatility of KSE Returns
• Let’s consider daily KSE-100 index return data from Jan 1, 2000- June 30,
2008. We hold last 25 observations for out of sample forecast comparison.
Here is the of plot returns which shows typical pattern of financial data
10
8
6
4
2
0
-2
-4
-6
-8
-10
Da y
52
Identifying GARCH order: Correlogram
of KSE-return
• There seem to be some significant
autocorrelations up to and including 10
lags. The mean equation may involve
too many parameters
53
Identifying GARCH order: Correlogram
of Squared KSE-return
• PACF of first four lags and at some higher large are large. Possibly indicating
that simple ARCH may involve too many parameters.
54
Testing for ARCH Effects in Returns
• ARCH LM test can be directly applied to kse-returns to see if squared returns has
autoregressive structure
Ho :1 2 ... k 0 where k are coefficients of
lags of squared returns
• LM(11)=TR2= 2207*0.1484=327.5 Pval=@chisq(327.5,11)=0.000
• As suggested in the literature e.g. Enders (2004, p146) we start with the most
parsimonious model for conditional volatility that has often been found to be
satisfactory in developed stock markets i.e. GARCH(1,1),.
• If it fails to satisfy any diagnostics, more complicated model are needed.
55
Tentative Model ARMA(1,1)-GARCH
(1,1)
• Let’s examine the fit of ARMA(1,1)-
GARCH(1,1) model. All the coefficients in mean
and volatility equations are significant. Also the
GARCH model seem to be covariance stationary
as the sum of alpha +beta
=0.1852+0.7525=0.937<1. That means the
model can be considered an IGARCH model as
sum is very close to 1.
• Shocks to volatility have high degree of
persistence
• Unconditional mean of the KSE returns is
0.1423/(1-0.9198)=1.89(*360=680% annually!,
which doesn’t look reasonable)
• Unconditional variance of kse-returns is
0.1517/(1-0.1852-0.7524)=2.43
56
Diagnostics: Ordinary and Standardized Residuals
500
Series: Ordinary RESIds
Sample 1 2218
400 Observations 2217
Mean -0.037813
300 Median -0.012862
The estimated model’s residuals Maximum 8.855997
Minimum -8.863792
should be uncorrelated and the 200 Std. Dev.
Skewness
1.547812
-0.169151
residuals should not contain any Kurtosis 7.038969
100
remaining conditional volatility. Jarque-Bera
Probability
1517.511
0.000000
58
Testing for Leverage Effects (Enders, p-
148)
• If there is no leverage, the regression of squared standardized residuals
on their level should give an overall F test which is insignificant and
individual t-values insignificant.
• sresid(-1) and sresid(-2)
are significant as well as
overall F-test.
This indicates presence of
leverage effect. Conditional
variance of kse returns is
affected more by
negative shocks.
59
EGARCH Estimates
• The coefficient of term
involving lag residual allows
the sign of residual to affect
conditional variance. If
asymmetry is present, this
coefficient (in this case c(6))
should be negative and
significant. If this term is zero
there is no asymmetry. In this
case the term is negative and
significant which confirms
previous diagnostic test.
60
EGARCH model
• The estimated EGARCH model gives
ut 1 u
log t2 0.173 0.3173 0.1103 t 1 0.8997 log t21
t 1 t 1
• A positive shock in error last period (good news) of one unit
increases volatility by =-0.1103 +0.3173=0.207 units
• A negative shock in error last period (bad news) of one unit
increases volatility by =-(-0.1103) +0.3173=0.4276 units
which is more than double hence volatility responds to shocks in
asymmetric way.
61
ARCH-in Mean
• Using either standard
deviation or variance in the
mean equation gives
insignificant estimate of risk
premium. Thus in this case
ARCH-M is not a
satisfactory description of
kse-returns.
62
Model Selection
• The following table indicates that log likelihood corresponds to
EGARCH (1,1) (with T distributed errors) is the maximum
among competing models. Compared to normal case parameter
estimates in the case of EGARCH with t errors are slightly
different. Akaike and Schwarz criteria also give similar results.
Model Log-
Likelihood
GARCH(1,1) -Normal Errors -3792.053
GARCH(1,1) -T Errors -3690.913
GARCH(1,1) -GED Errors -3685.821
EGARCH(1,1) Normal Errors -3787.982
EGARCH(1,1) T Errors -3677.53
EGARCH(1,1) GED Errors -3678.178
63
Estimated Volatility
• Volatility estimates from EGARCH model resembles more closely to realized squared
returns
Estimated Volatility and Squared KSE Returns
60
50
40
volatility
30
20
10
0
25 50 75 100
Date
VGARCHT VEGARCHT KSE^2
64