0% found this document useful (0 votes)
45 views64 pages

Chapter 8 (Brooks) : Modelling Volatility and Correlation

This document discusses modeling volatility and correlation in financial time series data. It notes that assumptions of constant error variance in linear models are often unrealistic for financial data, which tends to exhibit volatility clustering, fat tails, leverage effects, and time-varying conditional heteroscedasticity. Autoregressive conditional heteroscedasticity (ARCH) models allow the error variance to depend on prior squared errors, providing a way to model time-varying volatility without assuming a constant error variance.

Uploaded by

ahmad_hassan_59
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
45 views64 pages

Chapter 8 (Brooks) : Modelling Volatility and Correlation

This document discusses modeling volatility and correlation in financial time series data. It notes that assumptions of constant error variance in linear models are often unrealistic for financial data, which tends to exhibit volatility clustering, fat tails, leverage effects, and time-varying conditional heteroscedasticity. Autoregressive conditional heteroscedasticity (ARCH) models allow the error variance to depend on prior squared errors, providing a way to model time-varying volatility without assuming a constant error variance.

Uploaded by

ahmad_hassan_59
Copyright
© Attribution Non-Commercial (BY-NC)
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PPT, PDF, TXT or read online on Scribd
You are on page 1/ 64

Chapter 8 (Brooks)

Modelling volatility and correlation

1
Stylized Facts of Financial Data

• The linear structural and time series models assume constant error variance. This assumption is
not realistic in financial time series data especially at high frequency. Some of the stylized facts
of financial returns data are as follows:
- Fat Tails and Leptokurtosis: Distribution of financial data has tails thicker than the normal
distribution. Tails decrease slower than . Kurtosis of
 x 2financial data are almost always greater
e
than 3 (correspond to normal). Ignoring this may lead to under estimation of tail risk which has
obvious consequence for asset allocation and risk management.
- Volatility clustering: Tendency for volatility in financial markets to appear in bunches. Thus
large returns (of either sign) are expected to follow large returns and small returns (of either
sign) to follow small returns. This is due to the information arrival process. Information (news)
arrive in bunches.
- Asymmetric volatility (leverage effects): Tendency of volatility to rise more following a large
price fall than following a large price rise of the same magnitude. A price fall results in increase
in firms debt to equity ratio (i.e. leverage). Shareholder's residual claim is at larger risk.
 

2
A Sample Financial Asset Returns Time Series

Daily S&P 500 Returns for January 1990 – December 1999


Return
0.06

0.04

0.02

0.00

-0.02

-0.04

-0.06

-0.08
1/01/90 11/01/93 Date 9/01/97

3
KSE-100 index returns

KSE-100 Daily Returns (Jan 1990-Dec 1999)

10
8
6
4
2
0
-2
-4
-6
-8
-10

Day

4
KSE-100 index returns

1,200
Series: KSE index returns
Sample Jan 1990 July 1999
1,000
Observations 2609

800 Mean 0.032780


Median 0.000000
Maximum 12.76223
600 Minimum -13.21425
Std. Dev. 1.545728
400 Skewness -0.265931
Kurtosis 12.29654

200 Jarque-Bera 9425.946


Probability 0.000000
0
-10 -5 0 5 10

5
Are KSE returns different from IID
normal returns
Simulated iid returns with same mean and standard deviation as
KSE data for Jan 1990-Dec 1999

8
6
4
IID Retruns

2
0
-2
-4
-6
Day

6
KSE-100 index returns

KSE-100 index returns (Jan 1, 2000- June 30, 2008)

10
8
6
4
2
0
-2
-4
-6
-8
-10
Da y

7
KSE-100 index returns

600
Series: KSE index returns
Sample Jan 2000 July 2008
500
Observations 2217

400 Mean 0.097446


Median 0.095419
Maximum 8.507070
300 Minimum -8.662332
Std. Dev. 1.553112
200 Skewness -0.262416
Kurtosis 6.805081

100 Jarque-Bera 1362.909


Probability 0.000000
0
-8 -6 -4 -2 0 2 4 6 8

8
Daily Exchange rate (Pak Rs/US$)
changes April 2006 to July 2008
Daily Exchange Rate log change

2
(% exchange rate change)

-1

-2

-3
day

9
Daily Exchange rate (Pak Rs/US$)
changes April 2006 to July 2008
400
Series: Daily Exchange Rate
Sample april 2006 july 2008
Observations 588
300
Mean 0.028107
Median 0.000000
Maximum 2.406344
200 Minimum -1.902032
Std. Dev. 0.318125
Skewness 0.650558
Kurtosis 20.89349
100

Jarque-Bera 7885.813
Probability 0.000000
0
-2 -1 0 1 2

10
Monthly inflation rate (CPI-12 major
cities %log change)

Inflation Rate

3
2.5
Monthly Inflation rate (%)

2
1.5
1
0.5
0
-0.5
-1
-1.5
Month

11
Monthly inflation rate (CPI-12 major
cities %log change)

14
Series: INFLATION
12 Sample 1995M01 2006M06
Observations 126
10
Mean 0.512396
Median 0.471274
8
Maximum 2.366600
Minimum -0.851813
6 Std. Dev. 0.599854
Skewness 0.381571
4 Kurtosis 3.255628

2 Jarque-Bera 3.400593
Probability 0.182629
0
-0.5 -0.0 0.5 1.0 1.5 2.0

12
Non-linear Models: A Definition

• Campbell, Lo and MacKinlay (1997) define a non-linear data generating process


as one that can be written
yt = f(ut, ut-1, ut-2, …)
where ut is an iid error term and f is a non-linear function.

• They also give a slightly more specific definition as


yt = g(ut-1, ut-2, …)+ ut2(ut-1, ut-2, …)
where g is a function of past error terms only and 2 is a variance term.

• Models with nonlinear g(•) are “non-linear in mean”, while those with nonlinear
2(•) are “non-linear in variance”. Our “traditional” structural model could be
something like:
yt = 1 + 2x2t + ... + kxkt + ut, or more compactly y = X + u.
•  We also assumed ut  N(0,2).

13
Heteroscedasticity Revisited

• An example of a structural model is with ut 2 N(0, ).


u
  yt = 1 + 2x2t + 3x3t + 4x4t + u t
• The assumption that the variance of the errors is constant is known as
homoskedasticity, i.e. Var (ut) = 2 .
  u
• What if the variance of the errors is not constant?
- heteroskedasticity
- would imply that standard error estimates could be wrong.
 
• Is the variance of the errors likely to be constant over time? Not for
financial data.

14
Autoregressive Conditionally Heteroscedastic
(ARCH) Models

• So use a model which does not assume that the variance is constant.
• Recall the definition of the variance of ut:
t2 t ut-1, ut-2,...) = E[(ut-E(ut))2 ut-1, ut-2,...]
= Var(u
We usually assume that E(ut) = 0
2
so t = Var(ut  ut-1, ut-2,...) = E[ut2 ut-1, ut-2,...].
 
• What could the current value of the variance of the errors plausibly
depend upon?
– Previous squared error terms.
• This leads to the autoregressive conditionally heteroscedastic model for
the variance of the errors:
= 0 +  1 t2 ut21
• This is known as an ARCH(1) model.
15
Autoregressive Conditionally Heteroscedastic
(ARCH) Models (cont’d)
• The full model would be
yt = 1 + 2x2t + ... + kxkt + ut , ut  N(0, t2)
2 2
where t = 0 + 1 ut 1
• We can easily extend this to the general case where the error variance
depends on q lags of squared errors:
t2 = 0 + 1 ut 1+2 ut  2+...+q ut  q
2 2 2

• This is an ARCH(q) model.


 
 2
• Instead of calling the variance t , in the literature it is usually called
ht, so the model is
yt = 21 + 2x2t2 + ... + kxkt2 + ut , ut  N(0,ht)
u u u
where ht = 0 + 1 t 1+2 t  2+...+q t  q

16
Another Way of Writing ARCH Models

• For illustration, consider an ARCH(1). Instead of the above, we can


write
 
yt = 1 + 2x2t + ... + kxkt + ut , ut = vtt

t  0  1ut21 , vt  N(0,1)
 
• The two are different ways of expressing exactly the same model. The
first form is easier to understand while the second form is required for
simulation from an ARCH model.

17
Testing for “ARCH Effects”

1. First, run any postulated linear regression of the form given in the equation
above, e.g. yt = 1 + 2x2t + ... + kxkt + ut
saving the residuals, û t.

2. Then square the residuals, and regress them on q own lags to test for ARCH
of order q, i.e. run the regression
uˆt2   0   1uˆt21   2uˆt2 2  ...   quˆt2 q  vt
where vt is iid.
Obtain R2 from this regression

3. The test statistic is defined as (T-q)R2 (the number of observations multiplied


by the coefficient of multiple correlation) from the last regression, and is
distributed as a 2(q).

18
Testing for “ARCH Effects” (cont’d)

4. The null and alternative hypotheses are


H0 : 1 = 0 and 2 = 0 and 3 = 0 and ... and q = 0
H1 : 1  0 or 2  0 or 3  0 or ... or q  0.

If the value of the test statistic is greater than the critical value from the
2 distribution, then reject the null hypothesis.

• Note that the ARCH test is also sometimes applied directly to returns
instead of the residuals from Stage 1 above.

19
Problems with ARCH(q) Models

• How do we decide on q?
• The required value of q might be very large
• Non-negativity constraints might be violated.
– When we estimate an ARCH model, we require i >0  i=1,2,...,q
(since variance cannot be negative)
 
• A natural extension of an ARCH(q) model which gets around some of
these problems is a GARCH model.

20
Generalised ARCH (GARCH) Models





• Due to Bollerslev (1986). Allows the conditional variance to be dependent
upon previous own lags
• The variance equation is now
2 2 2
= + u + (1)





t 0 1 t
1 t-
1

• This is a GARCH(1,1) model, which is like an ARMA(1,1) model for the





variance equation.
• We could also write







2 2 2
t
-1 = 0 + 1u 
t2+ t-2
2 2







2
=
t
-
2+
0u
+
1

t
3t
-
3
• Substituting into (1) for t-12 :
2 22 2
t = 0 + 1 u
t 1+ ( +
u
0
1

t+
t
-
2
2)
22 2
+
=
u
0
1

t+
0
1
+
u
1

t+
t
-
2
2

21









Generalised ARCH (GARCH) Models (cont’d)













=








• Now substituting into (2) for t-22
2
+22
u
++u+2
(+
u2
+2
)







t
0 1

t
101
t
2 0
1
t
3t
-
3
2 22 2223
2
=+u
++u+ +
u+


t01

t
101
t0
2t1
3t
-
3
  2 2 2 22
32
=
t(
1
+
0+
+
)1u
(
1

t
1L
+
+L)
+
t-
3
• An infinite number of successive substitutions would yield
2 2 2 2 2 2
  t = 0 (
1+ + + ..
.)+ 1 ut1(1+ L + L +.
..
)+ 0









• So the GARCH(1,1) model can be written as an infinite order ARCH model.
 
• We can again extend the GARCH(1,1) model to a GARCH(p,q):
2 2 22 22 2
=
t+
0u
+
1

t
1u
2
t+
.
.
.
+

2u
+
q

t
q1
t+
-
1
2+
t
-
2.
.
.
+
pt
-
p
q p

   
2
 
t =0
u 
j t
j
2 2
i t
i

i1 j
1

22
Generalised ARCH (GARCH) Models (cont’d)

• But in general a GARCH(1,1) model will be sufficient to capture the


volatility clustering in the data.
 
• Why is GARCH better than ARCH?
- more parsimonious - avoids over fitting
- less likely to breech non-negativity constraints
-The ARCH part explains volatility clustering
-The GARCH part indicates the temporal dependence in volatility

23
The Unconditional Variance under the GARCH
Specification

• The unconditional variance of ut is given by



 

V
au
r
(
t)
=0

1(
1)
when 
1<1
• A high sum of alpha and beta would indicate volatility persistence.

• 

This would be useful in volatility forecast
  is termed “non-stationarity” in variance



1 1

• 1
=
1is termed intergrated GARCH

• For non-stationarity in variance, the conditional variance forecasts will


not converge on their unconditional value as the horizon increases.

24
Estimation of ARCH / GARCH Models

• Since the model is no longer of the usual linear form, we cannot use OLS.
 
• We use another technique known as maximum likelihood.
 
• The method works by finding the most likely values of the parameters given
the actual data.
 
• More specifically, we form a log-likelihood function and maximise it.
 
 
 
 

25
Estimation of ARCH / GARCH Models (cont’d)

• The steps involved in actually estimating an ARCH or GARCH model


are as follows
 





1. Specify the appropriate equations for the mean and the variance - e.g. an
AR(1)- GARCH(1,1) model:
yt =  + yt-1 + ut , ut  N(0,t2)
2 22
=
t+
0u
+
1

t
1t
-
1

2. Specify the log-likelihood function to maximise:


T 1 T 1 T
L   log(2 )   log( t )   ( y t    y t 1 ) 2 /  t
2 2

2 2 t 1 2 t 1
3. The software will maximise the function and give parameter values and
their standard errors
26
Parameter Estimation using Maximum Likelihood
 
• Consider the bivariate regression case with homoskedastic errors for
simplicity: y t   1   2 xt  u t

• Assuming that ut  N(0,2), then yt  N(  1   2 xt , 2) so that the


probability density function for a normally distributed random variable
with this mean and variance is given by
1  1 ( y t   1   2 xt ) 2  (1)
f ( y t  1   2 xt ,  ) 
2
exp 
   2  2 2 
• Successive values of yt would trace out the familiar bell-shaped curve.
 
• Assuming that ut are iid, then yt will also be iid.

27
Parameter Estimation using Maximum Likelihood
(cont’d)

• Then the joint pdf for all the y’s can be expressed as a product of the individual
density functions 
f ( y1 , y 2 ,..., yT  1   2 X t ,  2 )  f ( y1  1   2 X 1 ,  2 ) f ( y 2  1   2 X 2 ,  2 )...

(2) f ( yT  1   2 X 4 ,  2 )
T
  f ( yt 1   2 X t ,  2 )
t 1

 
• Substituting into equation (2) for every yt from equation (1),
(3)
  f ( y , y ,..., y    x ,  2 )  1  1 T ( y t   1   2 xt ) 2 
1 2 T 1 2 t exp   
 ( 2 )
T T
 2 t 1  2

28
Parameter Estimation using Maximum Likelihood
(cont’d)
• The typical situation we have is that the xt and yt are given and we want to
estimate 1, 2, 2. If this is the case, then f() is known as the likelihood
function, denoted LF(1, 2, 2), so we write
 

1  1 T
( y     x ) 2

(4)LF (  ,  ,  ) 
1 2
2
exp  t 1 2 t

   ( 2 )
T T
 2 t 1  2

• Maximum likelihood estimation involves choosing parameter values ( 1, 2,2)


that maximise this function.
 
• We want to differentiate (4) w.r.t. 1, 2,2, but (4) is a product containing T
terms.
 

29
Parameter Estimation using Maximum Likelihood
(cont’d)
• Since max f ( x )  max log( f (,xwe )) can take logs of (4).
x x
 
• Then, using the various laws for transforming functions containing logarithms, we
obtain the log-likelihood function, LLF:
 
T 1 T
( y     x ) 2
  LLF  T log   log(2 ) 
2
 2 t 1
t


1
2
2 t

• which is equivalent to
 
(5) T T 1 T
( y     x ) 2

  LLF   log  2  log(2 ) 


2 2
 2 t 1
t 1

2
2 t

• Differentiating (5) w.r.t. 1, 2,2, we obtain


 
(6)
LLF 1 ( y   1   2 xt ).2.  1
  t
 1 2 2
30
Parameter Estimation using Maximum Likelihood
(cont’d)
(7) LLF 1 ( y t  1   2 xt ).2.  xt
 
 2 2 2
LLF T 1 1 ( y t   1   2 xt ) 2
(8)   
 2 22 2 4
•  Setting (6)-(8) to zero to minimise the functions, and putting hats above the
parameters to denote the maximum likelihood estimators,
 
• From (6),  ( y  ˆ  ˆ x )  0
t 1 2 t

 y   ˆ   ˆ x  0
t 1 2 t

 y  Tˆ  ˆ  x  0
t 1 2 t
1 ˆ  ˆ 1
(9) T
 t 1 2T
y   x t 0

ˆ1  y  ˆ 2 x
31
Parameter Estimation using Maximum Likelihood
(cont’d)
• From (7),  ( y  ˆ  ˆ x ) x  0
t 1 2 t t

 y x   ˆ x   ˆ x  0
t t 1 t 2
2
t

 y x  ˆ  x  ˆ  x  0
t t 1 t 2
2
t

ˆ  x   y x  ( y  ˆ x ) x
2
2
t t t 2 t

ˆ  x   y x  Tx y  ˆ Tx
2
2
t t t 2
2

ˆ 2 ( xt2 Tx 2 )   y t xt  Tx y
(10)
ˆ 2 
 y x  Tx y
t t
  ( x Tx )2
t
2

• From (8), T 1

ˆ 2 ˆ 4
(y t  ˆ1  ˆ 2 xt ) 2
32
Parameter Estimation using Maximum Likelihood
(cont’d)
• Rearranging, ˆ 2  1  ( y t  ˆ1  ˆ 2 xt ) 2
T
1
(11)  2   ut2
T

• How do these formulae compare with the OLS estimators?


(9) & (10) are identical to OLS
(11) is different. The OLS estimator was
1
 
 2 
Tk
 ut2

• Therefore the ML estimator of the variance of the disturbances is biased,


although it is consistent.
•  But how does this help us in estimating heteroskedastic models?

33
Estimation of GARCH Models Using
Maximum Likelihood

 
 




Now we have yt =  + yt-1 + ut , ut  N(0,

T
2
=
t+
0
2
u
+
1

t
1
2
t
-
1

1 T 1 T
)  t2

L   log(2 )   log( t )   ( y t    y t 1 ) 2 /  t
2 2

2 2 t 1 2 t 1
• Unfortunately, the LLF for a model with time-varying variances cannot be maximised
analytically, except in the simplest of cases. So a numerical procedure is used to
maximise the log-likelihood function. A potential problem: local optima or
multimodalities in the likelihood surface.

• The way we do the optimisation is:


  1. Set up LLF.
2. Use regression to get initial guesses for the mean parameters.
3. Choose some initial guesses for the conditional variance parameters.
4. Specify a convergence criterion - either by criterion or by value.

34
Non-Normality and Maximum Likelihood

• Recall that the conditional normality assumption for ut is essential.


 
• We can test for normality using the following representation
ut = vtt vt  N(0,1)
ut
  t   0  1ut21   2 t21
vt 
t
 
• The sample counterpart is uˆt
  vˆt 
ˆ t
• Are the normal? Typically are still leptokurtic, although less so than the . Is
this a problem?
v̂t Not really, as we canv̂t use the ML with a robust variance/covariance
estimator.
ût ML with robust standard errors is called Quasi- Maximum Likelihood or
QML. The robust standard errors are developed by Bollerslev and Wooldridge (1992)

35
QML Estimator
• Let the normal density for error is.

• This maximization gives the QML estimates. According to


Bollerslev and Wooldridge (1992)

36
Extensions to the Basic GARCH Model

• Since the GARCH model was developed, a huge number of extensions


and variants have been proposed. Three of the most important
examples are EGARCH, GJR, and GARCH-M models.
 
• Problems with GARCH(p,q) Models:
- Non-negativity constraints may still be violated
- GARCH models cannot account for leverage effects
 
• Possible solutions: the exponential GARCH (EGARCH) model or the
GJR model, which are asymmetric GARCH models.
 

37
The EGARCH Model

• Suggested by Nelson (1991). The variance equation is given by


 
u t 1  u 2
 
2 2 t 1
log( t )     log( t 1 )   
 t 1
2   t 1 2 
 

• Advantages of the model


- Since we model the log(t2), then even if the parameters are negative, t2
will be positive.
- We can account for the leverage effect: if the relationship between
volatility and returns is negative, , will be negative.
38
The GJR Model




2



• Due to Glosten, Jaganathan and Runkle
=
t01

t
1t
-
1
 +
t
-
1
t-
1 u 2
+ 2
+

where It-1 = 1 if ut-1 < 0


u 2
I

= 0 otherwise
 
• For a leverage effect, we would see  > 0.
 
• We require 1 +   0 and 1  0 for non-negativity.

39
An Example of the use of a GJR Model

• Using monthly S&P 500 returns, December 1979- June 1998


 
• Estimating a GJR model, we obtain the following results.
  y t  0.172
  (3.198)
 

 t  1.243  0.015u t21  0.498 t 1  0.604u t21 I t 1


2 2

(16.372) (0.437) (14.999) (5.772)

40
News Impact Curves

The news impact curve plots the next period volatility (ht) that would arise from various
positive and negative values of ut-1, given an estimated model.

News Impact Curves for S&P 500 Returns using Coefficients from GARCH and GJR
Model Estimates:
0.14
GARCH
GJR
0.12
Value of Conditional Variance

0.1

0.08

0.06

0.04

0.02

0
-1 -0.9 -0.8 -0.7 -0.6 -0.5 -0.4 -0.3 -0.2 -0.1 0 0.1 0.2 0.3 0.4 0.5 0.6 0.7 0.8 0.9 1
41
Value of Lagged Shock
GARCH-in Mean

• We expect a risk to be compensated by a higher return. So why not let


the return of a security be partly determined by its risk?
 





• Engle, Lilien and Robins (1987) suggested the ARCH-M specification.
A GARCH-M model would be
yt =  +  t-1 + ut , ut  N(0, t2)
2 22
=
t+
0u
+
1

t
1t
-
1

   can be interpreted as a sort of risk premium.

• It is possible to combine all or some of these models together to get


more complex “hybrid” models - e.g. an ARMA-EGARCH(1,1)-M
model.
42
What Use Are GARCH-type Models?

• GARCH can model the volatility clustering effect since the conditional
variance is autoregressive. Such models can be used to forecast volatility.
  
• We could show that
Var (yt  yt-1, yt-2, ...) = Var (ut  ut-1, ut-2, ...)
 
• So modelling t2 will give us models and forecasts for yt as well.
 
• Variance forecasts are additive over time.

43
Forecasting Variances using GARCH Models

• Producing conditional variance forecasts from GARCH models uses a


very similar approach to producing forecasts from ARMA models.
• It is again an exercise in iterating with the conditional expectations
operator.
• Consider the following GARCH(1,1) model:
y t    u t , ut  N(0,t2), 2
 t   0   1u t21   t 1
2

• What is needed is to generate are forecasts of T+12 T, T+22 T, ...,
T+s2 T where T denotes all information available up to and
including observation T.
• Adding one to each of the time subscripts of the above conditional
variance equation, and then two, and then three would yield the
following equations
T+12 = 0 + 1 u2T+T2 , T+22 = 0 + 1 u2T+1 +T+12 , T+32 = 0 + 1
u2T+2+T+22

44
Forecasting Variances
using GARCH Models (Cont’d)
2
• Let  1f,T be the one step ahead forecast for 2 made at time T. This is
easy to calculate since, at time T, the values of all the terms on the RHS





are known.
•  1f,T 2 would be obtained by taking the conditional expectation of the
first equation at the bottom of slide 44:
2
f 2 2
1
,
T0 +
u
=
+
1
T T
f 2
• Given,  1f,T how is  2 ,T , the 2-step ahead forecast for 2 made at time T,
2

calculated? Taking the conditional expectation of the second equation


at the bottom of slide 44:
f 2
 2f,T = 0 + 1E( u T  
2 2
1  T ) +  1,T
2 2
• where E( uT  1 T) is the expectation, made at time T, of u T ,1 which is
the squared disturbance term.

45
Forecasting Variances
using GARCH Models (Cont’d)
• We can write
E(uT+12  t) = T+12
• But T+12 is not known at time T, so it is replaced with the forecast for it,
,sof 2that the 2-step ahead forecast is given by
1,T 2 2 f 2
=20f,T+ 1 + 1f,T  1,T
2
=2 ,0T + (1+)
f 2  1,T
f

• By similar arguments, the 3-step ahead forecast will be given by


2
 3f,T = ET(0 + 1 + T+22)
2
= 0 + (1+)  2f,T
f 2
= 0 + (1+)[ 0 + (1+) ]  1,T
f 2
= 0 + 0(1+) + (1+) 2  1,T
• Any s-step ahead forecast (s  2) would be produced by
s 1
h f
s ,T   0  ( 1   ) i 1  ( 1   ) s 1 h1f,T
i 1
46
What Use Are Volatility Forecasts?

1. Option pricing
 
C = f(S, X, 2, T, rf)
 
2. Conditional betas
 
im,t
i ,t 
   m2 ,t
3. Dynamic hedge ratios
The Hedge Ratio - the size of the futures position to the size of the underlying
exposure, i.e. the number of futures contracts to buy or sell per unit of the spot
good.
 

47
What Use Are Volatility Forecasts? (Cont’d)

• What is the optimal value of the hedge ratio?


• Assuming that the objective of hedging is to minimise the variance of the
hedged portfolio, the optimal hedge ratio will be given by
s
h p
F
where h = hedge ratio
p = correlation coefficient between change in spot price (S) and
change in futures price (F)
S = standard deviation of S
F = standard deviation of F

• What if the standard deviations and correlation are changing over time?
Use  s ,t
h p
t t
 F ,t
48
Testing Non-linear Restrictions or
Testing Hypotheses about Non-linear Models

• Usual t- and F-tests are still valid in non-linear models, but they are
not flexible enough.

• There are three hypothesis testing procedures based on maximum


likelihood principles: Wald, Likelihood Ratio, Lagrange Multiplier.
 
• Consider a single parameter,  to be estimated, Denote the MLE as ˆ
~
and a restricted estimate as  .

49
Likelihood Ratio Tests

• Estimate under the null hypothesis and under the alternative.


• Then compare the maximised values of the LLF.
• So we estimate the unconstrained model and achieve a given maximised
value of the LLF, denoted Lu
• Then estimate the model imposing the constraint(s) and get a new value of
the LLF denoted Lr.
• Which will be bigger?
• Lr  Lu comparable to RRSS  URSS
 
• The LR test statistic is given by
  LR = -2(Lr - Lu)  2(m)
where m = number of restrictions

50
Likelihood Ratio Tests (cont’d)

• Example: We estimate a GARCH model and obtain a maximised LLF of


66.85. We are interested in testing whether  = 0 in the following
equation.
   t2
 t2 + yt-1 + u
yt = ut2t1 , ut N(0, )
2
t 1
= 0 + 1 +
 
• We estimate the model imposing the restriction and observe the
maximised LLF falls to 64.54. Can we accept the restriction?
• LR = -2(64.54-66.85) = 4.62.
• The test follows a 2(1) = 3.84 at 5%, so reject the null.

51
Modelling Volatility of KSE Returns
• Let’s consider daily KSE-100 index return data from Jan 1, 2000- June 30,
2008. We hold last 25 observations for out of sample forecast comparison.
Here is the of plot returns which shows typical pattern of financial data

KSE-100 index returns (Jan 1, 2000 June 30, 2008)

10
8
6
4
2
0
-2
-4
-6
-8
-10
Da y

52
Identifying GARCH order: Correlogram
of KSE-return
• There seem to be some significant
autocorrelations up to and including 10
lags. The mean equation may involve
too many parameters

53
Identifying GARCH order: Correlogram
of Squared KSE-return
• PACF of first four lags and at some higher large are large. Possibly indicating
that simple ARCH may involve too many parameters.

54
Testing for ARCH Effects in Returns
• ARCH LM test can be directly applied to kse-returns to see if squared returns has
autoregressive structure
Ho :1   2  ...   k  0 where  k are coefficients of
lags of squared returns
• LM(11)=TR2= 2207*0.1484=327.5 Pval=@chisq(327.5,11)=0.000

• As suggested in the literature e.g. Enders (2004, p146) we start with the most
parsimonious model for conditional volatility that has often been found to be
satisfactory in developed stock markets i.e. GARCH(1,1),.
• If it fails to satisfy any diagnostics, more complicated model are needed.

55
Tentative Model ARMA(1,1)-GARCH
(1,1)
• Let’s examine the fit of ARMA(1,1)-
GARCH(1,1) model. All the coefficients in mean
and volatility equations are significant. Also the
GARCH model seem to be covariance stationary
as the sum of alpha +beta
=0.1852+0.7525=0.937<1. That means the
model can be considered an IGARCH model as
sum is very close to 1.
• Shocks to volatility have high degree of
persistence
• Unconditional mean of the KSE returns is
0.1423/(1-0.9198)=1.89(*360=680% annually!,
which doesn’t look reasonable)
• Unconditional variance of kse-returns is
0.1517/(1-0.1852-0.7524)=2.43

56
Diagnostics: Ordinary and Standardized Residuals
500
Series: Ordinary RESIds
Sample 1 2218
400 Observations 2217

Mean -0.037813
300 Median -0.012862
The estimated model’s residuals Maximum 8.855997
Minimum -8.863792
should be uncorrelated and the 200 Std. Dev.
Skewness
1.547812
-0.169151
residuals should not contain any Kurtosis 7.038969
100
remaining conditional volatility. Jarque-Bera
Probability
1517.511
0.000000

Kurtosis of standardized residuals 0


-8 -6 -4 -2 0 2 4 6 8

(residuals divided by conditional 600


Series: Standardized RESIDs
standard deviation) is smaller than 500
Sample 1 2218
Observations 2217
the kurtosis of ordinary residuals,
Mean -0.029514
which should be the case if 400
Median -0.011098
Maximum 4.215838
GARCH is a suitable model 300 Minimum -6.316449
Std. Dev. 0.999929
200 Skewness -0.510640
Kurtosis 5.679093

100 Jarque-Bera 759.3738


Probability 0.000000
0 57
-6 -4 -2 0 2 4
Model Diagnostics: Standardized
Residuals
• Standardized residuals and
squared standardized residuals
do not give evidence of any
misspecification

58
Testing for Leverage Effects (Enders, p-
148)
• If there is no leverage, the regression of squared standardized residuals
on their level should give an overall F test which is insignificant and
individual t-values insignificant.
• sresid(-1) and sresid(-2)
are significant as well as
overall F-test.
This indicates presence of
leverage effect. Conditional
variance of kse returns is
affected more by
negative shocks.

59
EGARCH Estimates
• The coefficient of term
involving lag residual allows
the sign of residual to affect
conditional variance. If
asymmetry is present, this
coefficient (in this case c(6))
should be negative and
significant. If this term is zero
there is no asymmetry. In this
case the term is negative and
significant which confirms
previous diagnostic test.

60
EGARCH model
• The estimated EGARCH model gives
ut 1 u
log  t2  0.173  0.3173  0.1103 t 1  0.8997 log  t21
 t 1  t 1
• A positive shock in error last period (good news) of one unit
increases volatility by =-0.1103 +0.3173=0.207 units
• A negative shock in error last period (bad news) of one unit
increases volatility by =-(-0.1103) +0.3173=0.4276 units
which is more than double hence volatility responds to shocks in
asymmetric way.

61
ARCH-in Mean
• Using either standard
deviation or variance in the
mean equation gives
insignificant estimate of risk
premium. Thus in this case
ARCH-M is not a
satisfactory description of
kse-returns.

62
Model Selection
• The following table indicates that log likelihood corresponds to
EGARCH (1,1) (with T distributed errors) is the maximum
among competing models. Compared to normal case parameter
estimates in the case of EGARCH with t errors are slightly
different. Akaike and Schwarz criteria also give similar results.
Model Log-
Likelihood
GARCH(1,1) -Normal Errors -3792.053
GARCH(1,1) -T Errors -3690.913
GARCH(1,1) -GED Errors -3685.821
EGARCH(1,1) Normal Errors -3787.982
EGARCH(1,1) T Errors -3677.53
EGARCH(1,1) GED Errors -3678.178

63
Estimated Volatility
• Volatility estimates from EGARCH model resembles more closely to realized squared
returns
Estimated Volatility and Squared KSE Returns
60

50

40
volatility

30

20

10

0
25 50 75 100
Date
VGARCHT VEGARCHT KSE^2
64

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy