Intro v220130311164400
Intro v220130311164400
Introduction
Carlo A. Favero and Massimo Guidolin
Dept. of Finance, Bocconi University
1. Introduction
Predicting the distribution of returns of nancial assets is a task of primary importance for
identifying desirable investments, performing optimal asset allocation within a portfolio,
as well as measuring and managing portfolio risk. Optimal asset management depends
on the statistical properties of returns at di erent frequencies. Portfolio allocation, i.e.,
the choice of optimal weights to be attributed to the di erent ( nancial) assets in a
portfolio, is typically based on a long horizon perspective, while the measurement of risk
of a given portfolio takes typically a rather short-horizon perspective. This means that
a long-run investor decides her optimal portfolio allocation on the basis of the (joint)
distribution of the returns of the relevant (i.e., in some pertinent asset menu from which
to chose) nancial assets at low frequency.1 However, the monitoring of the daily risk of
a portfolio normally depends on the statistical properties of the distribution of returns at
high frequencies.
This book (project), in its characteristically applied nature, is designed to illustrate
the statistical techniques to perform the analysis of time series of asset (often, nancial)
returns at di erent frequencies and its application to asset management and performance
evaluation, portfolio allocation, and nancial risk management.
The relevant concepts will be introduced and their application will be discussed by
using a set of MATLAB programmes speci cally designed for each of the chapters. The
underlying teaching strategy is to start with the econometrics required by the simplest
1
There is emprical evidence that females outperform males as professional portfolios managers. One
wonders whether this may be a re ection of the typical decision horizons that may possibly di er across
these two categories of investors.
portfolio allocation strategy or asset management application, to then investigate whether
and how progressively more complicated techniques lead to di erent solutions to the same
problem. Draft MATLAB codes for the solutions of the exercises, that are designed to
allow the reader to understand how the di erent econometric techniques could be put at
work, are made available in advance on the book webpage. Students are expected to work
on them in the computer classes. MATLAB applications will be based on the database
STOCKINT.XLS containing daily, monthly and quarterly data on US, UK and German
stock market indexes, on US and German 10 years government bonds, and on 3-month
German and US Treasury bills.
This chapter is structured as follows. Section 2 introduces the possibility that both
information and noise may be present in observed asset returns, with the former being
able to emerge as the horizon of the analysis lenghtens. Section 3 transforms the log
price-dividend ratio model of Section 2 in a simple statistical framework to understand
why over short horizons returns are essentially unpredictable and dominated by noise,
while over longer investment horizons, the opposite tends to occur. Section 4 o ers a
rst example of how predictability in asset returns may also derive from the existence
of contagion, and therefore o ers a rst de nition of what contagion may be. Section 5
introduces a standard, mean-variance portolio problem, but stops short from dealing with
the key problem of providing forecasts for the vector of mean returns and the covariance
matrix required to operationalize the model in practice. However, this provides occasion
to further discuss the implications of the data for optimal decisions. Section 6 presents the
simplest possible empirical model for how means, variances, and (possibly) covariances
of asset returns may be estimated starting from a sample of data, in view of performing
asset allocation calculations. Section 7 starts by brie y discussing a common problem
with mean-variance portfolio weights, their instability over time to propose two popular
solutions, re-sampling portfolio weights and the Black-Littermannn method to incorporate
estimation uncertainty in portfolio solutions. Section 8 maps future extensions and also
emphasizes how our development of econometric methods aims not only at supporting
optimal portfolio decisions but also risk management calculations.
The distribution of (future) asset returns is the basic ingredient of asset allocation de-
cisions of investors with di erent horizons. The di erent horizons matter because they
suggest di erent optimal econometric strategies for investors to distill and separate \in-
2
formation" from \noise" in returns. In fact, this book project is motivated by a view that
returns are determined by a permanent information component in opposition to a tem-
porary noise component.2 The noise component dominates the data at high-frequency,
while the information component emerges when high-frequency observations are aggre-
gated over time to construct long-horizon returns. The di erent statistical properties of
the information and the noise components have speci c implications for the econometric
modelling of returns.
At high frequency (such as when returns are sampled at infra-daily intervals, daily,
or also weekly), returns are dominated by the noise component and they are therefore
very close to be unpredictable, their volatility is strongly time-varying and persistent
(panic appears suddenly but disappears slowly), and their distribution tends to be poorly
approximated by the normal distribution because thick tails (e.g., large excess kurtosis)
and asymmetries (e.g., non-zero skewness) are regularly observed in the data.3 Moreover,
at high frequency, the interdependence between returns in di erent markets is strong
but not constant over time: phenomena of changing interdependence, or contagion, are
frequently observed.
On the contrary, at low frequencies (such as when returns are sampled at monthly,
quarterly, or even annual internvals), as information comes to dominate, predictability also
emerges in that fundamentals become useful to predict returns, heteroskedasticity and
non-normality tends to disappear as the horizon lengthens and interdependence across
di erent markets becomes more stable and often weaker.4 Also, the long-range (e.g.,
annualized) volatility of returns decreases with the investment horizon.
These observations suggest that econometric modelling of high and low frequency re-
turns should be conducted using di erent frameworks and di erent usage of alternative
methods may be appropriate at di erent frequencies. Long-horizon predictability of re-
turns could be used to improve the (realized, out-of-sample) performance of long-term
strategic asset allocation while predictability of the volatility of returns at short horizons
could be used to more precisely estimate the risk associated to any given portfolio. An
econometric strategy that recognizes these features of the data should outperform the
strategies in which practical decisions (e.g., optimal weights or value-at-risk estimates)
are simply based on calculations that hinge upon simple historical moments to forecast
2
The use of the terms \noise" vs. \information" is inspired after chapter 3 of Taleb (2001).
3
Examples of infra-daily returns are when these are calculated from transaction prices sampled every
5-minutes, which is becoming rather typical of the microstructural nance literature.
4
Heteroskedasticity simply means that the variance of a process changes over time. This is one of the
main topics of the course that will be tackled in due time.
3
(or simply, estimate) future expected returns and their variance-covariance matrix.
This view that as the horizon grows, useful information to predict features of the
joint distributon of asset returns may be inferred from historical data on asset returns
using appropriate statistical techniques is very di erent from the (perfect, e cient mar-
ket hypothesis) view prevalent in the 1960s and 1970s on the behaviour of asset prices
and nancial returns that Cochrane (1999) summarized as follows: CAPM is a good
measure of risk and thus a good explanation of why some stocks earn higher average
returns than others; returns are close to unpredictable: any predictability is a statistical
artifact or cannot be exploited after transaction costs; volatility is constant. Modern
nancial econometrics shows that these conclusions|that are equivalent to state that
asset returns would contain only noise and no information useful to forecast their future
properties|are crucially dependent on the horizon of the analysis and, as a result, on the
speci c methodologies employed. Section 2.1 make these general points more concrete by
providing a rst framework of analysis as well as some rst practical examples.
The key variables for the econometrics of asset and risk management are returns. Returns
are de ned over a given horizon. The statistical properties of returns at di erent horizon
are di erent. For concreteness, consider stock market returns. Our initial goal is derive
a simple and yet powerful framework of analysis of stock returns. Such a framework,
starting from an accounting de nition of asset returns, shows that the log price-dividend
ratio of any stock or risky portfolio measures the value of a long-term investment strategy
(buy and hold) which|apart from a constant|is equal to the stream of future dividend
growth discounted at the appropriate rate, which re ects the risk free rate plus risk
premium required to hold risky assets. This model is often referred to as the dynamic
dividend growth model, in its log-linearized form.
Start from one-period total holding returns in the stock market, that are de ned as
follows:5
s Pt+1 + Dt+1 Pt+1 Pt + Dt+1 Pt+1 Dt+1
Ht+1 1= = + ; (1)
Pt Pt Pt Pt
where Pt is the stock price at time t, Dt is the (cash) dividend paid at time t, and the
superscript s denotes \stock". The last equality decomposes a discrete holding period
return as the sum of the percentage capital gain and of (a de nition of) the dividend
5
The use of ` ' emphasizes that (1) provides a de nition. Moreover, Xt+1 denotes the rst di erence
of a generic variable, or Xt+1 Xt+1 Xt .
4
yield, Dt+1 =Pt Given that one-period returns are usually small it is sometimes convenient
to approximate them with logarithmic, continuously compunded returns, de ned as:
s s Pt+1 + Dt+1
rt+1 log 1 + Ht+1 = log = log (Pt+1 + Dt+1 ) log (Pt ) : (2)
Pt
Interestingly while linear returns are additive in the percentage capital gain and the
dividend yield components, log returns are not as
Pt+1 + Dt+1 Pt+1 Dt+1
log 6= log + log
Pt Pt Pt
However, it is still possible to express log returns as a linear function of the log of the
s
price dividend and the (log) dividend growth. Dividing both sides of (1) by 1 + Ht+1
and multiplying both sides by Pt =Dt we have:
Pt 1 Dt+1 Pt+1
= s
1+ :
Dt 1 + Ht+1 Dt Dt+1
Taking logs (denoted by lower case letters, i.e., xt log Xt for a generic variable Xt ), we
have:6
s
pt dt = rt+1 + dt+1 + ln 1 + ept+1 dt+1
(3)
as log(Dt+1 =Dt ) = log Dt+1 log Dt = log Dt+1 = dt+1 . Taking a rst-order Taylor
p d
expansion of the last term about the point P =D = e (where the bar denotes a sample
average), the logarithm term on the right-hand side can be approximated as:
ep d
ln 1 + ept+1 dt+1
' ln(1 + ep d ) + [(pt+1 dt+1 ) (p d)]
1 + ep d
1
= ln(1 ) ln 1 + (pt+1 dt+1 )
1
= + (pt+1 dt+1 )
where
ep d P =D 1
= <1 ln(1 ) ln 1 :
1 + ep d 1 + (P =D) 1
6 s
rt+1 follows from
1 s
log s
= log 1 log 1 + Ht+1
1 + Ht+1
s s
= log 1 + Ht+1 = rt+1
based on our earlier de nitions and the fact that log 1 = 0 for natural logs. Moreover, notice that
Pt+1
= elog(Pt+1 =Dt+1 ) = elog Pt+1 log Dt+1
= ept+1 dt+1
Dt+1
5
Although 2 (0; 1) is just a factor that depends on the average price-dividend ratio, in
what follows it will be used in a way that resembles a discount factor. At this point,
substituting the expression for the approximated term in (3), we obtain that the log
price-dividend ratio is de ned as:7
s
pt dt ' rt+1 + dt+1 + (pt+1 dt+1 ):
Re-arranging this expression shows that total stock market returns can be written as:
s
rt+1 = + (pt+1 dt+1 ) + dt+1 (pt dt ) ;
or a constant , plus the log dividend growth rate ( dt+1 ), plus the (discounted, at rate
) change in the log price-dividend ratio, (pt+1 dt+1 ) (pt dt ) = (pt+1 dt+1 )
(1 ) (pt+1 dt+1 ). Moreover, by forward recursive substitution one obtains:
s
(pt dt ) = rt+1 + dt+1 + (pt+1 dt+1 )
s s
= rt+1 + dt+1 + rt+2 + dt+2 + (pt+2 dt+2 )
s s 2
= ( + ) (rt+1 + rt+2 ) + ( dt+1 + dt+2 ) + (pt+2 dt+2 )
s s
= ( + ) (rt+1 + rt+2 ) + ( dt+1 + dt+2 ) +
+ 2( s
rt+3 + dt+3 + (pt+3 dt+3 ))
= s
(1+ + 2 ) (rt+1 s
+ rt+2 s
+ 2 rt+3 ) + ( dt+1 + dt+2 + 2 dt+3 ) + 3
(pt+3 dt+3 )
Xm X
m
j 1 j 1 s
= ::: = + ( dt+j rt+j ) + m (pt+m dt+m ) :
j=1 j=1
Under the assumption that there can be no rational bubbles, i.e., that8
m
lim (pt+m dt+m ) = 0;
m !1
from
X
m
1
j 1
lim =
m !1
j=1
1
if 2 (0; 1); we get
X
m
j 1 s
(pt dt ) = + dt+j rt+j :
1 j=1
7
The approximation notation `'' appears to emphasize that this expression is derived from an appli-
cation of a Taylor expansion.
8
This assumption means that as the horizon grows without bounds, the log price-dividend ratio (hence,
the underlying price-dividend ratio) may grow without bounds, but this needs to happen at a speed that
is inferior to 1= > 1; so that when pt+m dt+m is discounted at the rate m ; the limit of the quantity
m
(pt+m dt+m ) is zero.
6
This result shows that the log price-dividend ratio, (pt dt ), measures the value of a very
long-term investment strategy (buy and hold) which|apart from a constant =(1 )|
is equal to the stream of future dividend growth discounted at the appropriate rate,
s
which re ects the risk free rate plus risk premium required to hold risky assets, rt+j
rf + (rt+j
s
rf ).9 Therefore, for long investment horizons, econometric methods may hope
to infer from the data two di erent types of \information": information concerning the
forecasts of future (continuously compounded) dividend growth rates, i.e., dt+1 ; dt+2 ;
..., dt+m as m ! 1, which are measures of the cash- ows paid out by the risky assets
(e.g., how well a company will do); information concerning future discount rates, and in
s
particular future risk premia, i.e., (rt+1 rf ), (rt+2
s
rf ); ..., (rt+m
s
rf ) as m ! 1.
Log returns have time series properties that vary with the horizon at which they are
de ned: noise dominates at high frequency while information becomes more and more
relevant as the horizon increases. Returns are not predictable and non-normally distrib-
uted with time varying and persistent volatility at high frequency, while they become
more predictable and close to be normally distributed with constant volatility at lower
frequencies.
Summary statistics of the data (as presented during the lectures) reveal that the mean
of 1-month stock returns (i.e., when the monthly series of stock returns is considered) over
the sample 1956-2012 is 0.61 per cent (per month) with an associated standard deviation
of 3.7 per cent. Interestingly, the minimum value observed over this sample has been
-22.5 percent. If the data were drawn from a normal distribution, a sample of 40 billion
observations (here, months) would be needed to generate this observation. This is caused
by the fact that this minimum observation is in excess of 7 standard deviations away
from the mean of the empirical distribution. Under a (standard) normal distribution,
observations that fall either below or above the mean in excess of 7 standard deviations
receive such a small probability that these are expected to occur approximately in one
draw out of every 40 billion observations (equivantly, their probability is approximately
10
4e ; which is practically zero.
Deviations from normality become increasingly weaker as the horizon at which re-
turns are de ned increases, to disappear when we consider 10-year annualized returns,
obtained as the sum of 120 monthly returns computed in correspondence of each month
9
Here we have assumed that the risk-free interest rate is approximately constant. We shall see that,
at least as a rst approximation, this is an assumption that holds in practice.
7
in the sample:10 their mean is about 10 percent, their standard deviation is 4.9 percent
and the minimum observed value is -3.5 per cent, that can be observed with a relatively
large probability of 0.002. This is not surprising because 10-year stock returns are the
sum of 120 monthly returns and therefore a central limit theorem is likely to apply (for
instance, if returns were independently distributed).11 One important caveat in inter-
preting the behaviour of 10-year horizon returns sampled at monthly frequency is that
there is an important \overlapping-observation problem": when measured at monthly
frequency, 10-year returns at month t and at month t + 1 share 119 common observa-
tions and are therefore strongly correlated. In order to have independent observations it
would be necessary to \decimate" the series and take one observation every 120, com-
puted avoiding any overlaps. Of course in a sample of 50 years this procedure will leave
us with a small sample of only ve independent observations on 10-year returns. The
overlapping observations problem is relevant to determine appropriate standard errors of
parameters in regressions involving aggregated data, as residuals of these regressions are
not independently distributed. We shall return on these issues later in this book.
-0 .5
1946 1950 1954 1958 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008
-0 .5
-1
1946 1950 1954 1958 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008
-1
1946 1950 1954 1958 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008
0 .0 5
0
1946 1950 1954 1958 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008
8
Squared US Stock Market Returns: 1-M
1
0 .5
0
1946 1950 1954 1958 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008
0 .3
0 .2
0 .1
0
1946 1950 1954 1958 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008
0 .3
0 .2
0 .1
0
1946 1950 1954 1958 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008
20
10
0
1946 1950 1954 1958 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008
To illustrate the role of fundamentals and noise in determining returns consider the fol-
lowing simple model in which we augment the linearized de nition of returns obtained in
section 2.1 with a an elementary and yet realistic speci cation of the process for dividend
growth and for the dividend-price ratio:
s
rt+1 = 0 + 1 (pt+1 dt+1 ) + dt+1 (pt dt ) (4)
dt+1 = a d + 1 "1t+1 (5)
pt+1 dt+1 = adp + (pt dt ) + 2t+1 "2t+1 (6)
2
2t+1 = !+ 2
2t + [(pt dt ) adp (pt 1 dt 1 )]2 : (7)
Equation (5) speci es the process for the dividend growth as a simple white noise, in the
sense that the standardized deviations of dt+1 from its mean a d , ( dt+1 a d )= 1 ,
simply equals "1t+1 ; a standard zero mean, unit variance IID Gaussian random variable
that measures the innovations to the real dividend growth process.12 This simple parame-
12
IID is the acronym that means that a random variable is identically and independently distributed
over time. This implies that Cov[f ("1t ); g("1t+k )] = 0 for k 6= 0.
9
terization is fully consistent with the evidence of very little predictability in the dividend
growth rate (see for example, Cochrane 2008 and Meznly et al., 2004).
Equation (6) speci es the process for the (log) dividend-price ratio as a rst-order
autoregressive process in which the innovations have a time-varying variance. In fact,
notice that
Et [pt+1 dt+1 ] = adp + (pt dt ) ;
which depends on one lag of the log price-dividend ratio. Additionally,
which is clearly heteroskedastic as 2t+1 is characterized by a time index. Note that the
absolute value of must be less than 1 for the validity of the approximate linearized
de nition of returns. However, a large < 1; possibly close to 1, implies that the log
price-dividend ratio is highly predictable: as ! 1; then Et [pt+1 dt+1 ] ! adp + (pt dt )
and knowing (pt dt ) may give a very precise forecast of the future priced-dividend ra-
tio (how accurate shall depend on how large 2t+1 tends to be). 2t "2t+1 captures the
noise component: the log dividend-price ratio converges slowly to a long-run mean deter-
mined by fundamentals but it is dominated by noise over the short run. The conditional
volatility of this process is time-varying but the unconditional volatility is constant. Also,
consistently with the evidence reported in the following sections and chapters on squared
returns, conditional volatility is predictable. Noise is not necessarily normally distrib-
uted.13
Using this simple model, the one-step ahead prediction for stock returns can be written
as:
s
rt+1 = 0 + (adp + (pt dt ) + 2;t+1 "2;t+1 ) + 1 "1;t+1 (pt dt )
= [ 0 + adp ] + ( 1) (pt dt ) + [ 2;t+1 "2;t+1 + 1 "1;t+1 ]
| {z } | {z }
= 0 =vt+1
= 0 +( 1) (pt dt ) + vt+1 ;
where, because of the properties of 2t+1 "2t+1 ; it is clear that vt+1 = 1 "1t+1 + 2t+1 "2t+1
inherits the heteroskedastic properties of 2t+1 "2t+1 . Therefore one-period ahead fore-
casts of returns are dominated by the noise component vt+1 that determines time-varying
volatility and very little predictability of future returns given current fundamentals. No-
tice in fact that even though both and may be large and close to 1, because the
s
predictability coe cient of rt+1 has structure 1; this coe cient is likely to be small.
13
The details of this speci cation and the technical content of a few of these claims will be become
clear later on.
10
Iterating forward the equation (4) for m periods, we obtain a simple model for fore-
casting long-run stock returns:
Xm X
m X
m
j 1 s j 1 j 1 m
rt+j = (dt pt ) + dt+j + + (pt+m+1 dt+m+1 ) (8)
j=1 j=1 j=1
X
m Xm
m j 1 j 1
(pt+m dt+m ) = (pt dt ) + adp + 2t+m+1 j "2t+m+1 j :
j=1 j=1
The model clearly shows that, in the absence of bubbles, i.e., when lim m (pt+m dt+m ) =
m !1
P
m
j 1 s
0; expected long-run stock market returns, Et rt+j , depend on the current log
j=1
P
m
j 1
dividend-price ratio and on forecasts of future dividend growth rates, Et dt+j =
j=1
P
m
j 1
Et [ dt+j ]. Moreover, note that the term
j=1
X
m
j 1
2t+m+1 j "2t+m+1 j
j=1
14
P
m
j 1 s
When the average historical price-dividend ratio is su ciently high, then ' 1 so that rt+j '
j=1
P
m
s
rt+j ; which are m-period returns. For instance, when P =D = 25, which seems a plausible historical
j=1
25
value for many sectors, then 1+25 ' 0:962.
11
P
m
j 1
on the information available at time t (ignoring the constant ),
j=1
X
m
j 1
(dt pt ) + dt+j ;
j=1
(almost surely). The process (8) also predicts that the residuals of such predictive regres-
sion will contain a moving average component that should be taken care of in estimation.
This is a well-known result (see for example, Valkanov, 2003). Interestingly, (8) also
predicts that the coe cient on the dividend-price ratio in the projection of long-horizon
returns on this variable should be increasing with the horizon. In this speci cation the
forecasting performance of the dividend price ratio for stock market returns depends cru-
cially on the forecasting performance for dividend growth. Note that in the case in which
the dividend yield predicts expected dividend growth perfectly the proposition that re-
turns are not predictable holds in the data. However, the available empirical evidence
available tells us that the dividend yield does not predict dividend growth (see Cochrane,
2006) and this evidence is consistent with the parameterization that we have chosen in
which the expected discounted dividend growth converges to a constant and uctuations
in the dividend-price ratio are only related to uctuations in discount rates. If variables
other than the dividend yield can predict dividend growth, then the combination of these
variables with the dividend yield delivers the best predicting model for the stock market
(see Lettau and Ludvigson, 2005). Figure 3 illustrates how well the main predictions of
the model hold in the data.
12
Div idend Price v s 1-month Returns
-2.5
-3
10 Years Returns (t-10yrs,t)
-4
-4.5
-5
1946 1950 1954 1958 1960 1964 1968 1972 1976 1980 1984 1988 1992 1996 2000 2004 2008
Years
1.5 -3
10 Years Returns (t-10yrs,t)
1 -3.5
0.5 -4
0 -4.5
-0.5 -5
1956 1961 1966 1971 1976 1981 1986 1991 1996 2001 2006 2011
Years
13
4. Contagion Dynamics: A First Glance at the Joint Density of Returns
Contagion occurs when the degree of interdependence between two asset returns changes
over time and in particular when it changes on occasion of a crisis, i.e., when we record
a substantial surge in thevolatility of returns in one asset. How do we model contagion?
Consider the following speci cation of the joint process of high frequency returns on assets
i and j:
where we have boldfaced matrices. Note that the noise component of the two (say, stock)
markets displays a time-varying variance-covariance matrix, t; as the di usion of noise
is allowed to generate an interdependence between the two markets that is not constant
over time. In particular, the mean of vtj conditional upon the realization of vti is:
ij;t i
E(vtj vti = 2
vt ;
i;t
which shows that coe cient determining the interdependence between rjt and rit could be
time-varying in the presence of a time varying variance-covariance matrix. Interestingly,
the predictability of t could be used to forecast future interdependence structure as well
as the exposure of market j to spikes in the volatility in market i.
To be precise, de ned interdependence as:
ij;t jt
2
= ij;t :
i;t it
Therefore constant interdependence implies that when the variance of market i increases
relatively to the variance of j (i.e. a volatility \disease" strikes), the correlation between
the two asset returns increases. Therefore time-varying correlation is not necessarily and
indicator of contagion, i.e., of a structural change in interdependence in correspondence
of a volatility shock (\disease").
To see this formally, let's introduce the notion of Choleski decomposition. The Cholesky
algorithm implies that when we have a symmetric matrix,
" #
a b
M= ;
b c
14
we can write it as M = CC0 , where:
2 p 3
a 0
6 r 7
C=4 b b2 5 :
p c
a a
Using a Choleski decomposition of the variance-covariance matrix t; we can re-write
the simultaneous model for the returns in the two markets as a function of two shocks
orthogonal to each other. Then the measure of contagion described above is appropriate
in a DGP where return rit depends on an idiosyncratic structural shocks only and return
rjt depends on the shock on rit and on idiosyncratic shock according to:
! ! ! ! !
rti 10 11 0 pit 1 dit 1 it
= + + Ht (9)
rtj 20 0 21 pjt 1 djt 1 jt
" # !
it 0
Ht = q it
s D (0; I2 )
2
ij;t jt jt 1 ij;t jt
" #
2
it ijt it jt
t = 2 t = Ht H0t :
ijt it jt jt
this shows that a time-varying parameter predictive model for returns can be constructed
as a consequence of the time-varying structure of the covariance matrix. Interestingly
enough: " #
1
1 it
0
Ht = p ij;t
p1
1 2 1 2
it ij;t jt ij;t
15
and the contemporaneous impulse response|basically, the function that measures how a
variable reacts to a shock in another variable|of rtj to it is the re-scaled time-varying
analogue of the OLS coe cient ^bij;t 2
ij;t = it :
ij;t ij;t
ij;t jt = jt = = ^bij;t it :
it jt it
Consider the case of an investor who adopts a buy and hold portfolio strategy for a single
period of any xed length (the length is not a decision variable in the asset allocation)
from time t to time T . Let's denote with r the random vector of linear total returns from
time t to time T from a given menu of N risky assets for interval [t; T ], r D ( ; ).15
The investor can also invest at time t in a security the price of which at T is known at t
(typically, a non-defaultable Aaa bond), called risk-free security. Let rf be the discretely
compounded non-random return from this investment over every single period. Short
sales are admitted without any constraints. For simplicity, we ignore transaction costs
and any other frictions.
The investor's strategy is to invest in the riskfree bond and in the N risky assets
(stocks) at time t and then liquidate the investment at time T . The relative weights in-
vested in each of the risky assets are in collected in the column vector w, while (1 w 0 eN )
is the relative amount invested in the riskfree security (eN is a N 1 column vector of
ones). Given a degree of risk aversion , a standard mean-variance description of this
allocation problem is the following:
1
max (1 w0 e) rf + w0 (w0 w)
w 2
where E[r] = (1 w0 e) rf + w0 =rf + w0 ( rf e) and V ar[r] = w0 w.16 In this mean-
variance setup the distribution of returns is fully described by the rst two moments and
the minimization of the variance of the portfolio implies that by ruling out big losses also
big gains are ruled out. The solution of this problems determines the portfolio weights in
terms of the preferences of the investor, as capture by the parameter ; and the (known)
mean and the covariance matrix describing the joint distribution of returns. Because
this is an unconstrained convex problem, the rst-order conditions (FOCs) are necessary
and su cient and de ne the following system of N linear equations in N unknowns, the
15
Notice that in D ( ; ) ; D is not necessarily multivariate normal.
16
The presence of the 1/2 coe cient in the objective function simplies the problem and has no material
e ects on the qualitative ndings.
16
portfolio weights w 2RN :
( rf e) w = 0:
where rf e de nes the vector of risk premia for the N risky assets. (10) de nes the
solution to a standard mean-variance portfolio program and it is one of the most crucial
and commonly used results in all of nancial economics. Of course, in order to make
this approach to portfolio allocation operational, knowledge of needs to be paired with
estimates (better, forecasts of future values) of and (or rf e; when more convenient
or appropriate).
^ 0 e = 1, that is no investment in the riskfree
Consider now the special case in which w
bond is allowed. As you hopefully recall from your theory of nance class, this is the
famous tangency portfolio:
1
e0 w
^ = e0 1
rf e = 1 =) = e0 1
rf e
The simplest approach to the solution of the problem of nding numerical counterparts|
better estimates|for and is the use of historical moment estimators. Notice that the
underlying assumption is that of extreme stability of the environment that would allow
us to use as forecasts of future means, variances, and covariances, estimates derived from
the past. Almost no complex econometrics is needed to this end.17 This approach used
to be justi ed by a view prevalent in the 1960s and 1970s on the behaviour of asset prices
and nancial returns that Cochrane (1999) summarized as follows:
17
This is of course a matter of perspectives: even simple sample-based estimators of means, variance
and covariances are estimators in a statistical sense, and hence (because these are functions of random
samples) they are random variables and they have both joint and marginal probability distributions
(densities). Yet, to compute them is indeed straightforward.
17
The CAPM provides a good measure of risk and thus a good explanation for why
some stocks earn higher average returns than others according to the simple model
rf e = [( M rf )e];
Volatility and covariances are approximately constant over time, i.e., t V ar[rt+1 rf ejIt ]
= V ar[rt+1 rf e] = .
What is the econometric model speci cation that supports this traditional view from
the 1970s? Consider the following simultaneous equations linear regression model for a
sample of size T concerning observations on a vector of N returns:19
+
y+ = X+ + u+ , (11)
P
N
where y+ is a (N T 1) vector, X+ is a N T Ki matrix (Ki is the number of regressors
i=1
+ P
N
available at each point in time), is a Ki 1 vector of unknown parameters, and u+
i=1
18
As an example, consider the triangular matrix
" #
1 2
0 3
and the column vector [ 4 4]0 : We know that normally, the product would give
" #" # " #
1 2 4 4
= :
0 3 4 12
In this case the product is not even emphasized in the notation, i.e., Ax = A x. Using the dot product,
we obtain instead " # " # " #
1 2 4 4 8
= ;
0 3 4 0 12
i.e., each colum of the matrix is multiplied by the column vector.
19
In the lecture notes, it is possible that the number of assets G may have been called N: To impose
some uniformity in notations, we simply set N = G here and the two carry no special meaning or
di erentiation.
18
is a (N T 1) vector of residuals:
0 1 0 1
y1 X1 0 0 0
B .. C B .. C
B . C B . C
y+ = B C , X+ = B 0 X2 C,
B .. C B .. ... .. C
@ . A @ 0 . . A
yN 0 XN
0 1 0 1
1 u1
B .. C B .. C
B . C B C
+
= B C , u+ = B . C.
B .. C B .. C
@ . A @ . A
N uN
Clearly, all vectors and matrices are simply stacking N times the T -long histories for each
of the return series, each of the explanatory variables, as well as the residuals. Also notice
the special diagonal structure of X+ (the + indeed stands for \augmented" to emphasize
that the stacking operation has allowed us to capture in one single system of simultaneous
regression models, N such models, each with structure:
yi = Xi i + ui ;
i = 1; 2; ..., N . (11) is then easy to use to derive inferences on means, variance and
covariances:
+
E[y+ ] = X+
V ar[y+ ] = V ar[u+ ]
assuming that the regressors collected in X+ are predetermined (e.g., this will be the case
when X+ simply collects past values of asset return themselves). Because in this model
t V ar[y+ jIt ] = V ar[u+ ]; the third view is automatically enforced.
If ui is assumed to have standard white noise properties, i.e., E[ui ] = 0 and E[ui u0i ] =
ii IT (i.e., all residuals are not serially correlated although they can be contemporaneously
19
2
correlated) where ii = i; then the following properties hold for u+ :
E u+ = 0N T
0 1
E (u1 u01 ) E (u1 u02 ) E (u1 u0N )
B .. C
B E (u2 u01 ) E (u2 u02 ) . C
E[u (u ) ] = B
+ + 0
B .. .. ..
C
C
@ . . . A
0
E (uN u1 ) E (uN u0N )
0 1
11 IT 12 IT 1N IT
B .. C
B 21 IT I . C
= B C=
22 T
B .. .. . C IT :
@ . . .. A
N 1 IT N N IT
where each block of the covariance matrix E[u+ (u+ )0 ] is T T by construction. Here
20
denotes a standard Kronecker product. is non-singular covariance matrix. Notice
that will be a full matrix (i.e., it will not be simply a diagonal matrix) when the shocks
hitting the returns on di erent risky assets are potentially simultaneously correlated, so
that the o -diagonal elements of are non-zero. In this case, you shall remeber that least
squares estimation of the system (11) must be performed by Generalized Least Squares
(GLS), yielding
^+ = [X+ ( IT ) 1 (X+ )0 ] 1 [X+ ( IT ) 1 y+ ]:
The simplest model mentioned above in which E[rt+1 jIt ] = E[rt+1 ] = implies that
trivially no regression will be able to forecast or explain risky returns and as such X1 =
X2 =...= XN = eT and also assumes that all residuals are both contemporaneously and
21
serially uncorrelated, with diagonal covariance matrix d diagf 11 ; 22 ; ...., N N g.
+
Then, because the diagonal structure of d; one can show that GLS estimation of
20
For intance 2 3
ah ai al bh bi bl
2 3 6
6 am an ao bm bn bo
7
7
" # h i l 6 7
a b 6 7 6 ap aq ar bp bq br 7
4 m n o 5=6
6
7:
7
c d 6 ch ci cl dh di dl 7
p q r 6 7
4 cm cn co dm dn do 5
cp cq cr dp dq dr
If you contemplate the result for a while you understand the meaning of the di usive operation. Notice
that the Kronecker product of a Nc Nr matrix by a Mc Mr matrix, gives a new Nr Mr Nc Mc matrix.
21
Alternatively, the rst classical view would imply Xi = rf e+ i [( M rf )e]; where i is the CAPM of
the ith asset. However, although not exactly to a vector of ones, this is also a constant and the estimation
of i described below may be interpreted as an attempt to simply estimate i = rf + i ( M rf ) from the
data, assuming that the CAPM holds.
20
simply unwounds into application of classical OLS equation by equation. Consider for
example the observations on the ith return:
yi = eT i + ui ;
where 2 3 2 3
yi1 1
6 7 6 7
6 yi2 7 6 1 7
yi = 6
6 .. 7;
7 Xi = eT = 6
6 .. 7:
7
4 . 5 4 . 5
yiT 1
The OLS estimates of the relevant parameters are then simply
X T
1X
T
^i = 1 rit = ri ^ 11 = ^ 21 = (rit ri )2 ;
T t=1 T t=1
The traditional, simple approach to portfolio allocation can lead to dramatic swings in
optimal portfolio weights for small changes in investment views and conditions, as given
by the estimates/forecasts of and . There is a simple reason for this common nding:
too much sampling error in the estimation of the vector of expected returns and, due to
this, an asset allocation which is idiosyncratic to the speci c estimation sample. This
result is easily understood by using regression analysis to obtain a con dence interval on
the estimates of the mean returns obtained within the simple econometric model presented
in Section 6: the standard error associated to the OLS estimate ^i = ri is typically large.
Luckily, there are solutions to this problem. There are two di erent approaches: the rst
is to use methods that keep the simplest possible estimates of and but fully recognize
that the resulting estimates are simply realizations of sample estimators that may imply
considerable parameter (also called estimation) uncertainty; the second approach consists
of allowing for more complex econometric models of returns that are capable of exploiting
predictability. In this section we discuss of both approaches, although as far as the second
solution is concerned we simply o er and rather common and powerful example (Black and
Littermannn's framework), leaving it to the rest of this book project to show additional
avenues and models able to pin down any predictability in risky asset returns.
22
However, setting d diagf 11 ; 22 ; ...., N N g implies an important restriction: we are setting
ij = 0 for all pairs i 6= j even though the data may not support this condition.
21
7.1. The resampled optimal mean-variance portfolio
r1t = ^ 1 + u^1t
r2t = ^ 2 + u^2t
:::
rN t = ^ N + u^N t
h i0
u^1t u^2t ::: u^N t s N 0; ^ :
Notice that the fact that each observed return can always be decomposed as rit = ^ i + u^it
obtains by de nition. Moreover, standard least squares algebra shows that in this case
E[^
ut ] = 0 as claimed. The idea of resampling the mean-variance portfolio solution is
not to stop at replacing and with ^ [^ 1 ^ 2 ::: ^ N ]0 and ^ in the classical formula
1^ 1
^ =
w ^ rf e but to implement the following algorithm. Collect of the residuals
from estimation in the following T N matrix:
2 3
u^11 u^21 ::: u^N 1
6 7
6 u^ u^22 ::: u^N 2 7
^ 6 .12
U .. 7:
6 . ..
. 7
4 . ::: . 5
u^1T u^2T ::: u^N T
At this point, draw a new sample of size T of residuals by extracting randomly T rows
^ Extracting the rows at random and with replacement is important because it
from U:
ensures that the covariance structure of the residuals is preserved. Given these new, re-
^ 1t (t = 1; 2; ..., T ) and the estimates ^ ; we proceed
sampled residuals collected in a vector u
to generate a new arti cial sample of returns using
^ 1t ;
r1t = ^ + u
where the subscript \1" alludes to the fact that this represents the rst iteration of the
algorithm. At this point, a new OLS estimation of the model is performed on this arti cial
data, obtaining as an outcome a pair of new, bootstrapped estimates, ^ 1 and ^ 1 and,
^ 1 . At this point the algorithm is iterated a second time,
using the classical formula, w
re-sampling a new set of residuals u^ 2t (t = 1; 2; ..., T ) to obtain ^ 2 ; ^ 2 ; and w
^ 2 . This
22
algorithm is then replicated B times, where B is in general a large number (let's say 5,000
or 10,000 times), using the fact that at the bth iteration one simply draws a new sample
of size T of residuals by extracting randomly T rows from U; ^ generate a new arti cial
sample of returns using
^ bt ;
rbt = ^ + u
The traditional, simple mean-variance approach to portfolio allocation can lead to dra-
matic swings in portfolio weights for small changes in the investment views as given by
the estimates of and : Black and Littermannn's model (see Black and Littermann,
1990, and Black and Littermann, 1991) was developed to provide a systematic solution
to this problem. The basic idea is that of using as a starting point the market portfolio
composition (i.e., the most general, value-weighted portfolio of all risky assets) to inform
one's ex-ante beliefs on what is \plausible" and express an investor's own views as depar-
tures from this allocation. The main contribution of the method it to discipline the asset
manager action. A numerical speci cation of these views and of the con dence in such
views are also required to implement the approach. The Bayesian algorithm proposed by
Black and Littermann ensures then the most e cient implementation of the expressed
views into a vector of portfolio weights.
The starting intuition of Black and Littermann's framework is that we are not forced
to estimate (forecast) expected returns: Given our knowledge of total capitalization of dif-
ferent markets, it is possible to obtain expected returns by reverse engineering the optimal
23
portfolio allocation formula. In practice, given the knowledge of the market capitalization
and therefore of the market weights wmkt and some estimates of the variance-covariance
matrix of returns, we can use the optimal portfolio allocation condition to derive the
expected returns consistent with the market capitalization:
1 1
wmkt = rf e =) mkt = wmkt + rf e:
Assume now that the portfolio manager holds some (normally distributed, for simplic-
ity) views on a subset of size Q N of the N expected returns included in the market
portfolio:
P r s NQ (v; ) ;
r sN( mkt ; );
where is a scalar smaller than one (and conventionally set to 1/3 by Black and Lit-
termann and most of the subsequent literature) to lter out of the estimated covariance
matrix of returns the impact of their random variation (i.e., to take into account the e ect
of noise in small samples).
Black and Littermann's approach aims then at generating a value for the expected
return vector BL by optimally combining the distribution of returns implied in the
market capitalization and the subjective views of the portfolio manager. This is obtained
by solving the following optimization problem:
0 1
BL = arg min ( mkt ) ( ) ( mkt ) + (P v)0 1
(P v) :
This is a weighted least squares problems, where the weights depend on covaraince matrix.
When the diagonal elements of all approach zero, that is, when there is in nite con -
dence in the subjective views by the investor, the problem becomes a constrained least
squares problem where the relevant constraint is P BL = v. On the other hand, when
has diagonal elements diverging to in nity (no con dence in the views), the solution to
the problem is simply BL = mkt .
24
The rst order conditions for the solution of the problem can be written as follows:
1
2( ) ( BL mkt ) + 2P0 1
(P^ BL v) = 0
At this point, given ^ BL the optimal BL portfolio weights are obtained by the usual
formula:
1
1^ 1 f T ^ BL rf e
^ BL =
w ^ BL r e or w
^ BL = :
e0 1 (^
BL rf e)
There is an important discrepancy between our view on the relative importance on noise
and information in determining nancial returns and the simple asset allocation model
presented in Section 7. On the one hand, in the simple asset allocation model, the
econometric framework considered for returns is as follows:23
where rt;t+k denotes the return between time t and t+k of a risky asset (stock) or portfolio.
Pk
In the continuously compounded case, rt;t+k j=1 rt+j , which we have called a long-
horizon return. On the other hand, a more general model has emerged from our discussion
of the Dynamic Dividend Growth model that can be written as follows:
0
rt;t+k = + Xt + t;k ut+k ut+k IID N (0; 1);
where Xt is a set of predictors observed at time t. We know that as the horizon k in-
creases, predictability increases and therefore the uncertainty related to the unexpected
23
During the lectures, it is possible that the sum of IIDness of returns and of normality has also been
denoted as ut+k n:i:d:(0; 1): Note that IID N (0; 1) and n.i.d.(0, 1) have identical meaning.
25
components of returns decreases (i.e., the annualized variance of returns is a downward
sloping function of the horizon). Moreover|as we have already discussed|the depen-
dence of t;k on time (i.e., its time-varying nature) declines and long-horizon returns can
be described as a (conditional) normal homoskedastic processes. In the short-run noise
dominates and modelling returns on the basis of fundamentals is very di cult. However
as the horizon increases fundamentals become more important to explain returns and the
risk associated to portfolio allocation based on econometric models is reduced. The statis-
tical model becomes more and more precise as k gets large. As a result, optimal portfolio
weights can be based on forecasting models that are more articulated than the simple one
considered so far and based on time-varying predictors. The following chapters/lectures
guide you through such more articulated models.
Rp D p;
2
p
0 2
p = w
^ p ^0 w
=w ^
Having solved the portfolio problem and having committed to a given allocation described
by w,
^ there is a di erent role that econometrics can play at high frequencies: measuring
volatility and providing information on portfolio risk. In this subsection we brie y some
basic notions related to the use of econometrics in risk management applications, by
recalling a few simple notions that become relevant later on. The role of econometrics
in applied risk management is best seen through a di erent statistical model of high
frequency returns. When k is small (i.e., when one is considering infra-daily, daily, weekly
or at most monthly returns) the following framework is normally referred to:
1. The distribution of returns is centered around a mean of zero, and the zero mean
model dominates any alternative model based on predictors.
2. The variance is time-varying and predictable, given the information set, It ; available
at time t.
26
3. The distribution of returns at high frequency is not normal, i.e., D(0; 1) may often
di er from N (0; 1)
Given these features of the data, econometrics can still be used at high frequency
to assess the risk of a given portfolio. In particular, we shall investigate the role of
econometrics for deriving the Value-at- Risk (VaR) of a given portfolio. In the following,
we do not specify to what horizon the VaR refers to, although for simplicity one may
simply refer to a horizon that is equal to the frequency at which the data are sampled.
The VaR is the percentage loss obtained with a probability at most of percent:
Pr (Rp < V aR ) = :
Rp p V aR + p
Pr (Rp < V aR ) = () Pr < =
p p
V aR + p
() = ;
p
1
where ( ) is the cumulative density of a standard normal. At this point, de ning ()
as the inverse CDF function of a standard normal, we have that
V aR + p 1 1
= ( ) () V aR = p p ( ):
p
This shows that under normality, the -percent V aR is given by an intercept, represent
by the opposite of the mean portfolio return over the relevant horizon (say, 1 month),
minus the product between the estimated (or predicted) volatility of the portfolio p and
the critical value under a standard normal CDF that leaves a total mass of below such a
1 1
critical point, ( ) : Notice that for < 0:5 (as it is normally the case), ( ) < 0; so
that V aR is actually monotone increasing in p but monotone decreasing in p: This is
intuitively sensible: given < 0:5, a higher period expected return decreases risk for given
volatility, because this shifts the location of the distribution of future portfolio returns to
the right; a higher period volatility increases risk for given expected returns, because this
fattens the tails of the distribution of future portfolio returns for given location.
24
The notation () emphasizes the equivalence across alternative expressions. In what follows we deal
with the generic -percent VaR in place of a p-percent VaR to avoid the confusion between p denoting
tail probabilities vs. p denoting variables that refer to a portfolio.
27
1
For instance, because (0:01) = 2:33; we can easily obtain VaR if we have available
estimates of the rst and second moments of the distribution of portfolio returns:
V[
aR0:01 = ^p 2:33^ p
As the best predictor for mean portfolio returns is approximately zero when the horizon
is su ciently small (say at infra-daily or daily frequencies), then V[
aR0:01 ' 2:33^ p and
computing such an estimate of the VaR for any given portfolio under the null of normality
essentially only means to model and forecast volatility. In fact we shall proceed to model
volatility in the case of normal distribution of returns as a rst step in the course. We
shall then remove the hypothesis of normality and proceed to model VaR when volatility
is predictable and returns are not normally distributed. Multivariate models and time-
varying correlations will also be discussed in due course.
2. Assume you are a German investor. Compute from your German perspective (of a
consumer that pays in euror, i.e., please make sure to correctly use exchange rates! )
total monthly returns (i.e., including dividends) in excess of the your appropriate
risk free rate (the German one) for the US, UK and German stock markets and for
the German 10-years bond. Use the yield on 3-month government bonds as the risk
free rate. Work with log returns, even if in the following points/questions
this choice will lead to some inaccuracies.
3. Compute and plot cumulative excess returns of all risky assets over the sample
period January 1978 - December 2003.
4. Assume you had wanted to invest your wealth among the risky assets listed above
over the period January 2004 - December 2007 (notice only the risky assets). Solve
28
the asset allocation problem using the historical sample January 1978 - December
2003. In order to compute weights, use the solution to the Markowitz mean-variance
optimization problem. Base the calculations required by this exercise on uncondi-
tional moments.
5. Please comment on the weights delivered by your optimization exercise. Plot the
performance of your portfolio against the one of alternative portfolios based on a
buy and hold strategy for each of the risky assets over the investment period.
6. Re-estimate the weights by calculating the unconditional moments over the sample
January 2004 - December 2007. Are the weights equal to the ones computed in
question 4?
7. Assume that you have a subjective view, di erent from the unconditional moments,
on the expected excess returns of the stock and bond market indices under investi-
gation with reference to the investment period January 2004 - December 2007. How
would you now approach the asset allocation problem exploiting this subjective
view? Assume = 2:5:
SOLU T ION S
1. The solution to the rst question is based on the following lines of code:
[ lename,pathname]=uiget le(`*.xls');
[data,textdata,raw] = xlsread( lename,1);
You load data from the Excel spreadsheet into the Matlab work le. You will now have
three objects in the work le: the matrix raw (which shows time series, headers and dates
exactly as they appear in the excel le), the matrix data (collecting the time series), and
the matrix texdata (collecting headers and dates). Dates are imported from Matlab as
text.
A useful command you may now use is datenum. This function allows you to transform
dates written as text in serial numbers:
date=datenum(textdata(3:end,1),`dd/mm/yyyy');
29
2. To answer this question you have to compute asset returns for each risky asset and
then subtract the riskfree rate in order to obtain excess returns. As a rst step, you
can compute the monthly log riskfree rate:
lrf m = log(1+(data(:,2)/(100*12)));
Bear in mind that yields are always reported as annualized interest rates: this is why
we are dividing by 12. We are also dividing by 100 as a matter of scaling (in the .xls le
1=1%).
We now turn to the German 10-year bond. The time series that we nd in the .xls
le contains the yield of the bond. We therefore have to transform yields into returns.
This can be easily done by computing the duration of the bond and multiplying it by
variations in yields. As a rst step, we calculate log yields for the bond:
ly 10 m = (log(1+(lag(data(:,1))/(100*12))));
Again, data have to be re-scaled and annualized. We now have to approximate the
bond durations; a way to do it is to use an approximation that can be derived by assuming
that coupons are equal to the yield to maturity (see Appendix A for further details):
dur = ((1-(1+(data(:,1)/(100))).^(-10)))./(1-(1+(data(:,1)/(100))).^(-1));
The excess return is then the di erence between the monthly returns of the 10-year
bond and the monthly return of the risk free rate.
For what concerns stocks, log total returns are calculated including both log prices
and log dividends (here we just show an example for the German market). Of course,
dividend yields need to be annualized:
dy ger m = data(:,4)/(100*12);
lret ger m = log((p ger./lag(p ger))+dy ger m);
30
For the US and UK markets, returns have to be adjusted correcting for exchange rate
dynamics. We compute exchange rates as:
r EUvsDOL = log((data(:,9))./lag(data(:,9)));
r STRvsDOL = log((data(:,10))./lag(data(:,10)));
r STRvsEU = r STRvsDOL - r EUvsDOL;
In our weird notation, we might not be taking thoroughly into account FX conventions.
We therefore specify that: EUvsDOL means 1e = # .# # # $. For those of you who
master FX trading, we can also say that in this example the euro is the base currency.
We the compute returns in EURO terms of the US and UK stock markets:
We get excess returns again by just subtracting the log monthly risk free rate.
3. In order to solve this question, you have to compute the cumulative sum (remember
we are working with log returns) of excess returns over the given sample. As a rst
step, you have to select the starting and ending date of the sample and count the
number of observations in between them. Thus:
s start = '01/01/1978';
s end = '01/12/2003';
date nd=datenum([s start; s end],'dd/mm/yyyy');
ss=date nd(date nd(1,1),date);
se=date nd(date nd(2,1),date);
You can now select the relevant observations in the vectors of excess returns, and
collect them into a matrix:
Perf = cumsum(R);
31
This is the result you should get when plotting the time series:
gure(1);
plot(Perf );
title(`Assets Excess Returns 01/1978:12/2003',`fontname',`Garamond',`fontsize',14);
index=1:60:(se-ss+1);
set(gca,`fontname','garamond','fontsize',10);
set(gca,`xtick',index);
set(gca,`xticklabel',`Jan1978jJan1983jJan1988jJan1993jJan1997jJan2002');
set(gca,`xlim',[1 (se-ss+1)]);
grid;
ylabel(`Returns');
xlabel(`Date');
h=legend(`German LT Bond', `German Equity',`US Equity',`UK Equity',0);
A few brief comments: with the rst command you open a new empty gure in Matlab,
and set it as gure number 1. In this way, if you open a new gure, you will not overwrite
your previous work. We then plot the matrix Perf into the gure. If we are interested in
raw results and do not like formatting, our work is done. However, here we discuss some
easy commands to improve the appearance of graphs in Matlab. A useful command under
this respect is the function set, which is used in order to modify the gure properties.
32
In the code, the rst thing we do to format the plot is to assign a title and choose its
font and size. Then, we assign labels to the x-axis. First, we have to state how many
ticks we want on the axis and what is the distance between each two consecutive ticks.
This is why we build the linespace index and assign it to the command `xtick'. After
having chosen the number and position of the ticks, we can assign labels, as you can see
in row 6 of the code. We then de ne the limits for the x-axis, we plot a grid on the chart,
we assign axis names, and plot the legend.
4. We have to nd the optimal weights to the risky assets, i.e., the structure of
Markowitz's tangency portfolio.25 Because we are considering a multi-period in-
vestment horizon, we have to multiply both the unconditional mean vector and the
unconditional variance-covariance matrix by n, where n is the number of periods
(months) to be used. This is exactly what we do in the code:
muR = n*mean(R)';
SigmaR = n*cov(R);
Remember that muR is a vector of expected excess returns (also called risk premia).
As a result, the vector of weights is:
wMP = ((SigmaR^(-1))*muR)./(ones(4,1)'*(SigmaR^(-1)*muR));
5. We now plot the tangency weights and the performance of the portfolio derived
using the Markowitz model. As you can see, the portfolio is almost 100% invested
in the bond, and over the investment period dramatically underperforms all the
individual stocks taken into consideration.
25
The formal solution to the Markowitz asset allocation problem can also be found in the handouts of
FINANCIAL ECONOMETRICS AN EMPIRICAL FINANCE - MODULE 1, chapter 8.
33
6. We plot the weights calculated using data from the sample Jan. 2004 - Dec. 2007.
As you can see, their values have slightly changed.
34
7. As a rst step, we have to reverse-engineer market expected returns, starting from
the historical matrix (for the sample Jan. 1978 - Dec. 2003), the risk aversion co-
e cient , and a vector of observed market weights of the assets in our portfolio.26
We make the assumption that = 2:5, as it is usually done in B&L implemen-
tations. For the risky assets, we assume that for German investors the market is
equally allocated between the 4 investment opportunities; thus each asset will have
an hypothetical observed weight of 25% (note that this assumption does not a ect
you nal results very much; try and play around with the numbers to check this is
indeed the case). We can then compute the implied expected excess returns as:
mkt rf e = wmkt :
In the code:
As a next step, we de ne the views. Here we just say that we expect cumulated
excess returns over the whole investment period to be 40% for each stock index. For what
26
For a review of how to integrate a subjective view on a certain asset and Markowitz's mean-variance
approach, seeFINANCIAL ECONOMETRICS AND EMPIRICAL FINANCE - MODULE 1 lec-
ture notes, chapter 12.
35
concerns the bond, we do not express any views. We therefore specify a selection matrix
P as follows: 2 3
0 1 0 0
6 7
P=4 0 0 1 0 5
0 0 0 1
and a vector of expected values for the views v:
2 3
0:4
6 7
v = 4 0:4 5 :
0:4
To express our con dence in our insights, we specify the covariance matrix of the views,
and call it . It makes sense to assume that the matrix is diagonal, since it is di cult to
gure out correlation structures between absolute views on di erent assets. We choose a
relatively high value of the standard deviation, which is set to equal 0.2 (i.e., variance is
0.04) for each of the equity indices:
2 3
0:04 0 0
6 7
= 4 0 0:04 0 5 :
0 0 0:04
Finally, we set a parameter to expresses con dence in the estimates based on historical
data of the covariance matrix of returns, = 0:3, which is again a value in line with what
is usually done in applications of Black and Littermann's approach.
In the code, we have:
K = tau*SigmaR*P'*(tau*P*SigmaR*P' + Gamma)^(-1);
sBL = lambda*((SigmaR^(-1))*(muMP i) +
(SigmaR^(-1))*K*(V-P*muMP i));
wBL = sBL./(ones(4,1)'*sBL);
36
Appendix A: Yields-to-Maturity, Duration and
Holding Period Returns
A1. Zero-Coupon Bonds
De ne the relationship between price and yield to maturity of a zero-coupon bond as
follows:
1
Pt;T = ; (12)
(1 + Yt;T )T t
where Pt;T is the price at time t of a bond maturing at time T , and Yt;T is yield to maturity.
Taking logs of the left and the right-hand sides of the expression for Pt;T ; and de ning the
37
continuously compounded yield, yt;T ; as log(1 + Yt;T ), we have the following relationship:
which clearly illustrates that the elasticity of the yield to maturity to the price of a zero-
coupon bond is the maturity of the security. Therefore the duration of the bond equals
maturity as no coupons are paid. The one-period uncertain holding-period return on a
T
bond maturing at time T , rt;t+1 ; is then de ned as follows:
T
rt;t+1 pt+1;T pt;T = (T t 1) yt+1;T + (T t) yt;T (14)
= yt;T (T t 1) (yt+1;T yt;T ) ;
which means that yields and returns di er by the a scaled measure of the change between
the yield at time t + 1; yt+1;T ; and the yield at time t, yt;T .
When the bond is selling at par, the yield to maturity is equal to the coupon rate. To
measure the length of time that a bondholder has invested money for we need to introduce
the concept of duration:
C C 1+C
+2 2 + ::: + (T t) T t
c
(1+Yt;T
c
) ( t;T
1+Y c
) (1+Yt;T )
Dt;T = c
Pt;T
TPt
i (T t)
C i + T t
i=1 (1+Y c
t;T ) (1+Y t;T )
= c
:
Pt;T
X
T t
i (T t)
c c
Dt;T = Yt;T i +
i=1
c
1 + Yt;T (1 + Yt;T )T t
(T t) 1+Y1 c (T t) 1 1
T t+1 + 1
c
1+Yt;T
c
t;T (1+Yt;T
c
) (T t)
= Yt;T 2 +
1 1
c
(1 + Yt;T )T t
1+Yt;T
c (T t)
1 1 + Yt;T
= 1 ;
c
1 1 + Yt;T
38
because when jxj < 1;
X
n
(nx n 1) xn+1 + x
kxk = :
k=0
(1 x)2
Duration can be used to nd approximate linear relationships between log-coupon
yields and holding period returns. Applying the log-linearization of one-period returns to
a coupon bond we have:
c
pc;t;T c = rt+1 + k + (pc;t+1;T c)
c
rt+1 = k + pc;t+1;T + (1 )c pc;t;T :
1 c 1
When the bond is selling at par, = (1 + C) = 1 + Yt;T . Solving this expression
forward to maturity delivers:
TX
t 1
i c
pc;t;T = k + (1 )c rt+1+i :
i=0
c
The log yield to maturity yt;T satis es an expression with the same structure:
TX
t 1 n
i c 1 c
pc;t;T = k + (1 )c yt;T = k + (1 )c yt;T
i=0
1
c c
= Dt;T k + (1 )c yt;T :
By substituting this expression back in the equation for linearized returns we have the
expression
c c c c c
rt+1 = Dt;T yt;T Dt;T 1 yt+1;T ;
that illustrates the link between continuously compounded returns and duration.
References
[2] Campbell, John Y., and Robert Shiller, 1988, Stock Prices,Earnings, and Expected
Dividends, Journal of Finance, 43, 661-676.
[3] Campbell, John Y., and Robert Shiller, 1988, The Dividend-Price Ratio and Ex-
pectations of future Dividends and Discount Factors, Review of Financial Studies,
1:195-228
39
[4] Campbell, John Y., and Robert Shiller, 1998 Valuation Ratios and The Long-Run
Stock Market Outlook, 1998, Journal of Portfolio Management
[5] Campbell, John Y., and Robert Shiller, 2001 Valuation Ratios and The Long-Run
Stock Market Outlook, an update, Cowles Foundation DP 1295
[6] John Y. Campbell and Tuomo Vuolteenaho (2004), In ation Illusion and Stock Prices
(Cambridge: NBER Working Paper 10263)
[7] Cochrane, J. H. The Dog that Did Not Bark: A Defense of Return Predictability.
Review of Financial Studies, 21 (2008), 4, 1533-1575.
[9] Davidson, R., and E. Flachaire. The Wild bootstrap, Tamed at Last., Journal of
Econometrics,146(2008), 1, 162-169.
[10] Fama, Eugene and Kenneth R. French, 1988, Dividend Yields and Expected Stock
Returns, Journal of Financial Economics, 22, 3-26.
[12] Lander J., Orphanides A. and M.Douvogiannis(1997) Earning forecasts and the pre-
dictability of stock returns: evidence from trading the S&P, Board of Governors of
the Federal Reserve System, http:www.bog.frb.fed.org
[13] Lettau, Martin, and Sydney Ludvigson, 2005, Expected Returns and Expected Div-
idend Growth, Journal of Financial Economics, 76, 583-626
[14] Lettau, Martin, and Stijn Van Nieuwerburgh, 2008, Reconciling the Return Pre-
dictability Evidence, Review of Financial Studies, 21, 4, 1607-1652
[16] Meznly, Lior; Tano Santos, and Pietro Veronesi, Understanding Predictability, Jour-
nal of Political Economy, 112 (2004),1,1-47.
[17] Franco Modigliani and Richard Cohn (1979), In ation, Rational Valuation, and the
Market, Financial Analysts' Journal.
40
[18] Newey, W. K. and K. D. West. A Simple, Positive Semi-de nite, Heteroskedasticity
and Autocorrelation Consistent Covariance Matrix., Econometrica, 55 (1987), 3, 703-
08.
[19] Newey, W. K., and D. K. West. Automatic Lag Selection in Covariance Matrix
Estimation. Review of Economic Studies, 61 (1994), 631-653.
[20] Robert J. Shiller. Do Stock Prices Move Too Much to be Justi ed by Subsequent
Changes in Dividends? American Economic Review 71 (June 1981), 421-436. 21
41