ST339 CompleteNotes
ST339 CompleteNotes
1.3
Trading strategies and arbitrage opportunities . . . . . . . . . . . . . . . . . . .
Discounting . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
12
14
For Academic Year 2022/23 1.5 The Fundamental Theorem of Asset Pricing . . . . . . . . . . . . . . . . . . . . 17
University of Warwick 2.7 The Markowitz tangency portfolio and the capital market line . . . . . . . . . . 34
3 Utility Theory 44
3.1 Measure theoretic preliminaries . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
1
I would like to thank Martin Herdegen for sharing his notes and the students and teaching assistants Shiyao 4 Introduction to Risk Measures 60
Bian, Nikolaos Constantinou, Chester Gan, Alia Hajji, Scott Hamilton, Kairav Hirani, Nazem Khan, Kevin
4.1 Monetary measures of risk . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 60
Lam, Rahul Mathur, Noah Prasad, Anthony Shau, Osian Shelley, Haodong Sun, Anastasiya Tsyhanova, and
Ben Windsor for spotting typos in previous versions. 4.2 Value at Risk and Expected Shortfall . . . . . . . . . . . . . . . . . . . . . . . . 63
Any remaining mistakes and errors are of course my responsibility.
5 Pricing and Hedging in Finite Discrete Time 67 0 Introduction and Preliminaries
5.1 Conditional expectations . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 67
In this chapter, we briey discuss what Mathematical Finance is about as a subject, give a
5.2 Filtrations and martingales . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 69
short introduction to nancial assets, and then review some key concepts from Probability
5.3 Financial markets in nite discrete time . . . . . . . . . . . . . . . . . . . . . . 72
Theory.
5.4 Self-nancing strategies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 74
5.8 Pricing and hedging in the binomial model . . . . . . . . . . . . . . . . . . . . . 90 ods from probability theory and statistics. One main goal is to understand how nancial
assets such as stocks, bonds, and options are correctly priced taking into account
This module aims to give basic answers to the above questions. Along the way, we also need
This module focusses on key concepts and the probability side of Mathematical Finance.
This means that we assume throughout that all stochastic parameters (like the distribution
of assets) are known. In reality, of course, all parameters must be estimated from data. This
is a very interesting and challenging topic of its own and usually referred to as Financial
which are crucial for the nancial industry. This is again a very interesting and challenging
topic of its own and usually referred to as Computational Finance. Finally, we will not address
any questions related to machine learning which is of course becoming increasingly important
3 4
One main distinction between nancial assets is whether they are equity or debt. An equity Example 0.1. (a) If Ω is nite (or countable), i.e., Ω = {ω1 , . . . , ωN } (or Ω = {ω1 , ω2 , . . .}),
security entitles the holder to a share of the prots of a business, usually paid out as dividends. then the usual choice for a σ -algebra on Ω is F := 2Ω := {A : A ⊂ Ω}, the power set of Ω.
The prime example are stocks, and they are often traded at large exchanges as the London (b) If Ω = R, it turns out that the power set 2Ω is too big to be chosen as σ -algebra.3 For
Stock Exchange (LSE) or the New York Stock Exchange (NYSE). By contrast, a debt security this reason, one chooses instead F := BR , the Borel σ -algebra on R, i.e., the smallest σ -algebra
entitles the holder to a xed, predened payment stream from a counterparty. Important on R that contains all intervals of the form (a, b), (a, b], [a, b) and [a, b] for −∞ < a < b < ∞.
examples are corporate or government bonds. They are often also traded at large exchanges
Given a measurable space (Ω, F), a probability measure P on (Ω, F) is a map F → [0, 1]
like the LSE or NYSE. Usually equity is considered more risky than debt, because dividends
such that
of business are uncertain whereas the payments of a bond are certain. Notwithstanding, bonds
are not riskless because businesses (and also countries) may default. (a) P [∅] = 0 and P [Ω] = 1;
Another main distinction between nancial assets is whether they are primary or deriva- S P∞
(b) A1 , A2 , . . . ∈ F with Ai ∩ Aj = ∅ for i ̸= j =⇒ P [ An ] = n=1 P [An ].
tive. The payment structure of a derivative security (or just derivative ) depends on some n∈ N
other, more basic, underlying variable or nancial asset, whereas the payment structure of a The triple (Ω, F, P ) is called a probability space.
primary asset does not. The prime examples of primary assets are stocks and bonds. The If Ω is nite (or countable) and F = 2Ω , then a probability measure P is characterised by its
most important examples of derivative assets are options, futures and swaps.2 They are either values on the elementary events {ω}, i.e., we only need to specify P [{ωn }] for n ∈ {1, . . . , N }
traded at large exchanges or over the counter (OTC). Derivates can either reduce or magnify (or n ∈ N).
risk, depending on how they are used. If Ω = R and F = BR , then a probability measure P is uniquely characterised by its
If you buy a nancial asset you pay today a certain amount of money to get the asset. You (cumulative) distribution function FP : R → [0, 1] dened by
are then said to have a long position/to be long in the asset. By contrast if you (short-)sell a
nancial asset that you do not own, you get some money today and have to deliver the asset in FP (x) = P [(−∞, x]].
the future. You are then said to have a short position/to be short in the asset. More complex
portfolios, held e.g. by hedge funds, often involve a mixture of both long and short positions. Random variables
A (real valued) random variable X on a measurable space (Ω, F) is a function Ω→R such
0.3 Fundamental concepts of Probability Theory that for any Borel set B ∈ BR ,
Probability Theory is a cornerstone of Mathematical Finance. Here, we review some of its key
concepts. They are fundamental for the understanding of this module. For more details, we {X ∈ B} := {ω ∈ Ω : X(ω) ∈ B} ∈ F.
refer to the textbook Probability Essentials by Jacod and Protter [5].
We then also say that X is F -measurable. Whilst this denition is nice from a theoretical
Probability spaces perspective, it is almost useless for checking in practice that a given function X :Ω→R is
F -measurable. But fortunately, one can show that it suces to check this condition only for
A sample space Ω is a (nite or innite) set. Each ω ∈Ω describes a possible state of the
Borel sets B of the form B = (−∞, x] for x ∈ R, i.e., X is F -measurable if and only if for all
world.
x ∈ R,
Given a sample space Ω, a σ -algebra F on Ω is a collection of subsets of Ω such that
{X ≤ x} := {ω ∈ Ω : X(ω) ≤ x} ∈ F.
(a) Ω ∈ F; If (Ω, F) = (R, BR ), a random variable is often also called a measurable function and
(b) A ∈ F =⇒ Ac = Ω \ A ∈ F ; denoted by f or g, etc. Note that all continuous functions on R are measurable.
S The denition of a random variable does not mention any probability measure P at all.
(c) A1 , A2 , . . . ∈ F =⇒ n∈N An ∈ F. However, if we add a probability measure P to a measurable space (Ω, F), we get some further
The pair (Ω, F) is called a measurable space and each A∈F is called an F -measurable event.
3
See [3, Theorem 1.5] for a precise formulation of this statement.
2
For an excellent introduction to derivative securities, we refer to the textbook by Hull [4].
5 6
notions: For B ∈ BR we say that X ∈ B P -almost surely (P -a.s.) if P [X ∈ B] = 1. The If X is a simple random variable, i.e., X = c1 1A1 + · · · cn 1An , where ci ∈ R and Ai ∈ F ,
distribution (or law ) of X is dened by one sets
One can check that PX is again a probability measure on the measurable space (R, BR ). quence of nonnegative simple random variables,
5 one sets6
If X is general, one says that X is integrable or has nite expectation if E P [|X|] < ∞.
pX (x) = P X [{x}] = P [X = x], x ∈ R. In this case one sets
E P [X] := E P X + − E P X − ,
x ∈ Bc
P
Then pX (x) = 0 for and x∈B pX (x) = 1.
where X + = max{0, X} denotes the positive part and X − = max{0, −X} denotes the
Example 0.2. Let (Ω, F, P ) be a probability space and A ∈ F an F -measurable event. Then
negative part of X .
7
the indicator function 1A dened by
A random variable X is called continuous 4 if there exists a measurable (e.g. continuous) Indeed, setting cn := X(ωn ) and An := {ωn } for n ∈ {1, . . . , N }, we get
The following result lists two important properties of the expectation operator.
Expectation
Lemma 0.6. Let X and Y be integrable random variables on some probability space (Ω, F, P ).
For a random variable X on a probability space (Ω, F, P ), the expectation of X under P (also
(a) For a, b, c ∈ R, aX + bY + c is again an integrable random variable and
called the integral of X with respect to P ), is dened as follows:
4
A better name would be absolutely continuous. The CDF of a continuous random variable is absolutely E[aX + bY + c] = aE[X] + bE[Y ] + c.
continuous. The fact that the CDF is continuous is not sucient for the corresponding random variable to be
5
continuous. For instance, one can set Xn (ω) := max(2−n ⌊2n X(ω)⌋, n), where ⌊·⌋ denotes the oor function.
6
Of course, one has to show that this is well dened, i.e., independent of the choice of the approximating
sequence (Xn )n∈N .
7
Note that both X+ and X− are nonnegative random variables, X = X + − X −, and |X| = X + + X − .
7 8
(b) If X ≥ Y P -a.s., then If u = λ0 v then ||u||2 = λ0 ||v||2 and we nd λ0 the maximising value of λ is λ0 . Then
E[X] ≥ E[Y ]. (0.1) u · v = ||u||2 ||v||2 . Conversely, if there is equality then we must have u = λ0 v where λ0 =
||u||2 /||v||2 .
Moreover, the inequality in (0.1) is an equality if and only if X = Y P -a.s.
Corollary 0.9. Suppose x and y are vectors in Rd and suppose Σ is a symmetric d × d
Property (a) is referred to as linearity of the expectation and property (b) as monotonicity
invertible matrix. Then xT Σy ≤ (xT Σx)1/2 (yT Σy)1/2 .
of the expectation.
The following result is very useful for calculating expectations (of functions of ) discrete or Proof. Let ∆ (a d × d-matrix) be such that ∆T ∆ = Σ. Let u = ∆x and v = ∆y . Then
||u||22 + λ2 ||v||22
u·v ≤ (0.2)
2λ
9 10
1 No-Arbitrage and the Fundamental Theorem of Asset Pricing where u > d > −1. Here, u and d are mnemonics for up and down, and it is often assumed
that u > 0. The probabilities for up and down are given by
In this chapter, we develop a mathematical model for nancial markets in one period, introduce
the key concept of no-arbitrage, and formulate and prove the so-called Fundamental Theorem P [{ω1 }] = p1 and P [{ω2 }] = p2
of Asset Pricing on the absence of arbitrage in this setting.
where p1 , p2 ∈ (0, 1) and p1 + p2 = 1.14 One can nicely illustrate this model by the following
1.1 A mathematical model for a nancial market in one period trees, where the numbers beside the branches denote probabilities:
We consider a nancial market with 1 + d assets. We assume that the assets are priced at two 1
S0 : 1 1+r
times, at t=0 (today) and at t=1 (in one year). Asset prices today are known and given
p1 1+u
by the (usually positive) constants S00 , S01 , . . . , S0d ∈ R.9 Asset prices in one year, however, are
usually not known today. So we model them as real-valued (usually positive) random variables S1 : 1
S10 (ω), S11 (ω), . . . , S1d (ω) on some probability space (Ω, F, P ). Every ω ∈Ω corresponds to a p2 1+d
possible state of the world in one year, and S1i (ω) denotes the price of asset i if the state of
the world in one year happens to be ω ∈ Ω. 1.2 Trading strategies and arbitrage opportunities
It is convenient to identify each asset i with the stochastic process S i = (Sti )t∈{0,1} , i.e.,
with the collection of the two random variables S0i and S1i (where S0i (ω) := S0i ). We shall assume throughout that we have a frictionless market. This means that there are
In most nancial markets, not all asset prices in one year are unknown. Usually, there is
no transaction costs, i.e., assets can be bought and sold at the same price, and there are no
a riskless asset, often also called bank account, which will pay a sure amount in one year.
10 constraints on the number of assets one holds. In particular, one can hold a negative amount
of some asset, i.e., assets can be shorted and the price paid/received is linear in the quantity
We will assume throughout that S0 is riskless and satises
of assets bought/sold. Moreover, we shall assume that asset prices are exogeneously given and
S00 = 1 and S10 (ω) ≡ 1 + r, not inuenced by the trading activities of other market participants. Thus agents are views
as price takers. All this is of course an idealisation of reality but one has to start with the
where r > −1 denotes the interest rate.11 simplest case before building more realistic and therefore more complex models.
In order to distinguish the riskless asset S0 from the risky assets S1, . . . , Sd, we will use Given a nancial market S = (St0 , St )t∈{0,1} as above, a trading strategy, often also called
12
the notation a portfolio, is a vector
St = (St1 , . . . , Std ) and S t = (St0 , St ), t ∈ {0, 1} ϑ = (ϑ0 , ϑ) = (ϑ0 , ϑ1 , . . . , ϑd ) ∈ R1+d ,
and call the Rd -valued stochastic process S = (St )t∈{0,1} the risky assets.13 where ϑi denotes the number of shares held in asset i. The price today for buying the trading
strategy/portfolio ϑ is
Example 1.1 (One-period Binomial model). Assume that d = 1, i.e., there is only one risky d
X
asset, and there are only two states of the world at time 1, i.e., Ω = {ω1 , ω2 }. We assume that ϑ · S0 = ϑi S0i = ϑ0 + ϑ · S0 .
S01 = 1 and i=0
S 1 (ω1 ) = 1 + u and S 1 (ω2 ) = 1 + d, In one year, i.e., at t = 1, the value of the trading strategy/portfolio will be
9
Note that some derivative assets such as swaps have zero initial value. d
10
Government bonds are usually considered to be riskless in reality, in particular US government bonds.
X
ϑ · S 1 (ω) = ϑi S1i (ω) = ϑ0 (1 + r) + ϑ · S1 (ω),
Notwithstanding the bank account is a somewhat ctitious security, in particular in continuous time where
it denotes the rollover of very short term riskless bonds.
i=0
11
Before the nancial crisis of 2008, interest rates tended to be positive (or at least nonnegative).
12 d d
Note that S0 is an R -valued vector and S1 is an R -valued random vector. Likewise, S 0 is an R
1+d
-valued
depending on the state of the world ω ∈ Ω.
1+d
vector and S 1 is an R -valued random vector. The following denition is one of the cornerstones of Mathematical Finance:
13 d d
R -valued stochastic process here means the collection of the two R -valued random vectors S0 and S1
14
(where S0 (ω) := S0 ). To make the model mathematically rigorous, we also have to specify the σ -algebra F . This is as standard
in models with nite (or countable) Ω given by F = 2Ω , so that F = {∅, {ω1 }, {ω2 }, {ω1 , ω2 }}.
11 12
Denition 1.2. A trading strategy ϑ ∈ R1+d is called an arbitrage opportunity for S if and at time 1,
ϑ · S 0 ≤ 0, ϑ · S 1 ≥ 0 P -a.s. and P [ϑ · S 1 > 0] > 0. 200 × 1.01 + (−1) × 202 = 0
if ω = ω1 ,
ϑ · S 1 (ω) = 200 × 1.01 + (−1) × 200 = 2 > 0 if ω = ω2 ,
The nancial market S arbitrage-free
is called if there are no arbitrage opportunities. In this
200 × 1.01 + (−1) × 198 = 4 > 0
if ω = ω3 .
case one also says that S satises NA.
An arbitrage opportunity gives something (a positive chance of strictly positive nal wealth Thus,
because they are immediately exploited by so-called arbitrageurs (often hedge funds) and dis- Remark 1.4. If the market S admits arbitrage, there always exists an arbitrage opportunity
appear. The reason that arbitrage opportunities disappear is that in real nancial markets with ϑ · S 0 = 0. Indeed, if η = (η 0 , η) is an arbitrage opportunity with η · S 0 < 0, set
(unlike in our textbook setting) prices adjust to the trading activities of the market par- ϑ := (ϑ0 , ϑ) = (η 0 − η · S 0 , η). Then
ticipants, i.e., prices of assets in high demand rise and prices of assets in low demand fall.
ϑ · S 0 = ϑ0 + ϑ · S0 = η 0 − η · S 0 + η · S0 = η 0 − η 0 − η · S0 + η · S0 = 0.
Therefore, it is reasonable to assume that nancial markets are arbitrage-free, which is indeed
a key assumption of Mathematical Finance.
Moreover, as −η · S 0 > 0,
Example 1.3 (An arbitrage opportunity). Consider the nancial market given by the follow-
ϑ · S 1 = ϑ0 (1 + r) + ϑ · S1 = η 0 (1 + r) + (−η · S 0 )(1 + r) + η · S1
ing trees where the numbers beside the branches denote probabilities:
= η · S 1 + (−η · S 0 )(1 + r) ≥ (−η · S 0 )(1 + r) > 0 P -a.s.
1
S0 : 1 1.01
Thus, ϑ is an arbitrage opportunity with ϑ · S 0 = 0.
202
0.8
0.1 1.3 Discounting
S 1 : 200 200
Our next aim is to give a necessary and sucient condition on the market S to be arbitrage-
We claim that the market S = (St0 , St1 )t∈{0,1} admits arbitrage. Indeed, the riskless asset
e.g. GBP or EUR. Notwithstanding, it is clear that prices (and values) are relative. So basic
concepts of nancial markets (like being arbitrage-free) should not and do not depend on the
always has in all states of the world the same or a higher return than the risky asset. So we
choice of unit. For this reason, we are free to change the unit, in particular if this makes the
short the risky asset and use it to buy the riskless asset. Mathematically, we set
mathematics simpler. It turns out that a good choice is a unit which itself is a traded asset,
0 1
ϑ = (ϑ , ϑ ) := (200, −1). and the canonical choice is to use the risk-free asset S0. So we discount with S 0 or take S 0 as
numéraire, and dene the discounted assets X 0 , X 1 , . . . , X d by
Then at time 0,
Sti
ϑ · S 0 = 200 × 1 + (−1) × 200 = 0. Xti = , t ∈ {0, 1}, i ∈ {0, 1, . . . , d}.
St0
Then X0 ≡ 1 and X = (X 1 , . . . , X d ) expresses the value of the risky assets in units of the
15 1
Here, we have implicitly assumed that Ω = {ω1 , ω2 , ω3 }, S1 (ω1 ) = 202, S11 (ω2 ) = 200 and S11 (ω3 ) = 198,
F = 2Ω and P [{ω1 }] = 0.8, P [{ω2 }] = 0.1, and P [{ω3 }] = 0.1.
13 14
numéraire S0. Next, as X10 − X00 = 1 − 1 = 0, we note that (1.1) is equivalent to
Remark 1.5. The economic reason for taking a traded asset (as opposed to a standard
ϑ · (X 1 − X 0 ) ≥ 0 P -a.s. and P [ϑ · (X 1 − X 0 ) > 0] > 0. (1.4)
currency like GBP) as numéraire is that a standard currency does not reect the time value
of money : one pound today is not the same as one pound in a year or a pound in 100 years Then plugging (1.2) into (1.4) shows that
think of the (gigantic) value of one pound in the past. By discounting we make prices at
the dimension of the market from 1 + d to d. Now using that inequalities remain unchanged by multiplying by positive constants (here S10 ),
we obtain
Example 1.6. Consider the one-period Binomial model from Example 1.1. Then the dis-
ϑ · S 1 ≥ 0 P -a.s. and P [ϑ · S 1 > 0] > 0.
counted risky asset X1 is given by X01 = 1 and
This together with (1.3) shows that ϑ is an arbitrage opportunity for S, in contradiction to
1+u 1+d
X11 (ω1 ) = and X11 (ω2 ) = . the hypothesis that S satises NA.
1+r 1+r
We can reformulate the notion of arbitrage in terms of the discounted risky assets X only. 1.4 Equivalent Martingale Measures
Proposition 1.7. The following are equivalent: The other concept is the notion of an equivalent martingale measure (EMM).
(Ω, F) are called equivalent (notation: P ≈ Q) if, for A ∈ F , Q[A] = 0 if and only if P [A] = 0.
Rd such that
Two probability measures are equivalent, if they agree on which events will not happen,
ϑ · (X1 − X0 ) ≥ 0 P -a.s. and P [ϑ · (X1 − X0 ) > 0] > 0. i.e., have probability zero. But they may still assign dierent probabilities to events that
might happen.
Proof. We only prove the more dicult direction (a) ⇒ (b).
Seeking a contradiction, suppose that there exist an arbitrage opportunity ϑ ∈ Rd for X Example 1.9. Let Ω = {ω1 , . . . , ωN } be a nite sample space and F = 2Ω . Let P be a
satisfying
probability measure on (Ω, F) with P [{ωn }] > 0 for all n ∈ {1, . . . , N }. Then a probability
E Q X1i = X0i ,
i ∈ {1, . . . , d}.
0
Multiplying (1.2) by S0 =1 gives
Remark 1.11. The terminology equivalent martingale measure stems from the fact that the
ϑ · S 0 = 0. (1.3)
X i 's are martingales under the equivalent measure Q. Martingales will be studied in some
16
This has been known and used by actuaries for centuries.
17 detail in Chapter 5.
We would call such a ϑ an arbitrage opportunity for X. This is of course a slight abuse of notation, but a
very common one in Mathematical Finance. Alternatively, Q is also often called a risk-neutral measure.
15 16
1.5 The Fundamental Theorem of Asset Pricing
1
We have now all the tools to state and prove the so-called Fundamental Theorem of Asset
a·x=λ
pricing, giving necessary and sucient conditions for the absence of arbitrage. For multiperiod ∆N −1
models, this was only established in 1990 by Dalang, Morton, and Willinger. For this reason,
(b) There exists an EMM for the discounted risky assets X = S/S 0 .
X satises NA. In terms of K this means that
Proof. We rst establish the easy direction (b) ⇒ (a). So let Q ≈ P be an EMM. By
Proposition 1.7, it suces to show that X satises NA. Seeking a contradiction, suppose there K ∩ RN
+ = {0}, (1.6)
is ϑ ∈ Rd such that
where RN N
+ = [0, ∞) . Next, dene the standard simplex of dimension N −1 by
E Q [ϑ · (X1 − X0 )] > 0. As K and ∆N −1 are both nonempty and convex, K is a vector subspace and ∆N −1 is
compact, it follows from the strict separating hyperplane theorem 19 that there exists a vector
But by linearity of the expectation operator (cf. Lemma 0.6 (a)) and the fact that Q is an
a ∈ RN \ {0} and λ>0 such that
EMM,
d d
X X a·k =0 for all k ∈ K,
E Q [ϑ · (X1 − X0 )] = ϑi E Q X1i − X0i = ϑi × 0 = 0,
For the proof of the dicult direction (a) ⇒ (b), we only consider the special case that
As ∆N −1 contains all standard unit vectors en in RN , it follows that
Ω = {ω1 , . . . , ωN } is nite, F = 2Ω and P [{ωn }] > 0 for all n ∈ {1, . . . , N }.18 As is the case
a · en = an > 0, n ∈ {1, . . . , N }.
with many abstract existence theorems, the proof is not constructive. We are going to identify
d an
K := {(ϑ · (X1 (ω1 ) − X0 ), · · · , ϑ · (X1 (ωN ) − X0 )) : ϑ ∈ R } (1.5)
Q[{ωn }] = PN > 0.
k=1 ak
Then K corresponds to the collection of all random variables of the form ϑ·(X1 −X0 ) for ϑ ∈ Rd . 19
For a proof, we refer to [1, Proposition B.14].
Mathematically, K is an (at most d-dimensional) vector subspace of RN . By Proposition 1.7,
18
For a proof with general Ω and F (which requires more measure theory), we refer to [2, Theorem 1.7].
17 18
Then Q≈P by Example 1.9. Moreover, for i ∈ {1, . . . , d}, set 2 Mean-Variance Portfolio Selection and the CAPM
k i = (ei · (X1 (ω1 ) − X0 ), . . . , ei · (X1 (ωN ) − X0 )) ∈ K, In this chapter, we seek to answer the question how to optimally invest in a nancial market
taking into account the mean and the variance of the return of a portfolio. We then deduce
where ei denotes the unit vector in Rd . Then that if all market participants behave optimally in a mean-variance sense, the nancial market
N N has a special structure, which is described by the Capital Asset Pricing Model (CAPM).
X 1 X
E Q X1i − X0i = (X1i (ωn ) − X0i )Q[{ωn }] = PN an (X1i (ωn ) − X0i )
n=1 k=1 ak n=1 2.1 The return of an asset and of a portfolio
N
1 X
i a · ki 0
= PN an e · (X1 (ωn ) − X0 ) = PN = 0, Throughout this chapter, we consider a 1 + d-dimensional
nancial market S = (St , St )t∈{0,1}
a
k=1 k n=1 k=1 ak 0
on some probability space (Ω, F, P ), where S is a riskless bank account and satises
where we have used in the last step that a·k =0 for all k ∈ K.
S00 = 1 and S10 = 1 + r,
Example 1.13. Consider the Binomial model from Example 1.1. Using the FTAP, we want
to check when S satises NA. So let Q be a measure on (Ω, F), and set q1 := Q[{ω1 }] and with r > −1. We assume that all assets today are all positive, i.e., S00 , S01 , . . . , S0d > 0, and
r−d u−r Set µ = (µ1 , . . . , µd ) and µ = (µ0 , µ). Note that R 0 ≡ r = µ0 , i.e., the return of asset 0
q1 = and q2 = is deterministic and equals the interest rate r. For the risky assets S1, . . . , Sd, however, the
u−d u−d
return is stochastic, and we denote by Σ = (Σij )1≤i,j≤d ∈ Rd×d , the covariance matrix of the
The condition u>r>d is economically quite intuitive as it says that the risky asset must
20
oer the chance of a higher return than the interest rate in one state of the world (u > r) but
For t=0 (as well as i = 0), this integrability condition is of course trivially satised.
21
Otherwise, there is j ∈ {1, . . . , d} ̸= 0 with ϑj ̸= 0 such that
also have a lower return than the interest rate in another state of the world (d < r). Note
X −ϑi i
that the EMM Q does not depend on the values of p1 and p2 . S1j = S1 P -a.s.
ϑj
i̸=j
i.e., the risky asset j can be written as a linear combinations of the other assets, and hence be omitted.
19 20
return vector R = (R1 , . . . , Rd ) of the risky assets, given by Example 2.1. Consider a Binomial model with u = 0.05, r = 0.01, d = 0, p1 = 2
5 and p2 = 3
5.
Then
2 3
Σij = Cov[Ri , Rj ] = E (Ri − µi )(Rj − µj ) ,
i, j ∈ {1, . . . , d}. E S11 = × 1.05 + × 1 = 1.02.
5 5
One can show that the non-redundancy assumption (2.1) on S implies that Σ is positive Let x0 = 1000. Then ϑ ∈ R2 is x0 -feasible if and only if ϑ0 = 1000 − ϑ1 . Moreover,
the expected return and the variance of the return of ϑ by Let us briey comment on what goes wrong in Example 2.1. Focussing on expected
return only, motivates to buy huge amounts of the risky asset by borrowing money from the
ϑ · S1 − ϑ · S0
Rϑ := , bank account. By doing this, one can achieve any expected return one likes. However, this
ϑ · S0
completely ignores the risk inherent in the investment. To illustrate this point, suppose we
µϑ := E [Rϑ ] ,
h i choose ϑ1 = 1, 000, 000 and ϑ0 = −999, 000. Then we have an expected return of µϑ = 10.01 =
σϑ2 := Var [Rϑ ] = E (Rϑ − E [Rϑ ])2 . 1001%, which sounds amazing. However,
A portfolio ϑ satisfying the budget constraint (2.2) is called x0 -feasible. 2.3 The mean-variance problems
Similarly, a risk-only portfolio ϑ satisfying the budget constraint ϑ · S0 = x0 is called
As we have seen, maximising the expected return alone is not a good criterion for portfolio
risk-only x0 -feasible.
choice since it does not control the risk inherent in an investment. Markowitz, in a seminal
In general, there are many x0 -feasible portfolios. So one would like to maximise the
work in 1952 (for which he was awarded the Nobel Prize in Economics in 1990), proposed to
return Rϑ among all x0 -feasible portfolios ϑ. But since Rϑ is a random variable, it is not
consider the variance of the return as a measure of the risk of a portfolio and introduced what
clear how to do this. Therefore, one might try to maximise the expected return µϑ among all
is now known as mean-variance portfolio selection.
x0 -feasible portfolios ϑ. This is, however, not a good criterion, as can be seen by the following
23
Borrowing money to buy stocks was a popular investment strategy in the late 1920. It was one of the key
example.
drivers of the Wall Street crash of October 1929, which lead to the nancial ruin of many investors.
22
If ϑ · S 0 = 0, the return of ϑ is not dened.
21 22
There are two versions of the mean-variance problem, each of which has a formulation Similarly, if ϑ is a risk-only portfolio parameterised in numbers of shares with ϑ · S0 > 0,
ϑi S0i
with risk-only and one with general portfolios, i.e., portfolios which also allow an investent in dene π i := i
ϑ·S0 for i ∈ {1, . . . , d}, call π the fraction of wealth invested in asset i ∈ {1, . . . , d},
the riskless asset: and set π = (π 1 , . . . π d ). Note that in the risk-only case π ∈ H d−1 . We call π a risk-only
(1) Given an initial wealth x0 > 0 and a minimal desired expected return µmin > 0, minimise
portfolio parameterised in fractions of wealth.
Conversely, for π ∈ H 1+d−1 and an initial wealth x0 > 0, dene the portfolio ϑ ∈ R1+d
the variance of the return σϑ2 among all x0 -feasible portfolios ϑ ∈ R1+d that satisfy
(parameterised in numbers of shares) by
µϑ ≥ µmin .
the expected return µϑ among all x0 -feasible portfolios ϑ ∈ R1+d that satisfy σϑ2 ≤ σmax
2 .
πi
ϑi = x0 , i ∈ {1, . . . , d}.
(2') Given an initial wealth x0 > 0 2
and a maximal variance of the return σmax ≥ 0, maximise S0i
the expected return µϑ among all risk-only x0 -feasible portfolios ϑ ∈ Rd that satisfy
Remark 2.2. Calling π ∈ H 1+d−1 a portfolio is a slight abuse of notation because to recover
σϑ2 ≤ 2 .
σmax
the numbers of shares corresponding to π, we also need to specify the initial wealth x0 .
We shall see that problems (1) and (2) (and (1') and (2')) are two sides of the same coin. However, if π ∈ H 1+d−1 is a portfolio parametrised in fractions of wealth and x0 , x′0 > 0
′
The idea underlying the mean-variance problems is that (a high) expected return is de- are dierent initial wealths with corresponding portfolios ϑ and ϑ in numbers of shares, then
sirable whereas (a high) variance of the return is undesirable. We say that an investor has ϑ ϑ′
x0 = x′0
and Rϑ = Rϑ′ . So if we are only interested in returns, it suces to considers portfolios
whenever µϑ ≥ µϑ′ and σϑ2 ≤ σ 2 ′ , with at least one inequality being strict.
24
ϑ Parametrising portfolios in fractions of wealth is tailor-made for studying the mean variance
problems.
2.4 Portfolios in fractions of wealth
Lemma 2.3. Let S = (St0 , St )t∈{0,1} be an arbitrage-free non-redundant 1 + d-dimensional
In order to study the mean-variance problems, it is convenient to parametrise portfolios not
market on some probability space (Ω, F, P ). Assume that S has nite second moments and
in numbers of shares but in fractions of wealth. To this end, we rst introduce some notation.
S00 , . . . , S0d > 0. Denote by µ and Σ the mean vector and covariance matrix of the return
For N ∈ N, set 1N := (1, . . . , 1) ∈ RN and denote by H N −1 the unit hyperplane in RN given
vector R of the risky assets. Let x0 > 0 be some xed initial wealth. Then there is a one-to-
by
one correspondence between x0 -feasible portfolios ϑ ∈ R1+d parametrised in numbers of shares
H N −1 := {x ∈ RN : x · 1N = 1}.
and portfolios π ∈ H 1+d−1 parametrised in fractions of wealth. If π ∈ H 1+d−1 denotes the
If there is no danger of confusion, we write 1 instead of 1N . fractions of wealth of an x0 -feasible portfolios ϑ ∈ R1+d , then
If ϑ ∈ R1+d is a portfolio parameterised in numbers of shares with ϑ · S 0 > 0, dene
d
X
Rϑ = Rπ := π i Ri = π · R, (2.5)
ϑi S0i
π i := , i ∈ {0, . . . , d}. (2.3) i=0
ϑ · S0 d
X
µϑ = µπ := E [Rπ ] = π i µi = π · µ, (2.6)
and call πi the fraction of wealth invested in asset i ∈ {0, . . . , d}. We set π := (π 0 , π) = i=0
(π 0 , π 1 , . . . π d ). Note that π ∈ H 1+d−1 . We call π a portfolio parameterised in fractions of Xd
23 24
Similarly, there is a one-to-one correspondence between risk-only x0 -feasible portfolios ϑ ∈ Rd 2.5 The case without a riskless asset
parametrised in numbers of shares and risk-only portfolios π ∈ H d−1 parametrised in fractions
We start by considering the risk-only case. First, we characterise the so-called minimum
of wealth. If π ∈ H d−1 denotes the fractions of wealth of a risk-only x0 -feasible portfolios
variance portfolio.
ϑ ∈ Rd , then
Lemma 2.4. Let S = (St0 , St )t∈{0,1} be an arbitrage-free non-redundant market on some
d
Rϑ = Rπ :=
X
i i
π R = π · R, (2.8)
probability space (Ω, F, P ). Assume that S has nite second moments and S00 , . . . , S0d > 0.
i=1 Denote by µ and Σ the mean vector and covariance matrix of the return vector R of the
d
X risky assets. Then there exists a unique risk-only portfolio πmin ∈ H d−1 , called the minimum
µϑ = µπ := E [Rπ ] = π i µi = π · µ, (2.9)
variance portfolio, such that
i=1
d
σπ2min ≤ σπ2 for all π ∈ H d−1 .
X
σϑ2 = σπ2 := Var[Rπ ] = π i π j Σij = π ⊤ Σπ. (2.10)
i,j=1
It is given by27
Proof. We only prove the case of general portfolios; the risk-only case is completely analogous. Σ−1 1
πmin = (2.11)
The rst statement follows from the fact that the map 1⊤ Σ−1 1
and satises
n o ϑi S0i µ⊤ Σ−1 1 1
Φ : ϑ ∈ R1+d : ϑ · S 0 = x0 → H 1+d−1 , Φi (ϑ) = , i ∈ {0, . . . d}, µπmin = and σπ2min = .
x0 1⊤ Σ−1 1 1⊤ Σ−1 1
is bijective.
26 To prove the second statement, let ϑ ∈ R1+d be an x0 -feasible portfolio Proof. First, 1 · πmin = 1⊤ Σ−1 1
1⊤ Σ−1 1
= 1, it follows that πmin ∈ H d−1 . Next, let π ∈ H d−1 be
parametrised in numbers of shares. The corresponding portfolio π∈ H 1+d−1 parametrised in arbitrary. Set
Thus, Then by (2.10), the fact that Σ is symmetric, the denition of πmin and the fact that y and 1
are orthogonal,
d d d
ϑ · S 1 − ϑ · S 0 X ϑi X ϑi S i X
Rϑ = = (S1i − S0i ) = 0
Ri = π i Ri = π · R.
ϑ · S0 ϑ · S 0 ϑ · S 0 σπ2 = π ⊤ Σπ = (πmin + y)⊤ Σ(πmin + y) = πmin
⊤
Σπmin + 2y ⊤ Σπmin + y ⊤ Σy
i=0 i=0 i=0
⊤ Σ−1 1 Σ−1 1
So we have (2.5), and (2.6) follows by linearity of the expectation. Finally, to establish (2.7),
= πmin Σ + 2y ⊤ Σ ⊤ −1 + y ⊤ Σy
1⊤ Σ−1 1 1 Σ 1
we use that Cov[Ri , R0 ] = 0 for i ∈ {0, . . . , d} because R0 is deterministic, and obtain π⊤ 1 y⊤1
= ⊤min−1 + 2 ⊤ −1 + y ⊤ Σy
1 Σ 1 1 Σ 1
d
X d 1
= ⊤ −1 + 2 × 0 + y ⊤ Σy.
X
σϑ2 = Var[Rϑ ] = Var[Rπ ] = Var π i Ri = π i π j Cov[Ri , Rj ] 1 Σ 1
i=0 i,j=0
d
X d
X As Σ is symmetric and positive denite, y ⊤ Σy ≥ 0 with equality if and only if y = 0. This
= i j
π π Cov[R , R ] =i j
π π Σ = π ⊤ Σπ.
i j ij
shows both that πmin is the unique optimiser and yields σπ2min = 1
. The formula for
1⊤ Σ−1 1
i,j=1 i,j=1
26
µπmin follows directly from (2.9).
This uses that x0 > 0 and S00 , . . . , S0d > 0.
Next, we seek to nd the risk-only portfolio which minimises the variance among all risk-
27
Note that as the market is non-redundant, it follows that Σ is positive denite and hence invertible.
−1 ⊤ −1
Therefore, Σ is well dened and 1 Σ 1 > 0 by the fact that Σ−1 is positive denite.
25 26
only portfolios with a given expected return µ0 . To this end, note that if µ and 1 are collinear, the denitions of A, B and C and (2.9),
then every risk-only portfolio has the same expected return. Indeed, if µ and 1 are collinear,
C − Bµ0 ⊤ −1 Aµ0 − B ⊤ −1 C − Bµ0 Aµ0 − B
then µ1 = . . . = µd and hence, for any π ∈ H d−1 , 1 · πµ0 = 1 Σ 1+ 1 Σ µ= A+ B
AC − B 2 AC − B 2 AC − B 2 AC − B 2
AC − ABµ0 + ABµ0 − B 2
µπ = µ · π = µ 1 1 · π = µ1 . = = 1,
AC − B 2
C − Bµ0 ⊤ −1 Aµ0 − B ⊤ −1 C − Bµ0 Aµ0 − B
So we assume in the sequel that µ and 1 are not collinear.
µπµ0 = µ · πµ0 = µ Σ 1+ µ Σ µ= B+ C
AC − B 2 AC − B 2 AC − B 2 AC − B 2
CB − B 2 µ0 + ACµ0 − BC
Lemma 2.5. Let S = (St0 , St )t∈{0,1} be an arbitrage-free non-redundant market on a proba- =
AC − B 2
= µ0 .
bility space (Ω, F, P ). Assume that S has nite second moments and S00 , . . . , S0d > 0. Denote
by µ and Σ the mean vector and covariance matrix of the return vector R of the risky assets. Now let π ∈ H d−1 with µπ = µ0 . Set
Assume that µ and 1 are not collinear. Let µ0 ∈ R be given. Then there exists a unique
risk-only portfolio πµ0 ∈ H d−1 such that µπµ0 = µ0 and y := π − πµ0 .
σπ2µ ≤ σπ2 for all π ∈ H d−1 with µπ = µ0 . Note that y · 1 = π · 1 − πµ0 · 1 = 1 − 1 = 0 and
0
2 2
shows both that πµ0 is the unique optimiser and yields the rst equality in (2.13). The second
B = (⟨1, µ⟩Σ−1 ) ≤ ⟨1, 1⟩Σ−1 ⟨µ, µ⟩Σ−1 = AC, 1 B
equality in (2.13) follows by a simple rearrangement using that σπ2min = A and µπmin = A.
where the inequality is an equality if and only if µ and 1 are collinear. As they are not, it For the following result, we introduce the key concept of a risk-only ecient portfolio.
follows that AC − B 2 > 0.
Next, we check that πµ0 given by (2.12) is indeed in H d−1 and has expected return µ0 . By
Denition 2.6. A risk-only portfolio π ∈ H d−1 is called risk-only ecient (in the mean-
variance sense) if there does not exist another risk-only portfolio π ′ ∈ H d−1 such that µπ ′ ≥ µπ
28 2 2
It is part of the assertion that AC − B ̸= 0; we even have AC − B > 0.
and σπ2 ′ ≤ σπ2 with at least one inequality being strict.
27 28
C = µ⊤ Σ−1 µ. Dene the risk-only ecient frontier by Minimum-variance portfolio
µπ
B 2 Aµ20 − 2Bµ0 + C Ecient frontier E
E := (σ02 , µ0 ) ∈ R2 : µ0 ≥ ,σ = .
A 0 AC − B 2
(a) For each point (σ02 , µ0 ) ∈ E , there exists exactly one risk-only portfolio π ∈ H d−1 such
that (σπ2 , µπ ) = (σ02 , µ0 ). It is given by µπmin Feasible portfolios
C − Bµ0 −1 Aµ0 − B −1
π = πµ0 = Σ 1+ Σ µ. σπ2min σπ2
AC − B 2 AC − B 2
(b) A risk-only portfolio π ∈ H d−1 is risk-only ecient if and only if (σπ2 , µπ ) ∈ E . Figure 2: A graphical illustration of the risk-only case
Proof. (a). Let (σ02 , µ0 ) ∈ E . It follows from Lemma 2.5 that πµ0 satises (σπ2µ , µπµ0 ) =
0 has a unique solution π∗ given by
(σ02 , µ0 ). If π ′ ∈ H d−1 is any other portfolio with µπ ′ = µ0 but π ′ ̸= πµ0 , then σπ2 ′ > σπ2µ = σ02
0
again by Lemma 2.5 . So we have both existence and uniqueness of π. C − Bµmin −1 Aµmin − B −1
π∗ = πµmin = Σ 1+ Σ µ
(b). First, assume that π∈ H d−1 is risk-only ecient. Then σπ2
≥ σπ2min by Lemma 2.4, and AC − B 2 AC − B 2
so µπ ≥ µπmin = B
A by the denition of eciency. Set µ0 :=
Aµ2 −2Bµ +C
µπ and σ02 := 0AC−B02 . Then It is risk-only ecient and satises
(σ02 , µ0 ) ∈ E and by (a), (σπ2µ , µπµ0 ) = (σ02 , µ0 ). Eciency of π together with µπ = µ0 = µπµ0
0
Aµ2min − 2Bµmin + C
2
gives σπ ≤ σπ
2 = σ02 . On the other hand, Lemma 2.5 gives σπ2 ≥ σπ2µ = σ02 , whence σπ2 = σ02 , µπ∗ = µmin and σπ2∗ = .
2
µ0
29
0 AC − B 2
and so (σπ , µπ ) ∈ E .
d−1 be such that (σ 2 , µ ) ∈ E . Set µ := µ and σ 2 := σ 2 . Then
Conversely, let π ∈ H π π 0 π 0 π (2') Let σmax
2 ≥ 1
A be given.31 Then the risk-only mean-variance problem
π = πµ0 by (a). Seeking a contradiction, suppose there is π ′ ∈ H d−1 such that µπ′ ≥ µ0 and
σπ2 ′ ≤ σ02 with at least one of the inequalities being strict. If µπ′ = µ0 , then σπ2 ′ < σ02 = σπ2µ ,
0
argmax µπ subject to σπ2 ≤ σmax
2
(2.14)
π∈H d−1
and by Lemma 2.5, we arrive at a contradiction. Otherwise, if µπ′ > µ0 , set µ1 := µπ′ and
Aµ21 −2Bµ1 +C
σ12 = AC−B 2
2 2
. Then σ1 > σ0 because the function x 7→
Ax2 −2Bx+C
AC−B 2
is strictly increasing has a unique solution π∗ given by
B 2 2 2 2
for x ≥ . This means, that σπ ′ < σ1 . But σ1 = σπ by (a). Thus, µπ ′ = µ1 = µπµ and
A µ1 1
C − Bµσmax
2 Aµσmax
2 − B −1
σπ2 ′ < σ12 = σπ2µ , and again by Lemma 2.5, we arrive at a contradiction. π∗ = πµσ2 = Σ−1 1 + Σ µ,
1 max AC − B 2 AC − B 2
With the help of Theorem 2.7, we can now fully solve the risk-only versions of the mean-
where p
variance problems. B (AC − B 2 )(Aσmax
2 − 1)
µσmax
2 = + .
A A
Theorem 2.8. Let S = (St0 , St )t∈{0,1} be an arbitrage-free non-redundant market on a proba- It is risk-only ecient and satises
bility space (Ω, F, P ). Assume that S has nite second moments and S00 , . . . , S0d > 0. Denote
by µ and Σ the mean vector and covariance matrix of the return vector R of the risky assets.
p
B (AC − B 2 )(Aσmax
2 − 1)
µπ∗ = µσmax = + and σπ2∗ = σmax
2
.
Assume that µ and 1 are not collinear, and set A = 1⊤ Σ−1 1, B = 1⊤ Σ−1 µ, and C = µ⊤ Σ−1 µ.
2
A A
First, it follows from Theorem 2.7(a) and (b), that π∗ = πµσ2 is risk-only ecient, and
max
argmin σπ2 subject to µπ ≥ µmin a straightforward calculation (using (2.13)) shows that indeed σπ2∗ = σmax
2 . So π∗ satises the
π∈H d−1
constraint in (2.14) with equality. Let π ′ ∈ H d−1 be any portfolio satisfying the constraint
29
By (a), it even follows that π = πµ0 .
31
30 B
Note that by Lemma 2.4, A = µπmin , the mean of the minimum-variance portfolio.
1
Note that by Lemma 2.4, A = σπ2 min , the variance of the minimum-variance portfolio.
29 30
in (2.14). Then by eciency of π∗ it follows that µπ′ ≤ µπ∗ , which gives that π∗ is optimal It is given by πµ0 ,r = (1 − πµ0 ,r · 1, πµ0 ,r ), where
for (2.14). To establish uniqueness, note that if µ π ′ = µπ ∗ , then σπ2 ′ ≥ σπ2∗ by eciency of
µ0 − r
π∗ and hence by the constraint in (2.14), σπ2 ′ = σπ2∗ . But this then implies that (σπ2 ′ , µπ′ ) = πµ0 ,r = Σ−1 (µ − r1). (2.15)
(µ − r1)⊤ Σ−1 (µ − r1)
(σπ2∗ , µπ∗ ) ∈ E , and by Theorem 2.7(a), it follows that π ′ = π∗ .
It satises
(µ0 − r)2 (µ0 − r)2
2.6 The case with a riskless asset σπ2 µ = = , (2.16)
0 ,r (µ − r1)⊤ Σ−1 (µ − r1) Ar2 − 2Br + C
We proceed to study the case with a riskless asset. The analogue of Lemma 2.4 is trivial,
where A = 1⊤ Σ−1 1, B = 1⊤ Σ−1 µ, and C = µ⊤ Σ−1 µ.
because with a riskless asset, we can achieve zero risk by just investing in the riskless asset,
i.e., by choosing the portfolio π min,r := (1, 0, . . . , 0) ∈ H 1+d−1 . So we directly consider the Proof. First, it follows from the condition µ ̸= r1 and the fact that Σ−1 is symmetric and
analogue of Lemma 2.5. Since we can now also invest in the riskless asset, we do not need to positive denite that (µ − r1)⊤ Σ−1 (µ − r1) > 0. We proceed to check that π µ0 ,r ∈ H 1+d−1
assume that µ and 1d are not collinear but only that µ and 11+d are not collinear. By the and µπµ0 ,r = µ0 . By the denition of π µ0 ,r and Proposition 2.9,
assets, and let r be the interest rate. Let π = (π0 , π) ∈ H 1+d−1 . Then
y := π − π µ0 ,r .
µπ − r = (µ − r1d ) · π.
Proposition 2.9 gives
We now seek to nd the portfolio in H 1+d−1 which minimises the variance among all the fact that y is orthogonal to µ − r1,
portfolios in H 1+d−1 with a given expected return µ0 . Note that below 1 will always be 1d .
σπ2 = π ⊤ Σπ = (πµ0 ,r + y)⊤ Σ(πµ0 ,r + y) = πµ⊤0 ,r Σπµ0 ,r + 2y ⊤ Σπµ0 ,r + y ⊤ Σy
Lemma 2.10. Let S = be an arbitrage-free non-redundant market on a proba-
(St0 , St )t∈{0,1}
πµ⊤0 ,r (µ − r1) y ⊤ (µ − r1)
bility space (Ω, F, P ). Assume that S has nite second moments and S00 , . . . , S0d > 0. Denote = (µ0 − r)
(µ − r1)⊤ Σ−1 (µ − r1)
+ 2(µ0 − r)
(µ − r1)⊤ Σ−1 (µ − r1)
+ y ⊤ Σy
by µ and Σ the mean vector and covariance matrix of the return vector R of the risky assets, (µ0 − r)
and let r be the interest rate. Assume that µ ̸= r1. Let µ0 ∈ R be given. Then there exists a = (µ0 − r) + 2(µ0 − r) × 0 + y ⊤ Σy
(µ − r1)⊤ Σ−1 (µ − r1)
unique portfolio πµ0 ,r ∈ H 1+d−1 such that µπµ0 ,r = µ0 and (µ0 − r)2
= + y ⊤ Σy. (2.17)
(µ − r1)⊤ Σ−1 (µ − r1)
σπ2 µ ,r
0
≤ σπ2 for all π ∈ H 1+d−1
with µπ = µ0 .
As Σ is symmetric and positive denite, y ⊤ Σy ≥ 0 with equality if and only if y = 0. Moreover,
31 32
since π, π µ0 ,r ∈ H 1+d−1 , it follows that Theorem 2.14. Let S = (St0 , St )t∈{0,1} be an arbitrage-free non-redundant market on a prob-
ability space (Ω, F, P ). Assume that S has nite second moments and S00 , . . . , S0d > 0. Denote
d
X
yk = y · 11+d = π · 11+d − π µ0 ,r · 11+d = 1 − 1 = 0, by µ and Σ the mean vector and covariance matrix of the return vector R of the risky assets,
k=0 and let r be the interest rate. Assume that µ ̸= r1. Set A = 1⊤ Σ−1 1, B = 1⊤ Σ−1 µ, and
C = µ⊤ Σ−1 µ.
which implies in particular that y ̸= 0 if and only if y ̸= 0. Hence, (2.17) shows both that
π µ0 ,r is the unique optimiser and yields the rst equality in (2.16). The second equality in (1) Let µmin ≥ r be given. Then the mean-variance problem with a riskless asset
(2.16) follows by expanding the denominator and using the denitions of A, B and C.
argmin σπ2 subject to µπ ≥ µmin
We proceed to formulate the analogue of Theorem 2.7. To this end, we also need to π∈H 1+d−1
consider the notion of eciency for general portfolios.
has a unique solution π∗ given by π∗ = (1 − πµmin ,r · 1, πµmin ,r ), where
Denition 2.11. A portfolio π ∈ H 1+d−1 is called ecient (in the mean-variance sense) if
µmin − r
there does not exist another portfolio π ′ ∈ H 1+d−1 such that µπ ′ ≥ µπ and σπ2 ′ ≤ σπ2 with one πµmin ,r = Σ−1 (µ − r1).
Ar2 − 2Br + C
inequality being strict.
It is ecient and satises
Theorem 2.12. Let S = (St0 , St )t∈{0,1} be an arbitrage-free non-redundant market on some
probability space (Ω, F, P ). Assume that S has nite second moments and S00 , . . . , S0d > 0. (µmin − r)2
µπ∗ = µmin and σπ2 ∗ = .
Ar2 − 2Br + C
Denote by µ and Σ the mean vector and covariance matrix of the return vector R of the
risky assets, and let r be the interest rate. Assume that µ − r1 ̸= 0, and set A = 1⊤ Σ−1 1, (2) Let σmax
2 ≥ 0 be given. Then the mean-variance problem with riskless asset
B = 1⊤ Σ−1 µ, and C = µ⊤ Σ−1 µ. Dene the ecient frontier by
(µ0 − r)2
argmax µπ subject to σπ2 ≤ σmax
2
2.7 The Markowitz tangency portfolio and the capital market line
Corollary 2.13. π ∈ H 1+d−1 is ecient if and only if π = γΣ−1 (µ − r1) for some γ ∈ (0, ∞)
and π = (1 − π.1, π). Next, we study the relationship between (general) ecient and risk-only ecient portfolios.
We proceed to formulate the analogue of Theorem 2.8, the solution to the mean-variance
Theorem 2.15. Let S = (St0 , St )t∈{0,1} be an arbitrage-free non-redundant market on some
problems with a riskless asset. The proof is very similar to the proof of Theorem 2.8 and
probability space (Ω, F, P ). Assume that S has nite second moments and S00 , . . . , S0d > 0.
hence omitted.
Denote by µ and Σ the mean vector and covariance matrix of the return vector R of the
33 34
risky asset. Let r be the interest rate, and assume that r < ,
B 32
A where A = 1⊤ Σ−1 1 and Markowitz tangency portfolio
B = 1⊤ Σ−1 µ. µπ Capital Market Line
(a) There exists a unique ecient portfolio πtan , called the Markowitz tangency portfolio , E (reparametrised)
that is risk-only, i.e., πtan = (0, πtan ). It satises33
1
πtan = Σ−1 (µ − r1)
B − rA
µ0 − r µ0 − r
πµ0 ,r · 1 = 1⊤ Σ−1 (µ − r1) = (B − rA).
Ar2 − 2Br + C Ar2 − 2Br + C
where µλ := λµtan + (1 − λ)r. Thus, π = πµλ ,r , and then also π = π µλ ,r . Since µλ ≥ r (this
π = λπ tan + (1 − λ)π min,r uses that µtan > r and λ ≥ 0), it follows from Corollary 2.13 that π is ecient.
32
r< B
Note that the conditionA
B
implies that µ ̸= r1. Indeed, if µ = r1, then B = rA and so r = A . Conversely, suppose that π ∈ H 1+d−1 is ecient. By Theorem 2.12, there is µ0 ≥ r such
33
B − rA > 0 because r < B
Note that A
so that πtan is well dened. Moreover, note that if µ and 1 are
that π = π µ0 ,r . Set
collinear, then πtan = πmin .
34 B 2
µ0 − r
We even have µtan ≥ A . This can be seen as follows: By Cauchy-Schwarz, it follows that AC − B ≥ 0 λ := ,
(note that we do not assume here that µ and 1 are not collinear). This together with B − rA > 0 gives:
µtan − r
1
A2 r2 − 2ABr + AC
1
A2 r2 − 2ABr + B 2
1
(B − rA)2
where µtan is dened as above. Then λ ≥ 0 (because µ0 ≥ r and µtan > r) and µ0 =
µtan = Ar + ≥ Ar + = Ar +
A B − rA A B − rA A B − rA λµtan + (1 − λ)r. The same calculation as in (2.19) shows that π = λπtan + (1 − λ)πmin,r and
1 B
= (Ar + (B − rA)) = . then also π = λπ tan + (1 − λ)π min,r .
A A
35 36
Remark 2.16. (a). Theorem 2.15(b) is usually referred to as a mutual fund theorem because (1) Each individual portfolio ϑk is mean-variance optimal.36
it states that every ecient portfolio is a combination of the (ecient) mutual funds π tan and
(2) The stock markets clear :
π min,r , the rst one containing only risky assets, the second one containing only the riskless K
X
asset. ϑk = η.
(b) In the setting of Theorem 2.15, dene the capital market line (CML) by k=1
Remark 2.18. (a) Property (1) in Denition 2.17 is usually referred to as individual opti-
CML = {(λµπtan + (1 − λ)r, λσπtan ), λ ≥ 0}.
mality, and property (2) as market clearing. Both requirements together are at the core of
the concept of equilibrium, which extends beyond mean-variance preferences. Note that the
Then it follows from Theorem 2.15 that a portfolio π is ecient if and only if it lies on the
market clearing condition (b) consists in fact of d conditions, one for each stock S1, . . . , Sd.
capital market line in the sense that (µπ , σπ ) ∈ CML.
(b) We have not specied how many shares of the bank account S0 are outstanding nor
(c) Note that the capital market line CML is just a reparametrisation of the ecient
have we required market clearing for the bank account. It is usually assumed that there are
frontier E.
0 shares outstanding of the bank account, and in this case one says that the bank account
2.8 On mean-variance equilibria is in zero net supply. If we would make this assumption, we would also have to require that
PK 0
k=1 ϑk = 0. However, in the context of the CAPM, this does not really matter.
We proceed to study what happens if all agents investing in the nancial market S are mean- (c) In the context of equilibrium, it is often assumed (w.l.o.g.) that η = 1, in which case
variance optimisers in the sense that they solve the mean-variance problem (1) or (2) (for one says that the stocks are in unit net supply.
some choice of µmin or
2 )
σmax and hold the corresponding optimal portfolio.
For the arguments that follows, one must be very careful not to run into circular reasoning. 2.9 The Capital Asset Pricing Model (CAPM)
So far, we have always assumed that a nancial market S is given exogenously, i.e., prices are
Assuming that a mean-variance equilibrium exists, we can now deduce the Capital Asset
not inuenced by the investment decisions of the market participants. We have then derived
Pricing Model (CAPM). This model was developed in the 1960s by Treynor, Sharpe (who was
optimal trading strategies for agents that are mean-variance optimisers. Now, we want use the
awarded the Nobel prize in Economics in 1990), Lintner and Mossin.
form of these optimal trading strategies to draw conclusions on the structure of the nancial
If S = (St0 , St )t∈{0,1} is a market and η ∈ Rd++ the number of risky shares outstanding,
market S. To avoid circular reasoning, we have therefore to assume a priori that the structure
then the (risk-only) portfolio in fractions of wealth corresponding to η is called the market
of S is consistent with the derived mean-variance optimal strategies. This is a big assumption.
portfolio and denoted by πm . To wit,
In economic terms, we have to assume a priori that there exists a mean-variance equilibrium.
For the following denition, we need the notion of shares outstanding : For a stock S i , i η i S0i
πm = , i ∈ {1, . . . , d}.
the shares outstanding denotes the total number ηi of shares of that stock held by all market η · S0
participants together.
35 The shares outstanding times the market value of the stock η i S0i gives
We denote the return of the market portfolio by Rm , i.e., Rm := Rπm .
the market capitalisation of the stock.
called a mean-variance equilibrium, if there exists market participants 1, . . . , K with portfolios mean-variance equilibrium.
ϑ1 , . . . , ϑK (parametrised in numbers of shares) such that (a) The market portfolio πm equals the Markowitz tangency portfolio πtan .
35
treasury shares,
In addition to the shares outstanding (which have legal ownership rights), there are also 36
This means that the corresponding portfolio πk in fractions of wealth is mean-variance optimal, i.e., a
which are shares held by the corporation issuing the shares itself and have no exercisable rights. The issued 2
solution to the mean-variance problem (1) or (2) (for some choice of µmin or σmax ).
shares is the sum of the shares outstanding and the treasury shares.
37 38
(b) For all π ∈ H 1+d−1 , we have the CAPM formula: and Proposition 2.9, we obtain
K
X 2.10 Criticism of mean-variance portfolio selection and the CAPM
ϑk = η.
k=1 Mean-variance portfolio selection and the CAPM are very appealing for their tractability,
and so they have rightly become cornerstones of modern nancial theory. Notwithstanding
Denote by π k ∈ H 1+d−1 the portfolio parametrised in fractions of wealth corresponding to ϑk .
there are important points which one can and should criticise from a conceptual and empirical
It follows from Theorem 2.14 and the fact that ϑk is mean-variance optimal, that each πk is 37
perspective:
ecient. Hence, for each k ∈ {1, . . . , K} , Theorem 2.15(b) gives λk ≥ 0 such that
The mean-variance criterion assumes that the variance of the return of a portfolio is a
πk = λk πtan , good measure of the risk related to the portfolio. However, if returns are not normally
multi-factor models
P
K i linear regression version of Remark 2.20(c) has been extended to
η i S0i k=1 xk λk πtan
i
πm = = P i
= πtan , i ∈ {1, . . . , d}. like the three-factor model by Fama and French or the four-factor model by Carhart.
η · S0 K
x λ π ·1
k=1 k k tan 37
Here, we do not consider the criticism that mean-variance portfolio selection and the CAPM ignore fric-
tions, trading constraints, etc. Moreover, we do not consider the criticism that markets might be inecient
(b). Fix π ∈ H 1+d−1 . Then by (a), a similar calculation as in Lemma 2.3, Theorem 2.15(a) or that market participants might act in a non-rational or only partially rational way (which is studied in the
eld of Behavioural Finance ).
38
This is the book value of a company (calculated via balance sheet considerations) divided by its market
capitalisation.
39 40
2.11 Factor models and CAPM In general the right -hand side of (2.24) does not vanish.
39 it is assumed that the returns on stocks depend on the values of a small (b) Writing Ri as in (2.21), it follows from (2.23) that
In a factor model
uncorrelated with both the factors and the residuals for other securities). d
X d
X d
X
Rπ = π · R = π 0 r + πi · Ri = π 0 r + πir + π i β i (Rm − r) + π i εi
Typically in a factor model K is small and d is large.
i=1 i=1 i=1
If we combine a single factor model with CAPM so that K = 1 and F = F 1 = Rm −E[Rm ]
= r + βπ (Rm − r) + επ ,
then
Remark 2.20. (a) The single factor model with CAPM that we have written down does not,
Var[Rπ ] = βπ2 σπ2m + Var[επ ]. (2.25)
However, it does not follow that Cov[εi , εj ] = 0 for i ̸= j as would be required for a model Thus, it follows from (2.25) that Var[επ ] = 0.41
with a factor structure. Indeed, (c) Without the assumptions of the CAPM, one can always solve the (multiple) linear
regression
41 42
where the εi are mean-zero and uncorrelated (this is an assumption!) error terms. This model 3 Utility Theory
is also called the single index model (SIM).42 The constant term/intercept αi in this regression
is usually referred to as Jensen's alpha or just alpha and called the abnormal return. Note
In this chapter, we seek to systematically describe preferences of an investor who has to com-
pare random outcomes like the future payo of a nancial asset or the return of a portfolio. To
that in the CAPM setting, the alpha is zero.
this end, we will follow the axiomatic approach proposed by von Neumann and Morgenstern.
level of probability distributions than on the level of random variables (even though the latter
might be more intuitive from an economic perspective). For this reason, we will throughout
P X [B] := P [X ∈ B].
ν -integrable, we set Z
h(x) ν(dx) := E ν [h]
Lemma 3.1. Let (Ω, F, P ) be a probability space and D ⊂ R a nonempty interval. Moreover,
let X be a D-valued random variable with distribution ν := P X and h : D → R a measurable
function. Then h(X) is P -integrabe if and only if h is ν -integrable and in this case
Z
E [h(X)] = h(x) ν(dx).
1 if x ∈ B,
δx (B) :=
0 if x∈
/ B.
The Dirac measure represents the distribution of a D-valued random variable X that takes
43
The main examples are D = R, D = [0, ∞) and D = (0, ∞).
44
Note that usually the expectation is dened via the integral (and not vice versa).
42
In its original version, the εi are also assumed to follow a multivariate normal distribution.
43 44
the value x with probability 1, i.e., X = x P -a.s. Note that for any measurable function (a) Completeness: For all ν1 , ν2 ∈ M, either ν1 ⪰ ν2 or ν2 ⪰ ν1 or both are true.
h : D → R, Z
(b) Transitivity: If ν1 ⪰ ν2 and ν2 ⪰ ν3 , then also ν1 ⪰ ν3 .
h(y) δx (dy) = h(x).
If ν1 ⪰ ν2 , we say that ν1 is weakly preferred over ν2 .
In the sequel, we also need the notion of the mixture of two distributions. If ν1 and ν2
are probability distributions on (D, BD ) and α ∈ [0, 1], then the distribution αν1 + (1 − α)ν2 If ν1 ⪰ ν2 and ν2 ⪰ ν1 , we say to be indierent between ν1 and ν2 , and write ν1 ∼ ν2 . By
is called the mixture of ν1 and ν2 with weights α and (1 − α).45 If (Ω, F, P ) is a probability contrast, if ν1 ⪰ ν2 and ν2 ⪰̸ ν1 , we say that ν1 is strictly preferred over ν2 and write ν1 ≻ ν2 .
space, then a random variable X has distribution αν1 + (1 − α)ν2 , if and only if there is a
Example 3.4. Let M be the (convex) set of all probability measures on (R, B) with nite
Bernoulli random variable Y with P [Y = 1] = α and P [Y = 0] = (1 − α) such that X has R 2 ν( dx)
second moment, i.e., ν ∈M if and only if
Rx < ∞. For ν ∈M denote by µν :=
condition distribution ν1 given that Y = 1 and conditional distribution ν2 given that Y = 0.46
mean σν2 := 2 variance ν .47
R R
R x ν( dx) the of ν and by
R (x − µν ) ν( dx) the of
Warning: If X1 is a random variable with distribution ν1 and X2 is a random variable
with distribution ν2 , then in general αX1 +(1−α)X2 does not have distribution αν1 +(1−α)ν2 .
(a) The binary relation ⪰ dened by
R
The following result states that the integral h(x) ν(dx) is linear not only in the integrand
h but also in the integrator ν. ν1 ⪰ ν2 if and only if µν1 ≥ µν2 ,
Lemma 3.2. Let D ⊂ R be a nonempty interval, ν1 , ν2 probability measures on (D, BD ) and is a preference order.
α ∈ [0, 1]. Then αν1 + (1 − α)ν2 is again a probability measure on (D, BD ). Moreover, if
(b) The binary relation ⪰ dened by
h : D → R is a measurable function that this integrable with respect to ν1 and ν2 , then h is
also integrable with respect to αν1 + (1 − α)ν2 , and we have ν1 ⪰ ν2 if and only if µν1 ≥ µν2 and σν21 ≤ σν22
Z Z Z
h(x) αν1 + (1 − α)ν2 (dx) = α h(x) ν1 (dx) + (1 − α) h(x) ν2 (dx). is not a preference order because it fails to be complete.
γ 2 γ
Let D ⊂ R be a nonempty interval. A probability measure ν on (D, BD ) is also called a lottery ν1 ⪰ ν2 if and only if µν1 − σ ≥ µν1 − σν22 ,
2 ν1 2
(on D). It is called a simple lottery (on D) if it is a mixture of nitely many Dirac measures,
Denition 3.3. Let D⊂R be a nonempty interval and M a nonempty convex subset of all another description of preference orders that encodes preferences by a single mathematical
probability measures on (D, BD ). A preference order on M is a binary relation ⪰ with the object.
following properties: In a seminal paper in 1944, von Neumann and Morgenstern showed that many preference
45 46
Denition 3.5. Let D⊂R be a nonempty interval and M a nonempty convex subset of all Now consider the following lotteries:
prefer the slightly riskier lottery ν2′ because it has the higher expectation (µν ′
1
= 816 and
Remark 3.6. Linearity of the expectation and the fact that inequalities remain unchanged by µ ν2′ = 825).
multiplication with positive constants imply that a von Neumann-Morgenstern representation However, the choice ν1 ≻ ν2 together with the choice ν2′ ≻ ν1′ violates the independence
can only be unique up to a positive ane transformation, i.e., if U describes a preference axiom.
order, then aU + b, where a>0 and b ∈ R, describe the same preference order. Indeed, if the independence axiom were satised, then
Our goal is now to nd axioms for preference orders that together imply a von Neumann- 1 1 1 1
ν1 + ν2′ ≻ ν2 + ν2′ and
1 ′ 1 1 1
ν + ν2 ≻ ν1′ + ν2
Morgenstern representation. Surprisingly, essentially only two axioms are needed to ensure a 2 2 2 2 2 2 2 2 2
von Neumann-Morgenstern representation Transitivity of ≻ yields
48
Denition 3.9. Let D ⊂ R be an nonempty interval and M a nonempty convex subset of all
for all lotteries ν3 ∈ M and all α ∈ (0, 1).
probability measures on (D, BD ). A preference order ⪰ on M is said to satisfy the continuity
The independence axiom says that if we strictly prefer lottery ν1 over lottery ν2 , then axiom if for any triple ν1 ≻ ν2 ≻ ν3 , there is α ∈ (0, 1) such that
we should also strictly prefer the mixed lottery αν1 + (1 − α)ν3 over the mixed lottery
αν2 + (1 − α)ν3 . From a normative perspective, this is quite reasonable: Comparing the αν1 + (1 − α)ν3 ∼ ν2 .
mixed lotteries, with probability α, we have to choose between ν1 and ν2 , and with probabil-
The continuity axiom says that if a lottery ν2 lies preference-wise strictly in between two
ity (1−α), we do not have to make any choice because we get the lottery ν3 . The independence
other lotteries ν1 and ν3 , then there is a convex combination of ν1 and ν3 such that one is
axiom says that the conditional and the unconditional choice should coincide.
indierent between ν2 and this convex combination.
Even though the independence axiom has a good theoretical foundation, it is not clear if
For simple lotteries, the independence and the continuity axiom together imply a von
it reects people's preferences in practice.
Neumann-Morgestern representation. For a proof of the following result, we refer to [2, Section
Example 3.8 (Allais' Paradox). First, consider the following lotteries: 2.2].
49
ν1 := δ2400 , Theorem 3.10. Let M denote the collection of all simple lotteries on (D, BD ), where D ⊂ R
is a nonempty interval. Let ⪰ be a preference order on M satisfying the independence and
ν2 := 0.33δ2500 + 0.66δ2400 + 0.01δ0 .
48
⪰ implies transitivity of ≻. Indeed, if ν1 ≻ ν2 and ν2 ≻ ν3 , then a fortiori ν1 ⪰ ν2
Note that transitivity of
and ν2 ⪰ ν3 . So transitivity of ⪰ gives ν1 ⪰ ν3 . Seeking a contradiction, suppose that ν1 ⊁ ν3 . Then ν3 ⪰ ν1 ,
Which one would you choose? Empirical studies show that most people would prefer the sure and as ν1 ⪰ ν2 , transitivity of ⪰ gives ν3 ⪰ ν2 . But this is in contradiction to ν2 ≻ ν3 .
Note that the result is wrong for general lotteries on (D, BD ). For the general case, one needs stronger
49
amount and so choose lottery ν1 .
continuity properties of ⪰; see [2, Theorems 2.27 and 2.29].
47 48
U (x) suppose that E [|U (X)|] < ∞. Then
E [U (X)] ≤ U (E [X]) .
Moreover, the inequality is strict when U is strictly concave and X is not P -a.s. constant.
Proof. First, using the denition of concavity, one can show that for each a ∈ D, there is
0 x1 x2 x b∈R such that
Z Z
U (X) ≤ U (a) + b(X − a).
ν1 ⪰ ν2 ⇔ U (x) ν1 (dx) ≥ U (x) ν2 (dx),
and
where the measurable function U : D → R is unique up to a positive ane transformation.
P [U (X) < U (a) + b(X − a)] = P [X ̸= a] > 0,
3.4 Concave functions and Jensen's inequality if U is strictly concave and X is not P -a.s. constant. Thus, by monotonicity and linearity of
where the inequality is strict if U is strictly concave and X is not P -a.s. constant.
We proceed to state and prove the fundamental inequality for concave functions. monotone if δx ≻ δy for x > y , x, y ∈ D,
Lemma 3.13 (Jensen's inequality). Let (Ω, F, P ) be a probability space and X an integrable 51
If U is twice continuously dierentiable, the (weak) inequality (3.2) can be easily derived as follows: Fix
a ∈ D and set b := U ′ (a). By a Taylor expansion of U in a of order 1 with Lagrange remainder term, we
random variable with values in a non-empty interval D ⊂ R. Let U : D → R be concave and obtain for xed x∈D
1 ′′
50
The converse is not true: For example, the function U : R → R, x 7→ −x4 is strictly concave, but U ′′ (0) = 0. U (x) = U (a) + b(x − a) + U (ξ)(x − a)2 ,
2
where ξ lies in the interval with the endpoints x and a. Since U ′′ ≤ 0 by concavity of U, (3.2) follows.
49 50
risk averse
R
if δµν ≻ ν for ν∈M unless ν = δµν , where µν = x ν(dx). grator, and (3.3), we obtain
Z Z
Monotonicity of a preference order means that we strictly prefer a higher sure amount over
U (αx + (1 − α)y) = U (z) δαx+(1−α)y (dz) > U (z) (αδx + (1 − α)δy )(dz)
a lower sure amount, i.e., we strictly prefer more to less. This is a very natural assumption Z Z
both from a conceptual and an empirical perspective. = α U (z) δx (dz) + (1 − α) U (z) δy (dz)
Risk-aversion of a preference order means that for a lottery ν ∈ M, we strictly prefer
= αU (x) + (1 − α)U (y)
to receive the actuarially fair value µν over the lottery ν itself, unless, of course, the lottery
is deterministic and there is no dierence between the two. While risk aversion is a natural 53
Thus, it follows that U is strictly concave.
requirement from a normative perspective, in reality, there are persons who are not risk averse
Conversely assume that U is strictly concave. Let ν∈M with ν ̸= δµν . Then by Jensen's
but even risk-seeking, i.e., they strictly prefer a lottery over the actuarially fair value.52
inequality (Lemma 3.13), it follows that
The following result characterises monotonicity and risk aversion for preference orders that
Z Z Z
admit a von-Neumann-Morgenstern representation.
U (x) δµν (dx) = U (µν ) = U x ν(dx) > U (x) ν(dx).
Lemma 3.15. Let D ⊂ R be a nonempty interval and M a nonempty convex subset of all
probability measures on (D, B) having a nite expectation, and assume that M contains all Thus, we may deduce that δµν ≻ ν .
Dirac measures δx for x ∈ D. Suppose a preference order ⪰ on M has a von Neumann- Denition 3.16. Let D ⊂ R be an nonempty interval. A function U : D →R is called a
Morgenstern representation utility function if U is continuous, strictly increasing and strictly concave.54
Z Z
ν1 ⪰ ν2 ⇔ U (x) ν1 (dx) ≥ U (x) ν2 (dx), Denition 3.17. Let D⊂R be an nonempty interval and M be a nonempty convex subset
Z
U (x) = U (z) δx (dz). (3.3)
U (x) = − exp(−γx)
51 52
Logarithmic utility. Let D = (0, ∞). Then the function U :D→R given by linearity of the integral gives
Z Z
1
U (x) = log(x) U (x) ν(dx) ≈ U (µν ) + U ′ (µν )(x − µν ) + U ′′ (µν )(x − µν )2 ν(dx)
2
1
is called logarithmic utility. = U (µν ) + U ′ (µν )(µν − µν ) + U ′′ (µν )σν2
2
1
Remark 3.18. Logarithmic utility can be seen as power utility with parameter γ = 1. Indeed, = U (µν ) + U ′′ (µν )σν2 .
2
(3.6)
cient − UU ′ (µ
(µν )
ν)
. For this reason, for a utility function U :D→R that is twice continuously
Agents having preferences admitting an expected utility representation are risk averse by
dierentiable in the interior of D, the function AU : D◦ → (0, ∞) dened by
Lemma 3.15. In this section, we try to quantify in terms of their utility function how risk
σν2 R x−µν 2
i.e., agents are indierent between receiving the sure amount cU (ν) or the lottery ν. For this where = ν(dx) denotes the relative variance of ν. So the relative risk premium
µ2ν µν
reason, cU (ν) is called the certainty equivalent of ν for the utility function U. Moreover,
of ν is approximately 1/2 times the relative variance of ν times the coecient
′′
− µνUU′ (µ(µν)
. For
ν)
this reason, for a utility function U :D→R that is twice continuously dierentiable in the
ρU (ν) := µν − cU (ν)
interior of D, the function RU : D◦ → (0, ∞), dened by
risk premium of ν for the utility function U , where µν := x ν(dx) denotes the
R
is called the
xU ′′ (x)
actuarially fair value of ν . By risk aversion and monotonicity of ⪰ it follows that ρU (ν) ≥ 0, RU (x) = xAU (x) := −
U ′ (x)
where the inequality is strict unless ν = δµν .
We proceed to study how the risk premium depends on the utility function U. To this
is called the Arrow-Pratt coecient of relative risk aversion of U. The higher the relative risk
end, we argue heuristically and also assume that U is twice continuously dierentiable and ν aversion AU of a utility function U, the more risk averse an agent is.
has second moments. First, a Taylor expansion of order 1 on the left-hand side of (3.4) gives
Example 3.19. (a). Let U : R → R be an exponential utility function with parameter γ > 0,
i.e., U (x) = − exp(−γx). Then
U (cU (ν)) ≈ U (µν ) + U ′ (µν )(cU (ν) − µν ) = U (µν ) − U ′ (µν )ρU (ν). (3.5)
U ′′ (x) −γ 2 exp(−γx)
AU (x) = − =− = γ.
Next, a Taylor expansion under the integral of order 2 on the right-hand side of (3.4) and U ′ (x) γ exp(−γx)
53 54
So for exponential utility, the absolute risk aversion is constant and equal to the parameter γ. which is an unconstrained optimisation problem.
56 Finally, if
D is invariant by multiplication
For this reason, exponential utility is also called CARA utility, where CARA is an acronym with positive constants, we can dene the function e : D → R by
U
for constant absolute risk aversion.
(b). Let U :D→R be a power utility function with parameter γ ∈ (0, ∞) \ {1}, where U
e (x) = U ((1 + r)x), x ∈ D.
x1−γ
D = [0, ∞) for γ ∈ (0, 1) and D = (0, ∞) for γ ∈ (1, ∞), i.e., U (x) = 1−γ . Then
Then U
e is again a utility function, and (3.9) is equivalent to
57
x(−γx−γ−1 ) h i
RU (x) = − = γ. e x0 + ϑ · (X1 − X0 ) → max !
E U (3.10)
x−γ ϑ∈Rd
So for power (and logarithmic) utility, the relative risk aversion is constant and equal to the
For the following theorem, we need one key result from Measure Theory. For a proof and
parameter γ (1 for the logarithmic case). For this reason, power utility is also called CRRA more information, we refer to [5, Chapter 28].
utility, where CRRA is an acronym for constant relative risk aversion.
Theorem 3.20 (Radon-Nikodým). Let (Ω, F, P ) be a probability space. A probability measure
3.7 A primer on utility maximisation Q on (Ω, F) is equivalent to P if and only if there exists an F -measurable P -integrable random
variable Z > 0 P -a.s. such that
We now consider an investor who wants to invest in a nancial market S = (St0 , St )t∈{0,1} on
Q[A] = E P [Z1A ] .
some probability space (Ω, F, P ). Recall that S00 = 1 and S10 = 1 + r, where r > −1 denotes
the interest rate. The investor has initial wealth x0 > 0 and wants to maximise her nal for all A ∈ F . Moreover, if it exists, Z is P -a.s. unique.
wealth ϑ · S1 among all feasible strategies ϑ ∈ R1+d which satisfy ϑ · S 0 = x0 . We assume that
dQ dQ
If Q ≈ P and Z is as in Theorem 3.20, we often write Z = dP and call Z or
dP the
the preferences of the investor are described by a utility function U : D → R. So she solves
Radon-Nikodým derivative of Q with respect to P . Note that E
P [Z] = Q[Ω] = 1.
E U ϑ · S 1 → max ! subject to ϑ · S 0 = x0 . (3.8) We also have the following important corollary.
ϑ∈Rd+1
Of course, we also need to make sure that the expectation in (3.8) is well dened; in particular
Corollary 3.21. Let (Ω, F, P ) be a probability space, Q ≈ P an equivalent probability measure
we need to check that ϑ · S 1 ∈ D P -a.s.
and Z = dQdP the corresponding Radon-Nikodým derivative. Then a random variable X is Q-
The problem (3.8) is surprisingly dicult, and only for very special cases closed-form
integrable if and only if ZX is P -integrable, in which case
solutions obtain. So we proceed to study existence and uniqueness of (3.8). To this end, we
E Q [X] = E P [ZX] .
reformulate (3.8) in terms of the discounted risky assets X = S/S 0 . If ϑ ∈ Rd+1 is such that
55
Let A ∈ F. Then
So (3.8) is equivalent to
X X Q[{ωn }] X
Q[A] = Q[{ωn }] = P [{ωn }] = Z(ωn )P [{ωn }]
P [{ωn }]
E U (1 + r)(x0 + ϑ · (X1 − X0 )) → max !, (3.9)
{n:ωn ∈A} {n:ωn ∈A} {n:ωn ∈A}
ϑ∈Rd
N
55
ϑ ∈ Rd , ϑ := (ϑ0 , ϑ) := (x0 − ϑ · X0 , ϑ)
X
Note that for setting gives
= Z(ωn )1A (ωn )P [{ωn }] = E P [Z1A ] ,
ϑ · X 0 = ϑ0 × 1 + ϑ · X0 = x0 − ϑ · X0 + ϑ · X0 = x0 . n=1
56
Of course, we still have the constraint that (1 + r)(x0 + ϑ · (X1 − X0 )) ∈ D P -a.s.
57
We still have the constraint that x0 + ϑ · (X1 − X0 ) ∈ D P -a.s.
55 56
and so Z= dP
dQ . Sketch of part (c) and (d) of Theorem 3.23. (c). First, we establish existence of ϑ∗ . By part
(b), u(x) < ∞. So there is a sequence (ϑn )n∈N in A(x) such that
The following result gives for the domain D = [0, ∞) and under relatively weak assump-
tions, existence, uniqueness and further properties of the expected utility maximisation prob-
lim E [U (x + ϑn · (X1 − X0 ))] = u(x) < ∞.
n→∞
lem in its simplied version (3.10), where we write again U instead of U
e.
Since A(x) is compact by part (a), by the Bolzano-Weierstraÿ theorem, there exists a sub-
Theorem 3.23. Let S = (S 0 , St )t∈{0,1} be a non-redundant market on some probability space
sequence, denoted again by (ϑn )n∈N , converging to some ϑ∗ ∈ A(x). Assuming that we can
(Ω, F, P ), and assume that S has nite rst moments, i.e., E |Sti | < ∞ for i ∈ {0, . . . , d}
interchange expectation and limits and using continuity of U, we obtain
and t ∈ {0, 1}. Let U : [0, ∞) → R be a utility function that is continuously dierentiable on
(0, ∞) and satises U (0) = 0 and lim U (x) = +∞. Fix x > 0 and set
h i
x→∞ u(x) = lim E [U (x + ϑn · (X1 − X0 ))] = E lim U (x + ϑn · (X1 − X0 ))
n→∞ n→∞
= E [U (x + ϑ∗ · (X1 − X0 ))]
A(x) := ϑ ∈ Rd : x + ϑ · (X1 − X0 ) ≥ 0 P -a.s. ,
n o
u(x) := sup E U x + ϑ · (X1 − X0 ) , ϑ∗ . Seeking a contradiction, let ϑe∗ ̸= ϑ∗ ∈ A(x) be such
Next, we establish uniqueness of
ϑ∈A(x)
h i
that E U x + ϑ̃∗ · (X1 − X0 ) = u(x). Set ϑb∗ := 21 ϑ∗ + 12 ϑe∗ . Then ϑb∗ ∈ A(x) by convexity
where X = S/S 0 denote the discounted risky assets. Then: of A(x). By strict concavity of U and nonredundancy of S,
(a) The set A(x) is convex. It is compact if and only if S satises NA. 1 1
U x + ϑb∗ · (X1 − X0 ) ≥ U (x + ϑ∗ · (X1 − X0 )) + U x + ϑe∗ · (X1 − X0 ) ,
2 2
(b) We have u(x) < ∞ if and only if S satises NA.
where the inequality is strict with positive probability. Taking expectation gives
which is a contradiction.
U ′ x + ϑ∗ · (X1 − X0 )
dQ
= ′ (d). Fix i ∈ {1, . . . , d}. Since ϑ∗ is an interior point of A(x), for all suciently small ε > 0,
dP E U x + ϑ∗ · (X1 − X0 )
ϑ∗ ± εei ∈ A(x), where ei denotes the ith unit vector in Rd . Maximality of ϑ∗ then gives
Remark 3.24. (a) The set A(x) admissible strategies for initial wealth x,
is called the set of
E U x + (ϑ∗ − εei ) · (X1 − X0 ) − U (x + ϑ∗ · (X1 − X0 )) ≤ 0.
and the function u is called the indirect utility function. If S satises NA, one can show that
u is again a utility function. Dividing by ε, letting ε→0 and assuming that we can interchange limits and expectations,
(b) If Ω is nite and U satises the Inada condition limx→0 U ′ (x) = +∞, it is not dicult we obtain
to check that ϑ∗ lies in the interior of A(x).58
E U ′ x + ϑ∗ · (X1 − X0 ) (X1i − X0i ) ≤ 0,
(c) For nite Ω, Theorem 3.23(d) can be seen as a constructive version of the Fundamental
√
Asset Pricing. Indeed, choose U (x) = x. E −U ′ x + ϑ∗ · (X1 − X0 ) (X1i − X0i ) ≤ 0,
Theorem of
58
If Ω is innite, even with the Inada condition, ϑ∗ does in general not lie in the interior of A(x) and the
assertion in (b) of the remark is false.
57 58
and hence 4 Introduction to Risk Measures
E U ′ x + ϑ∗ · (X1 − X0 ) (X1i − X0i ) = 0.
In this short chapter, we briey discuss how to quantify the downside risk of a nancial
Assuming that U ′ (x + ϑ∗ · (X1 − X0 )) is P -integrable, and dening the measure Q≈P on F position. To this end, we follow the axiomatic approach initiated by Artzner, Delbaen, Ebner
by and Heath in a seminal paper in 1999.
dQ U ′ (x + ϑ∗ · (X1 − X0 ))
= ,
dP E [U ′ (x + ϑ∗ · (X1 − X0 ))]
4.1 Monetary measures of risk
we obtain by Corollary 3.21,
Throughout this chapter, we consider a linear subspace X of all real-valued random variables
U ′ (x + ϑ∗ · (X1 − X0 ))
dQ i (Ω, F), X ∈ X
E Q X1i − X0i = E P (X1 − X0i ) = E P (X1i − X0i ) on a measurable space containing the constants. Each is interpreted as a
dP P ′
E [U (x + ϑ∗ · (X1 − X0 ))]
possible nancial position of a large company, the prime example beings banks and insurance
1
E P U ′ x + ϑ∗ · (X1 − X0 ) (X1i − X0i ) = 0. downside risk
= P ′ companies. We aim to measure the related to the position X.
E [U (x + ϑ∗ · (X1 − X0 ))]
space (Ω, F), containing the constants. A map ρ:X →R is called a monetary measure of
risk if it has the following properties:
Normalisation : ρ(0) = 0.
Monotonicity : If X1 ≤ X2 , then ρ(X1 ) ≥ ρ(X2 ).
translation invariance. The nancial meaning of monotonicity is that the downside risk of a
We proceed to show that the variance even transformed in an appropriate way is not a
good measure of risk because it fails to be monotone, unless we restrict ourselves to the case
Example 4.2. Let (Ω, F, P ) be a probability space and X the set of all real-valued random
E X 2 < ∞.
variables having nite second moment, i.e., X∈X if and only if Moreover, let
XN be a linear subspace of X such that all X ∈ XN are normally distributed, where we agree
that a normal distribution with variance zero and mean µ∈R is the Dirac distribution δµ .
Dene the map ρ:X →R by
p
ρ(X) = Var[X] − E [X] .
59 60
Then ρ is normalised. It is also cash-invariant on X (and on XN ). Indeed, let X ∈X and As Φ is strictly increasing, it follows that
m ∈ R. Then
x − µ1 x − µ2
≤ (4.2)
p p σ1 σ2
ρ(X + m) = Var[X + m] − E [X + m] = Var[X] − E [X] − m = ρ(X) − m.
Dividing (4.2) by x and letting x → ±∞, may deduce that σ12 = σ22 . Together with µ1 ≥ µ2 ,
However, ρ is not monotone on X . Indeed, let X1 = 1 and X2 be a Pareto distributed random we obtain
variable with Parameter 3, i.e., the cdf of X2 is given by ρ(X1 ) = σ1 − µ1 ≤ σ2 − µ2 = ρ(X2 ).
1 − x−3 if x ≥ 1, This shows that ρ is monotone on XN .
FX2 (x) = (4.1)
0 if x < 1. Remark 4.3. One can show that if one replaces in Example 4.2 the variance Var[X] by the
semivariance SVar[X], which is dened by
Then X2 ≥ 1 = X1 .59 Using the formula for the mean and the variance of a Pareto distribu-
As X1 ≥ X2 , it follows that Convexity of a risk measure formalises the idea that diversication should reduce the
risk. This is best seen in the equivalent formulation as subadditivity (assuming also positive
x − µ1 x − µ2
FX1 (x) ≤ FX2 (x) ⇔ Φ ≤Φ , x∈R homogeneity): If a large company has dierent product lines or desks, the total risk of the
σ1 σ2
aggregate position is bounded by the sum of the individual risks related to each product line
59
We may assume without loss of generality that this equation does not hold only P -a.s. but even for ω. all or desk. Apart from being a reasonable requirement from an economic perspective, this
60 1 2
Note that X1 ≥ X2 implies that either both X and X are Dirac measures (in which case the claim is
1 2
trivial) or both X and X have nonvanishing variance. property is quite useful from a management perspective.
Positive homogeneity of a risk measure means that risk grows in a linear way. Even though
this is convenient from a mathematical perspective, it is less clear, if this is a good assumption
61 62
4.2 Value at Risk and Expected Shortfall Then the distribution of X satises
We proceed to discuss the two most important risk measures used in practice. The rst
P [X = −100] = (0.01)2 = 0.0001,
example is value at risk which was invented in 1993 as part of the seminal G-30 report
61 P [X = −5] = 2 × 0.01 × 0.99 = 0.0198,
and widely propagated by the RiskMetrics of JP Morgan launched in 1994.
P [X = 90] = (0.99)2 = 0.9801
Denition 4.6. Let (Ω, F, P ) be a probability space and X the set of all random variables.
Let α ∈ (0, 1) be a condence level. For X ∈ X, the Value at Risk of X at level α is given and we have
by
62
1 if m ∈ (−∞, −90),
VaRα (X) = inf{m ∈ R : P [m + X < 0] ≤ α}.
0.0199 if m ∈ [−90, 5),
P [X + m < 0] = (4.4)
The Value at Risk at level α is the smallest amount of capital which, if added to X, keeps
0.0001 if m ∈ [5, 100),
m ∈ [100, ∞),
0
the probability of a negative outcome below or equal to α. Typical values for α are 0.05, 0.01 if
or 0.001. Hence,
Value at Risk is probably the most widely used risk measure in practice. One can easily
VaR0.01 (Xi ) = 5,
check that it is normalised, monotone, cash-invariant and positively homogeneous.
1
However, Value at Risk fails to be convex, i.e., it may punish diversication instead of and the diversied position X= 2 X1 + 21 X2 is no longer acceptable.
encouraging it.
Let us briey comment on what goes wrong in Example 4.7. The probability of a loss is
Example 4.7. Let X1 and X2 be two independent identically distributed random variables higher for X than for Xi . Indeed,
1 1
X := X1 + X2 .
2 2 Denition 4.8. Let (Ω, F, P ) be a probability space and X the set of all real-valued random
61 variables having nite rst moments. Let α ∈ (0, 1) be a condence level. For X ∈ X, the
However, the use of value at risk (without the name) goes back to the early decades of the 20th century.
62
Note that in part of the literature α is replaced by 1 − α. Expected Shortfall of X at level α is given by
Z α
1
ESα (X) = VaRu (X) du.
α 0
Since VaRu is nonincreasing in u, it follows that ESα (X) ≥ VaRα (X) for all α and all
63 64
X. Moreover, one can show that unlike Value at Risk, Expected Shortfall is a coherent risk Moreover, note that
measure, i.e., it is in particular a convex risk measure; see [2, Theorem 4.52]. One can even
100 × 0.0001 + 5 × 0.0198
show that it is optimal in the sense that it is the smallest law-invariant convex risk measure E [−X | −X ≥ VaR0.01 (X)] = E [−X | −X ≥ 5] =
0.0199
that is more conservative than Value at Risk, see [2, Theorem 4.67].
≈ 5.47 < 5.95 = ES0.01 (X).
We state an alternative characterisation of Expected Shortfall for continuous distributions,
which shows that Expected Shortfall takes care of the size of the loss given default. For a This shows that (4.5) is not true in this example.
Lemma 4.9. Let X be an integrable random variable on a probability space (Ω, F, P ). Suppose
that the distribution of X is continuous. Then for α ∈ (0, 1),
The following example shows that Expected Shortfall encourages diversication and also
demonstrates that Lemma 4.9 is false without the assumption of a continuous distribution.
Example 4.10. Consider the setup of Example 4.7. Using (4.3), we obtain for i ∈ {1, 2},
100 if u ∈ (0, 0.01),
VaRu (Xi ) =
−90 if u ∈ [0.01, 1).
100 if u ∈ (0, 0.0001),
VaRu (X) = 5 if u ∈ [0.0001, 0.0199),
−90
if u ∈ [0.0199, 1).
This gives
So the Expected Shortfall of the diversied position X is signicantly lower than the Ex-
65 66
5 Pricing and Hedging in Finite Discrete Time a random variable Y :Ω→R is G -measurable if and only if it can be written as
The goal of this chapter is to describe pricing and hedging of derivative contracts like call or
N
X
Y = yn 1A n , (5.1)
put options of nancial markets in nite discrete time.
n=1
them in the next section, we rst need to introduce the measure theoretic version of a condi- N
X
Y := E [X | An ] 1An (5.2)
tional expectation. n=1
Before giving the precise denition, let us explain the underlying idea. Consider a random
variable X on some probability space (Ω, F, P ). If we have full information, i.e., we know F , is (a version of ) the conditional expectation of X given G, where for A ∈ F, the elementary
for each state of the world ω ∈ Ω, we know if ω∈A or ω∈
/A for all A ∈ F. In particular, as conditional expectation of X given A is dened by
{X = x} ∈ F for all x ∈ R, we can fully observe X. But if we have only partial information,
E [X1A ] P [A] > 0,
i.e., we are given a sub-σ-algebra G ⊂ F , for each state of the world ω ∈ Ω, we know if E [X | A] := P [A] if
(5.3)
0 if P [A] = 0.
ω∈A or ω∈
/A only for all A ∈ G. In particular, as {X = x} may not be in G, we can no
longer fully observe X. The extreme case is that we have trivial information, i.e., G = {∅, Ω},
Indeed, it follows from (5.1) that Y satises the measurability property (1) of a conditional
where we can only assert that ω ∈ Ω. So how we can make a best forecast for X if we have
expectation. To check that Y also satises the averaging property (2), let A ∈ G. Then there
only partial information (i.e., if we are given a sub-σ -algebra G )? For trivial information, SK
are indices n1 , . . . , nK ∈ {1, . . . , N } such that A= k=1 Ank . Note that by denition of Y
i.e., G = {∅, Ω}, this best prognosis is the expectation E [X]. The conditional expectation
and (5.3), for each k ∈ {1, . . . , K},
generalises the concept of expectation to general partial information.
h i h i
Denition 5.1. Let (Ω, F, P ) be a probability space, G ⊂ F a sub-σ -algebra, and X an h i h i E X1Ank E X1Ank h i h i
E Y 1Ank = E E [X | Ank ] 1Ank =E 1Ank = E 1Ank = E X1Ank .
F -measurable integrable random variable. Any integrable random variable Y satisfying P (Ank ) P (Ank )
The following result gives existence, uniqueness and further properties of conditional ex-
The random variable Y in Denition 5.1 is to be interpreted as the best forecast for X given
63
the information measurability property (1) ensures that Y only uses the information
G. The
pectations.
given in G , and the averaging property (2) ensures that Y is indeed the best forecast.
Theorem 5.3. Let (Ω, F, P ) be a probability space, G ⊂ F a sub-σ-algebra, and X an F -
Example 5.2. Let (Ω, F, P ) be a probability space and A1 , . . . , AN an F -measurable partition measurable integrable random variable. Then the conditional expectation E [X | G] exists and
Ω, An ∈ F n ∈ {1, . . . , N },
SN
Ai ∩ Aj = ∅ i ̸= j . is P -a.s. unique, i.e., if Y and Y ′ are random variables satisfying properties (a) and (b) in
of i.e., for n=1 An =Ω and for Let
Denition 5.1, then Y = Y ′ P -a.s. Moreover, we have the following properties:
G := σ(A1 , . . . , AN ),
(a) E [E [X | G]] = E [X].
i.e, G is the smallest σ -algebra containing A1 , . . . , A N . One can check that
SK
A∈G if and only
(b) If X is G -measurable (e.g. a constant), then E [X | G] = X P -a.s.
if there are indices n1 , . . . , nK ∈ {1, . . . , N } such that A = k=1 Ank . One can then show that 63
For a proof, we refer to [5, Chapter 23].
67 68
(c) If G = {∅, Ω}, then E [X | G] = E [X]. The basic denition of a stochastic process does not say anything about the ow of infor-
mation. To model the latter, we assume that the information available at time t is described
(d) If X1 and X2 are integrable random variables and a, b ∈ R, then
by the σ -algebra Ft . As information increases over time, it is naturally to assume that
(e) If X1 and X2 are integrable random variables with X1 ≥ X2 P -a.s., then and we call any increasing sequence of σ -algebras (Ft )t∈{0,...,T } satisfying (5.4) a ltration and
(Ω, F, (Ft )t∈{0,...,T } , P ) a ltered probability space. To simplify the presentation, we always
E [X1 | G] ≥ E [X2 | G] P -a.s.
assume that F0 = {∅, Ω} (trivial information) and FT = F (full information). With regard
to a ltration (Ft )t∈{0,...,T } , there are two important notions for stochastic processes.
If in addition P [X1 > X2 ] > 0, then P [E [X1 | G] > E [X2 | G]] > 0.
Denition 5.4. Let (Ω, F, (Ft )t∈{0,...,T } , P ) be a ltered probability space.
(f ) If H ⊂ G is a sub-σ-algebra of G , then
A stochastic process X = (Xt )t∈{0,...,T } is called adapted (to the ltration (Ft )t∈{0,...,T } ),
E [E [X | G] | H] = E [X | H] P -a.s. if each Xt is Ft -measurable.
(g) If Z is G -measurable and ZX is integrable, then A stochastic process Y = (Yt )t∈{1,...,T } is called predictable 65 (for the ltration (Ft )t∈{0,...,T } ),
if each Yt is Ft−1 -measurable.
E [ZX | G] = ZE [X | G] P -a.s.
If a stochastic process X = (Xt )t∈{0,...,T } is adapted, then at each time t, we are given the
information Ft and so can fully observe Xt (and also X0 , . . . , Xt−1 ). By contrast, we may not
(h) If X is independent of G , i.e., P [{X ∈ B} ∩ A] = P [X ∈ B]P [A] for all B ∈ BR and
be able to fully observe Xt+1 , . . . , XT . If a stochastic process Y = (Yt )t∈{1,...,T } is predictable,
A ∈ G, then
Yt (and also Y1 , . . . , Yt−1 ) can not only be fully observed at time t but already at time t − 1,
E [X | G] = E [X] P -a.s.
i.e., one period ahead. So we can accurately predict Yt already at time t − 1.
(i) If U : R → R is convex64 and E [|U (X)|] < ∞, For an adapted process X = (Xt )t∈{0,...,T } , knowledge about Xt does not give any knowl-
edge about Xt+1 in general. The special case that Xt gives the best available information
E [U (X) | G] ≥ U (E [X | G]) P -a.s. about Xt+1 , i.e., Xt = E [Xt+1 | Ft ] P -a.s. leads to the concept of a martingale.
Denition 5.5. Let M = (Mt )t∈{0,...,T } be a real-valued stochastic process on some ltered
In the above theorem, property (d) is referred to as linearity of conditional expectations,
probability space (Ω, F, (Ft )t∈{0,...,T } , P ). Then M is called a martingale (with respect to P
property (e) is referred to as monotonicity of conditional expectations, property (f ) is referred
and (Ft )t∈{0,...,T } ) if
to astower property of conditional expectations, property (g) is referred to as pull-out property
of conditional expectations, property (h) is referred to as independence property of conditional (1) M is adapted to (Ft )t∈{0,...,T } ;
expectations, and property (i) is referred to as Jensen's inequality for conditional expectations.
(2) M is P -integrable, i.e., each Mt is P -integrable;
We proceed to study stochastic processes on a probability space (Ω, F, P ). By denition, Remark 5.6. adaptedness, property (2)
(a) In Denition 5.5, property (1) is referred to as
a stochastic process is simply a family of random variables indexed by time, where in our integrability, and property (3), which is the crucial property, is referred to as
is referred to as
context, the index set is either {0, . . . , T } or {1, . . . , T }. martingale property.
64 65
Note that unlike in Lemma 3.13, we consider a convex instead of concave function, so the direction of the Note that in our denition, predictable processes start at t = 1.
inequality is changed.
69 70
(b) The martingale property (3) in Denition 5.5 is equivalent to the formally weaker (b) Let (Ω, F, (Ft )t∈{0,...,T } , P ) be a ltered probability space and Z an F -measurable
property integrable random variable. Then the process M = (Mt )t∈{0,...,T } dened by
(pick s = t − 1). Otherwise, if (5.5) is satised, let 0 ≤ s ≤ t ≤ T. If s = t, then is a martingale with MT = Z P -a.s. Indeed, adaptedness and integrability of M follow from
E [Mt | Fs ] = E [Mt | Ft ] = Mt = Ms P -a.s. by adaptedness of M and the pull-out property of the denition of conditional expectations. The martingale property of M follows from the
conditional expectations. Otherwise, there is n ∈ {1, . . . , T } such that s = t − n. The tower tower property of conditional expectations, and MT = Z P -a.s. follows from the pull-out
property of conditional expectations and (5.5) give property of conditional expectations and our standing assumption that FT = F .
E [Mt | Fs ] = E [Mt | Ft−n ] = E [E [Mt | Ft−1 ] | Ft−n ] = E [Mt−1 | Ft−n ] 5.3 Financial markets in nite discrete time
= E [E [Mt−1 | Ft−2 ] | Ft−n ] = E [Mt−2 | Ft−n ] = · · ·
We now consider a nancial market with 1+d assets as in Chapter 2, but now assume that
= E Mt−(n−1) Ft−n = Mt−n = Ms P -a.s. the assets are priced at times t = 0, 1, . . . , T for some nite time horizon T ∈ N. We work on
on some ltered probability space (Ω, F, (Ft )t∈{0,...,T } , P ), where the ltration (Ft )t∈{0,...,T }
(c) If = in property (3) of Denition 5.5 is replaced by ≥, M is called a submartingale,
describes the ow of information. We always assume without further mentioning that F0 =
and if it is replaced by ≤, M is called a supermartingale.
{∅, Ω} and FT = F . We model the assets as adapted stochastic processes (Sti )t∈{0,...,T } ,
There are countless examples of martingales; here we just consider two. i ∈ {0, . . . , d}. We assume that S0 is locally riskless in the sense that St0 is already known one
period ahead, i.e., St0 is Ft−1 -measurable for t ∈ {1, . . . , T }. More precisely, we assume that
Example 5.7. (a) Let (Ω, F, P ) be a probability space and X1 , . . . , XT independent integrable
random variables with mean 0. Also assume that F = σ(X1 , . . . , XT ). Set F0 := {∅, Ω} and t
Y
St0 := (1 + rk ),
k=1
Ft := σ(X1 , . . . , Xt ), t ∈ {1, . . . , T },
where rk > −1 P -a.s. is Fk−1 -measurable and denotes the interest rate in period k, i.e, from
i.e., Ft is the smallest σ -algebra for which X1 , . . . , Xt are measurable. Dene the process
k−1 to k. So the process (rt )t∈{1,...,T } is predictable. We also refer to S0 as bank account.
M = (Mt )t∈{0,...,T } by
66
We set as in Chapter 2,
t
X
Mt := Xi .
i=1
St = (St1 , . . . , Std ) and S t = (St0 , St ), t ∈ {0, . . . , T },
adaptedness and integrability of M. To check the martingale property, we use the alternative Example 5.8 .
(Binomial model) Assume that
67 d = 1, i.e., there is only one risky asset. Let
characterisation (5.5). So let t ∈ {1, . . . , T }. By linearity, the pull-out property, the indepen- u > r > d > −1. Assume that the bank account is given by
dence property of conditional expectations (using that Xt is independent of Ft−1 ), and fact
E [Mt | Ft−1 ] = E [Mt−1 + Xt | Ft−1 ] = E [Mt−1 | Ft−1 ] + E [Xt | Ft−1 ] Moreover, assume that the risky asset S 1 = (St1 )t∈{0,...,T } is given by
71 72
where S01 > 0 and Y1 , . . . , YT are i.i.d. random variables on some suitable probability space and let the probability measure P be given by
(Ω, F, P ) satisfying
T
Y
P [{ω}] := P [{(x1 , . . . , xT )}] = p xt .
P [Yi = 1 + u] = p1 and P [Yi = 1 + d] = p2 , t=1
where p1 , p2 > 0 and p1 + p2 = 1. We assume that the ltration (Ft )t∈{0,...,T } is given by Next, we set for t ∈ {1, . . . , T } and x1 , . . . , xt ∈ {1, 2},
i.e., Ft is the smallest σ -algebra with respect to which S01 , . . . , St1 are measurable. One can Then A(x1 ,...,xt ) denotes all states of the world with path up to time t given by (x1 , . . . , xt ).
check that Ft = σ(Y1 , . . . , Yt ) for t ∈ {1, . . . , T }, i.e., Ft is also the smallest σ -algebra generated It is not dicult to check that
by Y1 , . . . , Y t .
Ft := σ A(x1 ,...,xt ) : x1 , . . . , xt ∈ {1, 2} , t ∈ {1, . . . , T }
For a small number of T, e.g. T = 3, the above model can be nicely illustrated by the
following trees, where the numbers beside the branches denote transition probabilities. For
so for a state of the world ω = (x1 , . . . , xT ) ∈ Ω given the information in Ft , we can determine
convenience, we assume that S01 = 1.
the values of x1 , . . . , x t but not the values of xt+1 , . . . , xT , i.e., we can say if the stock went
1 1 1 up or down in period 1, . . . , t, but we cannot say if the stock will go up or down in periods
S0 : 1 1+r (1 + r)2 (1 + r)3
t + 1, . . . , T .
p1 (1 + u)3
Remark 5.9. Note that for the binomial model, the tree for S1 is recombining, so that the
(1 + u)2
p1 number of nodes only grows linearly in time. For non-recombining trees, the number of nodes
p2
p1 1+u (1 + u)2 (1 + d) grows exponentially in time. This dierence is very important from a computational/numerical
p2 p1 perspective.
S1 : 1 (1 + u)(1 + d)
p2 p1 p2
1+d (1 + u)(1 + d)2 As in Chapter 2, we discount with S0 (or take S0 as numéraire) and dene the discounted
p2
(1 + d)2 p1 assets X 0 , . . . , X d by
p2 (1 + d)3 Sti
Xti := , t ∈ {0, . . . , T }, i ∈ {0, . . . , d}.
St0
Let us nally describe how to give a rigorous description of the binomial model. This is
Then X 0 ≡ 1, and X = (Xt1 , . . . , Xtd )t∈{0,...,T } expresses the value of the risky assets in units
for instance important for implementing the binomial model on a computer. For Ω, we take
of the numéraire S0.
the path space
n o
Ω := {1, 2}T = ω = (x1 , . . . , xT ) : x1 , . . . , xT ∈ {1, 2} ,
5.4 Self-nancing strategies
To describe trading in the multiperiod market S = (St0 , St )t∈{0,...,T } is more complicated than
i.e., each ω = (x1 , . . . , xT ) describes one path in the tree corresponding to the model; e.g.
in the one-period setting in Chapter 2. We have to describe for each stock i and for each
ω = (1, . . . , 1) describes the state of the world that the stock goes up in each period. We set
trading period t, the number ϑit of shares held in asset i in period t, i.e., from t−1 to t. This
F := 2Ω , dene the random variables Y1 , . . . , Y T by
quantity is in general no longer a number but a random variable. However, as we cannot look
1 + u into the future, ϑit can only use the information available at the beginning of period t, i.e., at
if xt = 1,
Yt (ω) = Yt ((x1 , . . . , xT )) = time t − 1, so it must be Ft−1 -measurable.
1 + d if xt = 2,
Denition 5.10. Let S = (St0 , St )t∈{0,...,T } be a nancial market on some ltered probability
73 74
space (Ω, F, (Ft )t∈{0,...,T } , P ). A trading strategy is an R1+d -valued stochastic process ϑ = The name value process for V (ϑ) comes from the fact that V (ϑ) denotes the (discounted)
(ϑ0t , ϑt )t∈{1,...,T } that is predictable. value of the strategy ϑ at time t (after trading for t = 0 and before trading for t ∈ {1, . . . , T }).
t−1 i i
(after trading), and ϑt St is the resulting value at time t (before trading). So for all assets
fact that Xt0 − Xt−1
0 = 1 − 1 = 0,
and the resulting value at time t (before trading) is trading up to time t.68
The following result provides an equivalent characterisation of self-nancing strategies.
d
Proposition 5.13. Let S = (St0 , St )t∈{0,...,T } be a nancial market on some ltered probability
X
ϑt · S t = ϑit Sti .
i=0
space (Ω, F, (Ft )t∈{0,...,T } , P ) and ϑ = (ϑ0t , ϑt )t∈{1,...,T } a trading strategy. Then the following
So ϑt · S t is the pre-trading value of ϑ at time t and ϑt+1 · S t is the post-trading value of ϑ at
are equivalent:
time t. If we assume that we neither withdraw nor inject money at time t, we must have the
(a) ϑ is self-nancing.
book-keeping identity
Denition 5.11. Let S = (St0 , St )t∈{0,...,T } be a nancial market on some ltered probability is equivalent to
space (Ω, F, (Ft )t∈{0,...,T } , P ). A trading strategy ϑ = (ϑ0t , ϑt )t∈{1,...,T } is called a self-nancing
strategy if ϑt+1 · X t+1 − ϑt · X t = ϑt+1 · X t+1 − ϑt+1 · X t = ϑt+1 · (X t+1 − X t )
ϑt · S t = ϑt+1 · S t P -a.s. for t ∈ {1, . . . , T − 1} (5.6) = ϑt+1 · (Xt+1 − Xt ) P -a.s., t ∈ {1, . . . , T − 1}, (5.7)
The self-nancing condition (5.6) is extremely natural from an economic perspective. From
where we have used in the last step that
0 − X 0 = 1 − 1 = 0.
Xt+1 Rewriting (5.7) by using the
t
a mathematical perspective, however, it is a rather inconvenient constraint. For this reason,
denition of the value and the gains process, it follows that (b) is equivalent to
we seek to nd an alternative characterisation of self-nancing strategies. It turns out that to
Denition 5.12. Let S = (St0 , St )t∈{0,...,T } be a nancial market on some ltered probability
space (Ω, F, (Ft )t∈{0,...,T } , P ). For a trading strategy ϑ = (ϑ0t , ϑt )t∈{1,...,T } , dene the Moreover, the denition of the value and the gains process and the fact that X10 − X00 =
1−1=0 give
(discounted) value process (Vt (ϑ))t∈{0,...,T } by
Now assuming (b), summing over (5.8) and then adding (5.9) gives (c), and assuming (c) and
(discounted) gains process (Gt (ϑ))t∈{0,...,T } by
subtracting (c) for t from (c) for t+1 gives (5.8), which in turn is equivalent to (b).
t 68
X Note that the value process V (ϑ) depends on all 1+d coordinates of ϑ, whereas the gains process G(ϑ)
G0 (ϑ) := 0 and Gt (ϑ) := ϑk · (Xk − Xk−1 ) for t ∈ {1, . . . , T }; only depends on the last d coordinates ϑ of ϑ.
k=1
75 76
The equivalence of (a) and (c) in Proposition 5.13 has a very important consequence. Any (a) The market S satises NA.
pair (V0 , ϑ), where V0 ∈ R and ϑ = (ϑt )t∈{1,...,T } is a Rd -valued predictable process can be
(b) There does not exist a predictable process ϑ = (ϑ1t , . . . , ϑdt )t∈{1,...,T } such that
identied with the self-nancing strategy ϑ = (ϑ0t , ϑt )t∈{1,...,T } whose value process satises
More precisely, dene the process (ϑ0t )t∈{1,...,T } by We next extend the denition of an equivalent martingale measure to the multiperiod
setting.
and set ϑ := (ϑ0 , ϑ). It follows from the second equality in (5.11) that ϑ0 and hence also ϑ space (Ω, F, (Ft )t∈{0,...,T } , P ). Denote by X := S/S0 the discounted risky assets. A measure
are predictable. Moreover, by the denition of the value process and (5.11), we obtain
Q on (Ω, F) is called an equivalent martingale measure (EMM) for X if Q≈P and each Xi
is a Q-martingale, i.e., a martingale under the measure Q.
V0 (ϑ) = ϑ01 + ϑ1 · X0 = V0 + G0 (ϑ) − ϑ1 · X0 + ϑ1 · X0 = V0 ,
The following result on equivalent martingale measures is a version of Doob's famous
Vt (ϑ) = ϑ0t + ϑt · Xt = V0 + Gt (ϑ) − ϑt · Xt + ϑt · Xt = V0 + Gt (ϑ), t ∈ {1, . . . , T }. systems theorem for martingales.
We will make use of the identication (5.10) throughout the rest of the chapter. To this end, Theorem 5.18. Let S = (St0 , St )t∈{0,...,T } be a nancial market on some ltered probability
we introduce the shorthand notation space (Ω, F, (Ft )t∈{0,...,T } , P ). Denote by X := S/S 0 the discounted risky assets. Let Q ≈ P
be an equivalent measure. Then the following are equivalent:
ϑ=
b (V0 , ϑ).
(a) Q is an EMM for X .
5.5 The Fundamental Theorem of Asset Pricing revisited (b) For all self-nancing strategies ϑ = (ϑ0t , ϑt )t∈{1,...,T } for which ϑ is bounded, the value
We proceed to extend the denition of an arbitrage opportunity from Chapter 2 to the mul- process V (ϑ) is a Q-martingale.69
tiperiod setup.
(c) For all self-nancing strategies ϑ = (ϑ0t , ϑt )t∈{1,...,T } with VT (ϑ) ≥ 0 Q-a.s., the value
Denition 5.14. 0
Let S = (St , St )t∈{0,...,T } be a nancial market on some ltered probabil- process V (ϑ) is a (nonnegative) Q-martingale.
0
ity space (Ω, F, (Ft )t∈{0,...,T } , P ). A self-nancing strategy ϑ = (ϑt , ϑt )t∈{1,...,T } is called an
Proof. We only establish (a) ⇒ (b).
70 So let Q be an EMM for X and ϑ = (ϑ0t , ϑt )t∈{1,...,T } a
arbitrage opportunity if
self-nancing strategy for which ϑ is bounded. Proposition 5.13 gives
admits arbitrage, there always exists an arbitrage opportunity with ϑ1 · S 0 = 0. all i ∈ {1, . . . , d} and all t ∈ {0, . . . , T }. Then by (5.12),
The following result is the multiperiod version of Proposition 1.7. Its proof is left as an t X
X d t X
X d
ϑik (Xki − Xk−1
i
c |Xki | + |Xk−1
i
|Vt (ϑ)| = V0 (ϑ) + ) ≤ |V0 (ϑ)| + | .
exercise.
k=1 i=1 k=1 i=1
Proposition 5.16. Let S = (St0 , St )t∈{0,...,T } be a nancial market on some ltered probability 69
Note that (ϑ0t )t∈{1,...,T } might be unbounded.
70
space (Ω, F, (Ft )t∈{0,...,T } , P ). The following are equivalent: For a proof of the more dicult directions (b) ⇒ (c) and (c) ⇒ (a), we refer to [2, Theorem 5.14].
77 78
As |V0 (ϑ)| is a constant and each |Xki | is Q-integrable by the fact that each Xi is a Q- Then V0 (ϑ) = ϑ01 + ϑ1 · X0 = G0 (ϑ) − ϑ1 · X0 + ϑ1 · X0 = 0 and
martingale, it follows that |Vt (ϑ)| and hence also Vt (ϑ) is Q-integrable.
To check the Q-martingale property of V (ϑ) (in the form of (5.5)), x t ∈ {1, . . . , T }. ϑt · X t = ϑ0t + ϑt · Xt
Then by (5.12), the fact that Vt−1 (ϑ) and ϑt are Ft−1 -measurable, linearity and the pull-out = Gt−1 (ϑ) − ϑt · Xt−1 + ϑt · Xt
property of conditional expectations, and the Q-martingale property of each X i, we obtain = Gt−1 (ϑ) + ϑt · (Xt − Xt−1 )
Q
Q
= Gt (ϑ)
E Vt (ϑ) Ft−1 = E Vt−1 (ϑ) + ϑt · (Xt − Xt−1 ) Ft−1
d = Gt (ϑ) − ϑt+1 · Xt + ϑt+1 · Xt
X
E Q ϑit (Xti − Xt−1
i
= Vt−1 (ϑ) + ) Ft−1 = ϑ0t+1 + ϑt+1 · Xt = ϑt+1 · Xt ;
i=1
d
X in particular by Proposition 5.13, ϑ is self-nancing.
ϑit E Q Xti − Xt−1
i
= Vt−1 (ϑ) + Ft−1
i=1 It follows from Proposition 5.13 that VT (ϑ) = GT (ϑ) and then by Theorem 5.18(c), VT (ϑ)
Xd is a Q-martingale so
= Vt−1 (ϑ) + ϑit (Xt−1
i i
− Xt−1 )
i=1
0 = V0 (ϑ) = E Q VT (ϑ) F0 = E Q VT (ϑ) .
= Vt−1 (ϑ) Q-a.s.
Since VT (ϑ) ≥ 0 Q-a.s. we must have VT (ϑ) = 0 Q-a.s. and then GT (ϑ) = 0 Q-a.s. as
Thus V (ϑ) is a Q−martingale. required.
Proof. We only establish the easy direction (b) ⇒ (a). So let Q ≈ P be an EMM and qxt |(x1 ,...,xt−1 ) := Q[Yt = yxt | Y1 = yx1 , . . . , Yt−1 = yxt−1 ], t ∈ {2, . . . , T }.
ϑ = (ϑ1t , . . . , ϑdt )t∈{1,...,T } a predictable process with GT (ϑ) ≥ 0 P -a.s. By Proposition 5.16, it
suces to show that GT (ϑ) = 0 P -a.s. As Q ≈ P, we have GT (ϑ) ≥ 0 Q-a.s. and it suces to By Example 1.9, Q≈P if and only if qx1 , . . . , qxT |(x1 ,...,xt−1 ) > 0 for all x1 , . . . , xT ∈ {1, 2}.
show that GT (ϑ) = 0 Q-a.s. Moreover, if Q ≈ P, then by the equivalent characterisation (5.5) of the martingale property,
Q X1 71
For t = 1, 2, . . . T set ϑ0t = Gt−1 (ϑ) − ϑt · Xt−1 . Then ϑ0 is predictable. Set ϑ = (ϑ0 , ϑ). is an EMM for if and only if
79 80
it follows from the pull-out property of conditional expectations, that X1 is a Q-martingale if 5.6 Valuation of contingent claims
and only if
So far, we have fully characterised those models for nancial markets in nite discrete time
Yt that are reasonable in the sense that they are arbitrage-free. We proceed to study what we
Q Q
E Ft−1 = 1 Q-a.s. ⇔ E [Yt | Ft−1 ] = 1 + r Q-a.s.
1+r can say about the price or value of a derivative asset like a European call or put option in
an arbitrage-free market. More precisely, we assume that the market without the derivative
It follows from the properties of conditional expectations, the fact that
asset is arbitrage-free and want to price the derivative asset in such a way that there are no
and Example 5.2 that Denition 5.21. Let S = (St0 , St )t∈{0,...,T } be a nancial market on some ltered probability
price K. Of course, any rational person will exercise (i.e. make use of ) the right if and only
E Q Yt A(x1 ,...,xt−1 ) = (1 + u)q1|(x1 ,...,xt−1 ) + (1 + d)q2|(x1 ,...,xt−1 )
if STi (ω) > K , and in this case the net payo of the option is STi (ω) − K . Otherwise, i.e.,
if STi (ω) ≤ K , the option is worthless. So the value of the option at time T is given by the
1
So X is a Q-martingale if and only if
contingent claim
C = (STi − K)+ ,
q1 (1 + u) + q2 (1 + d) = 1 + r
(1 + u)q1|(x1 ,...,xt−1 ) + (1 + d)q2|(x1 ,...,xt−1 ) = (1 + r), x1 , . . . , xt−1 ∈ {1, 2}, t ∈ {2, . . . , T } where x+ := max(0, x) for x ∈ R.
(b) The owner of a European put option on asset i ∈ {1, . . . , d} with strike K > 0 and
Using that q1 + q2 = 1 and q1|(x1 ,...,xt−1 ) + q2|(x1 ,...,xt−1 ) = 1 it follows that X 1 is a Q-martingale maturity T has the right but not the obligation to sell the asset i at time T for the price K. Of
if and only if course, any rational person will exercise (i.e. make use of ) the right if and only if STi (ω) < K ,
and in this case the net payo of the option is K − STi (ω). Otherwise, i.e., if STi (ω) ≥ K , the
r−d u−r
q1 = and q2 = , option is worthless. So the value of the option at time T is given by the contingent claim
u−d u−d
r−d u−r
q1|(x1 ,...,xt−1 ) = and q2|(x1 ,...,xt−1 ) = , x1 , . . . , xt−1 ∈ {1, 2}, t ∈ {2, . . . , T }. C = (K − STi )+ .
u−d u−d
Note that q1|(x1 ,...,xt−1 ) and q2|(x1 ,...,xt−1 ) do not depend on x1 , . . . , xt−1 . This implies that the
(c) Let A denote the event of an extreme weather situation like hail at time T . It is natural
Yk are also independent under Q. Clearly, q1 , q2 , q1|(x1 ,...,xt−1 ) , q2|(x1 ,...,xt−1 ) > 0 if and only if
to assume that A is FT -measurable but independent of the market S. A toy example of a
d < r < u. So, in conclusion, the multi-period Binomial model is arbitrage-free if and only if
weather derivative is a contract that pays one unit of money at time T if the extreme event
d < r < u, and in this case there exists a unique EMM for X1 satisfying
A happens and zero otherwise. The corresponding contingent claim is given by
T
Y
Q[{x1 , . . . , xT }] = q xt , C = 1A .
t=1
Unlike the call or put option, this not a derivative security.
r−d u−r
where q1 := u−d and q2 := u−d .
Remark 5.23. (a) The (somewhat confusing) qualier European signies that the contin-
gent claim may exercised only at one date, i.e., at maturity. By contrast, so-called American
81 82
contingent claims can be exercised at any time up to and including maturity. Whereas in A rst guess might be to take the expectation v C = E [C] = 9.8.72 However, this is not
reality most contingent claims are American, we only consider European ones in the sequel a good idea. Consider the strategy b (0, (0.1, −1))
ϑ= for the extended market (S 0 , S 1 , S 2 ).
because the theory for them is somewhat easier. For an excellent treatment of American Then the extended gains process G((0.1, −1)) satises
(b) The notion of a European contingent claim can be extended to contracts with maturity G1 ((0.1, −1))(ω) = 0.1 × (X11 − X01 )(ω) − 1 × (X12 − X02 )(ω)
t < T. A contingent claim C with maturity t<T is just a nonnegative Ft -random variable.
0.1 × (110 − 100) − 1 × (10 − 9.8) = 0.8 > 0 if ω = ω1 ,
We proceed to study the question how we can assign to a contingent claim C a value at
= 0.1 × (100 − 100) − 1 × (0 − 9.8) = 9.8 > 0
if ω = ω2 ,
t < T, t = 0. S 0.1 × (90 − 100) − 1 × (0 − 9.8) = 8.8 > 0
times in particular at Assuming that the underlying market satises NA, we if ω = ω3 .
want to do this in such a way that we do not create any new arbitrage opportunities.
The following example illustrates that this not as straightforward as one might think, and So by Proposition 5.16, the extended market (S 0 , S 1 , S 2 ) admits arbitrage.
naively approaching the problem does not work. As the above arbitrage strategy involves shortselling the option, it follows that the option
was overpriced, i.e., we need a smaller value for vC . One might think that vC = 5 (which is
Example 5.24. Consider the one-period model S = (St0 , St1 )t∈{0,1} described by the following
almost half of the expectation of C) is small enough. However, this is not the case. Consider
trees, where the numbers beside the branches denote probabilities.
the strategy b (0, (0.5, −1))
ϑ = for the extended market (S 0 , S 1 , S 2 ). Then the extended
so that S1 = X 1. It is not dicult to check that the market S satises NA. Now consider a We proceed to describe in a systematic way how to nd fair, i.e., arbitrage free values for
call option on S1 with strike K = 100 and maturity 1, i.e., the contingent claim a contingent claim. We rst consider the ideal case.
C = (S11 − 100)+ Denition 5.25. Let S = (St0 , St )t∈{0,...,T } be a nancial market on some ltered probability
space (Ω, F, (Ft )t∈{0,...,T } , P ). A contingent claim C is called attainable or replicable if there
If we agree that vC ∈ R is a fair value at time 0, we can represent this new asset S2 by the exists a self-nancing trading strategy ϑ such that
tree
ϑT · S T = C P -a.s.
10
0.98
In this case ϑ is called a replication strategy or (perfect) hedge for C.
S 2 : vC 0
0.01
A contingent claim C is attainable if and only if the corresponding discounted contingent
0.01
0 claim
C
H := .
C
ST0
But what is a fair value for v ?
72
Note that there is not need to discount as the interest rate is 0.
83 84
satises Now, let Q∈P be arbitrary. Then by the Q-martingale property of V (ϑ), the fact that
and in this case, we say that the discounted contingent claim H is attainable and call ϑ a
E Q [H | Ft ] = E Q VT (ϑ) Ft = Vt (ϑ) = VtH P -a.s.,
t ∈ {0, . . . , T } (5.15)
replication strategy for H.
The following result shows that for arbitrage-free markets, the value process of an at- As the right-hand side of (5.15) does not depend on Q, we have (b).
tainable contingent claim is unique and can be easily computed if one knows at least one Finally, if (Vt )t∈{0,...,T } is an adapted process with VT = H , by the Fundamental Theorem
EMM. of Asset Pricing, the extended market (S, S 0 V ) = (St0 , St1 , . . . , Std , St0 Vt )t∈{0,...,T } satises NA
if and only if there exists an EMM for (X, V ), i.e., if and only if there is Q∈P such that V
Theorem 5.26. Let S = be a nancial market on some ltered probability
(St0 , St )t∈{0,...,T }
is a Q-martingale. If there exists Q∈P such that V is a Q-martingale, then by the fact that
space (Ω, F, (Ft )t∈{0,...,T } , P ) and H an attainable discounted contingent claim. Assume that
H = VT Q-a.s. (this uses that Q ≈ P ), the martingale property and (5.15), we obtain
S satises NA and denote by P the set of all EMMs for X = S/S 0 . Then
(b) E Q1 [H | Ft ] = E Q2 [H | Ft ] P -a.s. for all Q1 , Q2 ∈ P and all t ∈ {0, . . . , T }. On the other hand VH is a Q-martingale for any Q∈P by (5.15) and Example 5.7(b). So we
have (c).
(c) There exists a P -a.s. unique adapted process (VtH )t∈{0,...,T } with VTH = H P -a.s. such
that the extended (1+d+1)-dimensional market (S, S 0 V H ) = (St , St1 , . . . , Std , St0 VtH )t∈{0,...,T }
0
Theorem 5.26 completely answers all questions concerning the valuation of a (discounted)
satises NA. It is given by contingent claim H provided that it is attainable. However, it does not give any criterion to
determine whether a containing claim is attainable or not, nor does it provide any guidance
VtH = E Q [H | Ft ] P -a.s., t ∈ {0, . . . , T }, for all Q ∈ P. concerning the valuation of non-attainable contingent claims. Not surprisingly, this is more
dicult.
(d) If ϑ = (ϑ0t , ϑt )t∈{1,...,T } is any replication strategy for H , then VtH = Vt (ϑ) P -a.s., for The next result provides a necessary and sucient criterion to decide whether a contingent
t ∈ {0, . . . , T }. claim is attainable or not. Moreover, it gives a full description of all arbitrage-free prices at
Proof. As H is attainable, there exists a self-nancing strategy ϑ= (ϑ0t , ϑt )t∈{1,...,T } such that time 0 for a non-attainable contingent claim.73 For a proof, we refer to [2, Theorems 5.29 and
5.32].
H = VT (ϑ) P -a.s.
Theorem 5.27. Let S = (St0 , St )t∈{0,...,T } be a nancial market on some ltered probability
Let Q∈P and note that P=
̸ ∅ by the fundamental theorem of asset pricing (Theorem 5.19).
space (Ω, F, (Ft )t∈{0,...,T } , P ) and H a discounted contingent claim. Assume that S satises
As H = VT (ϑ) ≥ 0 P -a.s. and hence also Q-a.s., it follows from Theorem 5.18 that V (ϑ) is a
NA and denote by P the set of all EMMs for X = S/S 0 . Then the set of arbitrage-free prices
Q-martingale. This implies in particular that H = VT (ϑ) is Q-integrable, and so we have (a).
for H is non-empty and given by
Next, x Q1 ∈ P and dene the process (VtH )t∈{0,...,T } by
Π(H) = E Q [H] : Q ∈ P and E Q [H] < ∞ .
VtH =E Q1
[H | Ft ] , t ∈ {0, . . . , T }. (5.13)
Moreover:
Then the Q1 -martingale property of V (ϑ), the fact that H = VT (ϑ) and the fact that Q1 ≈ P (a) H is attainable if and only if Π(H) consists of a single element. In this case,
imply that
Π(H) = V0H ,
VtH Q1 Q1
=E [H | Ft ] = E VT (ϑ) Ft = Vt (ϑ) P -a.s., t ∈ {0, . . . , T }. (5.14)
73
One can also give a full characterisation of all arbitrage-free values at intermediate times t ∈ {1, . . . , T − 1}
for a non-attainable contingent claim. However, this is rather complicated.
As the left-hand side of (5.14) does not depend on ϑ, we have (d).
85 86
where V H is as in Theorem 5.26. (3b) If Q 7→ E Q [H] is not constant, then H is not attainable and E Q [H] is a fair price at
exactly, is an ongoing debate in the literature, and there is no easy answer to this question.
where πinf (H) < πsup (H) and Warning: In large parts of the literature, in particular, in credit risk and in more applied
and computational settings, one often just xes one nice equivalent martingale measure Q
πinf (H) := inf E Q [H] ∈ [0, ∞) and πsup (H) := sup E Q [H] ∈ (0, ∞]. (often referred to as the risk-neutral measure ) and calls
Q∈P Q∈P
With the help of Theorem 5.27, we can complete Example 5.24. VtH,Q := E Q [H | Ft ]
Example 5.28. Consider the setup of Example 5.24. If we describe an equivalent measure the risk-neutral price of H at time t. However, if H is not attainable VtH,Q crucially depends
Q≈P by the probability vector (q1 , q2 , q3 ), where qi = Q[{ωi }], i ∈ {1, 2, 3}, one can check on Q, so that one should at least think very carefully which Q ∈ P one chooses. Otherwise,
that the set P of equivalent martingale measures for X1 is given by there might not be much economic warrant for the resulting prices. The key problem is that
unlike in the complete case, those prices are not linked to any hedging strategy, and so if one
P = {Qλ : λ ∈ (0, 1)}, sells a derivative product for a certain price, it is unclear how to hedge the risk involved with
selling the product.
where Qλ is described by the probability vector
1 1
5.7 Complete markets
λ, 1 − λ, λ .
2 2
Valuation of contingent claims is much nicer for attainable than for non-attainable contingent
So by Theorem 5.27 the fact that that H is bounded, the set of arbitrage-free prices for H=C claims. The best possible case is if all contingent claims are attainable.
(because S11 = 1) is given by Denition 5.29. A nancial market S = (St0 , St )t∈{0,...,T } on a ltered probability space
1
(Ω, F, (Ft )t∈{0,...,T } , P ) is called complete if each contingent claim C is attainable. Otherwise,
Π(H) = E Qλ [H] : λ ∈ (0, 1) = incomplete.
λ × 10 : λ ∈ (0, 1) = (0, 5). it is called
2
The next result shows that there is a very simple criterion in terms of EMMs to decide
As Π(H) contains more than one element, it follows that H is not attainable and any number
whether an arbitrage-free nancial market is complete or incomplete.
in (0, 5) C
is a fair value for v .
Theorem 5.30. Let S = (St0 , St )t∈{0,...,T } be a nancial market on some ltered probability
Let us summarise the steps to value a (discounted) contingent claim H in an arbitrage-free space (Ω, F, (Ft )t∈{0,...,T } , P ). Assume that S satises NA. Then S is complete if and only if
nancial market S in nite discrete time. there exists a unique EMM for X = S/S 0 .
(1) Find the set P of all EMMs Q for X = S/S 0 . Proof. Denote by P the set of all EMMs for X. As S satises NA, it is nonempty by the
(3a) If Q 7→ E Q [H] is constant, then H is attainable, and its unique arbitrage-free value contingent claim H,
process is given by Π(H) = {E Q [H]},
VtH Q
= E [H | Ft ] , t ∈ {0, . . . , T },
and so H is attainable. As an (undiscounted) contingent claim C is attainable if and only if
where Q is any EMM. the corresponding discounted contingent claim H = C/ST0 is attainable, it follows that S is
complete.
87 88
Conversely, assume that S is complete. Let Q1 , Q2 ∈ P . Then for A ∈ F = FT , applying Proof. The (discounted) contingent claims corresponding to the put and the call option are
the Theorem 5.26(b) for t=0 to the discounted contingent claim H := 1A , it follows that are given by
+
(STi − K)+
Q1 [A] = E Q1 [H] = E Q1 [H | F0 ] = E Q2 [H | F0 ] = E Q2 [H] = Q2 [A]. K
H call = = XTi − ,
ST0 (1 + r)T
+
(K − STi )+
As A∈F was arbitrary, it follows that Q1 = Q2 . K
H put = = − XTi .
ST0 (1 + r)T
Theorem 5.30 is sometimes called the second fundamental theorem of asset pricing. To-
gether with the rst fundamental theorem of asset pricing (Theorem 5.19) it gives a very Note that
K
beautiful and conclusive description of nancial markets in nite discrete time: H call − H put = XTi − . (5.16)
(1 + r)T
Existence of an EMM is equivalent to the market being arbitrage-free. Since S is arbitrage-free complete, both the call and the put option are attainable, and there
exists an unique EMM Q for the discounted risky assets X = S/S 0 by Theorems 5.19 and
Uniqueness of an EMM is equivalent to the market being complete (and arbitrage free).
5.30. Moreover, by Theorem 5.26(c)
Both results can be extended to continuous or innite discrete time. However, the precise
call put
VtH = E Q H call Ft P -a.s. VtH = E Q H put Ft P -a.s.
formulations become more subtle and the proofs far more dicult. and
Remark 5.31. One can show that if a nancial market S is complete, then necessarily F = FT Thus, by linearity of conditional expectations, (5.16) and the Q-martingale property of X i,
is nite. More precisely, it may have at most (1 + d)
T atoms ; see [2, Theorem 5.37]. This
we obtain
shows that even though it makes things nice and simple, completeness is a very restrictive
call put K
VtH − VtH = E Q H call − H put Ft = E Q XTi −
assumption. Ft
(1 + r)T
K
As an application, we establish the so-called put-call parity for complete markets.
74
= Xti − P -a.s. t ∈ {0, . . . , T }.
(1 + r)T
Theorem 5.32. Let S = (St0 , St )t∈{0,...,T } be complete and arbitrage-free nancial market on
some ltered probability space (Ω, F, (Ft )t∈{0,...,T } , P ). Assume that St0 = (1 + r)t for some 5.8 Pricing and hedging in the binomial model
call
constant interest rate r > −1, t ∈ {0, . . . , T }. Fix i ∈ {1, . . . , d} and K > 0. Denote by V H We conclude this chapter by briey outlining how the preceding theory can be applied in
the (discounted) value process corresponding to the (undiscounted) call option the special case of a binomial model S = (St0 , St1 )t∈{0,...,T } , where we always assume that
u > r > d > −1 so that S is arbitrage-free; cf. Examples 5.8 and 5.20.
C call = (STi − K)+ It follows from Example 5.20 that S admits a unique EMM Q parametrised by q1 = u−d r−d
u−r
H put
and q2 = . Hence the model is arbitrage-free and complete by Theorems 5.19 and 5.30.
on asset i with (undiscounted) strike K and maturity T and by V the (discounted) value u−d
If H is a discounted contingent claim, it is attainable by completeness of S. So by Theorem
process corresponding to the (undiscounted) put option
5.26, it admits a unique arbitrage-free price process (VtH )t∈{0,...,T } given by
C put = (K − STi )+
VtH := E [H | Ft ] t ∈ {0, . . . , T }
on asset i with (undiscounted) strike K and maturity T . Then we have put-call parity :
By the Q-martingale property of V H, we can calculate the latter by the recursive algorithm
H call H put K
Vt − Vt = Xti − P -a.s., t ∈ {0, . . . , T }.
VTH := H H Q
VtH Ft−1 ,
(1 + r)T and Vt−1 := E t ∈ {1, . . . , T }.
74
Put-call parity also holds in incomplete markets. However, the precise formulation is a bit subtle.
Using that Ft = σ(A(x1 ,...,xt ) : x1 , . . . , xt ∈ {1, 2}) for t ∈ {1, . . . , T }, it follows from Examples
89 90
yx k
ξ0 := S01 .75
Qt
5.2 and 5.20 that where ξt1 (x1 , . . . , xt ) := S01 k=1 1+r , t ∈ {1, . . . , T }, and
VtH =
X
vtH (x1 , . . . , xt )1A(x1 ,...,xt ) , t ∈ {1, . . . , T }, Remark 5.33. (a) Formula (5.17) can be seen as the discrete-time version of the Delta-
x1 ,...,xt ∈{1,2} Hedge, i.e., the derivative of the value process with respect to the underlying.
V0H = v0H (b) Note that the value process VH and the (risky part of the) hedging strategy ϑH,1 can
recursively by
H H
If ϑ b (V0 (ϑ ), ϑH,1 )
= is a replication strategy for H, it follows from Theorem 5.26(d) that
H
Vt (ϑ ) = VtH , t ∈ {0, . . . , T }.
H H
VtH − Vt−1
H
= Vt (ϑ ) − Vt−1 (ϑ ) = Gt (ϑH ) − Gt−1 (ϑH ) = ϑH,1 1 1
t (Xt − Xt−1 ), t ∈ {1, . . . , T }.
Rearranging yields,
H
VtH − Vt−1 ∆VtH
ϑH,1
t = := , t ∈ {1, . . . , T } (5.17)
Xt1 − Xt−1
1 ∆Xt
As ϑH,1
t is Ft−1 measurable and Ft−1 = σ(A(x1 ,...,xt−1 ) : x1 , . . . , xt−1 ∈ {1, 2}), it follows that
ϑH,1
X
t = ζtH (x1 , . . . , xt−1 )1A(x1 ,...,xt−1 ) , t ∈ {2, . . . , T },
x1 ,...,xt−1 ∈{1,2}
ϑH,1
1 = ζ1H ,
where the functions ζtH : {1, 2}t−1 → R, t ∈ {2, . . . T }, and the number ζ1H in R can be
calculated as
91 92
References
[1] D. Bertsekas, Nonlinear programming, second ed., Athena Scientic, Belmont, MA, 1999.
[2] H. Föllmer and A. Schied, Stochastic Finance, 4th ext. ed., de Gruyter Studies in Mathe-
matics, vol. 27, Walter de Gruyter & Co., Berlin, 2016.
[3] H.-O. Georgii, Stochastics, De Gruyter Textbook, Walter de Gruyter & Co., Berlin, 2008.
[4] J. C. Hull, Options, futures, and other derivatives, eight ed., Pearson, Boston, MA, 2012.
[5] J. Jacod and P. Protter, Probability essentials, second ed., Universitext, Springer-Verlag,
Berlin, 2003.
[6] S. Le Roy and J. Werner, Principles of Financial Economics, second ed., Cambridge Uni-
versity Press, New York, NY, 2014.
[7] A. McNeil, R. Frey, and P. Embrechts, Quantitative risk management, second revised ed.,
Princeton Series in Finance, Princeton University Press, Princeton, NJ, 2015.
93