0% found this document useful (0 votes)
6 views23 pages

Print Merged

Uploaded by

whizindo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
6 views23 pages

Print Merged

Uploaded by

whizindo
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 23

Problem 1 Nuclear reactor (20 credits)

A new design of a nuclear reactor is suggested for which it is claimed that a nuclear meltdown is impossible. In order
to verify this, a numerical code h (θ) has been developed which (perfectly) simulates the reactor under operating
conditions θ, where it is known that θ ∼ p (θ). The numerical code returns for each θ either 1 or 0 according to

(
1 if nuclear meltdown occurs
h (θ) = (1.1)
0 if NO nuclear meltdown occurs

Using this code a Monte Carlo simulation with N = 108 samples has been run in order to estimate the probability q
of a nuclear meltdown. This Monte Carlo estimate of the the failure probability q was exactly zero.

We introduce the following two hypotheses

on
• H1 : nuclear meltdown is impossible, i.e. q = 0
• H2 : nuclear meltdown is possible but rare, i.e. q = 10−10

We assume that the hypotheses are a-priori equally probable.

ti
0 a) Write down the Monte Carlo estimator used to obtain the numerical estimate
1
2
The Monte Carlo estimator is given by

lu
3
4
5
N
1 X  (i) 
Iˆ = h θ θ (i) ∼ p (θ)
N i=1
So
with θ (i) i.i.d samples from p (θ) and N = 108 .
e
pl
m
Sa

– Page 2 / 10 –
b) Assess the plausibility of H1 compared to H2 within the Bayesian framework given the results obtained during 0
the Monte Carlo study. Can you give preference to either of the hypotheses? 1
2
3
Due to a-priori equal probabilites of hypothesis H1 and H2 it follows 4
5
6
p (H1 |D) p (D|H1 ) 7
=
p (H2 |D) p (D|H2 ) 8
9
Note that for a Monte Carlo estimator to yield exactly zero probability it implies that there have been 10
N = 108 numerical simulations that predicted no occurence of a nuclear meltdown; as such the data is given 11
12
by D {0, 0, ..., 0} with |D| = 108 . We therefore obtain the Bayes factor 13
14
15
p (H1 |D) p (D|H1 ) 1

on
= = 108
≈ 1.0101
p (H2 |D) p (D|H2 ) (1 − q)

I.e. one can conclude that the simulations run for the Monte Carlo analysis do not offer any evidence for a
fail-safe design vs. a small failure probability of q = 10−10 .

ti
lu
So
e
pl
m
Sa

– Page 3 / 10 –
Problem 2 SIR Model (20 credits)
The SIR model is a simple epidemiological model for the spread of an infectious disease within a population of N
people. It is given by a set of ordinary differential equation

dS I ·S
= −β (2.1)
dt N
dI βI · S
= − γI (2.2)
dt N
dR
= γI (2.3)
dt
where

on
• β > 0, γ > 0 are model parameters,

• S : number of susceptible people (i.e. people who can get infected),


• I : number of infected people
• R : number of recovered people

ti
d(S+I+R)
• Note that S + I + R = N (constant) for all t (i.e. dt = 0).

The variables S, I, R depend on time and their values at t = 0 are known. A dataset DK has been collected which

lu
K
contains the number of infected people I1 , ..., IK at time instances t1 , ..., tK , i.e. DK = {(tk , Ik )}k=1 .

A colleague of yours has already written:


So
• a Matlab function prior which takes as input (β, γ) and returns the corresponding value of the prior p (β, γ)
• a Matlab function likelihood which takes as input (β, γ) and returns the likelihood of the data DK .

You are asked to:


e
pl
m
Sa

– Page 4 / 10 –
a) complete the function PandemicRisk whichtakes as inputs the function handles prior and likelihood and 0
β 1
returns an estimate of the probability q = P r γ > 1 (The ratio R0 = β/γ is known as the basic reproduction
2
number and when R0 > 1 the number of infected people grows at an exponential rate) 3
4
Note: Both the likelihood and prior implementations return probability zero if either β ≤ 0 or λ ≤ 0. Any 5
implementation that runs in finite time is acceptable - efficiency is no consideration. 6
7
8
9
1 function q = PandemicRisk(prior, likelihood) 10
2 11
3
12
13
4
14
5 % the first part of the problem consists of writing a standard MCMC 15

on
6 % sampler as introduced in the lecture and / or exercise
7

8 N = 10^8; stepsize = 1;
9 beta = 1; gamma = 1;
10 p = prior(beta,gamma) * likelihood(beta,gamma);

ti
11

12 posterior = @(beta, gamma) prior(beta,gamma)*likelihood(beta,gamma)


13

lu
14 beta_samples = zeros(1,N);
15 gamma_samples = zeros(1,N);
16

17 for n=1:N
So
18

19 beta_proposed = beta + stepsize*randn();


20 gamma_proposed = gamma + stepsize*randn();
21 p_proposed = posterior(beta_proposed, gamma_proposed);
22

23 if rand() < p_proposed / p


24 p = p_proposed;
e

25 beta = beta_proposed;
26 gamma = gamma_proposed;
pl

27

28

29 beta_samples(n) = beta; gamma_samples(n) = gamma;


30
m

31 end
32

33 % the second part of the problem implies a Monte Carlo estimator


34 % using the samples created by MCMC
Sa

35 q = mean((beta_samples/gamma_samples) > 1);


36

37

38 end

– Page 5 / 10 –
0 b) describe or mark within your code of the PandemicRisk function where the numerical expense of solving the
1 ordinary differential equations (i.e. Eqs. (2.1) - (2.3)) occurs
2
3
4 The numerical burden of solving the ODE occurs in Line 21, where the posterior and thus the likelihood of
5 the proposed β, γ values are evaluated.

on
ti
lu
So
e
pl
m
Sa

– Page 6 / 10 –
Problem 3 Linear latent variable model (15 credits)
A set of random vectors x(i) ∈ R4 is generated as follows:

x(i) = Ly (i) + γ y (i) ∼ N (0, Σ) i = 1, ..., N (3.1)

where:
β1−1
 
0 0 0
 0 β2−1 0 0 
• y (i) ∈ R4 and Σ =   with β1 < β2 < β3 < β4 (given).
 0 0 β3−1 0 
0 0 0 β4−1

• L ∈ R4×4 (given) and orthogonal, i.e. LLT = I

on
• γ ∈ R4 (given)
 N
Given the data D = x(i) i=1 we use an Expectation-Maximization algorithm to find the maximum likelihood

estimate of the parameters W , b, σ 2 of a linear latent variable model

ti
 ∼ N 0, σ 2

x = Wz + b +  (3.2)

where

• z ∈ R2 and z ∼ N (0, I).


• W ∈ R4×2 , b ∈ R4
lu
So
As N → ∞ (i.e. as the number of data increases), what would be the maximum likelihood estimates of the following
parameters (briefly justify / explain your answer).

a) Maximum likelihood estimate of parameter b ∈ R4 0


1
2
With E [x] = γ we will obtain b = γ 3
4
e

5
pl
m

b) Maximum likelihood estimate of parameter W ∈ R4×2 0


Sa

1
2
Since L defines a rotation it is clear based on the ordering of the eigenvalues that the principal directions, i.e. 3
span (W ), will be given by the span of the two first column vectors of L. Given the spherical prior on z and 4
the rotation-invariance in the latent space, no further statements can be made. 5

– Page 7 / 10 –
0 c) Maximum likelihood estimate of parameter σ 2 ∈ R+
1
2
3 The MLE of σ 2 depends on the smallest eigenvalues of Σ, since the eigenvalues of the covariance of x are
2
4 unaffected by L and γ corresponding to a rotation or shift. Given the ordering of βi therefore σM LE only
1 −1 −1
2

5 depends on β3 and β4 , defininig the largest precision values, i.e. σM LE = 4−2 β3 + β4

on
ti
lu
So
e
pl
m
Sa

– Page 8 / 10 –
Problem 1 Auxiliary Variable Markov Chain Monte Carlo (15 credits)
We want to make use of Markov Chain Monte Carlo to sample from a posterior distribution π (x), with x ∈ RD .
To this end we consider a MCMC Method where the proposal mechanism makes use of an auxiliary variable
u ∼ U [0, 1], as explained in the code below. The function takes as input a function handle posterior (which returns
π (x)) and the dimension D = dim (x).

0 Implement the accept-reject step (line 20 - 38) to obtain a valid Metropolis-Hastings algorithm. Do not alter the
1 code outside these lines, and do not alter the way in which proposals are generated.
2
3
4
1 function X = MCMC(posterior, D)
5
6 2

7 3 % The array X needs to contain valid samples from the posterior when returned
8 N = 10^8; X = zeros(N,D); x = ones(D,1);

n
4
9
10 5 % function handle takes D-dimensional vectors and returns posterior probability
11 6 p = posterior(x);

tio
12 7
13
14
8 for n=1:N
15 9

10 % propose (using auxiliary variable u)


11 u = rand();

lu
12 if u < 0.75
13 y = x + randn(D,1);
14 else
15 y = x + sqrt(10)*randn(D,1);
So
16 end
17

18 % ====================== IMPLEMENT ACCEPT-REJECT STEP ============================


19 % Implement a accept-reject step that is valid for the proposal used above
20

21 % q(y|x) is symmetric, and therefore the solution is


22 % is the default Metropolis accept-reject step
e

23 if rand() < posterior(y) / posterior(x)


24 y = x
pl

25 end
26

27

28
m

29

30

31

32
Sa

33

34

35

36

37 % ===============================================================================
38 X(n,:) = x;
39 end

Reminder: rand() returns a sample from U (0, 1), randn() returns a sample from N (0, 1).

– Page 2 / 12 –
Problem 2 Buckling Mode (10 credits)

We consider a beam under vertical and horizontal loading. Buckling is assumed to occur
if the horizontal force FH exceeds 10% of the vertical force FV , which we simplify to be
FV
given by the event FH > 0.1 · FV . It is known that FH and FV are jointly Gaussian

 
FH
∼ N (µ, Σ)
FV

where the parameters µ and Σ are fully defined by the marginal distributions

FH
FH ∼ N (1, 0.5) FV ∼ N (10, 2)

n
and the correlation coefficient ρ ∈ [0, 1] of FH and FV .

tio
Derive the probability q for buckling to occur. 0
1
Note: You may expess your answer wrt. the CDF Φ (·) (or its inverse) of the distribution N (0, 1). 2
3
4
The probability q is found to be independent of ρ, since 5
6

lu
7
 
FH 1 8
q=p > 0.1 = p (FH > 0.1 · FV ) = p (FH − 0.1 · FV > 0) = 9
FV 2
10
So
The last step follows by noting that Z := FH − 0.1 · FV is zero-mean Gaussian, i.e. E [FH − 0.1 · FV ) = 0.
e
pl
m
Sa

– Page 3 / 12 –
Problem 3 Statistical Independence (5 credits)
Consider a random vector x ∼ N (0, K ) where K is:

 
1 0.5 0 0 0

 0.5 1 0.5 0 0 

K =
 0 0.5 1 0.5 0 

 0 0 0.5 1 0.5 
0 0 0 0.5 1

0 Are x1 , x5 statistically independent? Justify your answer.


1
2
3 The marginal p (x1 , x5 ) is given by a Gaussian

n
4
5  
x1
∼ N (0, I)

tio
x5

and therefore can be factorized as p (x1 , x5 ) = p(x1 )p(x2 ), implying x1 ⊥ x5 .

lu
So
Problem 4 Linear Kernel (5 credits)
5
We wish to use Gaussian Process regression to infer a mapping implied by the dataset {xn , fn }n=1 , where fn = f (xn ).
e

For this purpose we introduce a zero-mean Gaussian Process GP (0, C ) with a stationary covariance function
defined by
pl

C xi , xj = xiT xj + δij σ 2


with σ 2 ∈ R+ . We are interested in predicting f ∗ = f (x ∗ ).


m

0 Specify in which circumstance you will obtain p (f ∗ |x ∗ , D) = p (f ∗ |x ∗ ), i.e. the posterior is equal to the prior.
1
2
3 This is the case when the covariance between the observed data points and f ∗ is zero. We therefore require
Sa

4 x ∗ T xn = 0 ∀n ∈ {1, 2, 3, 4, 5}, i.e. all xn in the dataset need to be orthogonal to x ∗ (leading to exclusively
5 zero off-diagonal entries and no coupling).

– Page 4 / 12 –
Problem 5 Linear elastic bar (21 credits)
We consider a linear elastic bar of length l = 1 and cross-sectional area Acs = 1
l=1

Acs = 1

The displacement of the linear elastic bar shown is given by

n
u (x) = θx

where θ is the reciprocal of the elastic modulus E > 0, i.e. θ = 1/E . Suppose θ is unknown and we try to determine

tio
it by measuring the displacement at x = 1. Suppose we obtain a noisy measurement u1 which is assumed to relate
to u(x = 1) as follows:

 ∼ N 0, σ 2

u1 = u (x = 1) + 

We assume that a priori θ ≥ 0 follows an exponential distribution, i.e. p (θ) = λ exp (−λθ), where λ > 0 and σ 2 > 0

lu
are given.

a) write the likelihood p (u1 |θ) 0


1
So
2
With u1 = u (x = 1) +  = θ +  the likelihood follows as 3
4
5
p (u1 |θ) = N u1 θ, σ 2


alternatively, written out


e

 
1 1
p (u1 |θ) = √ exp − (u1 − θ)2
2πσ 2 2σ 2
pl
m

b) write the posterior (up to a multiplicative constant) 0


1
Sa

2
Noting that the prior p (θ) only has support for θ ≥ 0 3
4
5
p (θ|u1 ) ∝ p (u1 |θ) p (θ) 6
 
1
∝ exp − 2 (u1 − θ)2 exp (−λθ) · Iθ≥0 (θ)

with the indicator function

(
1 if θ ≥ 0
Iθ≥0 (θ) =
0 else

– Page 5 / 12 –
0 c) determine the MAP estimate of θ (for σ 2 = 1, λ = 1/2)
1
2
3  
4 1 1
5 p (θ|u1 ) ∝ exp − (u1 − θ)2 − θ · Iθ≥0 (θ)
2 2
 
1 2 2

= exp − u1 − 2θu1 + θ + θ · Iθ≥0 (θ)
2
   
1 2 1
∝ exp − θ − 2θ u1 − · Iθ≥0 (θ)
2 2
  2 !
1 1
∝ exp − θ − u1 − · Iθ≥0 (θ)
2 2

n
Therefore θMAP = max (u1 − 0.5, 0), i.e.

tio
(
u1 − 0.5 if u1 > 0.5
θMAP =
0 else

0
1
2
lu
d) determine the Laplace approximation of the posterior if u1 = 1.5 (and σ 2 = 1, λ = 1/2)

From the result of the previous problem it follows trivially that the Laplace approximation (for u1 > 0.5) is given
So
3
by N (θ|u1 − 0.5, 1), i.e. N (θ|1, 1). Alternatively one may obtain the result by taking the second derivative at
the mode.
e
pl
m
Sa

0 e) comment on the accuracy of the Laplace approximation


1
2
Required answer: The Laplace approximation introduces an error, since it places non-zero probability mass
on θ < 0 compared to the posterior.

– Page 6 / 12 –
Problem 6 Monte Carlo estimator (24 credits)
We wish to estimate the value of the integral

Z+∞
2
I= x 3 e −2x dx
−∞

with a Monte Carlo estimator of the form

N
1 X
h xi xi ∼ p xi
 
Î =
N
i=1

n

where x i are independent and identically distributed random numbers drawn from a PDF p x i .

a) specify a valid choice for the form of h (·) and the distribution p x i 0

tio
1
2
1
With σ = 2
3
4
5
Z∞ Z∞ √   6
3 2 3 2πσ 2 1
exp − 2 x 2

I= x exp −2x dx = x √ dx 7

lu
2πσ 2 2σ 8
−∞ −∞
Z∞ √ h √ i
= x 3 2πσ 2 N (0, 0.25) dx = EN (0,0.25) x 3 2πσ 2
So
−∞

this implies


h (x) = x 3 0.5π p x i = N (0, 0.25)

e
pl
m

b) explain why for the choices above the resulting Monte Carlo estimator wil converge to I
Sa

0
1
2
Explanation should contain
  
• Ep (x i ) h(x i ) = I and Var Î → 0

• reference to law of large numbers or central limit theorem

– Page 7 / 12 –

0 c) implement the function MonteCarlo which returns the estimate Î (using your previous choice of h (·) , p x i ).
1
2
3
1 function I_hat = MonteCarlo()
4
2
5
6 3 % number of Monte Carlo samples
4 N = 10^8;
5

6 I_hat = 0;
7

8 for n = 1:N
9 % sample x_i ~ N(0, 0.25)
10 x_i = 0.5*randn();

n
11 % additional term due to normalization of N(0, 0.25)
12 I_hat = I_hat + (x^3)*sqrt(0.5*pi);
13 end

tio
14

15 I_hat = I_hat / N;
16

17

18

lu
19

20

21

22
So
23

24

25

26

27

28
e

29

30

31
pl

32

33

34

35 end
m
Sa

Reminder: rand() returns a sample from U (0, 1), randn() returns a sample from N (0, 1).

– Page 8 / 12 –
d) consider an alternative estimator (for I) of the form 0
1
N
2
1 X 3
h xi + a xi − µ
 
Ĩ = 4
N
i=1 5
   6
where µ = Ep (x i ) x i and x i are again i.i.d. samples from p x i . 7
8
For which value of a will the estimator above converge the fastest? Provide a numerical value.


• You mase use: the first four moments of a zero-mean Gaussian Y ∼ N 0, σ 2 are given by

n
E Y 2 = σ2 E Y3 = 0 E Y 4 = 3σ 4
     
E [Y ] = 0

• You can reuse known results from the lecture or exercise to expedite the solution

tio
This is a Monte Carlo estimator making use of a control variate. As discussed e.g. in problem sheet 6 the
optimal value is given by

a ∗ = −Cov [h, x] /Var [x]

Since E [x] = E [h] = 0 (h is an asymmetric function), we get


lu
So
E [x · h]
a∗ = −
Var [x]
√   √
Note that E [x · h] = 2πσ 2 EN (0,0.25) x 4 = 2πσ 2 · 3σ 4 and Var [x] = σ 2 ; therefore (with σ = 0.5)


e

∗ 2πσ 2 · 3σ 4 √
a =− = − 2π · 3σ 3 ≈ −0.94
σ2
pl
m
Sa

– Page 9 / 12 –
Problem 1 Monte Carlo (12 credits)
A rod in a truss system experiences significant loading on any given day with probability q = 0.1 due to wind. If the
truss is under signficant loading due to wind, the maximum stress it experiences on this day can be modeled as a
Gaussian (in MPa)

X ∼ N µ = 40, σ 2 = 9


0 Complete the Matlab function to return a numerical estimate of the expected number of days within a year (365
1 days), for which the stress exceeds 60 MPa.
2
3
4
1 function D = MonteCarlo()
5

n
6 2 % D: expected number of days for which stress exceeds 60MPa (in a year)
7 3
8 4 N = 1e6;

tio
9
10 5 mu = 40;
11 6 sigma = 3;
12 7

8 I = 0;
9

lu
10 for n=1:N
11 days_threshold_exceeded = 0;
12 for d=1:binornd(365,0.1)
13 S = mu + sigma*randn();
So
14

15 if S > 60
16 days_threshhold_exceeded = days_threshold_exceeded + 1;
17 end
18 end
19
e

20 end
21

22 D = days_threshold_exceeded / N;
pl

23

24 end
25

26
m

27

28

29
Sa

30

31

32 end

Reminder: rand() returns a sample from U (0, 1), randn() returns a sample from N (0, 1), binornd(n,p) returns a
sample from Binomial(n, p).

– Page 2 / 12 –
Problem 2 Model Selection (14 credits)
We consider a bar of length ` = 1 and cross-sectional area Acs = 1 subjected to a force F
`=1

Acs = 1

The following two models which predict the displacement of the bar along the x-direction are proposed:

n

M1 : u (x) = α x M2 : u (x) = β x

with the priors on the parameters α, β.

tio
p (α) = N (1, 0.01) p (β ) = N (1, 0.01)

Suppose we obtain a noisy measurement û of the displacement at x = 0.64 as follows:

lu
û = u (x = 0.64) +   ∼ N (0, 0.01)

p(M1 |û)
Find the evidence ratio , if a-priori both models are considered equally plausible and û = 0.70. 0
So
p(M2 |û)
1
Hint: there is a way to do this without solving an integral. 2
3
4
Provide a numerical value of the evidence ratio for û = 0.70. 5
6
With p (M1 ) = p (M2 ) = 0.5 7
8
e

9
p (M1 |û) p (û|M1 ) p (M1 ) p (û|M1 ) 10
R= = = 11
p (M2 |û) p (û|M2 ) p (M2 ) p (û|M2 )
pl

12
13
The remaining terms involve a (tractable) integral 14

Z Z
m

p (û|M1 ) = p (û|α, M1 ) p (α) d α p (û|M2 ) = p (û|β , M2 ) p (β ) d β

Note however that we can stidestep this integration due to the properties of a Gaussian being closed under
liner transformations and addition. With x = 0.64 we obtain for
Sa


Model 1: û = α x + ε

 √  √
E [û] = E α x + ε = 0.64 · E [α] + E [] = 0.8
 √ 
Var [û] = Var α x + ε = x · Var [α] + Var [] = 0.64 · 0.01 + 0.01 = 0.0164

Model 2: û = β x + ε

E [û] = E [β x + ε] = x E [β ] + E [ε] = 0.64


Var [û] = Var [β x + ε] = x 2 Var [β ] + Var [ε] = 0.642 · 0.01 + 0.01 = 0.014096

From which follows p (û|M1 ) = N (0.80, 0.0164) and p (û|M2 ) = N (0.64, 0.014096). For û = 0.70 this yields

2.2966
R≈ ≈ 0.7766
2.9574

– Page 3 / 12 –
Problem 3 Probability Density Function (20 credits)
We consider the following PDF


0
 for x < a,
 2(x −a)

(b −a)(c −a)
for a ≤ x ≤ c,
fX (x) = 2(b −x)
 (b −a)(b −c)

 for c < x ≤ b,

0 for x > b

with parameters a = 0, b = 2, c = 1.

a) Find the variance Var [X ]. Provide a numerical value. 0


1
2

n
For the given parameters, fX (x) defines a simple triangular distribution with the mean obviously E [X ] = 1. 3
4
5

tio


0 for x < 0, 6
 7
x for 0 ≤ x ≤ 1,
fX (x) = 8

2−x for 1 < x ≤ 2, 9


0 for x > 2 10

 
Making use of Var [X ] = E X 2 − E2 [X ]

E X
 2

=
Z+∞
2
x fX (x) dx =
Z1
3
x dx +
lu
Z2
x 2 (2 − x) dx
So
−∞ 0 1
 1  2
1 4 2 3 1 4
= x + x − x
4 0 3 4 1
 
1 16 2 1 7
= + −4− + =
4 3 3 4 6
e

And hence we find Var [X ] = 61 .


pl
m
Sa

– Page 5 / 12 –
0 b) Complete the  Matlab function sample() to generate and return a single sample from the conditional distribution
1 fX |X ≤1 x |x ≤ 1 . The only random number generator you are allowed to use is rand(), which returns a sample
2 from the uniform distribution U (0, 1). Provide a short derivation / justification for your solution.
3
4
5
6 1 function x = sample()
7 2

8 3 u = rand();
9
4
10
5 % solution:
6 x = sqrt(u)
7

n
9

10

11

tio
12

13

14

15

16 end

Space for notes / derivations:

lu
With the PDF symmetric around 1, it follows p (X ≤ 1) = 0.5. Therefore the conditional PDF
So
follows immediately as

(
2x 0≤x≤1
fX |X ≤1 (x) =
0 else

With FX |X ≤1 (x) = x 2 (for 0 ≤ x ≤ 1) it follows that u ∼U (0, 1) will yield the desired distribution

if mapped via x = u, since d /dxF −1 (x) = d /dx x 2 = 2x (again, for 0 ≤ x ≤ 1).
e
pl
m
Sa

– Page 6 / 12 –
Problem 4 Elastic Rod (10 credits)
A rod made of material with yield strength Y modeled with a Gaussian

Y ∼ N (100, 5)

is subjected to stress only known stochastically as (in MPa)

X ∼ N (70, 10)

Obtain the probability that the yield strength will be exceeded. 0


1
Note: Express your answer w.r.t. the CDF Φ (·) of the standard Normal distribution N (0, 1). 2

n
3
4
We are interested in the probability that X exceeds Y , i.e. 5

tio
6
7
8
p (X > Y ) = p (X − Y > 0) = 1 − p (X − Y ≤ 0)
9
10
Let D := X − Y with D ∼ N (70 − 100, 10 + 5) = N (−30, 15), then with Z ∼ N (0, 1)

lu
   
30 30
p (X > Y ) = 1 − p (D ≤ 0) = 1 − p Z≤ √ =1−Φ √ ≈ 1 − Φ (7.746)
15 15

Or equivalently (starting fromY − X > 0, or by property of Φ (·))


So
 
30
p (X > Y ) = Φ − √ ≈ Φ (−7.746)
15
e
pl
m
Sa

– Page 7 / 12 –
Problem 5 Bayesian Inference (20 credits)
The measured displacements Y1 , Y2 of a material at two different locations are assumed to relate to an externally
applied force X as follows:

Y1 = 2X + W1
Y2 = X + W2

with W1 ∼ N (0, 2) and W2 ∼ N (0, 4) assumed independentent, and the a-priori belief X ∼ N (0, 1).

We have observed the data

Y1 = 5 Y2 = 4

n
0 a) Derive the maximum likelihood estimate of X given Y1 , Y2

tio
1
2
3 The likelihood factorizes
4
5
6 L = p (y1 , y2 |x) = N (y1 |2x, 2) · N (y2 |x, 4)
7

lu
8 working with the log-likelihood
9
10
11 11
log L = − (y1 − 2x)2 − (y2 − x)2 + const.
So
22 24
we take the derivative

d 1
log L = y1 − 2x + (y2 − x)
dx 4
9 !
=6− x =0
4
e

From this follows the maximum likelihood estimate x ∗ = 24


9
≈ 2.66.
pl
m
Sa

– Page 8 / 12 –
b) Complete the MCMC function below such that it returns N = 5000 samples from the a-posteriori distribution of X 0
using a Random Walk Metropolis Hastings algorithm. 1
2
3
4
1 function X = MCMC()
5
2 6
3 % init 7
4 N = 5000; 8
9
5 X = zeros(1,N); 10
6 y1 = 5; y2 = 4;
7 x = 0;
8

9 % note that normpdf takes sigma, not sigma^2 as input

n
10 posterior = @(x) normpdf(y1, 2*x, sqrt(2))*normpdf(y2, x, 2)*normpdf(x,0,1);
11

12 for n=1:N

tio
13

14 % propose
15 x_proposed = x + randn();
16

17 if rand() < posterior(x_proposed)/posterior(x)

lu
18 x = x_proposed;
19

20 X(n) = x;
21
So
22

23

24

25

26 end
27

end
e

28

Space for notes


pl

Identifying the posterior follows straightforwardly from Bayes’ theorem:


m

p (x |y1 , y2 ) ∝ p (y1 , y2 |x) · p (x) p (x)


= N (y1 |2x, 2) · N (y2 |x, 4) N (0, 1)

Alternatively of course p (y1 , y2 |x) can also be implemented using the joint distribution. This
Sa

is identical to the exercise, and merely the posterior needs to be adapted to the specific
case.

Reminder: rand() returns a sample from U (0, 1), randn() returns a sample from N (0, 1)

– Page 9 / 12 –
Problem 6 Gaussian Process Regression (14 credits)
We wish to solve a regression problem using the Gaussian Process prior f ∼ GP (0, C ), where we assume a linear
kernel


C xi , xj = xi xj
N
The observed values {yn }n=1 are corrupted by additive Gaussian noise

yn = f (xn ) + n n ∼ N (0, 1)

Our dataset D consists out of two pairs (N = 2)

n
(x1 , y1 ) = (0, 1) (x2 , y2 ) = (1, 1)

tio
0 What can you say about the value of the function f at x3 = 2 based on this data?
1
2 Provide numerical values for the parameters of the PDF.
3
4
5 Let f3 = f (x3 ). The vector z = [y1 , y2 , f3 ] is jointly Gaussian with zero mean and covariance matrix following

lu
6 from Cij = xi xj , and - for fy , y2 , the additive Gaussian noise (+δij ).
7
8
9  
1 0 0
10
So
11
C= 0 2 2 
12 0 2 4
13
14 which implies that we can disregard the obserd value y1 due to independence. We can instead consider

     
y2 0 2 2
∼N ,
f3 0 2 4
e

from the lecture / exercise it is known that the conditional distribution follows as
pl

p (f3 |D) = N µ, σ 2


with
m

µ = 2 · 2−1 (1 − 0) = 1.0 σ 2 = 4 − 2 · 2−1 · 2 = 2


Sa

– Page 10 / 12 –

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy