0% found this document useful (0 votes)
7 views137 pages

CS1 Assignment Book

The document contains assignments and solutions for Actuarial Statistics (CS1) for the year 2023, organized into chapters with specific questions covering topics such as summarizing data, basic probability, random variables, and probability distributions. Each chapter includes a series of questions that require calculations related to statistics and probability, along with a corresponding solutions section. The assignments are structured to help students practice and understand key concepts in actuarial statistics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
7 views137 pages

CS1 Assignment Book

The document contains assignments and solutions for Actuarial Statistics (CS1) for the year 2023, organized into chapters with specific questions covering topics such as summarizing data, basic probability, random variables, and probability distributions. Each chapter includes a series of questions that require calculations related to statistics and probability, along with a corresponding solutions section. The assignments are structured to help students practice and understand key concepts in actuarial statistics.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 137

CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

CS1
ACTUARIAL STATISTICS
ASSIGNMENTS & SOLUTIONS
FOR 2023

CA PRAVEEN PATWARI 1 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

INDEX

CHAPTERS PAGE NO
BASICS ASSIGNMENT 3-13
ASSIGNMENT 1 14-25
ASSIGNMENT 2 26-31
ASSIGNMENT 3 32-55
ASSIGNMENT 4 56-82
ASSIGNMENT 5 83-96
SOLUTIONS
BASICS ASSIGNMENT 97-100
ASSIGNMENT 1 101-103
ASSIGNMENT 2 104-105
ASSIGNMENT 3 106-115
ASSIGNMENT 4 116-131
ASSIGNMENT 5 132-137

CA PRAVEEN PATWARI 2 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

BASICS ASSIGNMENT

CHAPTER

SUMMARISING DATA
QUESTION 1.

The sizes of ten car claims received by an insurance company were:

£1,500 £1,820 £840 £260 £2,100

£790 £530 £1,360 £1,780 £1,650

Find the mean car insurance claim amount.

QUESTION 2.

The frequency table shows the number of claims made on 100 car insurance policies in the last year. Calcu-
late the mean number of claims per policy:

Number of claims per policy 0 1 2 3

Frequency 74 19 5 2

QUESTION 3.

(i) The mean age of death of 12 assurance policyholders was 72. What was the total age of the 12 policy-
holders?

(ii) The mean of the following list of investment returns is 4.2%, 5% , 4.75%, 3.6%, x%, 3.25%

Find the value of x.

CA PRAVEEN PATWARI 3 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(iii) A small department employs ten actuaries; their mean salary is £48,000. When an eleventh actuary
joins the department the mean salary of all the actuaries drops to £45,800. Find the salary of the new
employee.
(iv) The mean sum assured on 12 term assurances was £50,000 whereas the mean sum assured on 8 en-
dowment assurances was £30,000. Calculate the mean sum assured on all 20 policies.

QUESTION 4.
A list of the age last birthday at death of 30 male policyholders who held life assurance policies with a par-
ticular company is given:
57 68 75 66 72 86 80 81 70 78 76 72 88 84 69
77 83 90 48 63 74 81 94 51 73 96 81 66 77 101
Find the median age last birthday at death of these policyholders.

QUESTION 5.
The frequency table shows the number of claims (of a particular type) made each week in the last year.

Number of claims per week 0 1 2 3 4 5

Frequency 5 7 15 12 9 4

(i) Calculate the median number of claims per week.


(ii) In the first two weeks of the following year, 3 and 5 claims were made. Add these values to the fre-
quency distribution and find the new median of these 54 results.

QUESTION 6.
Here are the salaries of 7 individuals in a company (in £000’s):
18 21 25 25 25 25 30
Find the second order sample moment of this data set.

QUESTION 7.
Here are the salaries of 7 individuals in a company (in £000’s):
18 21 24 25 25 25 30

Find the third order central sample moment of this data set.

CA PRAVEEN PATWARI 4 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 8.

Find the third central moment of this data set:

02334445

Hence, comment on the skewness.

QUESTION 9.

What would the position of the mean, mode and median be for a negatively skew distribution?

QUESTION 10.

Given that:

  xi  x   xi  x 
2 3
N  100  856,934.91  11,949,848.3946

Calculate the:

(a) skewness (b) coefficient of skewness.

QUESTION 11.

This frequency table shows the number of claims per policy made to a car insurance company in the last
year. Calculate the mean and the standard deviation of the number of claims per policy:

Number of claims per policy 0 1 2 3

Frequency 74 19 5 2

CA PRAVEEN PATWARI 5 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 12.

(i) A group of 12 actuaries are weighed. Their weights have a mean of 78 kg and a standard deviation of 4
kg. Find the sum of squares (ie∑x2) of their weights.

(ii) The temperature over the previous 6 days had a mean of 19 C and a standard deviation of 5 C . To-
day’s temperature was 16 C . Calculate the mean and standard deviation of the temperature over all 7
days.

(iii) The ages at which a group of 10 male policyholders died had a mean of 72 years and a standard devia-
tion of 7 years. The ages at which a group of 8 female policyholders died had a mean of 78 years and a
standard deviation of 9 years.

Calculate the mean and standard deviation of all 18 policyholders.

QUESTION 13.

The number of yawns made by 5 students during a tutorial were:

38024

Find the sample variance of the number of yawns.

CA PRAVEEN PATWARI 6 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

CHAPTER

BASIC PROBABILITY
QUESTION 14.

A pile of 15 scripts contains two CT3’s, three CT4’s, four CA1’s and six ST5’s. A marker picks a script from
the pile at random.

What is the probability that the marker picks a CT Series script?

QUESTION 15.

In a CT3 tutorial there are 11 students of which 6 are female. Three of the women and2 of men are also tak-
ing CT4. What is the probability that a student picked at random:

(i) is a male who is also taking CT4

(ii) is not taking CT4?

QUESTION 16.

The probability that a car claim to a certain company is in excess of £1,000 is 0.6. What is the probability
that a claim is not in excess of £1,000?

QUESTION 17.

In a portfolio of 50 car insurance policyholders, 6 have “4 years no claims bonus”, 15 have “3 years no
claims bonus”, 18 have “2 years no claims bonus”, 7 have just “one year no claims bonus” and the rest have
none. A policyholder is picked at random, what is the probability that they have:

(i) 3 or 4 years no claims bonus

(ii) 0 or 1 years no claims bonus

(iii) neither 2 nor 3 years no claims bonus?

CA PRAVEEN PATWARI 7 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 18.

On a Friday night the probability that a driver has been drinking is 0.2. If a driver has been drinking the
probability that they have an accident is 0.05; otherwise it is 0.0001.

(i) Calculate the probability that a driver chosen at random on a Friday night has an accident.

(ii) At an accident on a Friday night the police carry out a breath test. What is the probability that the driver
has been drinking?

QUESTION 19.

A blood test for a particular type of cancer is 95% accurate for a patient with the cancer and 98% accurate
for a healthy patient. If only 6% of those actually tested have the cancer, calculate the probability that:

(i) a patient tests positive

(ii) the wrong result is given

(iii) a patient who gets a positive result actually has the cancer.

QUESTION 20.

The probability that a car accident is due to faulty brakes is 0.02, the probability that a car accident is cor-
rectly attributed to faulty brakes is 0.95, and the probability that a car accident is incorrectly attributed to
faulty brakes is 0.01.

Calculate the probability that a car accident, which is attributed to faulty brakes, was due to faulty brakes.

QUESTION 21.

An insurance company insured 6000 scooter-drivers, 3000 car-drivers and 9000 truck-drivers. The proba-
bility of an accident involving a scooter, a car and a truck is 0.02, 0.06 and 0.30 respectively. One of the in-
sured persons meets with an accident. Find the probability that he is a car-driver.

QUESTION 22.

One card from a pack of 52 cards is lost. Two cards are drawn from this pack and found to be both di-
amonds. Find the probability that the lost card is a diamond.

CA PRAVEEN PATWARI 8 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

CHAPTER

RANDOM VARIABLE
QUESTION 23.

Write down the probability function for the number of heads obtained when flipping two coins.

QUESTION 24.

The probability function of a random variable X is given by:

P  X  x   cx
2
x = 1, 2, 3 or 4

Find the value of c.

QUESTION 25.

The random variable W has a probability function:

W 2 4 5

P(W=w) 0.2 0.5 0.3

Give the cumulative distribution function for W.

QUESTION 26.

The cumulative distribution function of a random variable V is given by:

0 v 1
 0.216 1 v  2

FV  v    0.648 2v3
 0.936 3 v 4

1 4v

CA PRAVEEN PATWARI 9 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

Find:

(i) P(V = 2) (ii) P(V >1) (iii) P(V <3)

QUESTION 27.

A random variable x has the following probability distribution:

X 4 6 7 10

P(X=x) 0.2 0.4 0.3 0.1

Find the expectation and standard deviation of x.

QUESTION 28.

The random variable W has a probability function:

W 2 4 5

P(W=w) 0.2 0.5 0.3

Calculate:

(i)  
E W
2
(ii) E(5W-2) (iii) E(1/W)

QUESTION 29.

A random variable x has the following probability function:

Value of x 0 1 2 3 4 5 6 7

P(x) 0 k 2k 2k 3k k
2
2k
2 2
7k  k

Find the value of k and then evaluate P  x  6  , P  x  6  and P  0  x  5

CA PRAVEEN PATWARI 10 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 30.

The probability density function of a continuous random variable x is given by

f(x) = kx(2 – x), 0<x<2

= 0, elsewhere

Calculate the value of the constant k, E(x) and V(x)

QUESTION 31.

The probability density function of a continuous random variable x is given by

f(x) = 2x(1 – x), 0<x<1

= 0, elsewhere

Find P (x < 0.2), P( 0.2< x < 0.3) and P( x > 0.2)

CA PRAVEEN PATWARI 11 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

CHAPTER

PROBABILITY DISTRIBUTION
QUESTION 32.

Eight coins are tossed. Find the probability of getting (i) two heads (ii) no head and (iii) at least two heads.

QUESTION 33.

For a binomial distribution, the mean is 3 and variance is 2. Find the values of n and p. Hence find the prob-
ability that X is 5.

QUESTION 34.

A local electrical appliances shop has found from experience that the demand for tube lights is distributed
as Poisson with mean of 4 tube lights per week. If the shop keeps 6 tubes during a particular week, what is
the probability that the demand will exceed supply during that week?

QUESTION 35.

If 5% of the electric bulbs manufactured by a company are defective, use Poisson distribution to find the
probability that in a sample of 100 bulbs (i) none is defective, (ii) 5 bulbs will be defective.

QUESTION 36.

A sample of 100 dry battery cells tested to find the length of life, have mean 12 hours and standard devia-
tion 3 hours. Assuming the data are normally distributed, what percentage of battery cells are expected to
have life (i) more than 15 hours, (ii) less than 6 hours and (iii) between 10 and 14 hours?

QUESTION 37.

A company has a portfolio of 50 high-risk car insurance policies. The number of claims per policy in a 3-
month period has a Poisson distribution with mean 0.5. It is assumed that all of the policies in the portfolio
are independent.

CA PRAVEEN PATWARI 12 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

Calculate the probability that


(a) there are a total of 30 claims in a 3- month period for the whole portfolio
(b) there is a wait of more than half month before a claim is made across the whole portfolio.

QUESTION 38.
Claim amounts for a particular type of medical negligence are lognormally distributed with mean £15,000
and standard deviation £8,000. Calculate the probability that the next claim exceeds £20,000

QUESTION 39.
2
If the random variable x has a  distribution with five degrees of freedom, calculate:

(a) P(X > 6.5)


(b) P(X <118).

QUESTION 40.
A small voting district has 101 female voters and 95 male voters. A random sample of 10 voters is drawn.
What is the probability exactly 7 of the voters will be female?

QUESTION 41.
An oil company conducts a geological study that indicates that an exploratory oil well should have a 20%
chance of striking oil. What is the probability that the first strike comes on the third well drilled?

QUESTION 42.
Calculate P(X < 8) if:
(i) X is the number of claims reported in a year by 20 policyholders. Claims reporting from each policy-
holder makes claims at the rate of 0.2 per year independently of the other policyholders.
(ii) X is the number of claims examined up to and including the fourth claim that exceeds £20,000. The
probability that any claim received exceeds £20,000 is 0.3 independently of any other claim.
(iii) X is the number of deaths amongst a group of 500 policyholders. Each policyholder has a 0.01 proba-
bility of dying in the coming year independently of any other policyholder.
(iv) X is the number of phone calls made before an agent makes the first sale. The probability that any
phone call leads to a sale is 0.01 independently of any other call.

CA PRAVEEN PATWARI 13 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ASSIGNMENT 1
CHAPTER

GENERATING FUNCTIONS
QUESTION 1.

Write down the value of Mx 0 .

QUESTION 2.

Derive the MGF of the random variable X with probability density function

f  x   1 2 1  x  1  x  1

QUESTION 3.

Calculate the mean and variance of a random variable, X, with MGF given by:
1
 t
MX  t    1   t 5
 5

QUESTION 4.

  2
State the CGF of X where X ~ Gamma   ,   . Hence prove that E  X   , var  X   2 and skew  X   3 .
  

QUESTION 5.

If X follows the gamma distribution with parameters   2 and   0.4 , calculate P  X  10 using direct inte-
gration.

CA PRAVEEN PATWARI 14 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 6.

(i) Determine the moment generating function of the two-parameter exponential random variable X, de-
fined by the probability density function:

 x  
f  x   e , x where  ,   0

(ii) Hence, or otherwise, determine the mean and variance of the random variable X.

QUESTION 7.

Suppose that the moment generating function of X is Mx  t  .

(i) Derive an expression for the moment generating function of 2X+ 3 in terms of Mx  t  .

2
Now suppose that X is normally distributed with mean  and variance  .

(ii) Derive the distribution of 2X+ 3.

QUESTION 8.

The moment generating function, MY  t  , of a random variable, Y, is given by

2
M Y  t   1  4t  t  0.25

Calculate:

(i) E(Y)

(ii) the standard deviation of Y

 .
(iii) E Y
6

QUESTION 9.

The random variable U has a geometric distribution with probability function

P  U  u   pq
u 1
u  1,2,3,... where p  q  1

CA PRAVEEN PATWARI 15 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(i) Derive the moment generating function of U.

(ii) Write down the CGF of U, and hence show that E(U)= 1/p .

QUESTION 10.

A random variable X has probability density function:

f  x   ke
2x
x R

where R and k are positive constants.

(i) (a) Derive a formula for the moment generating function of X.

(b) State the values of t for which the formula in part (i)(a) is valid.

(ii) Hence determine the value of the constant k in terms of R.

QUESTION 11.

(i) Derive, from first principles, the moment generating function of a Gamma   ,   random variable.

(ii) Show, using the moment generating function, that the mean and variance of a Gamma   ,   random

2
variable are  /  and  /  , respectively.

QUESTION 12.

The claim amount X in units of £1,000 for a certain type of industrial policy is modelled as a gamma varia-

ble with parameters   3 and  1 / 4 .

1 2
(i) Use moment generating functions to show that X ~ 6 .
2

(ii) Calculate the probability that a randomly chosen claim amount exceeds £20,000.

CA PRAVEEN PATWARI 16 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

CHAPTER

JOINT DISTRIBUTION
QUESTION 13.

1 2 3 4

2 4 6 8
1
35 35 35 35

1 2 3 4
N 2
35 35 35 35

1 1 3 2
3
70 35 70 35

Use the table of probabilities given above to calculate:

(i) P(M= 3, N = 1 or 2)

(ii) P(N = 3)

(iii) P(M = 2|N = 3).

QUESTION 14.

The continuous random variables U and V have joint probability density function:

2u  v
fU,V  u, v   , where 10  u  20 and  5  v  5
3,000

Calculate P(10 < U < 15, V > 0).

CA PRAVEEN PATWARI 17 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 15.
Determine the marginal probability density functions for U and V, where:

2u  v
fU,V  u, v   , for 10  u  20 and  5  v  5
3,000

QUESTION 16.
Let X and Y have joint density function:

1
f  x, y    x  3y  0  x  2,0  y  2
16

Determine the conditional density function of X given Y = y.

QUESTION 17.
N1
Calculate the expected value of , where the joint distribution of M and N is:
M

1 2 3 4

2 4 6 8
1
35 35 35 35

1 2 3 4
N 2
35 35 35 35

1 1 3 2
3
70 35 70 35

m
ie P  M  m,N  n   n 2
.
35  2

QUESTION 18.
U and V have joint density function:

2u  v
fU,V  u, v   , where 10  u  20 and  5  v  5
3,000
CA PRAVEEN PATWARI 18 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(i) Calculate E(U) and £(V):


(a) using fU,V  u, v 

(b) using fU  u  and fV  v 

(ii) Comment on your answers.

QUESTION 19.
Calculate the covariance of the random variables X and Y whose joint distribution is as follows:

0 1 2

1 0.1 0.1 0

Y 2 0.1 0.1 0.2

3 0.2 0.1 0.1

QUESTION 20.
Calculate the correlation coefficient of U and V, where:
2u  v
fU,V  u, v   , where 10  u  20 and  5  v  5
3,000

140 5
You are given that E  U   and E  V   .
9 18

QUESTION 21.
If X ~ Poi    and Y ~ Poi    are independent random variables, obtain the probability function of Z =X + Y.

QUESTION 22.
2 2 2
The random variables X, Y, and Z have means and variances  X  4,  Y  5, Z  6,  X  1,  Y  4 and Z  3 .
The covariances are as follows:
cov  X, Y   3 cov  X,Z   2 cov  Y,Z   1

Calculate the mean and variance of W = X – 2Y+ 3Z .

CA PRAVEEN PATWARI 19 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 23.

A company has three telephone lines coming into its switchboard. The first line rings on average 3.5 times
per half-hour, the second rings on average 3.9 times per half-hour, and the third line rings on average 2.1
times per half-hour. Assuming that the numbers of calls are independent random variables having Poisson
distributions, calculate the probability that in half an hour the switchboard will receive:

(i) at least 5 calls

(ii) exactly 7 calls.

QUESTION 24.

If the number of minutes it takes for a mechanic to check a tyre is a random variable having an exponential
distribution with mean 5, obtain the probability that the mechanic will take:

(i) more than eight minutes to check two tyres

(ii) at least fifteen minutes to check three tyres.

QUESTION 25.

Let X and Y have joint density function given by:

f  x, y   c  x  3y  0  x  2,0  y  2

(i) Calculate the value of c .

(ii) Hence, calculate P(X < 1, Y > 0.5).

QUESTION 26.

The continuous random variables X,Y have the bivariate PDF:

f  x, y   2 x  y  1, x  0, y  0

(i) Derive the marginal PDF of Y.

(ii) Derive the conditional PDF of X given Y = y using the result from part (i)

CA PRAVEEN PATWARI 20 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 27.

The continuous random variables X and Y have joint PDF:

f  x, y  
6

1 2
x  xy  0 y  x 2

(i) Determine the PDF of the conditional distribution X|Y = y.

(ii) Calculate the conditional probability P 1  X  1.5| Y  1 .

QUESTION 28.

Let X and Y have joint density function:

fX ,Y  x, y  
4
5
 2

3x  xy 0  x  1, 0  y  1

Determine:

(i) the marginal density function of X

(ii) the conditional density function of Y given X = x

(iii) the covariance of X and Y.

QUESTION 29.

Calculate the correlation coefficient of X and Y, where X and Y have the joint distribution:

0 1 2

1 0.1 0.1 0

Y 2 0.1 0.1 0.2

3 0.2 0.1 0.1

CA PRAVEEN PATWARI 21 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 30.

Claim sizes on a home insurance policy are normally distributed about a mean of £800 and with a standard
deviation of £100. Claims sizes on a car insurance policy are normally distributed about a mean of £1,200
and with a standard deviation of £300. All claim sizes are assumed to be independent.

To date, there have already been home claims amounting to £800, but no car claims.

Calculate the probability that after the next 4 home claims and 3 car claims the total size of car claims ex-
ceeds the total size of the home claims.

QUESTION 31.

Let X be a random variable with mean 3 and standard deviation 2, and let Y be a random variable with
mean 4 and standard deviation 1. X and Y have a correlation coefficient of –0.3 Let Z = X + Y.

Calculate:

(i) cov(X, Z)

(ii) var(Z).

QUESTION 32.

X has a Poisson distribution with mean 5 and Y has a Poisson distribution with mean 10. If cov(X, Y) = –12,
calculate the variance of Z where Z = X – 2Y + 3.

QUESTION 33.

For a certain company, claim sizes on car policies are normally distributed about a mean of £1,800 and with
standard deviation £300, whereas claim sizes on home policies are normally distributed about a mean of
£1,200 and with standard deviation £500. Claim sizes are assumed to be independent.

Calculate the probability that a car claim is at least twice the size of a home claim.

CA PRAVEEN PATWARI 22 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

CHAPTER

CONDITIONAL EXPECTATION
QUESTION 34.

Two random variables X and Y have the following discrete joint distribution

10 20 30

1 0.2 0.2 0.1


X
2 0.2 0.3 0

Calculate E(Y | X = 1)

QUESTION 35.

Suppose X and Y have joint density function given by:

3
f  x, y   x  x  y  0  x  1, 0  y  2
5

Determine the conditional expectation E[Y | X = x].

QUESTION 36.

Let X and Y have joint density function given by:

fX ,Y  x, y  
1
6
 2
x x  xy  0 y  x 2

Determine the conditional expectation E[Y | X = x]

CA PRAVEEN PATWARI 23 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 37.

(i) Calculate E(Y) from first principles given that the joint density function of X and Y is:

3
f  x, y   x  x  y  0  x  1, 0  y  2
5

3x  4
(ii) Given that E  Y | X  x   , calculate E E  Y | X   .
3 x  1

(iii) Hence, confirm that E  Y   E E  Y| X   for this distribution.

QUESTION 38.

Evaluate var  Y| X  1 given the joint distribution:

10 20 30

1 0.2 0.2 0.1


X
2 0.2 0.3 0

QUESTION 39.

Suppose that X and Y are continuous random variables.

(i) Prove from first principles that:

E  Y   E E  Y| X  

The random variable X follows the gamma distribution with parameters   3 and   2. Y is a related
variable with conditional mean and variance of:

E  Y | X  x   3x  1 var  Y | X  x   2x  5
2

(ii) Calculate the unconditional mean and standard deviation of Y.

CA PRAVEEN PATWARI 24 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 40.
Suppose that X is a standard normal random variable, and the conditional distribution of a Poisson random
2
variable Y, given the value of X = x, has expectation x  1 .

Determine E(Y) and var(Y).

QUESTION 41.
Two discrete random variables, X and Y, have the following joint probability function:

1 2 3 4

1 0.2 0 0.05 0.15


Y
2 0 0.3 0.1 0.2

(i) Determine var(X |Y =2).

Let U and V have joint density function:


fU,V  u, v   6 2uv  u
2
 0 u  v 1

(ii) Determine E(U | V= v).

CA PRAVEEN PATWARI 25 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ASSIGNMENT 2
CHAPTER

THE CENTRAL LIMIT THEOREM


QUESTION 1.
It is assumed that the number of claims arriving at an insurance company per working day has a mean of 40
and a standard deviation of 12. A survey is to be conducted over 50 working days. Calculate the probability
that the sample mean number of claims arriving per working day is less than 35.

QUESTION 2.
The cost of repairing a vehicle following an accident has mean $6,200 and standard deviation $650. A study
was carried out into 65 vehicles that had been involved in accidents. Calculate the probability that the total
repair bill for the vehicles exceeded $400,000.

QUESTION 3.
Let X be a Poisson variable with parameter 20. Use the normal approximation to obtain a value for
P  X  15 and use tables to compare with the exact value.

QUESTION 4.
The average number of calls received per hour by an insurance company's switchboard is 5. Calculate the
probability that in a working day of eight hours, the number of telephone calls received will be:

(i) exactly 36

(ii) between 42 and 45 inclusive.

Assuming that the number of calls has a Poisson distribution, calculate the exact probabilities and also the
approximate probabilities using a normal approximation.
CA PRAVEEN PATWARI 26 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 5.

Use a normal approximation to calculate an approximate value for the probability that an observation from
a Gamma(25, 50) random variable falls between 0.4 and 0.8.

QUESTION 6.

Calculate the approximate probability that the mean of a sample of 10 observations from a Beta(10,10)
random variable falls between 0.48 and 0.52.

QUESTION 7.

If X follows the gamma distribution with parameters   10 and   0.2 , calculate the probability that X ex-
ceeds 80

(a) using a normal distribution

(b) using a chi-squared distribution.

QUESTION 8.

The probability of any given policy in a portfolio of term assurance policies lapsing before it expires is con-
sidered to be 0.15. Consider a random sample of 100 such policies.

Calculate the approximate probability that more than 20 policies will lapse before they expire

QUESTION 9.

A company issues questionnaires to clients to obtain feedback on the clarity of their brochure. It is thought
that 5% of clients do not find the brochure helpful.

Let N denote the number of clients who do not find the brochure helpful in a sample of 1,000 responses.

Calculate the approximate probability that 40 < N < 70.

QUESTION 10.

In a certain large population 45% of people have blood group A. A random sample of 300 individuals is
chosen from this population.

Calculate an approximate value for the probability that more than 115 of the sample have blood group A.
CA PRAVEEN PATWARI 27 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 11.
Consider a random sample of size 16 taken from a normal distribution with mean   25 and variance
2
  4 . Let the sample mean be denoted X .

State the distribution of X and hence calculate the probability that X assumes a value greater than 26.

QUESTION 12.
Suppose that the sums assured under policies of a certain type are modelled by a distribution with mean
£8,000 and standard deviation £3,000. Consider a group of 100 independent policies of this type.
Calculate the approximate probability that the total sum assured under this group of policies exceeds
£845,000.

QUESTION 13.
A computer routine selects one of the integers 1, 2, 3, 4, 5 at random and replicates the process a total of
100 times. Let S denote the sum of the 100 numbers selected.
Calculate the approximate probability that S assumes a value between 280 and 320 inclusive.

CA PRAVEEN PATWARI 28 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

CHAPTER

SAMPLING AND STATISTICAL


INFERENCE
QUESTION 14.
Calculate the mean and variance of the sample mean for samples of size 110 from a parent population
which is Pareto with parameters   5 and   3,000 .

QUESTION 15.


Calculate the probability that, for a random sample of 5 values taken from a N 100,25
2
 population:
(i) X will be between 80 and 120

(ii) S will exceed 41.7.

QUESTION 16.
Determine:

(i) P  F9,10  3.779 

(ii) P  F12,14  3.8 

(iii) P  F11,8  0.3392 

(iv) the value of p such that P  F14,6  p   0.01 .

QUESTION 17.
For random samples of size 10 and 25 from two normal populations with equal variances, use the F distri-
 S2   S2 
bution to determine the values of  and  such that P  12     0.05 and P  12     0.05 , where subscript
S  S 
 2   2 
1 represents the sample of size 10 and subscript 2 represents the sample of size 25.

CA PRAVEEN PATWARI 29 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 18.

A random sample of 10 observations is drawn from the normal distribution with mean  and standard
deviation 15. Independently, a random sample of 25 observations is drawn from the normal distribution
with mean  and standard deviation 12. Let X and Y denote the respective sample means.

Evaluate P  X  Y  3 .

QUESTION 19.

Evaluate C such that:

(a) P  F2,15  c   97.5%

(b) P  F8,5  c   5%

QUESTION 20.


Let  X1 , X2 ,..., X9  be a random sample from a N 0,
2
 distribution. Let X and S 2
denote the sample mean

and variance respectively.

Calculate the approximate value of P  X  S  by referring to an appropriate statistical table.

QUESTION 21.

House prices in region X are normally distributed with a mean of £100,000 and a standard deviation of
£10,000. House prices in region Y are normally distributed with a mean of £90,000 and a standard devia-
tion of £5,000. A random sample of 10 houses is taken from region X and a random sample of 5 houses from
region Y.

Calculate the probability that:

(i) the region X sample mean is greater than the region Y sample mean

(ii) the difference between the sample means is less than £5,000
CA PRAVEEN PATWARI 30 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(iii) the region X sample variance is less than the region Y sample variance

(iv) the region X sample standard deviation is more than four times greater than the region Y sample
standard deviation

QUESTION 22.

The time taken to process simple home insurance claims has a mean of 20 mins and a standard deviation of

5 mins.

Calculate, stating any assumptions, the probability that:

(i) the sample mean of the times to process 5 claims is less than 15 mins

(ii) the sample mean of the times to process 50 claims is greater than 22 mins

(iii) the sample variance of the time to process 5 claims is greater than 6.65 mins

(iv) the sample standard deviation of the time to process 30 claims is less than 7 mins

(v) both (i) and (iii) occur for the same sample of 5 claims.

QUESTION 23.

A statistician suggests that, since a t variable with k degrees of freedom is symmetrical with mean 0 and

k  k 
variance for k  2 , one can approximate the distribution using the normal variable N  0, .
k 2  k 2 

(i) Use this to obtain an approximation for the upper 5% percentage points for a t variable with:

(a) 4 degrees of freedom, and

(b) 40 degrees of freedom.

(ii) Compare your answers with the exact values from table and comment briefly on the result.

CA PRAVEEN PATWARI 31 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ASSIGNMENT 3
CHAPTER

POINT ESTIMATION
QUESTION 1.
A random sample from an Exp(  ) distribution is as follows
14.84, 0.19, 11.75, 1.18, 2.44, 0.53
Calculate the method of moments estimate for  .

QUESTION 2.
The random sample :
2.6, 1.9, 3.8, –4.1, –0.2, –0.7, 1.1, 6.9
is taken from a U  ,  distribution.

By equating the sample and population variances, calculate an estimate for  .

QUESTION 3.
A random sample from a Bin(n, p)distribution yields the following values
4, 2, 7, 4, 1, 4, 5, 4
Calculate method of moments estimates of n and p.

QUESTION 4.
A random sample of size 10 from a Type 2 negative binomial distribution with parameters k and p is as fol-
lows:
1, 1, 0, 1, 1, 1, 3, 2, 0, 5

Calculate method of moments estimates of k and p.

CA PRAVEEN PATWARI 32 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 5.
A random sample of size n (ie x1 , x 2 ,..., x n ) is taken from a Poi    distribution.

(i) Derive the maximum likelihood estimator of  .

(ii) The sum of a sample of 10 observations from a Poisson(  ) distribution is 24. Calculate the maximum
likelihood estimate, ̂ .

QUESTION 6.
Claims (in £000s) on a particular policy have a distribution with PDF given by:
2

f  x   2cxe
cx
x 0

Seven of the last ten claims are given below:


1.05, 3.38, 3.26, 3.22, 2.71, 2.37, 1.85
The three remaining claims were known to be greater than £6,000. Calculate the maximum likelihood esti-
mate of c.

QUESTION 7.
The number of claims in a year on a pet insurance policy are distributed as follows

No. of claims, n 0 1 2 3

P(N = n) 5 3  1  9

Information from the claims file for a particular year showed that there were 60 policies with 1 claim, 24
policies with 2 claims and 16 policies with 3 or more claims. There was no information about the number of
policies with no claims.
Calculate the maximum likelihood estimate of  .

QUESTION 8.
The number of claims, X, per year arising from a low-risk policy has a Poisson distribution with mean  .
The number of claims, Y, per year arising from a high-risk policy has a Poisson distribution with mean 2 .

A sample of 15 low-risk policies had a total of 48 claims in a year and a sample of 10 high-risk policies had a
total of 59 claims in a year. Determine the maximum likelihood estimate of  based on this information.

CA PRAVEEN PATWARI 33 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 9.

2
The estimator, ̂ , is used to estimate the variance of a N  ,   2
 distribution based on a random sample of n
observations:

1 n
 Xi  X 
2

2
ˆ 
n i 1

2
(i) Determine the mean square error of ̂ .

2
(ii) Determine whether ̂ is consistant.

QUESTION 10.

(i) Show that the CRLB for unbiased estimators of  , based on a random sample of n observations from a
2


N , 
2
 2
distribution with known variance  , is given by

n

(ii) Show that the maximum likelihood estimator ˆ  X attains the CRLB.

QUESTION 11.

Waiting times in a post office queue have an Exp    distribution. Ten people had waiting times (in mi-

nutes) of:

1.6 0.9 1.1 2.1 0.7 1.5 2.3 1.7 3.0 3.4

A further six people had waiting times of more than 4 minutes.

Calculate the maximum likelihood estimate of  based on these data.

QUESTION 12.

The number of claims arising in a year on a certain type of insurance policy has a Poisson distribution with
parameter  .

CA PRAVEEN PATWARI 34 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

The insurer's claim file shows that claims were made on 238 policies during the last year with the following
frequency distribution for the number of claims:

Number of claims Frequency

1 174

2 50

3 10

4 4

5 0

No information is available from the policy file, that is, only data concerning those policies on which claims
were made can be used in the estimation of the claim rate  (This is why there is no entry for the number
of claims being 0 in the table.)

(i) Show that the truncated probability function is given by:


x 
 e
PX  x  x  1,2,3,...

x! 1  e


(ii) Show that both the method of moments estimate and the MLE of  satisfy the equation


  x 1e

 , where x is the mean number of claims for policies that have at least one claim.
(iii) Solve this equation, by any means, for the given data and calculate the resulting estimate of  to two
decimal places.

(iv) Hence, estimate the percentage of all policies with no claims during the year.

QUESTION 13.

Suppose that unbiased estimators X1 and X 2 of a parameter  have been determined by two independent

methods, and suppose that var  X 1    and that var  X 2    , where   0 .


2 2

CA PRAVEEN PATWARI 35 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

Let Y be the combination given by Y  X1   X 2 , where  and  denote non-negative weights.

(i) Derive the relationship satisfied by  and  so that Y is also an unbiased estimator of  .

2
(ii) Determine the variance of Y in terms of  and  if, additionally, the weights are chosen such that the

variance of Y is a minimum.

QUESTION 14.

A random sample x1 , x 2 ,..., x n is taken from a population, which has the probability distribution function

F(x) and the density function f(x). The values in the sample are arranged in order and the minimum and
maximum values x MIN and x MAX are recorded.

n
(i) Show that the distribution function of x MAX is F  x   , and find a corresponding formula for the dis-

tribution function of x MIN .

The original distribution is now believed to be a Pareto   ,1 distribution, ie the probability density

function is:


f x  1
, x 0
1  x 

(ii) Determine the distribution function of X, and hence determine the distribution function of X MAX .

(iii) Show that the probability density function for the distribution of X MIN , is:

n
fX x  n1
x 0
MIN
1  x 

A random sample of 25 values gives a sample value for x MIN of 23.

(iv) Obtain a maximum likelihood estimate of  using the distribution of X MIN .

The same random sample gives a value of x MAX of 770.

CA PRAVEEN PATWARI 36 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(v) Obtain an equation for the maximum likelihood estimator of  using x MAX . Comment on the difficulty
of solving this equation.

(vi) Outline what further information you would need here in order to obtain a method of moments esti-
mate of  .

QUESTION 15.
A discrete random variable has a probability function given by:

x 2 4 5

P X  x 
1 1 3
8
 2 2
 3 8


(i) Give the range of possible values for the unknown parameter  .

A random sample of 30 observations gave respective frequencies of 7, 6 and 17.


(ii) Calculate the method of moments estimate of  .

(iii) Write down an expression for the likelihood of these data and hence show that the maximum likelih-
ood estimate ̂ satisfies the quadratic equation:

2 111 91
180ˆ  ˆ  0
8 32

(iv) Hence determine the maximum likelihood estimate and explain why one root is rejected as a possible
estimate of  .

QUESTION 16.
A motor insurance portfolio produces claim incidence data for 100,000 policies over one year. The table
below shows the observed number of policyholders making 0, 1, 2, 3, 4, 5, and 6 or more claims in a year.

No. of claims No. of policies

0 87,889

1 11,000

CA PRAVEEN PATWARI 37 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

2 1,000

3 100

4 10

5 1

6 —

Total 100,000

(i) (a) Estimate the parameter of the Poisson distribution to fit the above data using method of mo-
ments.

(b) Hence calculate the expected number of policies giving rise to the different numbers of claims
assuming the Poisson model.

(ii) Show that the estimate of the Poisson parameter calculated from the above data using the method of
moments is also the maximum likelihood estimate of this parameter.

(iii) (a) Estimate the two parameters of the Type 2 negative binomial distribution to fit the above data
using the method of moments.

(b) Hence calculate the expected number of policies giving rise to the different numbers of claims
assuming a negative binomial model.

You may use the relationship:

k  x 1
PX  x   q  P  X  x  1
x

for the negative binomial distribution.

(iv) Explain briefly why you would expect a negative binomial distribution to fit the above data better than
a Poisson distribution.

CA PRAVEEN PATWARI 38 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

CHAPTER

CONFIDENCE INTERVALS
QUESTION 17.

The average IQ of a random sample of 50 university students is found to be 132. Calculate a symmetrical
95% confidence interval for the average IQ of university students, assuming that IQs are normally distri-
buted. It is known from previous studies that the standard deviation of IQs among students is approximate-
ly 20.

QUESTION 18.

The average IQ of a random sample of 50 university students is found to be 132. Calculate a symmetrical 99%
prediction interval for the average IQ of university students, assuming that IQs are normally distributed. It is
known from previous studies that the standard deviation of IQs among students is approximately 20.

QUESTION 19.

Calculate:

(i) an equal-tailed 95% confidence interval and

(ii) a 95% confidence interval of the form (0, L)

for the standard deviation of the heights of the children in the population based on the information given in
the last question.

QUESTION 20.

The heights of 10-year-old children are normally distributed. The heights of a random sample of five child-
ren (in cm) are: 124cm, 122cm, 130cm, 125cm and 132cm.

Calculate a 90% confidence interval for the predicted height of a 10-year-old child based on these data val-
ues.

CA PRAVEEN PATWARI 39 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 21.
We have obtained a value of 1 from the binomial distribution with parameters n = 20 and  .

Construct a 95% symmetrical confidence interval for  .

QUESTION 22.
In a one-year mortality investigation, 45 of the 250 ninety-year-olds present at the start of the investigation
died before the end of the year. Assuming that the number of deaths has a binomial distribution with para-
meters n =250 and q, calculate a symmetrical 90% confidence interval for the unknown mortality rate q.

QUESTION 23.
In a one-year investigation of claim frequencies for a particular category of motorists, the total number of
claims made under 5,000 policies was 800. Assuming that the number of claims made by individual motor-
ists has a Poi(  ) distribution, calculate a symmetrical 90% confidence interval for the unknown average
claim frequency  ..

QUESTION 24.
A motor company runs tests to investigate the fuel consumption of cars using a newly developed fuel addi-
tive. Sixteen cars of the same make and age are used, eight with the new additive and eight as controls. The
results, in miles per gallon over a test track under regulated conditions, are as follows:

Control 27.0 32.2 30.4 28.0 26.5 25.5 29.6 27.2

Additive 31.4 29.9 33.2 34.4 32.0 28.7 26.1 30.3

Calculate a 95% confidence interval for the increase in miles per gallon achieved by cars with the additive.
State clearly any assumptions required for this analysis.

QUESTION 25.
In a one-year investigation of claim frequencies for a particular category of motorists, there were 150
claims from the 500 policyholders aged under 25 and 650 claims from the 4,500 remaining policyholders.
Assuming that the numbers of claims made by the individual motorists in each category have independent
Poisson distributions, calculate a 99% confidence interval for the difference between the two Poisson pa-
rameters.

CA PRAVEEN PATWARI 40 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 26.

A survey was carried out to find out the number of hours that actuarial students spend watching television
per week. It was discovered that for a sample of 10 students, the following times were spent watching tele-
vision:

8, 4, 7, 5, 9, 7, 6, 9, 5, 7

(i) (a) Calculate a symmetrical 95% confidence interval for the mean time an actuarial student spends
watching television per week.

(b) Write down the assumptions needed to calculate the confidence interval in part (a).

(ii) Calculate a symmetrical 95% prediction interval for the time an actuarial student spends watching
television per week.

(iii) (a) Describe the limiting case of the formulae for the intervals in parts (i)(a) and (ii) as n tends to
infinity.

(b) Explain which of the two intervals calculated will be more sensitive to the assumptions in part
(i)(b).

QUESTION 27.

A researcher investigating attitudes to Sunday shopping reports that, in a sample of 8 interviewees, 7 were
in favour of more opportunities to shop on Sunday.

Calculate an exact 95% confidence interval for the underlying proportion in favour of this idea using the
binomial distribution.

QUESTION 28.

Two inspectors carry out property valuations for an estate agency. Over a particular week they each go out
to similar properties. The table below shows their valuations (in £000s):

A 102 98 93 86 92 94 89 97

B 86 88 92 95 98 97 94 92 91

CA PRAVEEN PATWARI 41 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

The dotplots for these two inspectors as

(i) (a) Comment on the possible assumption of normality and equal variances for the two underlying
populations using the diagrams.

(b) Calculate a 95% confidence interval for this common variance using the equal variance assump-
tion from part (a).

(c) Calculate a 95% confidence interval for the mean difference between the valuations by A and B,
commenting briefly on the result.

The estate agency employing the inspectors decides to test their valuations by sending them each to
the same set of eight houses, independently and without knowledge that the other is going. The result-
ing valuations (in £000s) follow:

Property

1 2 3 4 5 6 7 8

A 94 98 102 132 118 121 106 123

B 92 96 111 129 111 122 101 118

(ii) Calculate a 90% confidence interval for the mean difference between valuations by A and B, comment-
ing briefly on the result.

CA PRAVEEN PATWARI 42 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 29.
The ordered remission times (in weeks) of 20 leukaemia patients are given in the table:

1 1 2 2 3

4 4 5 5 8

8 8 11 11 12

12 15 17 22 23

Suppose the remission times can be regarded as a random sample from an exponential distribution with
density:

f  x;     e
 x
, x 0

(i) (a) Determine the maximum likelihood estimator ˆ of  .

(b) Calculate the large-sample approximate variance of ̂ .


(c) Hence calculate an approximate 95% confidence interval for  .
2
(ii) (a) Calculate an exact 95% confidence interval for  using the fact that 2nX has  2n distribution.

(b) Comment briefly on how it compares with your interval in (i)(c).

QUESTION 30.
Heights of males with classic congenital adrenal hyperplasia (CAH) are assumed to be normally distributed.
(i) Determine the minimum sample size to ensure that a 95% confidence interval for the mean height has
a maximum width of 10cm, if:
(a) a previous sample has a standard deviation of 8.4 cm
(b) the population standard deviation is 8.4 cm.
(ii) Determine the minimum sample size to ensure that a 95% prediction interval for the height of a male
with CAH has a maximum width of 38cm, if:
(a) a previous sample has a standard deviation of 8.4 cm
(b) the population standard deviation is 8.4 cm.
(iii) Comment on the difference in sample size required for part 1 & 2.
CA PRAVEEN PATWARI 43 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 31.

The amounts of individual claims arising under a certain type of general insurance policy are known from
past experience to conform to a lognormal distribution in which the standard deviation is 1.8 times the
mean. An actuary has found that the lower and upper limits of a 95% confidence interval for the mean
claim amount are £4,250 and £4,750.

Evaluate the lower and upper limits of a 95% confidence interval for the lognormal parameter  .

QUESTION 32.

A general insurance company is debating introducing a new screening programme to reduce the claim
amounts that it needs to pay out. The programme consists of a much more detailed application form that
takes longer for the new client department to process. The screening is applied to a test group of clients as
a trial whilst other clients continue to fill in the old application form. It can be assumed that claim payments
follow a normal distribution.

The claim payments data for samples of the two groups of clients are (in £100 per year)

Without screening 24.5 21.7 35.2 15.9 23.7 34.2 29.3 21.1 23.5 28.3

With screening 22.4 21.2 36.3 15.7 21.5 7.3 12.8 21.2 23.9 18.4

(i) (a) Calculate a 95% confidence interval for the difference between the mean claim amounts.

(b) Comment on your answer.

(ii) (a) Calculate a 95% confidence interval for the ratio of the population variances

(b) Hence, comment on the assumption of equal variances required in part (i).

Assume that the sample sizes taken from the clients with and without screening are always equal to
keep processing easy.

(iii) Calculate the minimum sample size so that the width of a 95% confidence interval for the difference
between mean claim amounts is less than 10, assuming that the samples have the same variances as in
part (i).

CA PRAVEEN PATWARI 44 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

CHAPTER

HYPOTHESIS TESTING
QUESTION 33.
A random variable X is believed to follow an Exp(  ) distribution. In order to test the null hypothesis  =
20 against the alternative hypothesis  =30, where   1 /  , a single value is observed from the distribu-
tion. If this value is less than 28, H 0 is not rejected, otherwise H 0 is rejected.

Calculate the probabilities of:

(i) a Type I error

(ii) a Type II error.

QUESTION 34.
A short screening test has just been developed for depression. An independent blind comparison was made
with a gold-standard test for diagnosis of depression among 200 psychiatric outpatients.

Among the 50 outpatients found to be depressed according to the gold-standard test, 35 patients tested
positive under the new short test. Among 150 patients found not to be depressed according to the gold-
standard test, 30 patients tested positive under the new short test.

Calculate the sensitivity and specificity of the short screening test, assuming that the gold-standard test
correctly classifies each individual.

QUESTION 35.
A random variable X is believed to follow an Exp(  ) distribution. In order to test the null hypothesis  = 20
against the alternative hypothesis  = 30, where   1 /  , a single value is observed from the distribution. If
this value of X is less than k, H 0 is not rejected, otherwise H 0 is rejected.

(i) Calculate the value of k that gives a test of size 5%.

(ii) Determine the probability of a Type II error in this case.


CA PRAVEEN PATWARI 45 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 36.

The average IQ of a sample of 50 university students was found to be 105. Carry out a statistical test to con-
clude whether the average IQ of university students is greater than 100, assuming that IQs are normally
distributed. It is known from previous studies that the standard deviation of IQs among students is approx-
imately 20.

QUESTION 37.

The annual rainfall in centimetres at a certain weather station over the last ten years has been as follows:

17.2 28.1 25.3 26.2 30.7 19.2 23.4 27.5 29.5 31.6

Scientists at the weather station wish to test whether the average annual rainfall has increased from its for-
mer long-term value of 22 cm. Test this hypothesis at the 5% level, stating any assumptions that you make.

QUESTION 38.

A new gene has been identified that makes carriers of it particularly susceptible to a particular degenera-
tive disease. In a random sample of 250 adult males born in the UK, 8 were found to be carriers of the dis-
ease. Test whether the proportion of adult males born in the UK carrying the gene is less than 10%.

QUESTION 39.

In a one-year investigation of claim frequencies for a particular category of motorists, the total number of
claims made under 5,000 policies was 800. Assuming that the number of claims made by individual motor-
ists has a Poi(  ) distribution, test at the 1% level whether the unknown average claim frequency  is less
than 0.175.

QUESTION 40.

The average blood pressure for a control group C of 10 patients was 77.0 mmHg. The average blood pres-
sure in a similar group T of 10 patients on a special diet was 75.0 mmHg. Carry out a statistical test to as-
sess whether patients on the special diet have lower blood pressure.

10 10
 ci  59,420 and  t i  56,390 .
2 2
You are given that
i 1 i 1

CA PRAVEEN PATWARI 46 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 41.
A car manufacturer runs tests to investigate the fuel consumption of cars using a newly developed fuel ad-
ditive. Sixteen cars of the same make and age are used, eight with the new additive and eight as controls.
The results, in miles per gallon over a test track under regulated conditions, are as follows:
Control 27.0 32.2 30.4 28.0 26.5 25.5 29.6 27.2
Additive 31.4 29.9 33.2 34.4 32.0 28.7 26.1 30.3
If C is the mean number of miles per gallon achieved by cars in the control group, and  A is the mean num-
ber of miles per gallon achieved by cars in the group with fuel additive, test:
(i) H0 :  A   C  0 vs H1 :  A   C  0

(ii) H0 :  A   C  6 vs H1 :  A   C  6

QUESTION 42.
The average blood pressure for a control group C of 10 patients was 77.0 mmHg. The average blood pres-
sure in a similar group T of 10 patients on a special diet was 75.0 mmHg. Test whether the variances in the
two populations can be considered to be equal.
10 10
 ci  59,420 and  t i  56,390 .
2 2
You are given that
i 1 i 1

QUESTION 43.
A sample of 100 claims on household policies made during the year just ended showed that 62 were due to
burglary. A sample of 200 claims made during the previous year had 115 due to burglary.
Test the hypothesis that the underlying proportion of claims that are due to burglary is higher in the second
year than in the first.

QUESTION 44.
In order to increase the efficiency with which employees in a certain organisation can carry out a task, 5
employees are sent on a training course. The time in seconds to carry out the task both before and after the
training course is given below for the 5 employees:

A B C D E

Before 42 51 37 43 45

After 38 37 32 40 48

Test whether the training course has had the desired effect.

CA PRAVEEN PATWARI 47 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 45.

In testing whether a die is fair, a suitable model is:

1
P X  i  , i = 1, 2, 3, 4, 5, 6 where X is the number thrown
6

and the hypotheses may be:

H 0 : Number thrown has the distribution specified in the model

H 1 : Number thrown does not have the distribution specified in the model

If the die is thrown 300 times, with the following results,

x: 1 2 3 4 5 6

fi : 43 56 54 47 41 59

2
Carry out a  test to assess whether the data comes from a fair die.

QUESTION 46.

The table below shows the causes of death in elderly men derived from a study in the 1970s. Carry out a
chi-square test to determine whether these percentages can still be considered to provide an accurate de-
scription of causes of death in 2000.

Cause of death Proportion of deaths in 1975 Number of deaths in 2000

Cancer 8% 286

Heart disease 22% 805

Other circulatory disease 40% 1,548

Respiratory diseases 19% 755

Other causes 11% 464

CA PRAVEEN PATWARI 48 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 47.
The numbers of claims made last year by individual motor insurance policyholders were:

Number of claims 0 1 2 3 4+

Number of policyholders 2,962 382 47 25 4

Carry out a chi-square test to determine whether these frequencies can be considered to conform to a Pois-
son distribution.

QUESTION 48.
On a particular run of a process which bottles a drink, it is thought that the cleansing process of the bottles
has partially failed. The bottles have been boxed into crates, each containing six bottles. It is thought that
each bottle, independently of all others, has the same chance of containing impurities.

A survey has been conducted, and each bottle in a random sample of 200 crates has been tested for impuri-
ties. The table below gives the numbers of crates in the sample which had the respective number of bottles
which contained impurities:

Number of impure bottles: 0 1 2 3 4 5 6

Number of crates: 38 70 58 25 6 2 1

Test the goodness of fit of a binomial distribution to these observations.

QUESTION 49.
For each of three insurance companies, A, B, and C, a random sample of non-life policies of a particular kind
is examined. It turns out that a claim (or claims) have arisen in the past year in 23% of the sampled policies
for A, in 28% of those for B, and in 20% of those for C.

Test for differences in the underlying proportions of policies of this kind which have given rise to claims in
the past year among the three companies in the two situations:

(a) the sample sizes were 100, 100, and 200 respectively

(b) the sample sizes were 300, 300, and 600 respectively.

Comment briefly on your results.


CA PRAVEEN PATWARI 49 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 50.

In an investigation into the effectiveness of car seat belts, 292 accident victims were classified according to
the severity of their injuries and whether they were wearing a seat belt at the time of the accident. The re-
sults were as follows:

Wearing a seatbelt Not wearing a seatbelt

Death 3 47

Severe injury 78 32

Minor injury 103 29

Determine whether the severity of injuries sustained is dependent on whether the victims are wearing a
seat belt.

QUESTION 51.

The table below shows the numbers of births during one month at a particular hospital classified according
to whether a particular medical characteristic was or wasn't present during childbirth.

Age of mother <20 21-25 26-30 31-35 36+ Total

Characteristic present 10 12 9 4 3 38

Characteristic absent 5 51 38 25 5 124

Total 15 63 47 29 8 162

Assess whether the presence of this characteristic is dependent on the age of the mother.

QUESTION 52.

A statistical test is used to determine whether or not an anti-smoking campaign carried out 5 years ago has
led to a significant reduction in the mean number of smoking related illnesses. The probability value of the
test statistic is 7%.
CA PRAVEEN PATWARI 50 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

Determine the conclusion for a test of size:

(i) 10%

(ii) 5%.

QUESTION 53.

A random sample, x1 ,..., x 10 , from a normal population gives the following values:

9.5 18.2 4.69 3.76 14.2 17.13 15.69 13.9 15.7 7.42

 xi  120.19  xi  1,693.6331
2

(i) Test at the 5% level whether the mean of the whole population is 15 if the variance is:

(a) unknown

(b) 20.

(ii) Test at the 5% level whether the population variance is 20.

QUESTION 54.
A professional gambler has said: 'Flipping a coin into the air is fair, since the coin rotates about a horizontal
axis, and it is equally likely to be either way up when it first clips the ground. So a flicked coin is equally
likely to land showing heads or tails. However, spinning a coin on a table is not fair, since the coin rotates
about a vertical axis, and there is a systematic bias causing it to tilt towards the side where the embossed
pattern is heavier. In fact, when a new coin is spun, it is more than twice as likely to land showing tails as it
is to land showing heads.'

After hearing this, an experiment was carried out, spinning a new coin 25 times on a polished table; the
coin showed tails 18 times.

Comment on whether the results of experiment support the gambler’s claims about the probability when a
coin is spun.

QUESTION 55.
A blood test has been used on 1,000 people to detect whether they have a particular condition. Of the 427
people who had a positive result, 369 of them had the condition. Of the 573 people who had a negative re-
sult, 15 of them had the condition.
CA PRAVEEN PATWARI 51 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(i) (a) Calculate the sensitivity of the blood test.

(b) Calculate the specificity of the blood test.

A second blood test is used on 1,000 people which has a sensitivity of 80% and a specificity of 60%.
For this blood test, 544 people had a positive result.

(ii) (a) Calculate the number of true positives.

(b) Calculate the number of false positives.

QUESTION 56.

The lengths of a random sample of 12 worms of a particular species have a mean of 8.54 cm and standard
deviation of 2.97 cm. Let  denote the mean length of a worm of this species. It is required to test:

H0 :   7cm vs H1 :   7cm

The lengths of worms are assumed to be normally distributed.

Calculate the probability-value of these sample results.

QUESTION 57.

A general insurance company is debating introducing a new screening programme to reduce the claim
amounts that it needs to pay out. The programme consists of a much more detailed application form that
takes longer for the new client department to process. The screening is applied to a test group of clients as
a trial whilst other clients continue to fill in the old application form. It can be assumed that claim payments
follow a normal distribution.

The claim payments data for samples of the two groups of clients are (in £100 per year)

Without screening 24.5 21.7 45.2 15.9 23.7 34.2 29.3 21.1 23.5 28.3

With screening 22.4 21.2 36.3 15.7 21.5 7.3 12.8 21.2 23.9 18.4

(i) Test the hypothesis that the new screening programme reduces the mean claim amount.

(ii) Test the assumption of equal variances required in part (i).

CA PRAVEEN PATWARI 52 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 58.
The total claim amounts (in £m) for home and car insurance over a year for similar sized companies are
collected by an independent advisor:

Home 13.3 19.2 12.9 15.8 17.6

Car 14.3 21.0 12.8 17.4 22.8

(i) Test whether the mean home and car claims are equal. State clearly your probability value.
It was subsequently discovered that the results were actually 5 consecutive years from the same com-
pany.
(ii) Carry
(iii) out an appropriate test of whether the mean home and car claims are equal

QUESTION 59.
In an investigation into a patient's red corpuscle count, the number of such corpuscles appearing in each of
400 cells of a haemocytometer was counted. The results were as follows:

No. of red blood corpuscles 0 1 2 3 4 5 6 7 8

No. of cells 40 66 93 94 62 25 14 5 1

It is thought that a Poisson distribution with mean  provides an appropriate model for this situation.
(a) Estimate  .
(b) Test the fit of the Poisson model.
For a healthy person, the mean count per cell is known to be equal to 3. For a patient with certain of
anaemia, the number of red blood corpuscles is known to be lower than this.
(c) Test whether this patient has one of these types of anaemia.

QUESTION 60.
In a recent study investigating a possible genetic link between individuals' susceptibility to developing
symptoms of AIDS, 549 men who had been diagnosed HIV positive were classified according to whether
they carried two particular alleles (DRB1*0702 and DQA1*O2O1). The results were as follows:

Condition of individual Free of symptoms Early symptoms Suffering from AIDS Total

Alleles present 24 7 17 48

CA PRAVEEN PATWARI 53 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

Alleles absent 98 93 310 501

Total 122 100 327 549

Test whether there is an association between the presence of the alleles and the classification into the three
AIDS statuses using these results.

QUESTION 61.
A politician has said: 'A recent study in a particular area showed that 25% of the 400 teenagers who were
living in single-parent families had been in trouble with the police, compared with only 20% of the 1,200
teenagers who were living in two-parent families. Our aim is to reduce the number of single-parent families
in order to reduce the crime rates during the next decade.'
(i) Carry out a contingency table test at the 5% significance level to assess whether there is a significant
association between living in a single-parent family and getting into trouble with the police.
(ii) Comment on the politician's statement.

QUESTION 62.
A particular area in a town suffers a high burglary rate. A sample of 100 streets is taken, and in each of the
sampled streets, a sample of six similar houses is taken. The table below shows the number of sampled
houses, which have had burglaries during the last six months.

No. of houses burgled x 0 1 2 3 4 5 6

No. of streets f 39 38 18 4 0 1 0

(i) (a) State any assumptions needed to justify the use of a binomial model for the number of houses
per street which have been burgled during the last six months.
(b) Derive the maximum likelihood estimate of p, the probability that a house of the type sampled
has been burgled during the last six months.
(c) Determine the probabilities for the binomial model using your estimate of p.
(d) Comment on the fit without doing a formal test.
An insurance company works on the basis that the probability of a house being burgled over a six
month period is 0.18.

CA PRAVEEN PATWARI 54 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(ii) Carry out a test to investigate whether the binomial model with this value of p provides a good fit for
the data.

QUESTION 63.
It is desired to investigate the level of premium charged by two companies for contents policies for houses
in a certain area. Random samples of 10 houses insured by Company A are compared with 10 similar hous-
es insured by Company B. The premiums charged in each case are as follows

Company A 117 154 166 189 190 202 233 263 289 331

Company B 142 160 166 188 221 241 276 279 284 302

The line plots below show the sample values for the two companies:

(i) Comment briefly on the validity of the assumptions required for a two-sample t test for the premiums
of these two companies using the plots.

 A  2,134,  A  B  2,259,  B
2 2
For these data:  494,126,  541,463 .

(ii) Carry out a formal test to check that it is appropriate to apply a two-sample t test to these data, as-
suming that the premiums are normally distributed.
(iii) Test whether the level of premiums charged by Company B was significantly higher than that charged
by Company A, stating the p value and conclusion clearly.
(iv) (a) Calculate a 95% confidence interval for the difference between the proportions of premiums of
each company that are in excess of £200.
(b) Comment briefly on your result to part (iv)(a).
The average premium charged by Company A in the previous year was £170.
(v) Test whether Company A appears to have increased its premiums since the previous year.

CA PRAVEEN PATWARI 55 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ASSIGNMENT 4
CHAPTER

CORRELATION
QUESTION 1.
A sample of ten claims and corresponding payments on settlement for household policies is taken from the
business of an insurance company.

The amounts, in units of £100, are as follows:

Claim x 2.10 2.40 2.50 3.20 3.60 3.80 4.10 4.20 4.50 5.00

Payment y 2.18 2.06 2.54 2.61 3.67 3.25 4.02 3.71 4.38 4.45

Draw a scatterplot and comment on the relationship between claims and payments.

QUESTION 2.
The rate of interest of borrowing, over the next five years, for ten companies is compared to each compa-
ny’s leverage ratio (its debt to equity ratio).

The data is as follows:

Leverage ratio, x 0.1 0.4 0.5 0.8 1.0 1.8 2.0 2.5 2.8 3.0

Interest rate (%), y 2.8 3.4 3.5 3.6 4.6 6.3 10.2 19.7 31.3 42.9

Draw a scatterplot and comment on the relationship between company borrowing (leverage) and interest
rate. Hence apply a transformation to obtain a linear relationship.

CA PRAVEEN PATWARI 56 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 3.
Show that:

 x i 
2
S xx    x i  x    x i 
2
  x i  nx
2 2 2
n

QUESTION 4.
For the claims settlement data, we have:

Claim (£100's)x 2.10 2.40 2.50 3.20 3.60 3.80 4.10 4.20 4.50 5.00

Payment (£100's) y 2.18 2.06 2.54 2.61 3.67 3.25 4.02 3.71 4.38 4.45

Number of pairs of observations n = 10.


2
 x  35.4,  x  133.76,  y  32.87,  y  115.2025,  xy  123.81
2

Calculate Pearson's correlation coefficient for the claims settlement data and comment on its value.

QUESTION 5.
For the original borrowing rate data:

Leverage ratio, x 0.1 0.4 0.5 0.8 1.0 1.8 2.0 2.5 2.8 3.0

Interest rate y 0.028 0.034 0.035 0.036 0.046 0.063 0.102 0.197 0.313 0.429

Number of pairs of observations n = 10.


2
 x  14.9,  x  32.39,  y  1.283,  y  0.341769,  xy  3.082
2

Calculate Pearson's correlation coefficient for the borrowing rate data.

QUESTION 6.
Calculate Spearman's rank correlation coefficient for the claims settlement data and comment.

Claim (£100's)x 2.10 2.40 2.50 3.20 3.60 3.80 4.10 4.20 4.50 5.00

Payment (£100's) y 2.18 2.06 2.54 2.61 3.67 3.25 4.02 3.71 4.38 4.45

CA PRAVEEN PATWARI 57 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 7.
Calculate Spearman's rank correlation coefficient for the original borrowing rate data and comment.

Leverage ratio, x 0.1 0.4 0.5 0.8 1.0 1.8 2.0 2.5 2.8 3.0

Interest rate (%), y 2.8 3.4 3.5 3.6 4.6 6.3 10.2 19.7 31.3 42.9

QUESTION 8.
Calculate Kendall's rank correlation coefficient for the claims settlement data and comment.

Claim (£100's)x 2.10 2.40 2.50 3.20 3.60 3.80 4.10 4.20 4.50 5.00

Payment (£100's) y 2.18 2.06 2.54 2.61 3.67 3.25 4.02 3.71 4.38 4.45

QUESTION 9.
Calculate Kendall's rank correlation coefficient for the original borrowing rate data and comment on its
value.

Leverage ratio, x 0.1 0.4 0.5 0.8 1.0 1.8 2.0 2.5 2.8 3.0

Interest rate (%), y 2.8 3.4 3.5 3.6 4.6 6.3 10.2 19.7 31.3 42.9

QUESTION 10.

Test H0 :   0 vs H1 :   0 for the claims settlement data.

Claim (£100's)x 2.10 2.40 2.50 3.20 3.60 3.80 4.10 4.20 4.50 5.00

Payment (£100's) y 2.18 2.06 2.54 2.61 3.67 3.25 4.02 3.71 4.38 4.45

QUESTION 11.
Considering the data on claims and settlements, carry out the test:

H0 :   0.9 vs H1 :   0.9

CA PRAVEEN PATWARI 58 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

for the population of all claims/payments of this type.

Claim (£100's)x 2.10 2.40 2.50 3.20 3.60 3.80 4.10 4.20 4.50 5.00

Payment (£100's) y 2.18 2.06 2.54 2.61 3.67 3.25 4.02 3.71 4.38 4.45

QUESTION 12.
An actuary wants to investigate if there is any correlation between students' scores in the CS1 mock exam
and the CS2 mock exam. Data values from 22 students were collected and the results are:

Student 1 2 3 4 5 6 7 8 9 10 11

CS1 mock score 51 43 39 80 56 57 26 68 54 75 72

CS2 mock score 52 42 58 56 47 72 16 63 48 80 68

Student 12 13 14 15 16 17 18 19 20 21 22

CS1 mock score 85 48 27 63 76 64 55 78 82 52 60

CS2 mock score 82 54 38 57 71 50 45 60 59 49 61

2
You are given that d  494, n c  174 and n d  57 .

Test H0 :   0 vs H1 :   0 for the mock score data using the Spearman's rank correlation coefficient and
the Kendall's rank correlation coefficient.

QUESTION 13.
A new computerised ultrasound scanning technique has enabled doctors to monitor the weights of unborn
babies. The table below shows the estimated weights for one particular baby at fortnightly intervals during
the pregnancy.

Gestation period (weeks) 30 32 34 36 38 40

Estimated baby weight (kg) 1.6 1.7 2.5 2.8 3.2 3.5

CA PRAVEEN PATWARI 59 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

2
 x  210,  x  7,420,  y  15.3,  y  42.03,  xy  549.8
2

(i) Show that Sxx  70, Syy  3.015 and Sxy  14.3 .

(ii) Show that the (Pearson's) linear correlation coefficient is equal to 0.984 and comment.

(iii) Explain why the Spearman's and Kendall's rank correlation coefficients are both equal to 1.

(iv) Carry out a test of H0 :   0 vs H1 :   0 using Pearson's correlation coefficient and:

(a) the t test

(b) Fisher's transformation.

(v) Test whether Pearson’s sample correlation coefficient supports the hypothesis that the true corre-
lation parameter is greater than 0.9.

QUESTION 14.

A schoolteacher is investigating the claim that class size does not affect GCSE results. His observations of
nine GCSE classes are as follows:

Class XI X2 X3 X4 Y1 Y2 Y3 Y4 Y5

Students in class (c ) 35 32 27 21 34 30 28 24 7

Average GCSE point score for class (P ) 5.9 4.1 2.4 1.7 6.3 5.3 3.5 2.6 1.6

2
 c  238  c  p  33.4  p  cp  983
2
 6,884  149.62

(i) (a) Calculate Pearson's, Spearman's and Kendall’s correlation coefficients.

(b) Use Pearson's correlation coefficient to test whether or not the data agree with the claim that
class size does not affect GCSE results.

Following his investigation, the teacher concludes, 'bigger class sizes improve GCSE results'.

(ii) Comment on this statement.

CA PRAVEEN PATWARI 60 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 15.
A university wishes to analyse the performance of its students on a particular degree course. It records the
scores obtained by a sample of 12 students at entry to the course, and the scores obtained in their final ex-
aminations by the same students. The results are as follows:

Student A B C D E F G H I J K L

Entrance exam score x (%) 86 53 71 60 62 79 66 84 90 55 58 72

Finals paper score y (%) 75 60 74 68 70 75 78 90 85 60 62 70

2
 x  836  y  867  x y   x  x  y  y   1,122
2
 60,016  63,603

(i) (a) Explain why Spearman's and Kendall's rank correlation coefficients cannot be calculated here
using the simplified formula.
(b) Calculate the Pearson's correlation coefficient.
(ii) Test whether this data comes from a population with Pearson's correlation coefficient equal to 0.75.

CA PRAVEEN PATWARI 61 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

CHAPTER

LINEAR REGRESSION
QUESTION 16.
A sample of ten claims and corresponding payments on settlement for household policies is taken from the
business of an insurance company.

The amounts, in units of £100, are as follows:

Claim x 2.10 2.40 2.50 3.20 3.60 3.80 4.10 4.20 4.50 5.00

Payment y 2.18 2.06 2.54 2.61 3.67 3.25 4.02 3.71 4.38 4.45

Discuss whether a linear regression model is appropriate.

QUESTION 17.
x
Explain how to transform the relationship Y  ab to a linear form.

QUESTION 18.

Show that the fitted line yˆ  ˆ  ˆ x passes through the point  x, y  .

QUESTION 19.

Claim x 2.10 2.40 2.50 3.20 3.60 3.80 4.10 4.20 4.50 5.00

Payment y 2.18 2.06 2.54 2.61 3.67 3.25 4.02 3.71 4.38 4.45

The sample of ten claims and payments above (in units of £100) has the following summations:
2
 x  35.4,  x  133.76,  y  32.87,  y  115.2025,  xy  123.81
2

Calculate the fitted regression line and the estimated error variance.

CA PRAVEEN PATWARI 62 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 20.
For the claims settlement question above, calculate the expected payment on settlement for a claim of £350.

QUESTION 21.
Determine the split of total variation in the claims and payments model between the residual sum of
squares and the regression sum of squares.

Claim x 2.10 2.40 2.50 3.20 3.60 3.80 4.10 4.20 4.50 5.00

Payment y 2.18 2.06 2.54 2.61 3.67 3.25 4.02 3.71 4.38 4.45

Recall that: n = 10,  x  35.4,  y  32.87, Sxx  8.444, Syy  7.1588, Sxy  7.4502
QUESTION 22.
Calculate the coefficient of determination for the claims and payments model and comment on it.
Recall that: SSTOT  7.1588, SSREG  6.5734, SSRES  0.5854

QUESTION 23.
Calculate the correlation coefficient for the claims and payment data by using the coefficient of determina-
tion from the previous question.

QUESTION 24.
For the claims/settlements data:
(a) calculate a two-sided 95% confidence interval for  , the slope of the true regression line

(b) test the hypothesis H0 :   1 vs H1 :   1 .

Recall that: Sxx  8.444, Syy  7.1588, Sxy  7.4502, ˆ  0.164, ˆ  0.88231, ˆ 2  0.0732

QUESTION 25.
For the data set of 10 claims and their settlement payments, we had:
SSTOT  7.1588, SSREG  6.5734, SSRES  0.5854

Construct the ANOVA table and carry out an F test to assess whether   0 .

CA PRAVEEN PATWARI 63 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 26.
Consider again the claims/settlements data.
Calculate:
(a) a 95% confidence interval for the expected payments on claims of £460.

(b) a 95% confidence interval for the predicted actual payments on claims of £460

Recall that: ˆ  0.164, ˆ  0.88231, ˆ 2  0.0732 and Sxx  8.444

QUESTION 27.

The claims/settlement data values were as follows:

Claim x 2.10 2.40 2.50 3.20 3.60 3.80 4.10 4.20 4.50 5.00

Payment y 2.18 2.06 2.54 2.61 3.67 3.25 4.02 3.71 4.38 4.45

Calculate the residuals for the fitted regression model ŷ = 0.164 + 0.8823x.

QUESTION 28.

A senior actuary wants to analyse the salaries of the 50 actuarial students employed by her company, using
a linear model based on number of exam passes and years of experience. Express this model and the avail-
able data.

QUESTION 29.
A new computerised ultrasound scanning technique has enabled doctors to monitor the weights of unborn
babies. The table below shows the estimated weights for one particular baby at fortnightly intervals during
the pregnancy.

Gestation period (weeks) 30 32 34 36 38 40

Estimated baby weight (kg) 1.6 1.7 2.5 2.8 3.2 3.5

2
 x  210  x  y  15.3  y  xy  549.8
2
 7,420  42.03

CA PRAVEEN PATWARI 64 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(i) Show that:

(a) Sxx  70, Syy  3.015 and Sxy  14.3 .

(b) the fitted regression line is ŷ = –4.60 + 0.2043x.

2
(c) ̂ = 0.0234.

(ii) Calculate the baby's expected weight at 42 weeks (assuming it hasn't been born by then).

(iii) (a) Calculate the residual sum of squares and the regression sum of squares for these data.

2
(b) Calculate the coefficient of determination, R , and comment on its value.

(iv) Carry out a test of H0 :   0 vs H1 :   0 , assuming a linear model is appropriate.

(v) Construct an ANOVA table for the sum of squares from part (iii)(a) and carry out an F-test stating the
conclusion clearly.

(vi) (a) Estimate the mean weight of a baby at 33 weeks.

(b) Calculate the variance of this mean predicted response.

(c) Hence, calculate a 90% confidence interval for the mean weight of a baby at 33 weeks.

(vii) (a) Estimate the actual weight of an individual baby at 33 weeks.

(b) Calculate the variance of this individual predicted response.

(c) Hence, calculate a 90% confidence interval for the weight of an individual baby at 33 weeks.

The table below shows some of the residuals:

Gestation period (weeks) 30 32 34 36 38 40

Residual 0.07 0.05 0.04 –0.07

(viii) (a) Calculate the missing residuals.

CA PRAVEEN PATWARI 65 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

Two plots of the residuals are as follows:

(b) Comment on the first dotplot of the residuals .

(c) Comment on the fit of the model using the plot the residuals against the x values.

(d) Comment on the Q-Q plot of the residuals given below:

Normal Q-Q Plot

CA PRAVEEN PATWARI 66 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 30.
An analysis using the simple linear regression model based on 19 data points gave:
sxx  12.2 syy  10.6 sxy  8.1

(i) (a) Calculate ̂ .

(b) Test whether  is significantly different from zero.

(ii) (a) Calculate r.


(b) Test whether  is significantly different from zero.

(iii) Comment on the results of the tests in parts (i) and (ii).

QUESTION 31.
The sums of the squares of the errors in a regression analysis are found to be:

SSREG    yˆ i  y   6.4 SSRES    y i  yˆ i   3.6 SSTOT    y i  y   10.0


2 2 2

Calculate the coefficient of determination and explain what this represents.

QUESTION 32.
Explain how to transform the following models to linear form:
2
(i) y i  a  bx i  ei

bx i
(ii) y i  ae

QUESTION 33.
A university wishes to analyse the performance of its students on a particular degree course. It records the
scores obtained by a sample of 12 students at entry to the course, and the scores obtained in their final ex-
aminations by the same students. The results are as follows:

Student A B C D E F G H I J K L

Entrance exam score x (%) 86 53 71 60 62 79 66 84 90 55 58 72

Finals paper score y (%) 75 60 74 68 70 75 78 90 85 60 62 70

2
 x  836  y  867  x y   x  x  y  y   1,122
2
 60,016  63,603

CA PRAVEEN PATWARI 67 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(i) Calculate the fitted linear regression equation of y on x.

Now assume that the full normal model holds.


2
(ii) (a) Calculate an estimate of the error variance  .
2
(b) Hence, obtain a 90% confidence interval for  .

(iii) Test whether the data are positively correlated by considering the slope parameter.

(iv) Calculate a 95% confidence interval for the mean finals paper score corresponding to an individual
entrance score of 53.

(v) (a) Calculate the proportion of variation explained by the model

(b) Hence, comment on the fit of the model.

QUESTION 34.
The share price, in pence, of a certain company is monitored over an 8-year period. The results are shown
in the table below:

Time (years) 0 1 2 3 4 5 6 7 8

Price 100 131 183 247 330 454 601 819 1,095

  xi  x   yi  y    xi  x  y i  y   7,087
2 2
 60  925,262

An actuary fits the following simple linear regression model to the data:

y i     x i  ei i  0,1,...,8

where ei  are independent normal random variables with mean zero and variance  .
2

(i) Determine the fitted regression line in which the price is modelled as the response and the time as an
explanatory variable.

(ii) Calculate a 99% confidence interval for:

(b)  , the true underlying slope parameter

2
(c)  , the true underlying error variance.

CA PRAVEEN PATWARI 68 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(iii) (a) State the 'total sum of squares' and calculate its partition into the 'regression sum of squares' and
the 'residual sum of squares'.

(b) Calculate the 'proportion of variability explained by the model' using the values in part (iii)(a) to

(c) Comment on the result in part (iii)(b).

(iv) The actuary decides to check the fit of the model by calculating the residuals.

(a) Complete the table of residuals (rounding to the nearest integer):

Time (years) 0 1 2 3 4 5 6 7 8

Residual 132 –21 –75 –104 –75 25

A dotplot of the residuals is shown below:

(b) Comment on the assumption of normality using the dotplot.

A plot of the residuals against time is given below:

(c) Comment on the appropriateness of the linear model by referring to the plot of the residuals
against time.

CA PRAVEEN PATWARI 69 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 35.
A schoolteacher is investigating the claim that class size does not affect GCSE results. His observations of
nine GCSE classes are as follows:

Class X1 X2 X3 X4 Y1 Y2 Y3 Y4 Y5

Students in class (c) 35 32 27 21 34 30 28 24 7

Average GCSE point score for class (p) 5.9 4.1 2.4 1.7 6.3 5.3 3.5 2.6 1.6

2
 c  238  c  p  33.4  p  cp  983
2
 6,884  149.62

(i) Determine the fitted regression line for p on c.

Class X5 was not included in the results above and contains 15 students.

(ii) (a) Calculate an estimate of the average GCSE point score for this individual class

(b) Calculate the standard error for the estimate in part (ii)(a) assuming the full normal model.

QUESTION 36.
An actuary is fitting the following linear regression model through the origin:

Yi   x i  ei 
ei ~ N 0, 
2
 i  1,2,...n

(i) Show that the least squares estimator of  is given by:

ˆ 
 x i Yi
 xi
2

(ii) Derive the bias and mean square error of ̂ under this model.

QUESTION 37.
A life assurance company is examining the force of mortality,  x , of a particular group of policyholders. It is
thought that it is related to the age, x, of the policyholders by the formula:
x
 x  Bc

CA PRAVEEN PATWARI 70 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

It is decided to analyse this assumption by using the linear regression model:


Yi     x i   i where  i ~ N 0, 
2
 are independently distributed
The summary results for eight ages were as follows:

Age, x 30 32 34 36 38 40 42 44

Force of mortality,  x 10  4


 5.84 6.10 6.48 7.05 7.87 9.03 10.56 12.66

In x 3 sf  –7.45 –7.40 –7.34 –7.26 –7.15 –7.01 –6.85 –6.67

 In  x 
2
 x i  296  x i  11,120  In  x  x iIn  x
2
i
 –57.129 i
 408.50 i
 2,104.5

x
(i) (a) Apply a transformation to the original formula,  x  Bc , to make it suitable for analysis by linear
regression.

(b) Write down expressions for Y,  and  in terms of  x , B and c using the transformation given in
part (i)(a).

The graph of In  x against the age of the policyholder, x is shown below

(ii) Comment on the suitability of the regression model and state how this supports the transformation in
part (i)(a).

CA PRAVEEN PATWARI 71 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(iii) Use the data to calculate least squares estimates of B and c in the original formula.

(iv) (a) Calculate the coefficient of determination between In  x and x.

(b) Hence comment on the fit of the model to the data.


(c) Complete the table of residuals below.
(d) Comment on the fit by considering the residuals.

Age, x 30 32 34 36 38 40 42 44

Residual, ê i 0.08 –0.03 –0.06 0.02 0.09

(v) (a) Calculate a 95% confidence interval for the mean predicted response In 35 .

(b) Hence obtain a 95% confidence interval for the mean predicted value of 35 .

QUESTION 38.
The government of a country suffering from hyperinflation has sponsored an economist to monitor the
price of a ‘basket’ of items in the population's staple diet over a one-year period. As part of his study, the
economist selected six days during the year and on each of these days visited a single nightclub, where he
recorded the price of a pint of lager. His report showed the following prices:

Day(i) 8 29 57 92 141 148

Price ( Pi ) 15 17 22 51 88 95

InPi 2.7081 2.8332 3.0910 3.9318 4.4773 4.5539

  InPi 
2
 i  475  i  InPi  21.5953  iInPi  1,947.020
2
 54,403  81.1584

The economist believes that the price of a pint of lager in a given bar on day i can be modelled by:

InPi  a  bi  e i

where a and b are constants and the ei 's are uncorrelated N 0,  2
 random variables.
2
(i) Estimate the values of a, b and  .
CA PRAVEEN PATWARI 72 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(ii) Calculate the linear correlation coefficient r.

(iii) Calculate a 99% confidence interval for b.

(iv) Determine a 95% confidence interval for the average price of a pint of lager on day 365:

(a) in the country as a whole

(b) in a randomly selected bar.

QUESTION 39.

(i) Show that the maximum likelihood estimates (MLEs) of  and  in the simple linear regression model

are identical to the least squares estimates.

2
(ii) Show that the MLE of  has a different denominator from the least squares estimate

QUESTION 40.

The effectiveness of a tablet containing x 1 mg of drug 1 and x 2 mg of drug 2 is being tested. In trials the

following results are obtained:

% effectiveness, y x1 x2

92.5 50.9 20.8

94.9 54.1 16.9

89.3 47.3 25.2

94.1 45.1 49.7

98.9 37.6 95.2

 y  469.7  x1  235  x2  207.8  x1  11,202.68  x2  12,886.42


2 2

 yx1  22,028.78  yx2  19,870.22  x1x2  8,985.96


CA PRAVEEN PATWARI 73 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(i) Using the multiple linear least square regression model:

y    1 x1  2 x 2  e

(a) Show that the least squares estimates of  , 1 and 2 satisfy:

 y i  n  1  x i1  2  x i2
 y i x i1    x i1  1  x i1  2  x i2x i1
2

 y i x i2    x i2  1  x i1 x i2  2  x i2
2

(b) Hence, using the above data, show that the fitted model is:

ŷ  25.31  1.194x1  0.3015x 2

(ii) Comment on the significance of the parameters by considering the following output from R for this
model.

Coefficients:
Estimate Std. Error t value Pr( >| t | )
(Intercept) 25.308441 2.002062 12.64 0.006200 **
drug_1 1.193671 0.036592 32.62 0.000938 ***
drug_2 0.301468 0.007048 42.77 0.000546 ***
2
The coefficient of determination for the fitted model is R  0.9992 .
2
(iii) Calculate the adjusted R .
The ANOVA table for the model is

Source of variation Degrees of Freedom Sum of Squares Mean Sum of Squares

Regression 2 49.1137 *

Residual 2 0.0383 *

Total 4 49.152

(iv) Calculate the missing values, the F statistic and then carry out the F test .stating the conclusion clearly.

CA PRAVEEN PATWARI 74 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(v) Calculate the percentage effectiveness for a tablet containing 51.3 mg of drug x 1 and 18.3 mg of drug x 2 .

The plot of the residuals against the fitted values and the Q-Q plot of the residuals are given below.

(vi) Comment on the fit of the model, making reference to the plots given above.
It is thought that the two drugs might have an interactive effect.
(vii) (a) Explain what this means.
(b) Write down the formula for the regression model that has the two drugs as main effects and also
their interaction.
2
The model in part (vii)(b) has an adjusted R of 0.9969.
(c) Comment on whether the new model is an improvement.

CA PRAVEEN PATWARI 75 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

CHAPTER

GENERALISED LINEAR MODELS


QUESTION 41.
The table below shows the value of the linear predictor    1  2   1x1  2x2 for different values of
age  x1  and duration  x 2  .

Linear predictor,  Age  x1  Duration  x 2 

35 20 0

37 20 1

45 30 0

55 30 5

Show that it is impossible to individually estimate all the parameters in the linear predictor.

QUESTION 42.
In UK motor insurance business, vehicle-rating group is also used as a factor. Vehicles are divided into
twenty categories numbered 1 to 20, with group 20 including those vehicles that are most expensive to re-
pair.
Suppose that we have a three-factor model specified as age*(sex + vehicle group). Determine the linear
predictor for a model of this type.

QUESTION 43.
Claim amounts for medical insurance claims for hamsters are believed to have an exponential distribution
with mean  i :

1  y /  y 
f yi   e i
 exp   i  log  i 
i

i  i 

CA PRAVEEN PATWARI 76 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

We have the following data for hamsters’ medical claims, using the model above:

age x i (months) 4 8 10 11 17

claim amount (£) 50 52 119 41 163

The insurer believes that a linear function of age affects the claim amount:

i     x i

Using the canonical link function, write down (but do not try to solve) the equations satisfied by the maxi-
mum likelihood estimates for  and  , based on the above data.

QUESTION 44.

Claim amounts for medical insurance claims for hamsters are believed to have an exponential distribution
with mean  i :

1  y /  y 
f yi   e i
 exp   i  log  i 
i

i  i 

We are given the following data for hamsters' medical claims, using the model above:

age x i (months) 4 8 10 11 17

claim amount (£) 50 52 119 41 163

The insurer believes that a model with 5 categories for age is sufficiently accurate:

i   i i  1,2,3,4,5

Using the canonical link function, show that the fitted values  ̂i  are the observed claim amounts, y i .

QUESTION 45.

Explain the difference between the two types of covariate: a variable and a factor

CA PRAVEEN PATWARI 77 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 46.

A random variable Y has density of exponential family form:

 y  b    
f  y   exp   c  y,   
 a   
 

(i) State the mean and variance of Y in terms of b    and its derivatives and a    .

(ii) (a) Show that an exponentially distributed random variable with mean  has a density that can be
written in the above form.
(b) Determine the natural parameter and the variance function.

QUESTION 47.
An insurer wishes to use a generalised linear model to analyse the claim numbers on its motor portfolio. It
has collected the following on claim numbers y i , i  1,2,...,35 from three different classes of policy:

Class I 1 2 0 2 1 0 0 2 2 1

Class II 1 0 1 1 0

Class III 0 0 0 0 0 1 0 1 0 0

1 0 1 0 0 0 0 0 0 0

For these data values:


10 15 35
 y i  11  yi  3  yi  4
i 1 i 11 i 16

The company wishes to use a Poisson model to analyse these data.


(i) Show that the Poisson distribution is a member of the exponential family of distributions.
The insurer decides to use a model (Model A) for which:

 i  1, 2,..., 10

log  i   i  11, 12,..., 15
  i  16, 17,..., 35

where  i is the mean of the relevant Poisson distribution

CA PRAVEEN PATWARI 78 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(ii) Derive the likelihood function for this model, and hence find the maximum likelihood estimates for
 ,  and  .

The insurer now analyses the simpler model log  i   , for all policies.

(iii) Calculate the maximum likelihood estimate for  under this model (Model B).

(iv) (a) Show that the scaled deviance for Model A is 24.93.

(b) Calculate the scaled deviance for Model B.

It can be assumed that f(y) = y log y is equal to zero when y = 0.

(v) Compare Model A directly with Model B, by calculating an appropriate test statistic.

QUESTION 48.

In the context of generalised linear models, consider the exponential distribution with density function f(x),
where:

1  x/
f x  e  x  0

(i) Show that f(x) can be written in the form of the exponential family of distributions.

1
(ii) Show that the canonical link function,  , is given by    .

(iii) Determine the variance function and the dispersion parameter.

QUESTION 49.

The random variable Z i has a binomial distribution with parameters n and  i , where 0   i  1 .

A second random variable, Yi , is defined as Yi  Zi / n .

(i) Show that the distribution of Yi is a member of the exponential family, stating clearly the natural and

scale parameters and their functions a   , b   and c  y,  .

(ii) Determine the variance function of Yi .

CA PRAVEEN PATWARI 79 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 50.
A statistical distribution is said to be a member of the exponential family if its probability function or prob-
ability density function can be expressed in the form:

 y  b    
fY  y; ,    exp   c  y,   
 a    

(i) Show that the mean of such a distribution is b'    and derive the corresponding formula for the va-
riance by differentiating the following expression with respect to  :

 f  y  dy  1
y

(ii) Use this method to determine formulae for the mean and variance of the gamma distribution with
density function:


f x   x  0
1  x/
x e
  

QUESTION 51.
Independent claim amounts Y1 , Y2 ,..., Yn are modelled as exponential random variables with E  Yi   i ,
i  1,2,...,n . The fitted values for a particular model are denoted by ̂ i .

Derive an expression for the scaled deviance.

QUESTION 52.
A small insurer wishes to model its claim costs for motor insurance using a simple generalised linear model
based on the three factors:
i  1 for 'young ' drivers 
YOi   
i  0 for 'old' drivers 

 j  1 for 'fast ' cars 


FS j   
 j  0 for 'slow ' cars 

k  1 for 'town' areas 


TCk   
k  0 for 'country ' areas 

CA PRAVEEN PATWARI 80 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

The insurer is considering three possible models for the linear predictor:
Model 1: YO+FS + TC
Model 2: YO+FS + YO.FS + TC
Model 3: YO*FS*TC
(i) Write each of these models in parameterised form, stating how many non-zero parameter values are
present in each model.
(ii) Explain why Model 1 might not be appropriate and why the insurer may wish to avoid using Model 3.
The student fitting the models has said 'We are assuming a normal error structure and we are using
the canonical link function.'
(iii) Explain what this means.

The table below shows the student's calculated values of the scaled deviance for these three models
and the constant model.

Model Scaled Deviance Degrees of freedom

1 50 7

YO + FS + TC 10

YO + FS+YO.FS + TC 5

YO* FS* TC 0

(iv) (a) Complete the table by filling in the missing entries in the degrees of freedom column.
(b) Carry out the calculations necessary to determine which model would be the most appropriate.

QUESTION 53.
The following study was carried out into the mortality of leukaemia sufferers. A white blood cell count was
taken from each of 17 patients and their survival times were recorded.
Suppose that Yi represents the survival time (in weeks) of the ith patient and x i represents the logarithm
(to the base 10) of the ith patient's initial white blood cell count (i = 1,2,...,17).
The response variables Yi are assumed to be exponentially distributed. A possible specification

for E  Yi  is E  Yi   exp     x i  . This will ensure that E  Yi  is non-negative for all values of x i .

(i) Write down the natural link function associated with the linear predictor i     x i .

CA PRAVEEN PATWARI 81 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(ii) Use this link function and linear predictor to derive the equations that must be solved in order to ob-
tain the maximum likelihood estimates of  and  .
The maximum likelihood estimate of  derived from the experimental data is ˆ  8.477 , with esti-
mated standard error 1.655.
(iii) Construct an approximate 95% confidence interval for  and interpret this result.
The following two models are now to be compared:
Model 1: E  Yi   

Model 2: E  Yi      x i

The scaled deviance for Model 1 is found to be 26.282 and the scaled deviance for Model 2 is 19.457.
(iv) Test the null hypothesis that   0 against the alternative hypothesis that   0 stating any conclusions
clearly.

CA PRAVEEN PATWARI 82 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ASSIGNMENT 5
CHAPTER

BAYESIAN STATISTICS
QUESTION 1.

Three manufacturers supply clothing to a retailer. 60% of the stock comes from Manufacturer 1, 30% from
Manufacturer 2 and 10% from Manufacturer 3. 10% of the clothing from Manufacturer 1 is faulty, 5% from
Manufacturer 2 is faulty and 15% from Manufacturer 3 is faulty.

What is the probability that a faulty garment comes from Manufacturer 3?

QUESTION 2.

The annual number of claims arising from a particular group of policies follows a Poisson distribution with
mean  . The prior distribution of  is exponential with mean 30.

In the previous two years, the numbers of claims arising from the group were 28 and 26, respectively.

Determine the posterior distribution of  .

QUESTION 3.

Suppose that X 1 , X 2 ,..., X n is a random sample from a Type 1 geometric distribution with parameter p ,

where p is a random variable.

Determine a family of distributions for p that would result in conjugate prior and posterior distributions.

CA PRAVEEN PATWARI 83 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 4.

The number of claims received per week from a certain portfolio has a Poisson distribution with mean  .
The prior distribution of  is as follows:

 1 2 3

Prior probability 0.3 0.5 0.2

Given that 3 claims were received last week, determine the posterior distribution of  .

QUESTION 5.

A random sample of size 10 from a Poisson distribution with mean  yields the following data values:

3, 4, 3, 1, 5, 5, 2, 3, 3, 2

The prior distribution of  is Gamma(5,2).

Calculate the Bayesian estimate of  under

(i) squared error loss

(ii) absolute error loss

(iii) all-or-nothing loss

QUESTION 6.

A random sample of size 15 from a normal distribution with mean  and standard deviation 3 yields the
following data values:

10.75 –0.29 5.37 6.68 8.77 1.69 7.12 4.89 6.45 4.27 9.37 5.68 3.87 7.70 6.98

The prior distribution of  is N 5,2  .


2

Calculate an equal-tailed 95% Bayesian credible interval for  based on these data values. You are given


that the posterior distribution of  is N 5.83,0.722 .
2

CA PRAVEEN PATWARI 84 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 7.

The punctuality of trains has been investigated by considering a number of train journeys. In the sample,
60% of trains had a destination of Manchester, 20% Edinburgh and 20% Birmingham. The probabilities of
a train arriving late in Manchester, Edinburgh or Birmingham are 30%, 20% and 25%, respectively.

A late train is picked at random from the group under consideration.


Calculate the probability that it terminated in Manchester.

QUESTION 8.
A random variable X has a Poisson distribution with mean  , which is initially assumed to have a chi-
squared distribution with 4 degrees of freedom.
Determine the posterior distribution of  after observing a single value x of the random variable X.

QUESTION 9.
The number of claims in a week arising from a certain group of insurance policies has a Poisson distribu-
tion with mean  . Seven claims were incurred in the last week.

The prior distribution of  is uniform on the integers 8, 10 and 12.

(i) Determine the posterior distribution of  .

(ii) Calculate the Bayesian estimate of  under squared error loss.

QUESTION 10.
For the estimation of a population proportion p, a sample of n is taken and yields x successes. A suitable
prior distribution for p is beta with parameters 4 and 4.
(i) Show that the posterior distribution of p given x is beta and specify its parameters.
11 successes are observed in a sample of size 25.
(ii) Calculate the Bayesian estimate under all-or-nothing (0/1) loss.

QUESTION 11.
The annual number of claims from a particular risk has a Poisson distribution with mean  . The prior dis-
tribution for  has a gamma distribution with   2 and   5 .

CA PRAVEEN PATWARI 85 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

Claim numbers x 1 ,...x n over the last n years have been recorded.

(i) Show that the posterior distribution is gamma and determine its parameters.
8
Now suppose that n  8 and  x i  5
i 1

(ii) Determine the Bayesian estimate for  under:

(a) squared-error loss

(b) all-or-nothing loss


(c) absolute error loss.
(iii) Calculate a 95% equal-tailed credible interval for  .

QUESTION 12.
A single observation, x, is drawn from a distribution with the probability density function:

 1
 0 x 
f  x |   
0
 otherwise

The prior PDF of 0 is given by:

f      exp    , 0

Derive an expression in terms of x for the Bayesian estimate of  under absolute error loss.

QUESTION 13.
A proportion p of packets of a rather dull breakfast cereal contain an exciting toy (independently from
packet to packet). An actuary has been persuaded by his children to begin buying packets of this cereal. His
prior beliefs about p before opening any packets are given by a uniform distribution on the interval [0,1]. It
turns out the first toy is found in the n1 th packet of cereal.

(i) Determine the posterior distribution of p after the first toy is found.

A further toy was found after opening another n2 packets, another toy after opening another n3 packets
and so on until the fifth toy was found after opening a grand total of n1  n 2  n3  n 4  n5 packets.

(ii) Determine the posterior distribution of p after the fifth toy is found.

CA PRAVEEN PATWARI 86 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(iii) Show the Bayes' estimate of p under quadratic loss is not the same as the maximum likelihood esti-
mate and comment on this result.

QUESTION 14.
An actuary has a tendency to be late for work. If he gets up late then he arrives at work X minutes late
where X is exponentially distributed with mean 15. If he gets up on time then he arrives at work Y minutes
late where Y is uniformly distributed on [0,25]. The office manager believes that the actuary gets up late
one third of the time.
Calculate the posterior probability that the actuary did in fact get up late given that he arrives more than 20
minutes late at work.

CA PRAVEEN PATWARI 87 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

CHAPTER

CREDIBILITY THEORY
QUESTION 15.
A specialist insurer that provides insurance against breakdown of photocopying equipment calculates its
premiums using a credibility formula. Based on the company’s recent experience of all models of copiers,
the premium for this year should be £100 per machine. The company's experience for a new model of copi-
er, which is considered to be more reliable, indicates that the premium should be £60 per machine.

Given that the credibility factor is 0.75, calculate the premium that should be charged for insuring the new
model.

QUESTION 16.
An insurer is setting the premium rate for the buildings in an industrial estate. Past experience for the es-
tate indicates that a premium rate of £3 per £1,000 sum insured should be charged. The past experience of
other similar estates for which the insurer provides cover indicates a premium rate of £5 per £1,000 sum
insured. The insurer uses a credibility factor of 75% for this risk.

Calculate the premium rate per £1,000 sum insured.

QUESTION 17.
Claim amounts on a portfolio of insurance policies have an unknown mean  . Prior beliefs about  are de-
2
scribed by a distribution with mean 0 and variance 0 . Data are collected from n claims with mean claim
2
amount x and variance s . A credibility estimate of  is to be made, of the form:

Zx  1  Z  0

Suggestions for the choice of Z are:


2
n0
A. 2 2
n0  s

CA PRAVEEN PATWARI 88 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

2
n0
B. 2
n0  n

2
0
C. 2
n  0

Explain whether each suggestion is an appropriate choice for Z.

QUESTION 18.
The total claim amount per annum on a particular insurance policy follows a normal distribution with un-
2
known mean  and variance 200 . Prior beliefs about  are described by a normal distribution with mean
2
600 and variance 50 . Claim amounts x1 , x 2 ,..., x n are observed over n years.

(i) State the posterior distribution of  .

(ii) Show that the mean of the posterior distribution of  can be written in the form of a credibility esti-
mate.

Now suppose that n = 5 and that total claims over the five years were 3,400.

(iii) Calculate the posterior probability that  is greater than 600.

QUESTION 19.
A statistician wishes to obtain a Bayesian estimate of the mean of an exponential distribution with density
1  x/
function f  x   e . He is proposing to use a prior distribution with PDF:

 /
 e
f    ,  0
 
1

The mean of this distribution is  /    1 .

(i) Write down the likelihood function for  , based on observations x1,...,xn from an exponential distribu-
tion.

(ii) (a) Determine the form of the posterior PDF for  .

CA PRAVEEN PATWARI 89 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(b) Hence show that an expression for the Bayesian estimate for  under squared error loss is:

   xi
ˆ 
n   1

(iii) Show that the Bayesian estimate for  can be written in the form of a credibility estimate, giving a
formula for the credibility factor.

The statistician decides to use a prior distribution of this form with parameters   40 and   1.5 .
You are given the following summary statistics from the sample data:

n  100,  x i  9,826, and  x i  1,200,000


2

(iv) Calculate the Bayesian estimate of  and the value of the credibility factor.

(v) Comment on the results obtained in part (iv).

QUESTION 20.
Let  denote the proportion of insurance policies in a certain portfolio on which a claim is made. Prior be-
liefs about  are described by a beta distribution with parameters  and  .

2
Underwriters are able to estimate the mean  and variance  of  .

(i) Express  and  in terms of  and  .

A random sample of n policies is taken and it is observed that claims had arisen on d of them.

(ii) (a) Determine the posterior distribution of  .

(b) Show that the mean of the posterior distribution can be written in the form of a credibility esti-
mate.

(iii) Show that the credibility factor increases as  increases.

(iv) Comment on the result in part (iii).

CA PRAVEEN PATWARI 90 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

CHAPTER

EMPIRICAL BAYES
CREDIBILITY THEORY
QUESTION 21.
The table below shows the aggregate claim amounts (in £m) for an international insurer's fire portfolio for
a 5-year period, together with some summary statistics.

Aggregate claim amount. Year j

1 5
 
2
1 2 3 4 5 xi  x ij  xi
4 j 1

1 48 53 42 50 59 50.4 39.3

2 64 71 64 73 70 68.4 17.3
Country, i
3 85 54 76 65 90 74.0 215.5

4 44 52 69 55 71 ? ?

(i) Fill in the missing entries in the last row of the table.

(ii) Estimate the values of E  m     , E s     and var m     using EBCT Model 1, and hence estimate
2
 
the credibility factor, Z.

(iii) Calculate the credibility premium for each country using EBCT Model 1.

QUESTION 22.
The figures given in the table below are the aggregate claims (in £000s) for each of four risks over a period
of four years.

CA PRAVEEN PATWARI 91 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

Year 1 Year 2 Year 3 Year 4

Risk 1 1,892 1,975 2,309 2,278

Risk 2 2,356 2,876 3,002 3,378

Risk 3 2,890 2,489 2,424 2,551

Risk 4 1,662 1,408 1,697 2,034

(i) Assuming that the data satisfy the assumptions of EBCT Model 1, estimate the aggregate claim amount
for Risk 1 in Year 5.

You are now given the following risk volumes:

Year 1 Year 2 Year 3 Year 4 Year5

Risk 1 300 329 334 346 370

Risk 2 410 425 446 470 461

Risk 3 468 405 397 422 437

Risk 4 227 206 236 259 268

(ii) Use EBCT Model 2 to estimate the aggregate claim amount for Risk 1 in Year 5. The corresponding fig-
ure for Risk 2 is 3,050.8.

QUESTION 23.
Consider the following statements made about EBCT Model 1.

(a)  represents the 'true' risk premium for a given risk.

(b) The variance of X j | doesn't depend on  .

(c) None of the random variables or parameters in the model are assumed to have a normal distribution.

Explain whether each of these statements is true or false.

CA PRAVEEN PATWARI 92 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

QUESTION 24.
The table below shows the aggregate claim amounts (in £m) for an international insurer's fire portfolio for
a 5-year period, together with some summary statistics.

Aggregate claim amount. Year j

1 2 3 4 5

1 48 53 42 50 59

2 64 71 64 73 70
Country, i
3 85 54 76 65 90

4 44 52 69 55 71

The volumes of business for each country for the insurer are as follows

Volume of business, Year j

1 2 3 4 5 6

1 12 15 13 16 10 20

2 20 14 22 15 30 25
Country, i
3 5 8 6 12 4 10

4 22 35 30 16 10 12

Calculate the credibility premium for each country in Year 6 using EBCT Model 2

QUESTION 25.
An actuary has, for three years, recorded the volume of unsolicited advertising that he receives. He believes
that the number of items that he receives follows a Poisson distribution with a mean which varies accord-
ing to which quarter of the year it is. He has recorded Yij the number of items received in the i th quarter of

CA PRAVEEN PATWARI 93 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

the j th year (i = 1,2,3,4 and j = 1,2,3). The actuary wishes to estimate the number of items that he will re-
ceive in the first quarter of year four. He has recorded the following data:

  Yij  Yi 
2
1
Yi1 Yi2 Yi3 Yi   Yij
3 j j

i=1 98 117 124 113 362

i=2 82 102 95 93 206

i=3 75 83 88 82 86

i=4 132 152 148 144 224

(i) Estimate Y1,4 the number of items that the actuary expects to receive in the first quarter of year four
using the assumptions of EBCT Model 1. The actuary believes that, in fact, the volume of items has
been increasing at the rate of 10% per annum.
(ii) Suggest how the approach in (i) can be adjusted to produce a revised estimate taking this growth into
account.
(iii) Calculate the maximum likelihood estimate of Y1,4 (based on the quarter one data already observed
and the 10% pa increase described above).
(iv) Compare the assumptions underlying the approach in (i) and (ii) with those underlying the approach
in (iii).

QUESTION 26.
An actuary wishes to analyse the amounts paid by a group of insurers on their respective portfolios of
commercial property insurance policies using the models of Empirical Bayes Credibility Theory.
The actuary obtains the following information about the amounts of claim payments made and the number
of policies sold for each of three different insurers. The data obtained are as follows

Year 1 Year 2 Year 3 Year 4

£14.2m £15.8m £22.7m £19.0m


Insurer A
163 189 252 199

CA PRAVEEN PATWARI 94 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

£58.6m £63.1m £81.0m £64.2m


Insurer B
4,435 4,761 5,576 4,581

£123m £132m £161m £133m


Insurer C
16,184 17,443 20,102 18,000

(i) Calculate the expected total claim payment to be made by Insurer B in the coming year under EBCT
Model 1.
(ii) Calculate the expected payout amount for Insurer B in the coming year using EBCT Model 2, assuming
that the expected number of policies sold for the coming year for Insurer B is 4,800.
You may use the summary statistics given below, which have been calculated using the formulae and
notation given in the Tables, again working in millions of pounds. Subscripts 1, 2 and 3 refer to Insur-
ers A, B and C respectively.

 P1j  x1j  x1   P1j  x1j  x 


2 2
 0.014667  5.106461

 P2j  x2j  x2   P2j  x 2j  x 


2 2
 0.006103  0.336408

 P3j  x3j  x3   P3j  x3j  x 


2 2
 0.003979  0.292641

(iii) Comment on your results

QUESTION 27.
An actuarial student is using Empirical Bayes Credibility Theory Model 2 to calculate credibility premiums
for a group of insurers. The student has analysed the data for six different insurers, using 10 years of past
data for each insurer and has obtained the following figures:
6 10
  Pij  1, 498
*
P  18.24
i 1 j1

The estimated values of E  m     , E s     and var m     based on the data from the six insurers are
2
 
4.00, 62.8 and 42.1, respectively.

The student has just received the following information relating to a seventh insurer (Insurer I), and he

CA PRAVEEN PATWARI 95 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

wishes to update the estimates of E  m     , E s     and var m     using the claims data for Insurer I
2
 
given in the below:

Year, j 1 2 3 4 5 6 7 8 9 10

Aggregate

Claim amount, 100 85 90 102 109 106 128 132 150 131
y ij

Risk volume,
22 24 26 20 25 30 29 35 40 36
Pij

Calculate the updated estimates for E  m     , E s     and var m     .


2
(i) (a)
 

(b) Hence calculate the credibility premium for Insurer I for the coming year, given that Insurer I is
expected to have a risk volume figure for the coming year of 38.

The student also needs a credibility estimate for Insurer K, one of the six insurers included in the orig-
inal analysis. He knows that, for Insurer K:
10 10
 y Kj  986 and  PKj  327
j 1 j 1

(ii) Explain whether the credibility premium for Insurer K (based on the full analysis of the seven insur-
ers) will be greater or less than the corresponding figure for Insurer I (per unit of risk volume).

CA PRAVEEN PATWARI 96 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

BASICS ASSIGNMENT
SOLUTIONS
SUMMARISING DATA
ANSWER 1. ANSWER 5.

£1,263 (i) 2 claims per week

(ii) 2.5 claims per week


ANSWER 2.

0.35 ANSWER 6.

595
ANSWER 3.

(i) 864 years ANSWER 7.

(ii) 4.4% 3.428571 × 10


9

(iii) £23,800
ANSWER 8.
(iv) £42,000
– 2.92
ANSWER 4. Negatively skewed
76.5 years

ANSWER 9.

CA PRAVEEN PATWARI 97 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 12.
ANSWER 10.
(i) 73,184
(a) – 119.498.484
(ii) 18.57, 4.70
(b) – 0.15064
(iii) 74.66 years, 8.29 years
ANSWER 11.
ANSWER 13.
Mean = 0.35
8.8
Standard Deviation = 0.67232

BASIC PROBABILITY
ANSWER 14. ANSWER 19.
1/3 (i) 0.0758

ANSWER 15. (ii) 0.0218

(i) 2/11 (iii) 0.752


(ii) 6/11
ANSWER 20.
ANSWER 16.
0.660
0.4

ANSWER 21.
ANSWER 17.
(i) 21/20 0.06

(ii) 11/50
ANSWER 22.
(iii) 17/50
11/15
ANSWER 18.
(i) 0.01008

(ii) 0.992

CA PRAVEEN PATWARI 98 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

RANDOM VARIABLE
ANSWER 23. ANSWER 27.
Probability function, P(X=x), is: Expectation = 6.3
P(X = 0) = 0.25 Standard Deviation = 1.62
P(X = 1) = 0.5
ANSWER 28.
P(X = 2) = 0.25
(i) 16.3
ANSWER 24. (ii) 17.5
1/30 (iii) 0.285

ANSWER 25. ANSWER 29.


Answer 25. k = 1/10

0 w2 P(x < 6) = 0.81


0.2 2  w  4
 P  x  6   0.19
FW  w  
0.7 4  w  5

1 5w P ( 0< x < 5) = 0.8

ANSWER 26. ANSWER 30.

(i) P (V = 2) = 0.432 ¾, 1, 0.2

(ii) P (V > 1) = 0.784 ANSWER 31.


(iii) P (V < 3) = 0.648 0.035, 0.037, 0.965

PROBABILITY DISTRIBUTION
ANSWER 32. ANSWER 33.

(i) 7/64 9, 1/3, 224/2187

(ii) 1/256
ANSWER 34.
(iii) 247/256 0.1114
CA PRAVEEN PATWARI 99 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 35. ANSWER 39.

(i) 0.007 (a) 0.2606

(ii) 0.182 (b) 0.9622

ANSWER 36. ANSWER 40.

(i) 15.87% 0.130

(ii) 2.28%
ANSWER 41.
(iii) 49.72%
0.128

ANSWER 37.
ANSWER 42.
(a) 0.0454
(i) 0.94887
(b) 0.0155
(ii) 0.12604

ANSWER 38. (iii) 0.86768

0.20469 (iv) 0.07726

CA PRAVEEN PATWARI 100 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ASSIGNMENT 1
SOLUTIONS
GENERATING FUNCTIONS
ANSWER 1. ANSWER 7.
1 (i) e M X  2t 
3t

ANSWER 2. (ii) 
N 2  3,4
2
 distribution.
1 t 1 t 1 t
 e  2e  2e
t 2t 2t ANSWER 8.
(i) E(Y) = 8
ANSWER 3.
(ii) Sd Dev = 5.6569
1/5, 1/25
(iii) 20643840
ANSWER 4.
ANSWER 9.
Proof t
pe
(i) MU  t   t
ANSWER 5. 1  qe

0.09158 qe
t
(ii) CGF  1  t
1  qe
ANSWER 6.
 t ANSWER 10.
(i) e
t   2 t R
Ke
(i) (a)
(ii) E  X  
1
 2  t 

(b) t<2
1
(iii) V  X   2 2R
 (ii) k  2e

CA PRAVEEN PATWARI 101 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 11. ANSWER 12.


 3
 t (i) 1  2t 
(i) 1  
 
(ii) 0.1247
(ii) Proof

JOINT DISTRIBUTION
ANSWER 13. ANSWER 19.
(i) 9/35 0.02

(ii) 1/7 ANSWER 20.


(iii) 1/5 –0.019

ANSWER 14. ANSWER 21.


0.229 Poi      distribution

ANSWER 15. ANSWER 22.


30  v E(W) = 32
fV  v   , –5 < v < 5
300
V(W) = 32
ANSWER 16. ANSWER 23.
x  3y (i) 0.95974
0 x 2
2 1  3y 
(ii) 0.10371

ANSWER 17. ANSWER 24.


36/35 (i) 0.525

ANSWER 18. (ii) 0.4232

(i) E(U) = 140/9 ANSWER 25.


E(V) = 5/18 (i) 1/16

(ii) Both methods are equivalent (ii) 0.398

CA PRAVEEN PATWARI 102 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 26. ANSWER 29.


(i) 2(1 – y) 0.0322
(ii) 1/(1 – y)
ANSWER 30.
ANSWER 27. 0.236
2
x  xy ANSWER 31.
(i) 3
, 0 x 2
8 5y
 2y  (i) 3.4
3 6
(ii) 3.8
(ii) 0.3696
ANSWER 32.
ANSWER 28.
93
4 2 1 
(i)  3x  x  for 0  x  1
5 2  ANSWER 33.
3x  y 0.283
(ii) 1
for 0  y  1
3x  2

(iii) –1/450

CONDITIONAL EXPECTATION
ANSWER 34. ANSWER 38.
18 56

ANSWER 35. ANSWER 39.


3x  4 (i) Proof
3 x  1 (ii) E(Y) = 5.5, Std Dev = 4.21

ANSWER 36. ANSWER 40.


5 E(Y) = 2, V(Y) = 4
x 0 x 2
9
ANSWER 41.
ANSWER 37. (i) 0.80556
(i) 1.2 (ii) (5/8) v
(ii) 1.2

CA PRAVEEN PATWARI 103 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ASSIGNMENT 2
SOLUTIONS
THE CENTRAL LIMIT THEOREM
ANSWER 1. ANSWER 7.

0.00161 (a) 0.02889

(b) 0.0433
ANSWER 2.

0.71634 ANSWER 8.

0.06178
ANSWER 3.

0.15721 ANSWER 9.

0.91356
ANSWER 4.

Exact prob – (i) 0.0539 ANSWER 10.

(ii) 0.2064 0.988

Normal Approx – (i) 0.0518


ANSWER 11.
(ii) 0.2141
0.02275

ANSWER 5.
ANSWER 12.
0.840
0.06681

ANSWER 6.
ANSWER 13.
0.43794
0.85282

CA PRAVEEN PATWARI 104 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

SAMPLING AND STATISTICAL INFERENCE


ANSWER 14. ANSWER 20.
Exp = 750 0.89%

Var = 8522.7
ANSWER 21.
ANSWER 15. (i) 0.995

(i) 0.926 (ii) 0.0983

(ii) 0.0253 (iii) 4.2%

(iv) 10%
ANSWER 16.
(i) 0.025 ANSWER 22.
(ii) 0.99 (i) 0.0127

(iii) 0.05 (ii) 0.00234

(iv) 0.2244 (iii) 0.9

(iv) 0.998
ANSWER 17.
(v) 0.0114
alpha = 2.3

beta = 0.345

ANSWER 18.
0.28638

ANSWER 19.
(a) 4.765

(b) 0.2711

CA PRAVEEN PATWARI 105 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ASSIGNMENT 3
SOLUTIONS
POINT ESTIMATION
ANSWER 1. ANSWER 8.

0.1940 3.057

ANSWER 2. ANSWER 9.
5.67  2n  1  4
(i)  2 
 n 
ANSWER 3.
(ii) Estimator is consistent
p = 0.2621

n = 14.78 ANSWER 10.


2
ANSWER 4. 
(i)
n
p = 0.7317
(ii) PROOF
k = 4.091
ANSWER 11.
ANSWER 5.
0.2364
(i) ˆ  x

(ii) 2.4 ANSWER 12.

(i) PROOF
ANSWER 6.
(ii) PROOF
0.0443
(iii) 0.624
ANSWER 7.
(iv) 54%
0.102

CA PRAVEEN PATWARI 106 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 13.
(i)   1

 2
(ii) 
1

ANSWER 14.
(i) PROOF

 
n n

(ii) FX MAX
 x   F  x    1  1  x 

(iii) PROOF

(iv) 0.01259

1 771 log771
(v)  log771  24  
0
 1  771

This equation cannot be solved algebraically. A numerical method will be needed to solve it.

(vi) We cannot use the usual method of moments approach unless we know all the individual sample val-
ues (or at least the mean of the sample). So we do not have sufficient information to use the method of
moments approach here.

ANSWER 15.

1 1
(i) Since 0  P  X  x   1 , using this for each of the probabilities gives lower bounds for  of  ,  and
16 6
3 1 7 1 5
 . Hence,    . We also obtain upper bounds for or of  of , and .
8 16 16 6 8

1
Hence   .
6

(ii) 0.0083

(iii) PROOF

(iv) 0.0929

CA PRAVEEN PATWARI 107 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 16.

(i) (a) 0.13345

(b) X=0, 87507

X = 1, 11678

X = 2, 779

X = 3, 35

X = 4, 1

X = 5, 0

X > = 6, 0

(ii) PROOF

(iii) (a) p=0.93295, k=1.8569

(b) X=0, 87909

X = 1, 10945

X = 2, 1048

X = 3, 90

X = 4, 7

X = 5, 1

X> = 6, 0

(iv) For a Poisson distribution, the mean and variance are the same. Since the sample mean and variance
(which, for a sample as large as this, should be very close to the true values) are 0.1334S and 0.14304,
which differ significantly, this suggests that the Poisson distribution may not be a suitable model here.
The negative binomial distribution has more flexibility and can accommodate different values for the
mean and variance (provided the variance exceeds the mean)

CA PRAVEEN PATWARI 108 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

CONFIDENCE INTERVAL
ANSWER 17. ANSWER 22.
(126.5, 137.5) (0.140, 0.220)

ANSWER 18. ANSWER 23.


(80.0, 184.0) (0.151, 0.169)

ANSWER 19. ANSWER 24.


(i) (2.53,12.1) (–0.168, 5.068)

(ii) (0,10.0)
ANSWER 25.
ANSWER 20. (0.0908, 0.2203)

(116.7, 136.5)

ANSWER 21.
(0.00127, 0.249)

ANSWER 26.

(i) (a) (5.48,7.92)

(b) We have assumed that the numbers of hours that actuarial students spend watching television
has a normal distribution.

(ii) (2.66, 10.7)

(iii) (a) For large samples, the confidence interval for the mean will eventually converge on the sample
mean which should be equal to the true mean, whereas the prediction interval will not converge
to a single value but down to an interval of the distribution.

(b) Unlike confidence intervals for the mean, which is concerned with the centre of the distribution,
prediction intervals also take account of the tails as well as the centre. Hence, prediction inter-
vals have greater sensitivity to the assumption of normality.

CA PRAVEEN PATWARI 109 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 27.
(0.4735, 0.9968)

ANSWER 28.
(i) (a) B appears to have a slightly smaller spread (but it is hard to tell with so few data points).

The difference in the spread doesn't appear to be significant, so the assumption of equal va-
riances can be allowed to stand.

There are no outliers and so there is nothing to suggest non-normality.

(b) (11.2, 49.0)

(c) (–3.37, 6.00)

(ii) (–1.59, 5.09)

Since this interval contains zero there is insufficient evidence to suggest that A and B give different
valuations.

ANSWER 29.
(i) (a) 0.11494

(b) 0.000661

(c) (0.06457, 0.1653)

(ii) (a) (0.07020, 0.1705)

(b) This confidence interval is narrower as it is based upon the exact result, whereas in part (i) (c) it
was based on a relatively small sample of 20. A larger sample would have given a narrower in
terval.

ANSWER 30.
(i) (a) 14

(b) 11

(ii) (a) 13

(b) 4

CA PRAVEEN PATWARI 110 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(iii) Comment

For the confidence intervals the sample sizes are: similar, but larger in the case where less informa-
tion is known. In general, prediction intervals are wider than confidence intervals and so a larger
sample is needed to get the same width. However, in this case, the prediction intervals vary due to the
vast difference in the talls of the t distribution.

ANSWER 31.

(7.632, 7.744)

ANSWER 32.

(i) (a) (-0.793, 12.1)

(b) Since the confidence interval contains the value 0, there is insufficient evidence to conclude that
the new screening programme significantly reduces the mean claim amount.

(ii) (a) (0.154, 2.49)

(b) Since the confidence interval contains 1, this means that we are reasonably confident that the
population variances are the same.

(iii) 16

HYPOTHESIS TESTING
ANSWER 33. ANSWER 34.

(i) 0.2466 Sensitivity = 70%

(ii) 0.6068 Specificity = 80%

ANSWER 35.

(i) 59.9

(ii) 0.864

CA PRAVEEN PATWARI 111 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 36.
The average IQ of university students is greater than 100

ANSWER 37.
The long term average annual rainfall has increased from its former level.

ANSWER 38.
the proportion of male carriers in the population is less than 10%.

ANSWER 39.
The true claim frequency is less than 0.175.

ANSWER 40.
The patients on the special diet have the same blood pressure as patients on the normal die

ANSWER 41.
(i) the mean performance is greater with the additive than without

(ii) the difference in the means is not equal to 6

ANSWER 42.
There is no difference in the variances of the two populations.

ANSWER 43.
the proportion of claims due to burglaries in the year just ended is not greater than the proportion in the
previous year.

ANSWER 44.
the training course does not increase employees’ efficiency.

ANSWER 45.
We have no evidence that the die is not fair.

CA PRAVEEN PATWARI 112 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 46.
there has been no change in the pattern of causes of death.

ANSWER 47.
A Poisson model does not provide a good model for the number of claims.

ANSWER 48.
The underlying distribution is binomial.

ANSWER 49.
(a) No differences among the population proportions have been detected.

(b) The population proportions are not all equal

ANSWER 50.
The level of injury is almost certainly dependent on whether the victim is wearing a seatbelt.

ANSWER 51.
The characteristic is not dependent on the mother’s age.

ANSWER 52.
(i) Sufficient evidence at the 10%level to reject H0

(ii) Insufficient evidence at the 5%level to reject H0

ANSWER 53.
(i) (a) It is reasonable to conclude that  =15.

(b) it is reasonable to conclude that  not equal to 15

2
(ii) It is reasonable to conclude that  = 20

ANSWER 54.
It is reasonable to conclude that p = 2/3

CA PRAVEEN PATWARI 113 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 55.
(i) (a) Sensitivity = 96.1%
(b) Specificity = 90.6%
(ii) (a) 288
(b) 256

ANSWER 56.
10%

ANSWER 57.

(i) We conclude that  2  1

2 2
(ii) We conclude that 1  2

ANSWER 58.
(i) We conclude that  H   C

(ii) We conclude that  D  0

ANSWER 59.
(i) (a) 2.585
(b) conclude that the model is a good fit
(ii) We conclude that   3, ie the patient does have anaemia.

ANSWER 60.
The classification into the three AIDS statuses is not independent of the presence or absence of the alleles.

ANSWER 61.
(i) There is an association between single parent families and being in trouble with the police.
(ii) However, the presence of an association does not justify the politician's assumption that single par-
ents cause crime. There may be some other underlying causes (eg education levels, poverty) that in-
fluence family circumstances and crime rates together.
CA PRAVEEN PATWARI 114 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 62.
(i) (a) Each house independently must have the same probability of being burgled

(b) 91/600

(c) 37.3, 40.0, 17.9, 4.3, 0.6, 0.0, 0.0

(d) These are very similar to the observed frequencies, which implies that the model is a good fit.

(ii) Conclude that the model is a good fit

ANSWER 63.
(i) There is perhaps some very slight evidence of concentration at the centre of the distribution for A, but
the sample sizes are small and it is difficult to tell whether an assumption of normality is reasonable.
The variance of the data from Company B looks slightly smaller than that from Company A. However,
it is unlikely that such a small difference is significant. There are no outliers in either distribution.
2 2
(ii) Reasonable to conclude that  A  B

(iii) The level of premiums charged by Company B is the same as that charged by Company A

(iv) (a) (–0.53, 0.33)

(b) Since this confidence interval contains zero, we cannot conclude that the proportions of pre-
miums in excess of 200 are different for the two companies.

(v) The company has increased its premiums since the previous year.

CA PRAVEEN PATWARI 115 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ASSIGNMENT 4
SOLUTIONS
CORRELATION
ANSWER 1.

Here we can see that there appears to be a strong positive linear relationship. The plotted data points lie

roughly in a straight line.

CA PRAVEEN PATWARI 116 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 2.

In can clearly be seen that the data displays a non-linear relationship, since the rate of change in the inter-
est rate increases with the leverage ratio.

ANSWER 3.
Proof

ANSWER 4.
r = 0.95824

As expected, this is high (close to +1), and indicates a strong positive linear relationship.

ANSWER 5.
r = 0.87108

ANSWER 6.
rs = 0.9636

CA PRAVEEN PATWARI 117 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

As expected, the Spearman’s rank correlation coefficient is very high, since it is known from the calculation
of the Pearson’s correlation coefficient that there is a strong positive linear relationship (hence a strong
monotonically increasing relationship).

ANSWER 7.
For the corporate borrowing data, the ranks of the two data are exactly equal, hence Spearman’s rank cor-
relation coefficient is trivially equal to 1.

The reason that this is materially higher than the equivalent Pearson coefficient is because the non-linearity
of the relationship does not feature in the calculation, only the fact that it is monotonically increasing.

ANSWER 8.
 = 0.8667

The relatively high value demonstrates the strong correlation between the variables.

ANSWER 9.
For the corporate borrowing data, clearly all the pairs are concordant, and so  is trivially equal to 1.

ANSWER 10.
Test statistic = 9.478

ANSWER 11.
p-value = 0.12

ANSWER 12.
Spearman’s rank correlation

Test statistic = 4.654

Kendall’s rank correlation

Test statistic = 3.299

ANSWER 13.
(i) Sxx  70, Syy  3.015 and Sxy  14.3

CA PRAVEEN PATWARI 118 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(ii) r = 0.984336

There is a strong linear association between gestation period and foetal weight.

(iii) The ranks of the two variables (gestation period and weight) are exactly equal, hence Spearman's
rank correlation coefficient is equal to 1.

This means that all the pairs are concordant, and so  is also equal to 1.

(iv) (a) Test statistic = 11.17

(b) Test statistic = 4.193

(v) Test statistic = 1.643

ANSWER 14.
(i) (a) Pearson correlation coefficient : r = 0.81045

Spearman rank correlation coefficient : rs = 0.95

Kendall rank correlation coefficient :  = 0.83

(b) Test statistic = 3.660

(ii) There is strong positive correlation between class size and GCSE results (ie bigger classes have better
GCSE results).

However, correlation does not necessarily imply causation, ie whilst bigger classes have better results,
it is not necessarily the class size that causes the improvement.

ANSWER 15.
(i) (a) Since we have tied ranks wecannot use the simplified formula for Spearman or Kendall.

(b) r = 0.85860

(ii) Test statistic = 0.945

LINEAR REGRESSION
ANSWER 16.
There appears to be a strong positive linear relationship and so fitting a linear regression model is appropri-
ate.
CA PRAVEEN PATWARI 119 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 17.
If we take logs, the relationship becomes:

log Y  log a  x log b

So if we work in terms of the variable Y'  log Y , we have a linear relationship:

Y'  log a  x log b

ANSWER 18.

Substituting x into the RHS of yˆ  ˆ  ˆ x gives:

yˆ  ˆ  ˆ x

Since ˆ  y ˆ x , it follows that:

 
ŷ  y ˆ x  ˆ x  y

ANSWER 19.

The fitted regression line is : ŷ  0.164  0.8823x

Estimated error variance = 0.0732

ANSWER 20.
£325

ANSWER 21.

SS TOT  7.1588

SSREG  6.5734

SSRES  0.5854

ANSWER 22.
0.918

CA PRAVEEN PATWARI 120 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 23.
±0.958

ANSWER 24.
(a) (0.668, 1.10)

(b) The 95% two-sided confidence interval in (a) contains the value ‘1’, so the two-sided test in (b) con-
ducted at the 5% level results in H 0 being accepted.

ANSWER 25.
p-value = 89.8

ANSWER 26.
(a) (£392, £452)

(b) (£353, £492)

ANSWER 27.
The residuals eˆ i  y i  yˆ i , are given in the table below

xi 2.10 2.40 2.50 3.20 3.60 3.80 4.10 4.20 4.50 5.00

ê i 0.163 –0.221 0.171 –0.377 0.330 –0.266 0.239 –0.159 0.246 –0.125

ANSWER 28.
The basic model is:

E  Y| x1 , x2     1 x1  2x2

Here x 1 represents the number of exam passes, x 2 represents the number of years' experience and Y would
represent the corresponding salary.

 , 1 and 2 are constants where:

  reflects the average salary for a new student (with no exam passes or experience)

CA PRAVEEN PATWARI 121 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

 1 and 2 reflect the changes in pay associated with an extra exam pass and an extra year's experience,
respectively.

Since the data relates to 50 (= n) students, we need to introduce an extra subscript i corresponding to the i th
student. So the actual salary for the i th student will be:

Yi    1 x i1  2 x i2  ei

where e i is the difference between the student's actual salary and the theoretical salary for someone with
the same number of exam passes and experience.

ANSWER 29.

(i) (a) Sxx  70, Syy  3.015 and Sxy  14.3 .

(b) the fitted regression line is ŷ  4.60  0.2043x .

2
(c) ˆ  0.0234 .

(ii) 3.98 kg

(iii) (a) SS TOT  3.015

SSREG  2.921

SSRES  0.094

(b) 96.9%

(iv) Test statistic = 11.2

(v) p-value = 124.7

(vi) (a) 2.141 kg

(b) (1.99,2.30)

(vii) (a) 2.141 kg

(b) 0.0287

(c) (1.78, 2.50)


CA PRAVEEN PATWARI 122 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(viii) (a) The completed table is:

Gestation period (weeks) 30 32 34 36 38 40

Residual 0.07 –0.24 0.15 0.05 0.04 –0.07

(b) All values are between 3ˆ  3 0.0234  0.46 so there appear to be no outliers.

There may be possible skewness but it’s difficult to tell with such a small dataset.

(c) The plot appears to be patternless which implies a good fit.

(d) One of the values is way off the diagonal line which indicates that the data set may be non-
normal and hence the full normal linear regression model may not be appropriate.

ANSWER 30.
(i) (a) 0.66393

(b) Test statistic = 4.184

(ii) (a) 0.71228

(b) 4.184

(iii) These tests are equivalent. Testing whether there is any correlation is equivalent to testing if the slope
is not zero (ie it is sloping upwards and there is positive correlation or it is sloping downwards and
there is negative correlation). So the tests give the same statistic and p-value.

ANSWER 31.
2
R  0.64

ANSWER 32.
2
(i) Let Yi  y i and X i  x i .

Then the model becomes Yi  a  bX i  ei .

(ii) Taking logs gives:

Iny i  Ina  bx i
CA PRAVEEN PATWARI 123 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

Let Yi  Iny i and X i  x i .

Then the model becomes Yi     X i where   Ina and   b .

ANSWER 33.

(i) The fitted regression equation of y on x is y = 28.205 + 0.63223x.

(ii) (a) 25.289

(b) (13.8,64.2)

(iii) Test statistic = 5.296

(iv) (56.2,67.2)

(v) (a) 73.7%

(b) 73.7% of the variation is explained by the model, which indicates that the fit is fairly good. It still
might be worthwhile to examine the residuals to double check that a linear model is appropriate

ANSWER 34.

(i) The regression line is:

ŷ  32.47  118.117x

(ii) (a) (67.4,169)

(b) (4350,89100)

(iii) (a) SS TOT  925,262

SSREG  837,093

SSRES  88,169

2
(b) R  90.5%

(c) This tells us that 90.5% of the variation in the prices is explained by the model. Since this leaves
only 9.5% from other non-model sources, it would appear that the model is a very good fit to the
data.
CA PRAVEEN PATWARI 124 JAI SHREE RAM
CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(iv) (a) x 1  eˆ  45

x4  eˆ  110

x 8  eˆ  183


(b) Since ei ~ N 0, 
2
 we would expect the dotplot to be normally distributed about zero. This does
not appear to be the case, but it is difficult to tell with such a small data set.
(c) Clearly this is not patternless. The residuals are not independent of the time. This means that the
linear model is definitely missing something and is not appropriate to these data.

ANSWER 35.
(i) The fitted regression line is: p  0.16901c  0.75836

(ii) (a) 1.78 (b) 1.29518

ANSWER 36.

(i) 
 x i Yi
 xi
2

2

(ii) 
Bias ˆ  0 ; MSE ˆ  
 xi
2

CA PRAVEEN PATWARI 125 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 37.
(i) (a) In x  InB  x In c

(b) Y  In x   InB   Inc

(ii) The graph appears to show an approximately linear relationship and this supports the transformation
in part (i)(a). However, it does appear to have a slight curve and this would warrant closer inspection
of the model to see if it is appropriate for the data.

(iii) B = 0.000103, c = 1.06


2
(iv) (a) R  95.7%

(b) This tells us that 95.7% of the variation in the data can be explained by the model and so indi-
cates an extremely good overall fit of the model.

(c)

Age, X 30 32 34 36 38 40 42 44

Residual, ê i 0.08 0.02 –0.03 –0.06 –0.06 –0.03 0.02 0.09

(d) The residuals should be patternless when plotted against X, however it is clear to see that some
pattern exists - this indicates that the linear model is not a good fit and that there is some other
variable at work here.

(v) (a) (–7.309, –7.193)

(b) (0.000669, 0.000752)

ANSWER 38.

bˆ  0.01413; aˆ  2.4805; ˆ  0.01940


2
(i)

(ii) 0.989

(iii) (0.00918, 0.0191)

(iv) (a) (870,4940)

(b) (800,5360)

CA PRAVEEN PATWARI 126 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 39.
(i) Proof
(ii) Proof

ANSWER 40.
(i) (a)  y i  n  1  x i1  2  x i2
 y i x i1    x i1  1  x i1  2  x i2x i1
2

 y i x i2    x i2  1  x i1 x i2  2  x i2
2

(b) 1  1.19367, 2  0.301468,   25.3084

Regression line is:

y  25.31  1.194x1  0.3015x 2

(ii) The p-values for all the parameters are less than 0.05 and so they are all significantly different from
zero.
(iii) 0.9984
(iv) Test statistic = 1,280
(v) 92.1%
(vi) The first plot appears to be random and there is no discernible increase in the variance - so this would
imply that the model meets these assumptions. Point 1 (92.5%) does appear to be an outlier. But it is
difficult to tell with such a small dataset.
With the exception of point (1) the rest of the values lie along the diagonal line thus implying a normal
distribution is appropriate.
(vii) (a) If there is interaction between the two drugs then there is an additional effect caused when both
are present compared with what would be expected if they were each administered singly.

(b) The formula is Y    1 X 1  2 X 2   12 X 1 X 2 .

2
(c) The model with just the two drugs as main effects had an adjusted R of 0.9984 in part (iii) whe-
2
reas the new model with the interactive effect has an adjusted R of 0.9969.

CA PRAVEEN PATWARI 127 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

2
Since there is a decrease in the value of the adjusted R the previous model would be considered
the 'best' model as the interaction term does not improve the fit enough to justify the extra pa-
rameter.

GENERALISED LINEAR MODELS


ANSWER 41.

Proof

ANSWER 42.

  i  j   i x   jx

ANSWER 43.

1
 y i   0
ˆ  ˆ x i

xi
 x i y i   0
ˆ  ˆ x i

ANSWER 44.

Proof

ANSWER 45.

A variable is a type of covariate (eg age) whose actual numerical value enters the linear predictor directly,
and a factor is a type of covariate (eg sex) that takes categorical values.

ANSWER 46.

(i) E Y   b'    var  Y   a    b"   

 1 y
(ii) (a) f  y   exp In  
  

CA PRAVEEN PATWARI 128 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(b) The natural parameter is  , so here the natural parameter is:

1

The variance function is (by definition) b"    , so here we find:

1 1
b'      b"    
2
2

 

ANSWER 47.

 y log    
(i) f  y   exp   log y!
 1 

(ii) ̂ = log 1.1 = 0.09531

̂ = log 0.6 = –0.51083

̂ = log 0.2 = –1.60944

(iii) –0.66498

(iv) (a) 24.93 (b) 35.03

(v) We can use the chi-squared distribution to compare Model A with Model B. We calculate the differ-
ence in the scaled deviances (which is just 2( log L A  log L B )):

35.03 – 24.93 = 10.10

This should have a chi-squared distribution with 3 –1 = 2 degrees of freedom, which has a critical val-
ue at the upper 5% level of 5.991. Our value is significant here, since 10.10 > 5.991, so this suggests
that Model A is a significant improvement over Model B. We prefer Model A here.

ANSWER 48.

 y 
(i) f  y   exp    log  
  

(ii)   1 / 

2
(iii) Variance function is 

CA PRAVEEN PATWARI 129 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(iv) The dispersion parameter or scale parameter is   1 .

ANSWER 49.

  i  
 y i In    In  1   i  
  1  i   n 
(i) f  y i   exp   In  
 1/n  ny i  
 
 

(ii) b"     i 1  i 

ANSWER 50.

(i) E  Y   b'    ; var  Y   a   b"   

2

(ii) E  X   ; var  X  

ANSWER 51.


  ˆ  yi 

2 In  i  1  
  yi
  ˆ
 
i

ANSWER 52.
(i) Model 1: i  j   k

Model 2:  ij   k

Model 3:  ijk

(ii) Model 1 does not allow for the possibility that there may be interactions (correlations) between some
of the factors. For example, it may be the case that young drivers tend to drive fast cars and to live in
towns.

With Model 3, which is a saturated model, it would be possible to fit the average values for each group
exactly ie there are no degrees of freedom left. This defeats the purpose of applying a statistical model,
as it would not 'smooth' out any anomalous results.

CA PRAVEEN PATWARI 130 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(iii) Normal error structure means that the randomness present in the observed values in each category
(eg young/fast/town) is assumed to follow a normal distribution.

The link function is the function applied to the linear estimator to obtain the predicted values. Asso-
ciated with each type of error structure is a 'canonical' or 'natural' link function. In the case of a nor-
mal error structure, the canonical link function is the identity function .

(iv) (a) The completed table, together with the differences in the scaled deviance and degrees of free-
dom, is shown below.

Model Scaled Deviance DF  Scaled Deviance  DF

Constant: 1 50 7

Model 1: YO + FS + TC 10 4 40 3

Model 2: YO + FS + YO.FS + TC 5 3 5 1

Model 3: YO*FS*TC 0 0 5 3

(b) Model 1 is a significant improvement over the constant model.

Model 2 is a significant improvement over Model 1.

Model 3 is not a significant improvement over Model 2.

ANSWER 53.
(i) In  i is the natural link function.

17
  x i 
(ii)  y ie  17
i 1

17 17
  x i 
 x i y ie   xi
i 1 i 1

(iii) (5.233,11.721)

(iv) Test statistic = 6.825

CA PRAVEEN PATWARI 131 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ASSIGNMENT 5
SOLUTIONS
BAYESIAN STATISTICS
ANSWER 1. ANSWER 6.

0.167 (4.41, 7.24)

ANSWER 2. ANSWER 7.

 61  2/3
Gamma  55, 
 30 
ANSWER 8.
ANSWER 3.
 3
Gamma  x  2,  .
p must have a beta distribution.  2

ANSWER 4. ANSWER 9.

P    1| X  3   0.11989 (i) P    8| X  7   0.51066

P    2| X  3   0.58806
P    10| X  7   0.32954
P    3| X  3   0.29205
P    12| X  7   0.15980

ANSWER 5. (ii) 9.29830

(i) 3
ANSWER 10.
(ii) 2.972
(i) Beta(x + 4, n– x + 4)
(iii) 2.917
(ii) 0.45161

CA PRAVEEN PATWARI 132 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 11. (ii) (a) 0.538

(i) Gamma  2   x i ,n  5 (b) 0.462

(c) 0.513
(iii) (0.217, 1.00)

x + log2
ANSWER 12.

ANSWER 13.  5 
(ii) Beta  6,  n i  4 
 i 1 
(i) Beta 2,n1 

(iii) The two estimates are different. The Bayesian estimate of p under quadratic loss is the value of g that
minimises the expected posterior loss:
1

  g  p fpost  p  dp
2

The maximum likelihood estimate of p is the value of p that maximises the likelihood function.

ANSWER 14.

0.39722

CREDIBILITY THEORY
ANSWER 15.
£70

ANSWER 16.
(i) £3.50

ANSWER 17.
First, let's consider what happens when n increases. In A, Z increases. In B, Z is unaffected by n. In C, Z de-
creases. In practice, we want Z to increase as n increases, so B and C are inappropriate.

CA PRAVEEN PATWARI 133 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

2
Now consider what happens when 0 (the variance of the prior distribution) increases. In all cases Z in-
creases. In practice, we want Z to increase, so all expressions are appropriate in this respect.
2
Finally, consider what happens when s (the sample variance) increases. In A, Z decreases. In B and C, Z is
2
unaffected by s . In practice, we want Z to decrease, so B and C are inappropriate.

Overall, therefore, only the expression in A is an appropriate choice for Z.

ANSWER 18.

 nx 600 
 2
 2 
1
(i) N  200 50 , 
 n 1 n 1 
 2
 2 2
 2 
 200 50 200 50 

n 1
2 2
(ii) 200 x 50 600
n 1 n 1
2
 2 2
 2
200 50 200 50

(iii) 0.669

ANSWER 19.

e 
 x / i

(i) n

    x i  /
e
(ii) (a) n 1

   xi
(b)
n   1

n
(iii) Z 
n   1

(iv) 0.9950
(v) The value of Z is very close to 1. So the credibility estimate is very close to the sample mean (98.26),
and takes little account of the prior mean (80). This is because n is much bigger than  .

CA PRAVEEN PATWARI 134 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 20.

 1   
2
(i)  2


  1    
   2
 1  1   
  

(ii) (a) Beta    d,   n  d 

 n  d     
(b)     
     n  n      n   

(iii) Increasing  will reduce the denominator, and hence increase the value of Z.

(iv) Increasing the standard deviation of the prior distribution means that there is greater uncertainty as-
sociated with the prior distribution, and hence it is less reliable. So, when estimating  , we should put
less weight on the prior mean and more weight on the maximum likelihood estimate of  (which is
calculated from the sample data alone). To achieve this the credibility factor, Z, must increase. The
formula in (iii) illustrates this.

EMPIRICAL BAYES CREDIBILITY THEORY


ANSWER 21.
(i) x 4  58.2

1 5
 
2
 x 4j  x4
4 j 1
 132.7

(ii) Z =0.81695

(iii) Country 1: 52.66

Country 2: 67.37

Country 3: 71.94

Country 4: 59.03

CA PRAVEEN PATWARI 135 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

ANSWER 22.

(i) £2,129,300

(ii) £2,397,400

ANSWER 23.

(a) This is false.  is just a risk parameter that reflects the likelihood of claims. The true risk premium for

a given risk is E  m    | X  .

(b) This is also false. The variance of X j |  is s    , which is a function of  .


2

(c) This is true. In fact, none of the quantities in the model are assumed to have any specific type of distri-
bution.

ANSWER 24.

Country Estimated credibility factor Risk premium per unit volume EBCT premium

1 0.8048 3.851 77.0

2 0.8632 3.468 86.7

3 0.6862 8.504 85.0

4 0.8759 2.750 33.0

ANSWER 25.

(i) 112.75

(ii) We could use EBCT Model 2, with risk volumes Pi,j for Quarter i of Year j of:

1 1 1
P1,4  1 P1,3  P1,2  2
P1,1  3
1.1 1.1 1.1

(iii) 102.417

CA PRAVEEN PATWARI 136 JAI SHREE RAM


CS1 ASSIGNMENTS & SOLUTIONS ACTUATORS EDUCATIONAL INSTITUTE

(iv) In the EBCT approach of parts (i) and (ii), we are not assuming any particular distribution for the ran-
dom variables Yi,j . In the maximum likelihood approach in part (iii), we are explicitly assuming that

each Yi,j follows a Poisson distribution.

Also, the EBCT approach assumes that the data from all 4 quarters provide us with information about
Quarter 1, whereas the maximum likelihood approach only considers the data from Quarter 1.

ANSWER 26.
(i) £66.79m

(ii) £66.18m

(iii) The two models give fairly similar results. The estimate in Model 2 will depend on the prediction of
risk volume for the coming year.

In both cases we have used a very high value for the credibility factor. So we are effectively ignoring the
data from the other insurers, and are basing our estimate almost entirely on the data from Insurer B.

This seems sensible, given that both the volume figures and the average claim amounts appear to be
quite variable between the three different insurers. This suggests that we should not place too much
emphasis on the data from Insurers A and C, and focus on the information that we have for Insurer B.

ANSWER 27.

E  m      3.9916; E s      54.881; var m    


2
(i) (a)  32.70533

(b) 150.024

(ii) The estimated credibility factor for Insurer K is 0.99489.

So we place more emphasis on the mean of the direct data for Insurer K, and this reduces the credibili-
ty estimate. As a result, the credibility premium per unit of volume will be lower for Insurer K than for
Insurer I.

CA PRAVEEN PATWARI 137 JAI SHREE RAM

You might also like

pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy