A Concise Course in A-Level Statistics - Crawshaw.J
A Concise Course in A-Level Statistics - Crawshaw.J
A-LEVEL STATISTICS
Second Edition
Also from the same publisher:
Second Edition
J CRAWSHAW ssc
Head of Mathematics Department
Clifton High School, Bristol
J CHAMBERS ma
Head of Mathematics Department
Sutton High School GPDST, Surrey
All rights reserved. No part of this publication may be reproduced or transmitted in any form
or by any means, electronic or mechanical, including photocopy, recording, or any information
storage and retrieval system, without permission in writing from the publisher or under licence
from the Copyright Licensing Agency Limited. Further details of such licences (for reprographic
reproduction) may be obtained from the Copyright Licensing Agency Limited, of 90 Tottenham
Court Road, London W1P 9HE.
Crawshaw, J.
A concise course in A-level statistics. With worked examples
2nd ed.
1. Statistics
I. Title II. Chambers, J.
519.5
ISBN 0-7487-0455-8
at TE
aoe
+ inet aint he st
ietenn & marae
#2 yotan 3
fan ft
fog
tga -
i ’ <tiaaeh
alattae
| pe
om aa ey hant \e a Ls
aia. Totenee
7 at x ; Fig af ‘
a ’ p our 7 s
aia eee Vac Soruiving>ie
~1
ran 1, SS am
. Svomery *~ "Taba
PREFACE
This text is intended primarily for use by students and teachers of
the statistics section of A-level Pure Mathematics with Statistics, an
increasingly popular course.
Points of theory are presented concisely and illustrated by suitable
- worked examples, many taken from previous A-level papers. These
are then supported by very carefully graded exercises which serve to
consolidate the theory, link it with previous work and build up the
confidence of the reader. There are frequent summaries of main
points and miscellaneous exercises containing mainly A-level
questions.
Throughout the text we have aimed to provide the reader with a
mathematical structure and a logical framework within which to
work. We have given special attention to topics which, in our
experience, cause great difficulty. These include probability theory,
the theory of continuous random variables and significance testing.
The text covers the main theory required by all the major examining
boards. We are very grateful to the following for permission to
reproduce questions:
University of Cambridge Local Examinations Syndicate (C)
The Southern Universities’ Joint Board (SUJB)
Joint Matriculation Board (JMB)
University of London (L)
Oxford and Cambridge School Examinations Board (O & C)
incorporating School Mathematics Project (SMP)
Mathematics in Education and Industry (MEI)
The Associated Examining Board (AEB)
Oxford Delegacy of Local Examinations (O)
A-level questions are followed by the name of the board. Questions
from Additional Mathematics papers are indicated by the word
Additional, and (P) indicates a part-question.
We are particularly indebted to The Associated Examining Board
and The Southern Universities’ Joint Board for allowing us to use
some of their questions as worked examples, and would stress that
they are in no way involved in, or responsible for, this working.
We extend our thanks to our families, colleagues and students for
all their encouragement and support, in particular to Audrey
Shepherd and Jane Ziesler.
J Crawshaw
J Chambers
1x
PREFACE TO THE
SECOND EDITION
In order to give a fully comprehensive coverage of the present
A-level syllabuses the following material has been added:
Chapter 4 — The use of binomial and Poisson cumulative
probability tables. The geometric distribution
Chapter 5 — The negative exponential distribution
Chapter 6 — The use of the standard normal cumulative tables
®(z) (with the use of tables giving Q(z) retained in
the Appendix)
Chapter 7 — Random sampling and the use of random number
tables
Chapter 9 — Significance testing relating to the binomial and
Poisson distributions
Chapter 11 — A fuller treatment of correlation and linear regres-
sion, including significance testing relating to
Spearman’s and Kendall’s coefficients of correlation.
Numerous recent A-level questions taken from all the major
examining boards have been added, together with worked examples
from the University of London Schools Examination Board which
we would stress is in no way responsible for these solutions.
J Crawshaw
J Chambers
1990
DESCRIPTIVE STATISTICS
DISCRETE DATA
These are the marks obtained by 30 pupils in a test:
Ge AS AGnLe 6 Total 30
CONTINUOUS DATA
These are the heights of 20 children in a school. The heights have
been measured correct to the nearest cm.
1
2 A CONCISE COURSE IN A-LEVEL STATISTICS
Continuous data cannot assume exact values, but can be given only
within a certain range or measured to a certain degree of accuracy,
for example
144 cm (correct to the nearest cm) could have arisen from any
value in the range 143.5cm <h< 144.5 cm.
Other examples of continuous data are
the speeds of vehicles passing a particular point,
the masses of cooking apples from a tree,
the time taken by each of a class of children to perform a task.
FREQUENCY DISTRIBUTIONS
The values 119.5, 124.5, 129.5, ..., are called the class boundaries.
NOTE: the upper class boundary (u.c.b.) of one interval is the lower
class boundary (1.c.b.) of the next interval. —
= 5
In fact, in this example, each of the classes has been chosen so that
the width is 5.
To group the heights into the following classes it helps to use a
‘tally’ column, entering the numbers in the first row, then the
second row, and so on.
Height (cm)
119.5 <h < 124.5
124.5 <h<1295
129.5 <h < 134.5
134.5 <<h< 139.5
1389.5 <h < 144.5
DESCRIPTIVE STATISTICS
Example 1.1 The following table gives the diameters of 40 ball-bearings, each
measured in cm correct to 2 decimal places (d.p.). Form a frequency
distribution by taking classes of width 0.02 cm.
Solution 1.1 The smallest value in the table is 3.93 and the largest value is 4.04.
As measurements have been taken in cm correct to 2 d.p., the lowest
class boundary is 3.925cm. As the class width is 0.02 cm, the first
interval must have an upper class boundary of 3.945cm.
So we take as class boundaries 3.925, 3.945, 3.965, ..., 4.045.
The frequency distribution is as follows:
Diameter (cm)
Diameter (cm)
3.93-3.94
3.95-3.96
3.97-3.98
and so on
The interval ‘27-31’ means 26.5 mm < length < 31.5 mm.
The class boundaries are 26.5, 31.5, 36.5, 46.5, 51.5
The class widths are Sree Se rT Oss 25
Length
of call (min) O0O- 38- 6- 9- 12- 18-
The interval ‘3-’ means 3 minutes <time <6 minutes, so any time
including 3 minutes and up to (but not including) 6 minutes comes
into this interval.
The class boundaries are 0, 3,6,9,12,18
The class widths are 3,3, 3, 3546
The interval ‘-250’ means 100g < mass < 250 g; so any mass over
100 grams up to and including 250 grams comes into this interval.
The class boundaries are 0,100, 250, 500, 800
The class widths are 100, 150, 250, 300
DESCRIPTIVE STATISTICS oi
As the ages are in completed years (not to the nearest year) then
‘21-24’ means 21 < age < 25. Someone who is 24 years and 11
months would come into this category. Sometimes this interval is
written ‘21-’ and the next is ‘25-’, etc.
The class boundaries are 21, 25, 29, 33, 41, 53
The class widths are weed 8. 12
HISTOGRAMS
Solution 1.2 The class boundaries are 9.5, 14.5,19.5, 24.5, 29.5
The class widths are 5, Ds 5, 5
When all the class intervals are of equal width the penis can
be used for the height of each rectangle.
Frequency
Solution 1.3 The class boundaries are 5.5,8.5, 11.5, 17.5, 20.5, 29.5
The class widths are rs Pae's 6, 3, 9
As the class widths are not equal we cannot make the height of each
rectangle equal to the frequency.
So we choose a convenient width as a ‘standard’ and adjust the
heights of the rectangles accordingly, as follows.
If we choose a class width of 3 as standard, then the first two
rectangles can be 4 and 6 units high respectively. However, as the
third interval is twice the standard width we must make the height
of the rectangle equal to half the frequency.
Similarly, as the last interval is 3 X standard we must make the height
of the rectangle equal to one-third of the frequency.
DESCRIPTIVE STATISTICS
Height of
Mass (kg) Class width Frequency | rectangle (stan-
dard frequency)
standard 4
standard 6
2 X standard 10
standard 3
3 X standard 12
Standard
frequency
Mass (kg)
Example 14 The following table gives the distribution of the interest paid to 460
investors in a particular year.
Solution 1.4 The class boundaries are 25, 30, 40, 60, 80, 110
The class widths are 5, 10, 20, 20, 30
We will choose a class width of 10 as the standard width.
Standard
Interest (£) Class width frequency
5 | $X standard 17 1
isX17 2 34
=
standard 55 55
2X standard 142 3X 142 =71
2X standard 153 $X153 = 76.5
3 X standard 93 3x 93 = 31
80
70
60
frequency
Standard
50
40
30
0 ¥ 25 30 40 50 60 70 80 90 100. 110
Interest (£)
Example 1.5 The following table gives the distribution of marks of 60 pupils in a
test. Draw a histogram to illustrate the data.
95x
© standard
standard
standard
standard
2 X standard
20
frequency
Standard
10
CF pelponie
34.5
Marks
frequency
Standard
10 A CONCISE COURSE IN A-LEVEL STATISTICS
Exercise 1a
Number of 3
L636) 215 02" Oeeoae O
viewers
3. The masses (measured to the nearest g) of Draw a histogram to illustrate the data.
washers are recorded in the table. Draw a
histogram to illustrate the data. 38 children solved a simple problem and
the time taken by each was noted.
Mass (g) 0-2 3-5 6-11 12-14 15-17
Time
20 20 eae
iieencalie 10
4. 100 people were asked to record how
many television programmes they watched Draw a histogram to illustrate this informa-
in a week. The results were as follows: tion.
Table A
Length of
6-25 26-60 61-80 81-105 106-115 116-150 151-200 201-300
stay (min)
Table B
Table C
The sales (in thousands of litres) of petrol from four petrol stations
A,B,C and D are noted for the first week of March, and are shown
in the table:
Solution 1.6 The total angle of 360° at the centre of acircle is divided according
to the sales at each of the stations.
The total sales (thousands of litres) = 90+140+30+20 = 280
Petrol station
A CONCISE COURSE IN A-LEVEL STATISTICS
Station A
(90)
Station B
Station D
(140) (20)
Station C
(30)
Example 1.7 The following agricultural statistics refer to the land use, in hectares,
of three parishes. Draw three pie diagrams to compare these data.
= /4020:+/1200 :./630
63.40 : 34.64: 25.10
SALT e43221-255
For convenience, we take r; = 3.2cm,r,= 1.7 cmandr;= 1.3 cm.
The angles in the pie diagrams are calculated as shown in the table:
0
32 pee
Carnford 22°)(360)= 182.9° (360)
= 91.4° 250)(360)== 85.7°
630 630 630
APPLEFORD
Barley
(1830)
Woodland
Wheat (550)
(1640)
BURNFORD
CARNFORD
Barley
Barley
(645)
(320)
Exercise 1b
1. Construct a pie diagram to illustrate the és.) Five companies form a group. The sales of
scores obtained when a die is thrown 120 each company during the year ending 5th
times. April, 1978, are shown in the table below.
Company A B C D E
Sales (in
20 35 860
£1000’s) oP tae
Draw apie chart of radius 5 cm to illustrate
2. The results of the voting in an election this information.
were as follows: For the year ending 5th April, 1979, the
total sales of the group increased by 20%,
2045 votes ,and this growth was maintained for the
4238 votes year ending 5th April, 1980.
8605 votes
If pie charts were drawn to compare the
12012 votes
total sales for each of these years with the
Represent this information on a pie total sales for the year ending 5th April,
diagram. 1978, what would be the radius of each of
these pie charts?
If the sales of company E for the year
ending 5th April, 1980, were again
3. The pie chart, which is not drawn to scale,
£60 000, what would be the angle of the
shows the distribution of various types of
sector representing them? (C Additional)
land and water in a certain county. Cal-
culate
6. Mr Williams worked out how much it had
(i) the area of woodland,
cost him to run his car for each of 3
(ii) the angle of the urban sector,
consecutive years. The results were as
(iii) the total area of the county.
follows:
(C Additional) P
Tax and
insurance
fem
1200 km?
Item A
[atin[Amen[Asia[Bape
diagram drawn to illustrate Mrs N’s replies
Year | in which the circle representing the total
amount had a radius of 5cm, the sector
1972 8.4 12.2 15.6 23.8 representing the amount spent on item A
1973 5.5 6.7 13.2 19.6 had an angle of 7 2° and the amount spent
on item B was £4.00. Find the amount
Draw two pie charts which allow the total spent on item C by Mrs N and draw a pie
annual sales to be compared. (C Additional) diagram to illustrate her expenditure.
DESCRIPTIVE STATISTICS ; 15
FREQUENCY POLYGONS
A frequency distribution may be displayed as a frequency polygon.
Number of flaws 0 1 2 3 4 5
Solution 1.8 Points are plotted, with the number of flaws on the horizontal axis
and the frequency on the vertical axis.
Frequency
1 1 2 3 4 5
Number of flaws
Example 1.9 Construct a frequency polygon for the data given in Example 1.3.
16 A CONCISE COURSE IN A-LEVEL STATISTICS
frequency
Standard
FREQUENCY CURVES
If the number of intervals is large, then the frequency polygon will
consist of a large number of line segments. The frequency polygon
approaches a smooth curve, known as a frequency curve.
Frequency curve
Exercise 1c
1. Ina competition to grow the tallest holly- Draw a histogram and superimpose the
hock, the heights recorded by 50 com- frequency polygon.
petitors were as follows. Heights were
measured to the nearest cm (see Table A
below).
TableA
2. (a) The following table shows the weekly 3. The table shows the duration, in minutes,
sales of television sets in a department of 64 telephone calls made from a high
store in one year. street call box in one day.
Number of sets Length of
5-13 14-22 23-31 32-40 41-49
sold/week call (min)
Number of weeks Frequency 3 1 22 20 6 6 0
Draw a frequency polygon to illustrate this Draw a frequency polygon to illustrate the
information. information.
(6) The following year the sales were as 4. The table shows the ages (in completed
follows: years) of women who gave birth to a child
at Anytown Maternity Hospital during a
Number of sets particular year.
5-13 14-22 23-31 32-40 41-49
sold/week
Number of weeks 3 16 20 12 A Age (years) 16- 20- 25- 30- 35- 45-
Number of births 70 470 535 280 118 0O
Draw a frequency polygon to show the
sales in the second year, on the same grid Draw a frequency polygon to illustrate this
as part (a). information. Do not draw a histogram first.
CUMULATIVE FREQUENCY
Example 1.10 The marks of 40 pupils in a test are shown in the table. Construct a
cumulative frequency distribution.
Mark
Frequency
Solution 1.10 The cumulative frequency distribution for the marks is as follows.
Up to and including 4
Up to and including 5 Zoo
Up to and including 6 2 oO 1.0 =15
Up to and including i 25 +o LO = 25
Up to and including 8 2AA5 428410 -7 = 32
Up to and including 9 2-54 8 -aL0 +7 + 5 = 37
Up to and including 10 2+5+8+10+7+5+3=40
Example 1.11 The heights of 30 broad bean plants were measured, correct to the
nearest cm, 6 weeks after planting. The frequency distribution is
given below. Construct the cumulative frequency table.
Example 1.12. (a) Construct a cumulative frequency curve for the data in Example
LAT.
(b) Estimate from the curve
(i) the number of plants that were less than 10 cm tall;
(ii) the value of x, if 10% of the plants were of height x cm or
more.
DESCRIPTIVE STATISTICS , 19
frequency
Cumulative
Height (cm)
(b)
(i) To find how many plants were less than 10 cm tall, find the
height, 10 cm on the horizontal axis. Draw a vertical line to meet
the curve and then draw a horizontal line to meet the cumulative
frequency axis.
Example 1.13 Pupils were asked how long it took them to walk to school on a
particular morning. A cumulative frequency distribution was
formed:
Time taken | —5 <19 <15 <20 <25 <30 <35 <40 <45
(minutes)
Solution 1.13 (a) Cumulative frequency curve to show the times taken to walk to
school
frequency
Cumulative
(b) From the graph we estimate that 114 pupils took less than 18
minutes.
Therefore,
e 6% of the pupils took 36 minutes or longer.
e ee
DESCRIPTIVE STATISTICS
oH
: 28
45—28= 17
81—45= 36
143—81= 62
280 —143 = 1387
349—280= 69
374—349= 25
395—374= 21
400—395= 5
Total = 400
150
Frequency
100
30 35 40 45
Time (minutes)
Exercise 1d
ee ) Table A below gives the distribution of (ii) the range of marks gained by all
~~ marks of candidates in an examination. candidates except the top 10% and the
(a) Construct a cumulative frequency dis- bottom 10%.
tribution and draw a cumulative frequency
curve. (2, The cumulative frequency curve overleaf
(6) Use your curve to estimate has been drawn from information about
(i) the percentage of candidates who the amount of time spent by 50 people in
passed, if the pass mark was 45; ; a supermarket on a particular day.
Table A
0-9 10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89 90-99
Table B
Table C
|pHvalue |< 4.8 <5.2 <5.6 <6.0 <64,/<68 <7.21 <76 =<S0 52
Cumulative
THE MEDIAN
Example 1.15 The table shows the number of children in the family for 35 families
in a certain area. Find the median number of children per family.
Number of children 0 1 2
Frequency 3 On le
Number of children
0 3
Up to and including 1 8
Up to and including 2 20
Up to and including 3 29
Up to and including 4 33
Up to and including 5 35
We could have written out all the values in order from the frequency
table, thus 0, 0,0,1,1,1,1,1, 2, 2,.... However, we can see from
the cumulative frequency table that the 18th value will be 2, as the
first 8 values are 0 or 1 and the first 20 values are 0 or 1 or 2.
Therefore the median number of children per family is 2.
1. Find the median of each of the following 3. , Find the median of each of the following
sets of numbers: frequency distributions:
(a) 4, 6,18, 25, 9, 16, 22, 5, 20, 4,8
(b) 192, 217, 189, 210, 214, 204 (2)
(c) 1267, 1896, 895, 3457, 2164
(d) 0.7, 0.4, 0.65, 0.78, 0.45, 0.32, 1.9,
0.0078 (b)
Example 1.16 The masses, measured to the nearest kg, of 49 boys are noted and
the distribution formed. Estimate the median mass.
The median is the 3(49 + 1)th value, i.e. the 25th value.
20 items
25 items
34 items
There are 14 items in the class 74.5-79.5 and from the diagram the
median is a of the interval of 5 kg from 74.5 to 79.5.
5
Estimate of the median mass = 74.5+ Pat(5)
76.3kg (1d.p.)
Therefore we estimate the median to be 76.3 kg (1 d.p.)
frequency
Cumulative
0
59.5 645 69.5 74.5¢ 79.5 84.5 89.5
Mass (kg)
Med
26 A CONCISE COURSE IN A-LEVEL STATISTICS
Frequency
13.5 15.6 16.38 12.3 13.1 14.2 12.4 11.3 14.0 14.6
19.G514.8 12.7 10.9 11.0 11.4 15,0 10.1 15.4 11.3
10% 146 13.5 15.1 12.1.12.0 14.2 11.4 15.0 13.3
13.2 9.1 16.9 14.2 15.0 13.6 14.8 11.4 14.8 15.7
13.5 13.5 12.9 13.8 13.7'16.2 11.6 13.8 14.2 10.7
10.0-10.9
11.0-11.9
12.0-12.9
13.0-13.9
14.0-14.9
15.0-15.9
16.0-16.9
(c) The class boundaries are 8.95, 9.95, 10.95, 11.95, ... , 16.95.
The class widths are each equal to 1. As the class widths are equal
we can label the vertical axis ‘frequency’.
Consider the class 13.0-13.9; there are 18 units of area to the left
of the lower class boundary point of 12.95, so we need another
7 units. If we find P such that AP = 7 then AP: AC = 7:12 and, by
similar triangles, PQ: CD = 7:12. Hence AM: AB = 7:12,
28 A CONCISE COURSE IN A-LEVEL STATISTICS
Frequency
10
i fo: Mie |
8.95 9.95 10.95 11.95 12.95 13.95 14.95 15.95 16.95
=O
Haemoglobin level
So there are 7 units of area in the class 13.0-13.9 to the left of the
line QM.
From the histogram, an estimate of the median is 13.45.
(d) In the sample the median is the 5(50 + 1)th value, i.e. the
255th value.
Now there are 18 readings as far as a haemoglobin level of 12.95.
Arranging the items in the class 13.0-13.9 gives
19th 20th 21st 22nd 23rd 24th 25th 26th
13.1) «13.2en 93:3) 9135. aldsbaeAd.5 i
median
Exercise 1f
1. Estimate the median of the following Mass (gm) -50 -54 -58 -62 -66 -70 -74
frequency distribution
(a) by calculation, Frequency S222 IER AQ M10 PG. a9
(6) from a cumulative frequency curve,
(c) from a histogram. Construct a cumulative frequency table
and draw a cumulative frequency curve.
The frequency distribution shows the times
Use the curve to estimate the median
taken by 55 pupils to do their mathematics mass.
homework. Times have been measured to
the nearest minute. 3. The table shows the frequency distribution
: : of the speeds of cars passing along a marked
Time (min)| 5-14 15-24 25-34 35-44 45-54 stretch of road of length 1 kilometre.
Frequency 5 7 19 17 7 Estimate the median speed.
Speed (km/h) 40- 60- 80=- 100-
2. Eggs laid at Hill Farm are weighed and the
results grouped as shown:
DESCRIPTIVE STATISTICS 29
Table B
10-39 40-49 50-54 55-59 60-69 70-79 80-89
120 ih ot AGe be oe) 17
TableC
Length of life (h) 650-669 670-679 680-689 690-699 700-719
and so on.
Example 1.18 Find the semi-interquartile range of the following set of numbers:
2,3,3,9,6,6,12,11,8, 2, 3,5, 7,5,4,4,5,12,9
5,6,6,7,8, @,9,11,12,12
5,4,5,
2,2,3,3, @),4,
There are 19 numbers.
Q,; = 9
= 3(9—3)
= 3
The semi-interquartile range of the set of numbers is 3.
When data has been grouped into classes, the values of the quartiles
and percentiles may be obtained in the same way as the median.
Example 1.19 The table gives the cumulative distribution of the heights (in cm) of
400 children in a certain school:
Height (cm) <100 <110 <120 <130 <140 <150 <160 <170
frequency
Cumulative
300
200
The middle half of the readings, that is the interquartile range, has a
range of 16cm.
Exercise 1g
1. Find (a) the median, (b) the lower quar- Draw a cumulative frequency curve. Use _
tile Q;, (c) the upper quartile Q3 for each this curve to estimate the median and the
of the following sets of data: quartiles of the distribution. , (O&C)
(i) Test marks of 11 students:
52, 61, 78, 49, 47, 79, 54, 58, 62, 73, 72 @/ From the soil of an English garden 100
earthworms were collected. Their lengths
(ii) were recorded to the nearest millimetre
Number of and grouped as shown in Table C below.
peas per pod
Write down the cumulative frequency
Frequency LO 1S se 24 22519548) 85 table and draw a cumulative frequency
curve to illustrate this information.
2. The marks scored by 63 pupils in a test Estimate
are shown in the frequency distribution. (i) the median length of worm,
Calculate (a) the median, (6) the inter- (ii) the semi-interquartile range,
quartile range for the set of marks. (iii) the percentage of worms which are
Mark Ope t C2235 4 5s Ga 18 BLO over 180 mm in length. (C Additional)
Frequency
| 2.2 3 4 6 11 15 10 6 3 i “~~
Table B
Number of
0-2 3-5 6-8 9-11 12-14 15-17 18-20 21-23
rivets missing
Length (mm) 95-109 110-124 125-139 140-154 155-169 170-184 185-199 200-214
Number of
2 8 17 26 24 16 6 al
worms
2806-85 -90 -95. -100 -105 -110 -115 -120 -125 over 125
0 6 12Z60°227) 231 FTO a + 2 1 0
DESCRIPTIVE STATISTICS 33
te 30 specimens of sheet steel are tested for centage of persons who performed the
tensile strength, measured in kN m7. task in forty minutes or less. (JMB)
Table E below gives the distribution of =A
the measurements. 10 The frequency distribution, given in
Draw a cumulative frequency diagram of the table, refers to the heights, in cm, of
this distribution. 50 men, corrected to the nearest 10 cm.
Estimate the median and the 10th and Height (em) |140 150 160 170 180 190
90th percentiles. (O&C)
The figure below shows the cumulative (a) State the least possible height of the
frequency diagram for the distribution of one man whose height is recorded in the
the number of marks, N, in the range 0 to table as 140 cm.
99 inclusive, obtained by 120 candidates
(b) Draw on graph paper a histogram to
in an examination. From the diagram,
illustrate the data of the table, drawing five
estimate
columns, with the first column represent-
(a) the median mark,
ing the seven shortest men. Label the axes
(b) the inter-quartile range,
carefully and explain clearly how fre-
(c) the number of candidates who scored
quency has been represented on your
more than 59 marks.
histogram.
State why the diagram has to be read at (c) Draw a cumulative frequency diagram |
N = 9.5, 19.5, ...if a grouped frequency on graph paper for the data given in the
table showing how many candidates are table. From your diagram, estimate the
in the class intervals 0-9, 10-19, ... is upper and lower quartiles, the median
to be found. height and the interquartile range.
Draw up such a table and illustrate it by (L Additional)
drawing a histogram. Mark on your dia- The following data concern a random
gram the median mark. (L Additional) sample of 1000 men with heights in
The distribution of the times taken when the given ranges.
a certain task was performed by each of a
large number of people was such that its
twentieth percentile was 25 minutes, its
fortieth percentile was 50 minutes, its 180-
sixtieth percentile was 64 minutes and its 182-
eightieth percentile was 74 minutes. Use 184-
186-
linear interpolation to estimate (i) the
188-
median of the distribution, (ii) the upper
190-192
quartile of the distribution, (iii) the per-
Table E
Number of 4 3 6 10 5 2
specimens
PT
TN
frequency
Cumulative
80 90 100
MarksNV
A CONCISE COURSE IN A-LEVEL STATISTICS
34
Draw a cumulative frequency diagram to between the fortieth and seventieth per-
illustrate these data. Use your diagram to centiles,
estimate (c) the number of men in the sample
(a) the median height, with heights of at least 183 cm.
(L Additional)
(b) the range of heights for men who are
_ MitreYa nanrire aT EYTOR TR ET
MEASURES OF LOCATION
There are three main statistical measures which attempt to locate a
‘typical’ value. These are
the median (which we have already investigated, p. 22),
the mode,
the arithmetic mean.
THE MODE
The mode is the value that occurs most often.
The mode has the advantage that it is easy to calculate and it elim-
inates the effects of extreme values, but it is generally unsuitable
for further calculation and it is not used widely.
Example 1.21 Estimate the mode of the following frequency distribution which
shows the marks of 330 candidates in an examination.
Marks 11-20 21-30 31-40 41-50 51-60 61-70 71-80 81-90 91-100
Frequency 20 40 80 100 50 200 10 10 0
Frequency
-0
0 Y10.5 20.5 30.5 pee 60.5 70.5 80.5 90.5
43 Marks “
Estimate
of mode
Now the modal class contains 20 more than the class below and 50
more than the class above. So the mode is likely to divide the modal
class in the ratio 20:50 = 2:5.
20
timate ofof mode
estimate mode = 40.5+ bead
|———] (10 )
m=eme 405+
oray(| a0
| (Eo)
lI 43.4
Aap
36
A
dz a is the l.c.b. of the modal class
:
<=c>
es rate
Estimate of the mode from the cumulative frequency curve
The rate of increase of a cumulative frequency curve is greatest at
the point corresponding to the mode. Therefore, at the mode there
is a point of inflexion on the cumulative frequency curve. To
estimate where this occurs, place a ruler along the curve and find
where the curve has its maximum gradient.
y
{= Point of inflexion
Mode
Example 1.22 Draw a cumulative frequency curve for the data in Example 1.21
and use it to estimate the mode.
Solution 1.22 The upper class boundaries are 20.5, 30.5, 40.5, ..., 90.5. The
lower boundary of the first class is 10.5.
The cumulative frequency distribution is as follows:
Marks <10.5 <20.5 <30.5 <40.5 <50.5 <60.5 <70.5 <80.5 <90.5
Cumulative 310 320 330
frequency 0 20 60 140 240 290
300
200
frequency
Cumulative
100
0
10.5 20.5 30.5 40.5 50.5 60.5 70.5 80.5 90.5
Mode Marks
DESCRIPTIVE STATISTICS
37
We see from the curve that the point of inflexion occurs when the
mark is 44.5 (approximately).
Therefore an estimate of the mode is 44.5 marks.
Exercise 1h
1. Find the mode of each of the following The age recorded for each man is the
sets of numbers: number of completed years lived.
(Q)e2im 22 a Zon 2a eee ao (a) Construct the cumulative frequency
(b) 412, 426, 435, 412, 427, 428, 485 table and draw the cumulative frequency
(c) 4, 6, 4, 8, 9, 2,4, 2, 6, 7, 8, 6, 5, 5, curve.
4,6 (b) From the cumulative frequency curve,
(d) 101; 106, 99, 108, 76, 87, 102, 93 estimate the mode.
(c) Drawa histogram, and use it to estimate
2. Find the mode of the following frequency the mode.
distribution:
12 14 26 35 23 5 1
Table B
Life (in hours) - 660-669 670-679 680-689 690-699 700-709 710-719 720-729 730-739
No. of bulbs
Dx
Therefore =
n
_ 694
10
= 69.4
V5 =) X1—X,
x X2~ Xq
ee <i Ke
Vimo Xn— Xa
Summing 2y; = 2x,—nx, for i = 1,2,...,n
So ee ze or
n n
Therefore Y= K=KX_q
Rearranging x = x,+y
Example 1.24 Find the mean of the set of numbers given in Example 1.23, using
an assumed mean X, of 70.
Now y = x—70
Therefore
y = x—70
sO x = 70+%¥
= 170+2
n
—6
= 70 tee
10
= 69.4
Therefore the mean, X = 69.4, as before.
Exercise 1i
1. Find the mean for each of the following 3. The mean of 10 numbers is 8.If an eleventh
sets of numbers (a) without using an number is now included in the results, the
assumed mean, (6) using an assumed mean. mean becomes 9. What is the value of the
(i) 5,6, 6,8,8,9,11,13, 14,17 eleventh number?
(ii) bee 153, bees oe pee
aaa 1 gd ggd 541 cod 541 551 564
i) 445, 475, 485, 515, 525, 545, 555, 565 4. The mean of 4 numbers is 5, and the mean
(iv) 1769, 1771, 1772, 1775, 1778, 1781, of 3 different numbers is 12. What is the
rs Ae bani aed Bil toe mean of the 7 numbers together?
Vv . 9 . > . r} ° > na > -
2. Ifthe mean of the following numbersis17, 5, The mean of n numbers is 5. If the number
find the value of c: 13 is now included with the n numbers, the
12,18, 21,c,13 new mean is 6. Find the value of n.
Example 1.25 The 30 members of an orchestra were asked how many instruments
each could play. The results are set out in the frequency distribution.
Calculate the mean number of instruments played:
Number of instruments,x}| 1 2 3 4 5
Solution 1.25 fx
11 _) 2px
Cae
10 Lf
5
63
3 ie
1 30
Xf = 30 = 2.1
: 2 fy Ss 6
Kk = 8,7) whereVy =F and. y = xX,
Example 1.26 For the data of Example 1.25, find the mean using an assumed
mean of 2.
Solution 1.26
sO si I bo a
30
= 2.1
The mean number of instruments played is 2.1, as before.
Example 1.27 The lengths of 40 bean pods were measured to the nearest cm and
grouped as shown. Find the mean length, giving the answer to 1 d.p.
x fx
SI II
Sig
825
bah
= 20.6 (1d.p.)
Therefore the mean length of the bean pods is 20.6 cm (1 d.p.).
2 315 — 30
oy = c=10 —40
16 7 =o. — 35
21 14 0 0
26 8 5 40
50
Dfy =—105+90 =—15
2 fy
i= 21 --yeywhere™ ¥ = Sf
oD
eee ri ee
40
= 20.6 (1d.p.)
Therefore the mean length of the bean pods is 20.6 cm (1 d.p.), as
before.
42 A CONCISE COURSE IN A-LEVEL STATISTICS
Exercise 1]
1./ Find the mean for each of the following (a) Calculate the mean maximum day time
frequency distributions (a) without using temperature for February. (b) Find the
an assumed mean, (b) using an assumed mode. (c) Find the median.
mean. 5.
A sample of 100 boxes of matches was
o
follows:
Number of matches
pens 47 48 49 50 51
ai 14.25 32 23. 6
; Frequency 4
Table A
Example 1.28 Find the mean of the set of numbers 5693, 5700, 5714, 5721,
5735.
Solution 1.28 Choose as assumed mean X, = 5714. However, when we write out
the values of x —xX, we note that all the numbers are multiples of 7.
KO TL
So, we introduce a further column y = eo
Dx = (5)(5714)
+ D7y
= (5)(5714)+72y
ak Ly
So =5 = (6714-75
ja Cra)
Therefore x = Daas
aa 57 42,0
Proof Let
Xa,
vi = a Ce aa
Rearranging x; = X,+ by;
Summing Lx = nx, toZy
oe S, Ly
so 4 tg te Dace
n n
i.e. x = X,+ by
This method is particularly useful when the data is in the form of a
frequency distribution and the intervals are of equal width.
Zfy;
In this case f= Sa
: Df;
For data grouped into classes of equal width:
(a) use the mid-point of each interval to represent the class,
(b) choose a central value as assumed mean, X,,
(c) divide by the class width, b.
Example 1.29 A girl measured her waiting time (in minutes, to the nearest minute)
for the school bus on 30 mornings, and obtained the following
results:
Solution 1.29 The mid -points are 2.5, 6.5; 10.5, 14.5, 18.5. Choose a central value
of 10.5, say, as assumed mean; so x, = 10.5.
The class widths are each equal to 4 so take b = 4.
DESCRIPTIVE STATISTICS 45
So we use
10.5 and b= 4
i.e. 10.5 + 4y
Waiting
time (min)
0
T
8
Lfy =—12+15=3
z
Now xX = X,+by where Y = Ie
2f
oa 3
So x = 10.5+4|—
30
= 105+0.4
= 10.9
The mean waiting time is 10.9 minutes.
eee
De ee
Exercise 1k
1. Find the mean, X, of the set of numbers Find the mean, using a method of coding.
10, 20, 30, 40, 50, 60 using the coding
_ x—A40
02: 4, Find the mean, using a method of coding,
for each of the following frequency dis-
2. Find the mean, x, of the numbers 217, tributions:
222, 227, 237,242,252 using the coding
Skanes& (a) | Interval |15-21 22-28 29-35 36-42 43-49
Ere Frequency 2 18 23 17 9
3. The table shows the masses of a group of (b) | Interval | 0- 10- 20- 30- 40- 50- 60-
male students at a college. Measurements Frequency |10 15 23 32 18 2 0
have been taken to the nearest kg.
60-64 65-69 70-74 75-79 80-84 85-89 (c) 1-2 3-4 5-6 7-8 9-10 11-12 13-14
Mass (kg)
2 42 60 35 12 p mccemsiomnid | 18 «6 2
Frequency 4
46 A CONCISE COURSE IN A-LEVEL STATISTICS
5. Ina practical class students timed how long feed the animals. The results were as
it took for a sample of their saliva to break shown:
down a 2% starch solution. The times, to
the nearest second, are shown in Table A |Time(min) [-15 -20 -25 -30 -35 -40 -45 -50
below. Find the mean time, using a method
of coding.
6. Each morning for a month the owner of a Calculate the mean time taken to feed the
smallholding timed how long it took to animals, using a method of coding.
Table A
MEASURES OF DISPERSION
THE RANGE
The range is the difference between the highest and the lowest
value. It is based entirely on the extreme values.
The mean deviation from the mean makes use of all the observations.
Example 1.30 Two machines, A and B, are used to pack biscuits. A sample of 10
packets was taken from each machine and the mass of each packet,
measured to the nearest gram, was noted. Find the mean deviation
from the mean of the masses of the packets taken in the sample
for each machine. Comment on your answer.
_| Machine A
(mass in g)
Machine B
(mass in g)
2 2000 Ux 2000
pe OH x = ——pit | = 200
n 10 n 10
Zz
2
1
0
0
1
r
2
5
= 1.8 = 4.2
The larger number for machine B indicates that the masses are more
widely spread than those from machine A.
Therefore machine A is more reliable.
the
It is possible to find the mean deviation from the median and
on is not
mean deviation from the mode. However, the mean deviati
widely used.
48 A CONCISE COURSE IN A-LEVEL STATISTICS
THE STANDARD DEVIATION, s
The square of the deviation from the mean is considered for each
value of x.
lt TZ. ere ae
THE VARIANCE
Example 1.31 For the data given in Example 1.30, calculate the standard devia-
tion of each machine, given X = 200g in each case.
—8 64
Gh
Mr’
eieeieS
or
DESCRIPTIVE STATISTICS 49
» _ 2(x—200) s¢
>. =
U(x—200)
——— ae
10 10
= 5.6 = 24
s = V5.6 s = /24
= 2.37 (2d.p.) = 4.90 (2d.p.)
The standard deviation for machine A is 2.37 g and the standard
deviation for machine B is 4.90 g, once again indicating that machine
_ A is more reliable.
= =
I ZK;
Beyer ater?
—SpaxXkypt XxX )
n
1
Ss (Dx? —2x xz Lx’)
n
Sx ee x, ex
gt
n n n
Lx? 8 —2
Sm RK
n
2
— 2 Sistine
n
So we have (38 Il
Example 1.32 Find the mean and the standard deviation of the set of numbers
2,3,5,6,8
Solution 1.32
50 A CONCISE COURSE IN A-LEVEL STATISTICS
Standard deviation:
[= (x—®)* : De 2 as
Method 1— using s = | Method 2— using s = a ae
crs
138
= sep as
= 4.56
s = V4.56
214 (2d.p.) = 2.14 (2d.p.)
Therefore the standard deviation of the set of numbers is 2.14
(2 d.p.).
NOTE: in this case there is far less working involved in method 2.
Exercise 11
Find the mean and the standard deviation For a set of 9 numbers ©(x— X)* = 234.
of the following sets of numbers. For Find the standard deviation of the num-
questions (a), (6) and (c) try using both bers.
forms of the formula for the standard
deviation. Use whichever you wish for For a set of 9 numbers ©(x—X)? = 60
parts (d), (e) and (f). Do not use the and Dx” = 285. Find the mean of the
programmed functions on your cal- numbers.
culator.
(a) 2, 4,5, 6,8 The numbers a, b,8,5,7 havea mean of
(bo) 6,8,9,11 6 and a variance of 2. Find the values of
(c) 11, 14,17, 28, 29 aand b,ifa>b.
(d) 5,13, 7,9, 16,15
Find the mean and the standard deviation
(e) 4.6, 2.7, 3.1, 0.5, 6.2
of the set of integers 1,2,3,..., 20.
(f) 200, 203, 206, 207, 209
Find the mean and the standard deviation
of the first n integers.
The mean of the numbers 3,6, 7,a,14 is
8. Find the standard deviation of the set You may use
of numbers.
>a - = 1ann td);
9. From the information given about each of 10. Calculate the mean and the standard
the following sets of data, work out the deviation of the four numbers
Te[3 | Be
missing values in the table:
25 .Ono
The following example has been done using two types of calculator,
and you should consult your calculator instructions if yours does
not appear to follow one of the patterns.
Example 1.33 Find the mean and standard deviation of the numbers
33, 28, 26, 35, 38
Solution 1.33
pressing pressing ec
sar aes]
gives SI lI 32 gives SI lI 32
Exercise 1m
Example 1.34 For the set of numbers 3,6,7,9,10 the mean is 7 and the standard
deviation is /6. If each number in the set is increased by 3, find the
new mean and standard deviation. Comment on your answers.
on
oO
oo
feb)SS a n II |S
-
D(x —xX)? = 30
Therefore, if each member of the set of numbers is increased by 3,
then the mean is increased by 3 but the standard deviation remains
unaltered.
DESCRIPTIVE STATISTICS : 53
In general, consider the set of n numbers x;, x», ..., x, with mean
x and standard deviation s,.
ee 2 (y;— 9)?
Sask =>a
n
PR SOARS
Alls
n
a L (x;—X)?
n
= 5,2
Therefore S, = 8
Exercise 1n
G) By considering the set of numbers 3,6,7,9, 2. The set of numbers x1, %2,...,%X, has
10, with mean 3 and standard deviation mean X and standard deviation s;. Each of
w6i investigate the effect on the mean and the numbers is multiplied by a constant
on the standard deviation of multiplying term k. Show that the new mean is kx and
each term by 3. the new standard deviation s2 = ks}.
54 A CONCISE COURSE IN A-LEVEL STATISTICS
‘3 (a) Find the mean and the standard devia- () If a;is the ith member of A and d;is
‘ tion of the set of numbers 4,6,9,3,5,6,9. the ith member of D, find a relatiouship
(b) Deduce the mean and the standard between a; and d;in the form d;=/la;+m
deviation of the set of numbers 514,516, where / and m are constants.
519, 513,515,516, 519. (c) The two ordered sets X and Y
(c) Deduce the mean and the standard each have rn elements and y;= px;+q
deviation of the set of numbers 52,78, where p and q are constants. If the mean
117, 39,65, 78,117. and the variance of X are X and s,°, show
f{Y
(a) Find the mean and the variance of the eset ae isae os ea vee
ordered set of numbers ¥ (1 Additional)
= {1,2,3, 4,5, 6, 7}.
Hence find the mean and the variance of the
following ordered sets 5./ A set of values of a variable X has mean 5
and standard deviation 2. Values of a new
B = {4,5,6,7,8,9,10} variable are obtained by using the formula
C = {10, 20, 30, 40, 50, 60, 70} Y = 4X-—3. Find the mean and the standard
D = {13,23,33,43, 53, 63, 73} , deviation of the set of values of Y.
Example 1.35 A set of marks has a mean of 40 and a standard deviation of 5. The
marks are to be scaled so that the mean becomes 50 and the stand-
ard deviation becomes 8. If the equation of the transformation is
y = ax + b, find the values of the constants a and b. Find also the
scaled mark which corresponds to a mark of 45 in the original set.
Solution 1.35 If there are n marks, then y; = ax;+ b for eachi =1,2,...,n.
Summing Ly; = aXx;+nb
29 =a ee b
n n
So y = ax+b
Hence 50 = a(40)+b 40a+b = 50 (i)
Let s, be the original standard deviation, and s, the new standard
deviation
L(y;—y)?
Then So aes wee =610%
n
2 laxp (ae O12
n
=
— a
iC ases
on
= q?s,?
or 8, = as,
So 8 = 5a a = $(ii)
DESCRIPTIVE STATISTICS 55
Exercise 10
It is proposed to convert a set of marks were scaled linearly (that is, a mark of x
whose mean is 52 and standard deviation is became a mark of ax +6 where a and b
4 to a.set of marks with mean 61 and are constants) so that the means and
standard deviation 3. The equation for the standard deviations of the marks in both
transformation necessary to convert the examinations became the same. The
marks is y = ax + b. Find (i) the values of original means and standard deviations
a and b, (ii) the value of the scaled mark are shown in the table.
which corresponds to a mark of 64 in the
original data, (iii) the value in the original
data if the scaled mark is 79. Mean mark 48
Standard deviation 12
The marks of 5 students in a mathematics
test were 27, 31, 35, 47, 50. Find a and b.
(i) Calculate the mean mark and the
The original marks of a particular student
standard deviation. are 36 in algebra, 48 in biology. In what
(ii) The marks are scaled so that the mean sense, if any, has he done better in algebra
and standard deviation become 50 and 20 than in biology? (C Additional)
respectively. Calculate, to the nearest whole
number, the new marks corresponding to A linear function f(x) = ax + b transforms
the original marks of 31 and 50. Xe enon oO sek)
(C Additional)
into a set Y, so that {(5) = 13 and f(1) = 5.
In order to compare the performances of (a) Find f.
candidates in two schools a test was given. (b) Calculate the mean and the variance
The mean mark at school A was 45, and of X.
the mean mark at school B was 31 witha (c) Hence calculate the mean and the
standard deviation of 5. The marks of variance of Y.
school A are scaled so that the mean An element, k, is added to X forming a
and standard deviation are the same as set Z. Given that the mean of Z is three
school B and a mark of 85 at school A greater than the mean of X, find
becomes 63. Find the values of a and b (d) the value of k,
if the transformation used is y = ax + b. (e) the variance of Z. (L Additional)
Find also the original standard deviation
of the marks from school A. Show that the standard deviation of the
integers
Given that the mean and standard devia-
1, 2,3, 4, 5,6, 7
tion of a set of figures are J and 0 respect-
tively, write down the new values of the is 2.
mean and standard deviation when Using this result find the standard deviation
(i) each figure is increased by a con- of the numbers ok
stant c, (a) 101, 102, 103,404, 105, 106, 107.
(ii) each figure is multiplied by a con- (b) 100, 200, 300, 400,500, 600, 700.
stant k. (c) 2.01, 3.02, 4.03,5.04, 6.05, 7.06, 8.07.
(d) Write down seven integers which have
A group of students sat two examinations,
mean 5 and standard deviation 6.
one in algebra and one in biology. In order
(L Additional)
to compare the results the algebra marks
56 A CONCISE COURSE IN A-LEVEL STATISTICS
Example 1.36 A set of 12 numbers has a mean of 4 and a standard deviation Olea.
A second set of 20 numbers has a mean of 5 and a standard devia-
tion of 3.
Find the mean and the standard deviation of the combined set of
32 numbers.
ny
sO > «? = n\(X%,2+8,2) = (12)(42+ 27) = 240
i=4
n2 F
n2
and x? = n3(X52+937)\p 2005? 32) 680
>” = Sia?t
I Se7 = si gin = 920
a
all x i= =
DESCRIPTIVE STATISTICS 57
So
7.359
So S = V/7.359
2.71 (2d.p.)
Therefore the mean of the combined set of numbers is 4.625 and
the standard deviation is 2.71 (2d.p.).
_ yk +1 ,X2
(n, +n)
Example 1.37 Suppose that the values of a random sample taken from some
population are x1, X2,...,X,. Prove the formula
n
; Standard.
‘
management’ 350 12.4% 2.1%
‘
union’ Zod. 10.7% 1.8%
n,X, =e NX
n, +n,
(350)(12.4) + (237)(10.7)
587
= 11.7' (1dp.)
Therefore the mean percentage level for the 587 workers is 11.7%
(1 d.p.).
Dx?
For the combined variance s? = —— — xX”
n
Management
ny nz
LPS vor
Bs Bs 1 7—1 _ x2
n,+n,
_ 55 359.5
+ 27 902.0 aie
587 eet
= 4,95
( T) For each of the following sets of data, find and a standard deviation 1.9 yr. Calculate
~ the mean and the standard deviation of the the mean and standard deviation of the
combined set. ages of all the 565 pupils. (AEB 1976)
a)rny = 125 x; = 6, = 2 Se
te) ar 5. Be ma 51 = ‘4. Suppose that the values of a random
Derek? | 10, s2.= 3 sample taken from some population are
(b) ny = 30, X, = 27, s; = 5.6 X1,%2,...,Xy,. Prove the formula
nz = 40, X2 — 33, 2 6.4 n 2 n
Example 1.38 The table shows the number of children per family for a group of
20 families. The mean number of children per family is 2.9. Find
the standard deviation.
Frequency, f
60 A CONCISE COURSE IN A-LEVEL STATISTICS
D f(x —x)’
Solution 1.38 Method 1—using s =
af
So $2
Yf(x — 2.9)?
= 2f(x— 2.9)"
Lf
_ 29.80
20
= 1.49
s = /1.49
= 1.22 (2d.p.)
; 2B fx?
Method 2—using s = xs
Lf
So
La 198 (2 9)?
20 ;
= 1.49
s = 71.49
= 1.22. (2d.p.)
Set to SD Set to SD ia
|gives
seurt][1 % = 29 ][7 ]eives
[inv = 29
suit] |2 |gives sei 22088n [wv |[8 |gives ; 1E2203..;
3 |gives
[Kout][ =f = 20 ][6]sives
[inv =f = 20
2 |gives Eye =II 58
[Kout][ |[5|gives Eye = 58
[vv
[Kout ][1 |gives Sfx? =I 198 [inv |[4 |gives Sfx?= 198
Example 1.39 The lengths of 32 leaves were measured correct to the nearest mm.
Find the mean length and the standard deviation.
21
24
27
30
33
D fx? = 23 805
Sfx 867
N Ow g = eee
x Sf 39 a7 st (tian.p.)
Sfx? 23805 (867\2
Bnd pi Se | = 9.835
Sf 32
s /9.835 = 3.14 (2dp.)
The mean length of the leaves is 27.1 mm and the standard devia-
tion is 3.14 mm (24d.p.).
_ Exercise 1q
A
Frequency, f | 2 b OC 2) 39) 6 4 2
4. Fora particular set of observations Lf = 20,
Dfx? = 16143, Dfx = 563. Find the values
2. The scores in an IQ test for 60 candidates of the mean and the standard deviation
are shown in the table. Find the mean 7
score and the standard deviation.
5. For a given frequency distribution
100-106 107-113 114-120 121-127 128-134
3 7. i = ; mh
> =e
—x)2= 182.3, Dfx 2= 1025, 2f= = 30.
tk, Be.
y= b where X, is the assumed mean and
b is a suitable constant
Now = x, + by
and x = X,t+by (seep. 44)
z Lie 2
Also Rae) PORe Aiea
be) en eT
n
_ 2 (%_ + by; — (%, + bY)?
n
2 b%(y;—F)?
n
Ly yy
and so s=b 0 for: 7)1,75...
58
n
Ly;?
s = 8 ee,
n
If the numbers occur with frequencies /,, f,, ..., f,, then the
corresponding formulae for the standard deviation are:
These formulae look very complicated, but in fact they are very
easy to use.
aX eS
Solution 1.40 We will use the coding y = . * where X, = 342,0=5.
ewBAZ
sO aeaa ene
64 A CONCISE COURSE IN A-LEVEL STATISTICS
Now
; oF
Fe 5
= 25[3.6—(—0.4)?]
= 86
So s = /86
= 927 (2d.p.)
The standard deviation of the set of numbers is 9.27 (2 d.p.).
Example 1.41 For the data given in Example 1.39, find the standard deviation,
using a method of coding.
Solution 1.41 We note that the class widths are each equal to 3 and a central
Si
value is 27. Therefore, we choose the coding y =
Now y=
DESCRIPTIVE STATISTICS ; 65
D fy2
We have s? = b? p —¥|) wherey bp = 73
2f
35 layne
= 9/— — |—
. La |
= 9.835
So s = 79.835
= 3.14 (2d.p.)
‘The standard deviation of the length of the leaves is 3.14 mm, as
ree
Exercise 1r
1. Find the mean and the standard deviation Find the mean mark and the standard
of the following sets of data, using a deviation by using an assumed mean
method of coding: between 50 and 60 and a coding factor
of 10.
(a) |x |304 308 312 316 320 324
ife es oN eAel ea 2 3. A farmer grows two different varieties of
(b) 10-19 20-29 30-39 40-49 50-59 60-69 potatoes, Desiree and Pentland Squire. A
sample of 50 potatoes of each variety is
f 3 ie 48. 2p te
taken and the potatoes are weighed. The
(c) |x |1250 1500 1750 2000 2250 2500 2750 results are shown in Table B below. Find
Ca ar ee ee the mean and the standard deviation for
each sample. Use a method of coding.
(@) i be LOS ls) 22) Gee G 0 4, The table shows the times taken on 30
consecutive days for a coach to complete
(e) ey(Oa Of 0:7 1.0 123° WS 1.9 22 one journey on a particular route. Times
ie eens 1579 fe. 3.1 have been given to the nearest minute.
Find the mean time for the journey and
(f) -200 -250 -300 -350 -400 -450 -600 the standard deviation.
i 8 80 ess 38 |25 14 28
Time (min) 60-63 64-67 68-71 72-75 76-79
2. The marks obtained in an examination by Frequency 1 3 12 10 4
190 students are recorded in Table A below.
Table A
Mark O- 10- 20- 30- 40- 50- 60- 70- 80- 90- 100-
|Frequeney| 3 5 5 6 25 33 49 40 15 9 0
TableB
Desiree 1 3 4 7 12 15 5 3
frequency
Pentland
17 12 3 1
Squire
frequency
66 A CONCISE COURSE IN A-LEVEL STATISTICS
Example 1.42 In order to estimate the mean length of leaves from a certain tree a
sample of 100 leaves was chosen and their lengths measured correct
to the nearest mm. A grouped frequency table was set up and the
results were as follows:
(b) Calculate estimates for the mean and standard deviation of the
leaf lengths using an assumed mean of 4.7 cm.
——4
20
Frequency
(b) As the intervals each have a width of 0.5, and we are told to
= 3
Now
" 0.5
so x = 4.74+0.5y
a
and X = 4.7+0.5y where y = *Ty
ead
Therefore x = 4.74+0.5 ae
100
= 4.405
s2 — oy
> fy? =
Now
Lf
363 —59\?
=..(0.25)=— Na
100 100
= (0.8205
s = 0.8205
/V
= 0.91 (2d.p.)
Therefore the mean length of the leaves is 4.405 cm and the stan-
dard deviation is 0.91 cm.
(c) Each class width is 0.5 cm, so the class with mid-point 3.7 has
l.c.b! ="°3.7—0.25 = 3.45
u.c.b. = 3.7+0.25 = 3.95
< 1:95 0
46 items
50.5 items
70 items
4.5
An estimate of the median is 4.45 + Flos = 4.54cm (2d.p.).
Age(years) | 0-9 10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-99
Frequency 44 56 64 78 60 40 36 18 ~
Solution 1.43 Assuming that ages have been given in completed years then the
interval 80-99 means 80 < age < 100; someone who is 99 years and
11 months, for example, would come into this interval.
DESCRIPTIVE STATISTICS , 69
L fy?
= 1969
N 2% 46
Ow y ie
So x = 45+10y
a my=
¥ =
and f=.45710y where
Lf
Seo
i ig i ]
== 30.0 1(1d.p)
>» 2
So Ss =. 20-2. (4'd.p.)
the
Therefore the mean age of the population is 35.9 years and
deviati 20.2 years.
on is See
OS EN dEe
standar ees
70 A CONCISE COURSE IN A-LEVEL STATISTICS
Cumulative
Age (years) Frequency Age (years) frequency
0-9 44 44
10-19
20-29
frequency
Cumulative
0 20 40 60 80 100
Age (years)
We want to find the number of people with ages within one stand-
ard deviation of the mean, i.e. the number of people with ages in
the interval 35.9 + 20.2 = (15.7, 56.1).
From the graph, we estimate that
72 people have ages below 15.7 years,
330 people have ages below 56.1 years.
So 258 people have ages in the interval (15.7, 56.1),
258
i.e. the percentage of the population wenlCL00)Z
400
= 64.5%
Therefore 64.5% of the population of the village have ages which
are within 1s.d. of the mean.
DESCRIPTIVE STATISTICS Zhi
Example 1.44 A set of n values has mean yu and variance s,”. A second set of values
has mean au and variance s,”. Given that s is the standard deviation
of the combined set of 27n values, show that
1 1
a= s(8 1 82) AG te
Solution 1.44
elesetse
= i
Second set
nu(a
+1)
2n
= 3u(a+1)
patel n
Dx? = n(s;7+p7)
Similarly Dx—?- = n(s.°
+ a7”)
For the combined set
py?
ai" tS)0) $(1 +a?)y?—4(1 +a)?
Miscellaneous Exercise 1s
Two hundred and fifty Army recruits Obtain an estimate of the mean and standard
have the following heights. deviation of the data. Estimate the median,
and the lower and upper quartiles. (O &C)
Height (cm) 165- 170- 175- 180- 185- 190-195
No. of recruits | 18 37 60 65 48 22 4, Table A below gives an analysis by num-
bers of employees of the size of UK
Plot the data in the form of a cumulative factories of less than 1000 employees
frequency curve. Use the curve to estimate manufacturing clothing and footwear.
(a) the median height, (b) the lower Calculate as accurately as the data allow
quartile height. the mean and the median of this distribu-
The tallest 40% of the recruits are to be tion, showing your working.
formed into a special squad. Estimate If 90% of the factories have less than N
(a) the median, (b) the upper quartile of employees, estimate N. (O&C)
the heights of the members of this squad.
(SUJB Additional) \ The numbers 4,6,12;4,10,12,3,x,y
' have a mean of 7 and a mode of 4. Find
Below are given the number n of hours (i) the values of the two numbers x and
worked in a week by 64 men. y, (ii) the median of this set of nine
numbers.
30.8 27.6 33.6 39.4 39.7
21.8 40.6 33.9 36.9 39.1 When two additional numbers 7+ 7 and
45.4 42.5 9.6 26.3 36.1 7— n are included the standard deviation
30.5 44.4 38.4 40.6 26.5 of all eleven numbers is found to be 4.
52.7 35.7 28.9 38.2 30.4
34.8 37.8 38.0 43.7 40.8 Write down the mean of these eleven
40.1 23.7 31.8 42.0 29.1 numbers and calculate the value of n.
37.3 28.4 39.6 22.9 35.2 (C Additional)
(i) Group the numbers into intervals of 654 The sum of 20 numbers is 320 and the
width 8 hours defined by 9.5 <n < 12.5, sum of their squares is 5840. Calculate
1285 in) <a ober the mean of the 20 numbers and the
(ii) Use the grouped data to calculate standard deviation.
estimates of the mean and standard (i) Another number is added to these
deviation of n. 20 so that the mean is unchanged. Show
(iii) Estimate the percentage of workmen that the standard deviation is decreased.
for whom nis within one standard devia- (ii) Another set of 10 numbers is such
tion of the mean. (MEI) that their sum is 130 and the sum of
their squares is 2380. This set is com-
The following table shows the durations bined with the original 20 numbers. Cal-
of 40 telephone calls from an office via culate the mean and standard deviation
the office switchboard. of all 30 numbers. (C Additional)
Duration A weather station recorded the number
S1 1-2/ 2-3) 3-5 5-10 210
in minutes
of hours of sunshine each day for 80
Number of
6 LO 45 5 4 0} days, with the results as shown in Table B
calls
below.
Table A
Number of
11-19 20-24 25-99 100-199 200-499 500-999
employees
Number of
1500 800 2800 70 0 400 100 5800
Table B
Hours of
O O-1 1-2 2-3 3-4 4-5 5-6 6-7 17-8 8-12 over 12
sunshine
Number
Bee ls 2 6 17 ade allah 5 3 9 2 0
DESCRIPTIVE STATISTICS 73
[The grouping symbol 2-3, for example, nearest minute, the estimated mean and
denotes greater than 2 hours and less than standard deviation for the duration of all
or equal to 3 hours.| 100 journeys. (C)
State which is the modal group.
10.
~
Table D below gives the cumulative
Construct a cumulative frequency table frequency distribution of the masses x in
and draw the cumulative frequency curve. kilogrammes of a group of 200 eighteen-
Use your curve to estimate (i) the median, year-old boys.
(ii) the inter-quartile range, (iii) the per-
Draw a cumulative frequency graph and
centage of days for which more than 3h from this estimate the median.
hours of sunshine were recorded.
(C Additional) Compile a frequency distribution from
the data and hence estimate the mean and
8. (a) Sketch the expected frequency curves standard deviation of the sample. State a
‘for each of the following distributions: well known probability distribution which
(i) the number of light bulbs broken you would expect to fit such data.
in boxes containing 125 bulbs, id (JMB)
assuming that the modal number of iy
breakages is 0, 11. 100 pupils were tested to determine their
(ii) the age at marriage of females. intelligence quotient (I.Q.), and the
(b) State the assumption that is made in results were as follows:
obtaining measures of average and dis- Beals 55- 65- 75- 85- 95- 105- 115- 125-134
persion from grouped frequency tables. No, of pupils | 1 1° 2 #6 21 29 24 12 4
The table below shows the ages, at last
birthday, of the employees of a certain All 1.Q.’s are given to the nearest integer.
firm. (i) Calculate the mean, and the standard
deviation.
Age (last Less than 20 20- 25- 30- 40- 50 and over (ii) Draw a cumulative frequency graph,
birthday)
and estimate how many pupils have I.Q.’s
within 1s.d. on either side of the mean.
employees
(SUJB)
Without drawing a cumulative frequency
curve, estimate (i) the semi-interquartile 12. (a) Find the median, mean, and standard
range, (ii) the number of employees aged deviation of the set of numbers 3,5,12,1,
37 and over. (C Additional) 6, 3,12.
(bo) A set of digits consists of m zeros
9. Table C below shows the durations of 60 and n ones. Find the mean of this set and
journeys on the same route bya lorry, show that the standard deviation is
the variations in journey times being
caused by varying traffic conditions. V(mn
Can) (C Additional)
Calculate, to the nearest minute, estimates
of the mean and standard deviation for
the duration of the journeys. 13. (a) A set of values of a variable X has a
When the times for 40 other journeys mean WU and a standard deviation 0. State
were taken, it was found that the mean the new value of the mean and of the
_pf standard deviation for the times of standard deviation when each of the
these 40 journeys were 6h 24 min and variables is (i) increased by k, (ii) multi-
18 min, respectively. Find, also to the plied by p.
Table C
\
Table D
30 35 40 45 50 55 60 65 70 75 80 85 90 95
Number with
0 1 4 11 25 47 79 114 146 171 187 195 198 200
mass less
than x
74 A CONCISE COURSE IN A-LEVEL STATISTICS
Values of a new variable Y are obtained that the mean becomes 45 and the lower
by using the formula Y = 3X+ 5. Find quartile becomes 35.
the mean and the standard deviation of State, with reason, whether the quartiles
the set of values of Y. of the original marks will scale into the
(b) It is proposed to convert a set of quartiles of the scaled marks. (SUJB)
values of a variable X, whose mean and |
16: The table shows the yield, in litres, of
standard deviation are 20 and 5 respec-
/ milk produced by 131 cows at a certain
tively, to a set of values of a variable Y
farm on a given day.
whose mean and standard deviation are
42 and 8 respectively. If the conversion Yield (litres) }5-10 11-16 17-22 23-28 29-34 35-40
formula is Y = aX + b, calculate the value 26 18 7
of a and of b. (C Additional)
(a) State the modal class and estimate the
14. A set of numbers has mean pL and standard mode. (b) By calculation, estimate the
deviation o. A new set of numbers is median yield. (c) Draw a cumulative
obtained by subtracting uw from each frequency curve, and from it estimate
number and dividing the result by 0. Write the semi-interquartile range. (d) Calculate
down the mean and standard deviation the mean and the standard deviation of
of the new set of numbers. the distribution, using a method of
In an examination in Statistics the mean coding.
mark of a group of 120 students was 68
andathe=stantiard udeviatiouswassGain 17. In a certain industry, the numbers of
Algebra the mean mark of the group was thousands of employees in 1970 were as
62 and the standard deviation was 5. One shown in Table F below, by age groups.
student scored 76 in Statistics and 70 in Calculate the arithmetic mean, median,
Algebra. By scaling the marks for each variance and standard deviation of the
subject so that each set of marks has the ages of employees in the industry.
same mean and standard deviation ou Estimate the percentage of the employees
pare the performances of this student in whose ages lie within one standard
the two subjects. (C Additional) deviation of the arithmetic mean.
15. 200 candidates sat an examination and aS ee te
the distribution was obtained as shown in 18, PRO ee ae 4d ee eee
ble B below. \) each estimate the height of the top ofa
Table E
Marks (x) |10-19 20-29 30-39 40-49 50-59 60-69 70-79 80-89 90-99
10 18 20 30 49 46 20 5 2
Table F
Age last
AaeeOtee 20-24 25-29 30-34 35-39 40-44 45-49 50-54 55-59 60-64
Number of
thousands
66 65 56 50 42 37 35 30 24 22
DESCRIPTIVE STATISTICS 4 75
(e) One member of the original class of (b) The average height of 20 boys is
12 revises his estimate and the new mean 160 cm, with a standard deviation of
for the 12 estimates is X+ 0.5. Find the 4cm. The average height of 30 girls is
increase in the estimate of this member. 155 cm, with a standard deviation of
(f) The teacher of the class makes an 3.5cm. Find the standard deviation of
estimate of the height of the church the whole group of 50 children. (SUJB)
tower and when her estimate is taken
with the original 12, the mean of all 13 21. (a) Sketch frequency curves for distribu-
estimates is X+ 0.5. Find the teacher’s tions which have one mode and for
estimate. which (i) the mode, median and mean
(g) Two extra children, different from coincide, (ii) the mode is less than the
those mentioned in (c), join the class and median, indicating on each sketch the
each make an estimate so that the mean positions of these measures.
of their two estimates and the original 12 (b) The mean of the set of numbers 3,1,
estimates is X+ 0.5. 7,2,1,1,7,x,y, where x and y are single
digit positive whole numbers, is known to
Find the sum of their two estimates.
be 4. Show that x+y = 14.
(L Additional)
Hence, or otherwise, find the mode of
this set of numbers when (i)x =y,
_\19. Ten values of a variable x are
(ii)x Fy.
822, 0.0, O71, 6225 0.4, (-9,,8.0, 8:3, 7.8, 8.1
If the standard deviation is 3/76 find x
Express each of these values in the form
and y, assuming that x Sy.
8+ 0.1y. Calculate the arithmetic mean
(C Additional)
and the variance of the ten values of y
and hence, or otherwise, deduce the 22. A random sample of 1000 surnames is
mean and the variance of the ten values drawn from a local telephone directory.
of x. The distribution of the lengths of the
Hence find the mean and the variance of names is as shown in Table G below.
the set of ten numbers Calculate the sample mean and sample
824, 804, 814, 824,844, standard deviation. Obtain the upper
794, 804, 834, 784,814 quartile.
A transformation of the form z = a+ bx, Represent graphically the data in the
where b > 0, is applied to the first set of table.
ten values of x so that the mean is Give a reason why the sample of names
increased by 0.9 and the standard devia- obtained in this way may not be truly
tion is doubled. Find the values of the representative of the population of Great
constants a and b. (L Additional) Britain. (JMB)
)
20. /Show, from the basic definition, why 23. In an agricultural experiment the gains in
/ the standard deviation of a set of obser- mass, in kilograms, of 100 pigs during a
“
4
vations x,X2,X3,---,*, with certain period were recorded as follows:
mean X may be found by evaluating
Gain in mass | 59 410-14 15-19 20-24 25-29 30-34
(kilograms)
Ex,70°.5sa.
freee
n Frequency 2 29 37 16 14 2
(a) Find, showing your working clearly Construct a histogram and a relative
and not using any pre-programmed cumulative frequency polygon of these
function on your calculator, the standard data. Obtain (i) the median and the semi-
deviation of the following frequency interquartile range, (ii) the mean and the
distribution: standard deviation.
Which of these pairs of statistics do you
27 28
consider more appropriate in this case,
15 11 and why? (AEB 1977)
Table G
24. Table H below gives the ages in completed (b) Draw a histogram and comment on
years of the 113 persons convicted of the shape of the distribution.
shop-lifting in a British town in 1986. (c) Using the frequency table estimate
Working in years and giving answers the mean and standard deviation of the
correct to 1 place of decimals, calculate marks.
(a) the mean age and standard deviation, (d) The marks are to be scaled linearly
(b) the coefficient of skewness given by
by the relation Y=atbX where X is
the old mark and Y the new mark. The
(mean — mode)/standard deviation, new mean and standard deviation are
(c) the median age. to be 50 and 10 respectively. Using your
estimates in (c) calculate suitable values
Which do you consider to be best as a
fora and b. (SUJB)
representative average of the distribution
—the mean, median or mode? Give
27. A travel agency has two shops, FR and S.
reasons for your choice.
The number of holidays purchased in a
Draw a histogram of the data with a class particular week and the mean and stand-
interval of 2 years. (SUJB) ard deviation of the costs of these holidays
at each shop are shown in the following
\ 25. A grouped frequency distribution of the table.
ages of 358 employees in a factory is
shown in Table I below. Estimate, to the Number of | Meancost | S.D.
nearest month, the mean and the standard holidays (£) (&)
deviation of the ages of these employees.
Shop R 32 190.35 10.4
Graphically, or otherwise, estimate
Shop S 24 202.25 15.5
(a) the median and the interquartile
range of the ages, each to the nearest Calculate the mean, and, to the nearest
month, : penny, the standard deviation of the costs
(b) the percentage, to one decimal place, of all the 56 holidays purchased. (L)P
of the employees who are over 27 years
old and under 55 years old. (L) 28. The following are the ignition times in
seconds (correct to the nearest 100th
26. The following is a set of 109 examination of a second) of samples of 80 uphol-
marks ordered for convenience. stery materials They are arranged in
numerical order by columns.
(a) Construct a grouped frequency distri- (a) Group these data into 8 equal classes
bution using a class width of 10 and commencing 1.00-2.49, 2.50-3.99, ...
starting with 0-9. and arrange them in a frequency table.
Table H
|Age| 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30-49
Bee ee aa ae es 6
Table I
Age (last birthday) 16-20 21-25 26-30 31-35 36-40 41-45 46-50 51-60 61-
Number of employees 6 56 58 52 46 38 36 36 0
DESCRIPTIVE STATISTICS , 77
(b) Using the frequency table obtain litter. The contents of each bag are then
estimates for the mean time and standard weighed. A summary of the results is
deviation. shown in the table.
(c) Construct a frequency polygon for
the distribution and comment on its : Mean wt. S.D
Sample | Size %
shape.
(d) Chebychev’s Theorem states that, for al 50 11.8 0.5
any distribution, the proportion of the 2 30 2 1 0.9
population that lies outside k standard ao 20 ite. 7 eal
deviations from the mean is less than
1/k?. Verify this for the above distribu- Find, in kg to 2 decimal places, the mean
tion when k > 1.5. (SUJB) weight per bag and the standard deviation
arr for the 100 bags. (L)P |
20Xtn a borehole the thickness, in mm, of : : . s
Heros diriks aie cho warn thetable: 31. Referring to your projects if possible,
give an example of a graphical represen-
tation of
Thickness
(mm) Preis ast Oms hs AD 89% (a) a discrete frequency distribution,
(6) a grouped frequency distribution.
Number “5 9 ey 0
of strata Given the frequency distribution
30. Three random samples of 50, 30 and 20 Calculate also, to 2 decimal places, the
bags respectively are taken from the mean and the variance of the above
production line of ‘12 kg bags’ of cat distribution. (L)
Table J
Lifetime (to 720-729 730-739 740-744 745-749 750-754 755-759 760-769 770-788|
690-709 710-719
nearest hour)
Number of 3 7 15 38 41 35 21 16 14 10 |
discs
PROBABILITY
An experiment can result in several possible outcomes. For example
(a) One toss of a coin results in the outcomes (H, T). If the coin is
fair, then each outcome is equally likely.
(b) Two tosses of a coin result in the outcomes (HH, HT, TH, TT).
Again, if the coin is fair, then each outcome is equally likely.
(c) If a machine produces articles, some of which are defective, the
outcomes are (defective, not defective). In this case the out-
comes should not be equally likely.
(d) If acoin is tossed repeatedly until a head is obtained, the out-
comes are (H, TH, TTH, TTTH, TTTTH,...).
(e) The outcomes of a race being run by A and B could be (A wins,
B wins, there is a dead heat). These outcomes may not be
equally likely.
Each possible outcome is called a sample point and the set of all
possible outcomes is the possibility space S.
If the possibility space has a finite number of sample points then we
denote the number of points in S by n(S).
Consider an event E which is a subset of S, then n(E) < n(S).
For example, for one throw of an ordinary die the possibility
space S = (1,2,3,4, 5,6) and n(S) =
Let E, be the event ‘the number is odd’, then E, = (1, 3,5) and
n(£,)= 8.
Let E, esthe event ‘the number is less than 3’ then E,= (1,2) and
n(E,)=
78
&
PROBABILITY 79
So, in the example on the previous page
Drees 5
: nis) a6 2
_ ME)
2 1
ie nS) 6 8
In order to investigate the rules which apply when considering
probabilities, we will consider the situation in the classical defini-
tion —that of a finite possibility space with equally likely outcomes.
However, the results apply in general and will be used in other
situations in the problems.
IMPORTANT RESULTS
Result 1 We have
n(A) n s
Paya ay ;
fs CD |
=>
- n
Now P(A) = na
n(S)
lizar
3 n
5 n
= 1—P(A)
or P(A)+P(A) = 41
Solution 2.1 The possibility space S = (the pack of 52 cards) and n(S) = 52. Let
A be the event ‘the card is a seven’, then n(A) = 4.
1
1
ae ae
13
12
13
Therefore the probability that the card drawn is not a seven is 12
13°
Example 22 Compare the probabilities of scoring a 4 with one die and a total of
8 with two dice.
So P(A) = n(A)
n(S)
1
6
The probability of scoring 4 with one die is é
First die
n(B)
So P(B) = —
n(S)
gah
36
The probability of obtaining a total of 8 with two dice is 2.
Example 2.3 Two fair coins are tossed. Illustrate the possible outcomes on a
possibility space diagram and find the probability that two heads
are obtained.
Solution 2.3 Each coin is equally likely to show a head or atail. The possibility
space for the outcomes when two coins are tossed is as shown.
n(S) = 4 ye
Let A be the event ‘two heads are obtained’.
From the diagram n(A) = 1.
Second
coin
First coin
82 A CONCISE COURSE IN A-LEVEL STATISTICS
Therefore PA n(A)
n(S)
iL
4
The probability that two heads are obtained when two fair coins
are tossed is ie
si ele Uo A ee ets es ee ee ee ee __
Exercise 2a
eee LLCO OE
An ordinary die is thrown. Find the If a child is chosen at random, find the
probability that the number obtained probability that there are three children
(a) is a multiple of 3, (b) is less than 7, in his or her family.
(c) is a factor of 6.
ti
A ecard is drawn at random from an & = {(x:x is an integer and 1 < x < 20}
ordinary pack containing 52 playing
A = {x:x
isa multiple of 3}
cards. Find the probability that the
card drawn (a) is the four of spades, B = {x:x
isa multiple of 4}
(b) is the four of spades or any diamond, and an integer is picked at random from
(c) is not a picture card (Jack, Queen, &, find the probability that (a) it is in
King) of any suit. A, (6) it is not in B.
From a set of cards numbered 1 to 20a 8 A die is in the form of a tetrahedron and
card is drawn at random. Find the proba- its faces are marked 1,2,3 and 4. The
bility that the number (a) is divisible by ‘score’ is the number on which the die
4, (b) is greater than 15, (c) is divisible lands. Find the probability that when a
by 4 and greater than 15, tetrahedral die is thrown the score is
If the card is divisible by 4 and it is not (a) an even number, (6) a prime number.
replaced, find the probability that (d) the (NOTE: 1 is not a prime number.)
second card drawn is even.
If two tetrahedral dice are thrown find
A counter is drawn from a box con- the probability that (c) the sum of the
taining 10 red, 15 black, 5 green and 10 two scores is 5, (d) the difference of the
yellow counters. Find the probability two scores is 1, (e) the product of the
that the counter is (a) black, (6) not two scores is a multiple of 4.
green or yellow, (c) not yellow, (d) red
An ordinary die and a fair coin are
or black or green, (e) not blue.
thrown together. Show the possible
Two ordinary dice are thrown. Find the outcomes on a possibility space diagram
probability that (a) the sum on the two and find the probability that (a) a head
dice is 3, (b) the sum on the two dice and a 2 is obtained, (b)a tail and a 7 is
exceeds 9, (c) the two dice show the obtained, (c) a head and an even number
same number, (d) the numbers on the is obtained.
two dice differ by more than 2, (e) the
product of the two numbers is even.
10. An ordinary die and two coins are thrown
together. Show the possible outcomes on
The pupils in a class were asked how a possibility space diagram and find the
many brothers and sisters they had. Their probability that (a) two heads and a
answers are shown in the table: number less than 3 is obtained, (b) the
coins show different faces and a 4 is
Number of brothers shown on the die, (c) the die shows an
odd number and the coins show the
Number of pupils
same face, (d) a 6 and at least one head
MPA
(sh eet al
is obtained.
PROBABILITY | / 83
Result 3
'
If A and B are any two events of the same experiment such that
P(A) #0 and P(B) # 0 then
P(A or B) = P(A) + P(B)—P(A and B)
Note that ‘A or B’ means ‘A occurs, or B occurs, or both A and B
occur’.
Writing the result in set notation we have
P(AUB) = P(A)+P(B)—P(ANB)
P(AUB) = ee
n(S)
(Gf) dt (saab)
= aaa tea ae
Cale ss
| aL ee
Pas t ANB
“non on
P(A) + P(B)—P(ANB)
Example 24 A coin and a die are thrown together. Draw a possibility space
diagram and find the probability of obtaining (a) a head, (b)a
number greater than 4, (c) a head anda number greater than 4,
(d) a head or a number greater than 4.
A
\ s
Gs ial ae)
Coin
0 neg ee =12
penBe? vi(S)
1 2 3 4 5 6
Number on die
‘n(B) 4 1
(b) se) Sj 12 ak
The probability of obtaining a number greater than 4 is 5:
Bl»
alH
The probability of obtaining a head and a number greater than 4 is :
[a
wir
The probability of obtaining a head or a number greater than 4 is z.
We now check that this satisfies P(A UB) = P(A) + P(B)— P(A NB).
a8
Therefore left hand side = right hand side and
P(AUB) = P(A) +P(B)—P(ANB).
Example 25 Events A and B are such that P(A) = 8, P(B) = 2 and P(AUB) = 2.
Find P(ANMB).
Solution 25 Now P(AUB) = P(A) +P(B)—P(ANB)
sO fi = Lea. P(ANB
5 S0, 5 ( )
TOME T2224
PUA Be sate air
30 30 30
ae
30
7
Therefore P(AMB)= 30°
Example 2.6 In a group of 20 adults, 4 out of the 7 women and 2 out of the 13
men wear glasses. What is the probability that a person chosen at
random from the group is a woman or someone who wears glasses?
Solution 2.6 Let W be the event ‘the person chosen is a woman’ and G be the
event ‘the person chosen wears glasses’.
Now
7 6 4
PWieanae20’nt BOG) — PCW andiG) SRW OG) =o20
20:
P(W or G) = P(WUG) = P(W)+P(G)—P(WOG)
Cn
~ 20 20 20
9
~ 20
Therefore the probability that the person is a woman or someone
who wears glasses is oe
tape ce eee ae oh)20
86 A CONCISE COURSE IN A-LEVEL STATISTICS
Result 4
A can occur or an event B can occur but not both A
If an event
and B can occur, then the two events A and B are said to be mutually
exclusive.
and =P(ANB)= Ul 0 *
This is known as the addition law for mutually exclusive events.
Example 2.7 In a race the probability that John wins is - the probability that
Paul wins is t and the probability that Mark wins is z. Find the
probability that (a) John or Mark wins, (b) neither John nor Paul
wins. Assume that there are no dead heats.
Solution 2.7 We assume that only one person can win, so the events are mutually
exclusive.
=l|1-—
7
12
ao
ao
P(neither John nor Paul wins) = 2.
Solution 2.8 The possibility space S = (the pack of 52 cards) so n(S) = 52.
Let C be the event ‘a club is drawn’, D be the event ‘a diamond is
drawn’, K be the event ‘a king is drawn’.
(b) P(club) = ae
52
noe ee
n(S) 52
Now P(king club) = P(king of clubs)
1
52
A CONCISE COURSE IN A-LEVEL STATISTICS
88
both
The events C and K are not mutually exclusive as a card can be
a king and a club.
Therefore
P(club U king) = P(club) + P(king) — P(club M king)
;
2" T@
1s Ace ee
B2 52 52
= - a
= = co aN
In this example we could have noted straight away that the event ‘a
club or a king is drawn’ has 16 sample points: )
2b
Exercise
An ordinary die is thrown. Find the households have a black and white set
probability that the number obtained and 7 households have a colour and a
is (a) even, (b) prime, (c) even or prime. black and white set. Find the proba-
bility that a household chosen at random
In a group of 30 students all study at owns a colour television set.
least one of the subjects physics and
biology. 20 attend the physics class and For events A and B it is known that
21 attend the biology class. Find the P(A) = P(B) and P(ANMB) = 0.1 and
probability that a student chosen at P(AUB) = 0.7. Find P(A).
random studies both physics and biology.
The probability that a boy in class 2 is in
From an ordinary pack of 52 playing the football team is 0.4 and the proba-
cards the seven of diamonds has been bility that he is in the chess team is 0.5.
lost. A card is dealt from the well-shuffled If the probability that a boy in the class
pack. Find the probability that it is (a) a is in both teams is 0.2, find the proba-
diamond, (6b) a queen, (c) a diamond or bility that a boy chosen at random is in
a queen, (d) a diamond or a seven. the football or the chess team.
For events A and B it is known that Two ordinary dice are thrown. Find the
P(A) = 3,(AUB) = 3 and P(ANB) = 35. probability that the sum of the scores
obtained (a) is a multiple of 5, (b) is
Find P(B).
greater than 9, (c) is a multiple of 5 or
In a street containing 20 houses, 3 house- is greater than 9, (d) is a multiple of 5
holds do not own atelevision set; 12 and is greater than 9.
PROBABILITY 89
9. Given that P(A) = 2 P(B) = 4 and probability that (a) at least one 6 is
P(ANB) = +; find P(A UB). thrown, (b) at least one 3 is thrown,
(c) at least one 6 or at least one 3 is
10. Two ordinary dice are thrown. Find the thrown.
EXHAUSTIVE EVENTS
Result 5
For example
(i) Let S = (1, 2,3,4,5,6,7,8,9,10).
IfA = (1, 2,3,4,5,6) and B = (5,6, 7,8,9,10) then AUB=S
and A and B are exhaustive events.
Example 29 Events A and B are such that they are both mutually exclusive and
exhaustive. Find a relationship between A and B. Give an example
of such events.
Similarly Anan
Toss a coin. Let A be the event ‘a head is obtained’, B be the event
‘a tail is obtained’.
Now A and B are mutually exclusive, as the coin cannot show both
a head and a tail. e
CONDITIONAL PROBABILITY
Result 6
If A and B are two events and P(A) #0 and P(B) #0, then the
probability of A, given that B has already occurred is written :
P(A|B)
and .: P(A|B)
(A|B) = = PANB)
PB) :
Illustrating this by means of the Venn diagram, the possibility space
is B, since we know that B has already occurred.
n(ANB)
P(A|B)
n(B)
t
‘s
t/n
s/n
P(ANB)
P(B)
This result is often written
P(AMB) = P(AIB)-P(B)
Example 2.10 Given that a heart is picked at random from a pack of 52 playing
cards, find the probability that it is a picture card.
Example 2.11 When a die is thrown, an odd number occurs. What is the proba-
bility that the number is prime?
P(prime M odd)
Solution 2.11 P(prime|odd) =
P(odd)
2/6
(The odd prime numbers are 3 and 5)
3/6
2
3
The probability that the number is prime, given that it is odd, is5.
Result 7
Example 2.12 Two tetrahedral dice, with faces labelled 1,2, 3 and 4, are thrown
and the number on which each lands is noted. The ‘score’ is the sum
of these two numbers. Find the probability that (a) the score is
even, given that at least one die lands on a 3, (b) at least one die
lands on a 8, given that the score is even.
Solution 2.12 There are 16 sample points in the possibility space S, as shown in
the diagram, so n(S) = 16.
Let A be the event ‘at least one die lands on a 3’ and let B be the
event ‘the score is even’.
die
Second
First die
92 A CONCISE COURSE IN A-LEVEL STATISTICS
The sample space A is ((1, 3), (2, 3), (3, 3), (4, 3), (3, 1), (3, 2), (3, 4),
sO
n( A) ss
n(A) = 7 and P(A) = n(S) 16
B is ((1,1), (1,3), (2, 2), (2, 4), (3,1), (3, 3), (4, 2),
Sample space
(4, 4)).
B has been marked A on the diagram.
We have B)
n(B) = 8 an d P(B)
(B) = we
n(S) = es
16
pe n(AMAB) 3
PANS) = ———__ = a
n(S) 16
NOTE: this result could have been obtained directly from the
diagram. The possibility space has been reduced to the 7 sample
points in A. For 3 of these the event B occurs, so P(B| A) = 2.
we have P(A|B)-
= = |=}(—
16 7/\16
(3) GI
P(A|B)
(AIB) == =5
Therefore the probability that at least one die lands on a 38, given
that the score is even, is 3.
Example 2.13 A bag contains 10 counters, of which 7 are green and 8 are white. A
counter is picked at random from the bag and its colour is noted.
The counter is not replaced. A second counter is then picked out.
Find the probability that (a) the first counter is green, (b) the first
counter is green and the second counter is white, (c) the counters
are of different colours.
Solution 2.13 (a) Let G, be the event ‘the first counter is green’.
7
P(G,) = = (as there are 10 counters, of which 7 are green)
Go sal
The probability that the first counter is green is i:
(b) Let W, be the event ‘the second counter picked is white’. Now
3 a
P(W,|G,) = — = — (as there are 9 counters in the bag,
9 of which 3 are white)
Bis
We require Il P(W,|G,)-P(G,)
P(W2 G,) =
II
The probability that the first counter is green and the second
counter is white is a
+ P(G,
(c) With obvious notation, we require P(W,M G,) Wj).
tai
Now P(W,) = = and P(G,|W,) =. Therefore
P(G,0 W,) = P(G2|W,)-P(Wi)
=a
TA toe
7
eG
it a
So P(W, 1 Gy) +P(G2N Wi) = 30 ra
sj
~ 15
the proba
re the
Therefoore that the counters are different colours
lity *e
probabibly
Theref
Ged
1S 5°
94 A CONCISE COURSE IN A-LEVEL STATISTICS
Exercise 2c
A card is picked at random from a pack known to be yellow, (c) if one is known
of 20 cards numbered 1, 2, 3,..., 20. Given to be yellow ticket numbered 1?
that the card shows an even number, find (SUJB Additional)
the probability that it is a multiple of 4.
A number is picked at random from the
digits 1, 2,...,9. Given that the number
If P(A|B) = 2, P(B) = }, P(A) = 3, find is a multiple of 3, find the probability
(a) P(B| A), (b) PAMB). that the number is (a) even, (b) amultiple
of 4.
Two digits are chosen at random from a
table of random numbers containing the Two tetrahedral dice are thrown; one is
digits 0,1,2,...,9. Find the probability red and the other is blue. The number on
that (a) the sum of the two numbers is which each lands is noted, the faces being
greater than 9, given that the first number marked 1, 2, 3 and 4. Find the probability
is 3, (b) the second number is 2, given that (a) the sum of the numbers on
that the sum of the two numbers is greater which the dice land is 6 given that the red
than 7, (c) the first number is 4, given die lands on an odd number, (6) the blue
that the difference between the two die lands on a 2 or a 3, given that the red
numbers is 4. die lands on a 2.
A bag contains 4 red counters and 6 black 10. If oo A and B are such that P(A) = 5,
counters. A counter is picked at random P(B) = # and P(A|B)= 0. (a) Find
from the bag and not replaced. A second P(A UB). (b) Are events A and B
counter is then picked. Find the proba- exhaustive? (Give a reason.)
bility that (a) the second counter is red,
given that the first counter isred, (b) both 11. A and Biare two events such that P(A) = 4,
counters are red, (c) the counters are of P(B) = 3 and P(AMB) =. Are A and B
different colours. ee events?
Two cards are drawn successively from 12. A and B are Sree events and it =
an ordinary pack of 52 playing cards and known that P(A|B)= i and PBy= z
kept out of the pack. Find the probability Find P(A).
that (a) both cards are hearts, (b) the
first card is a heart and the second card is 13. Give two examples of events which are
a spade, (c) the second card is a diamond, both mutually exclusive and exhaustive.
given that the first card is a club. 14. Two coins are tossed. A is the event ‘at
least one head is obtained’. Describe an
X and Y ae two events such that P(X) = z,
event B such that A and B are exhaustive
P(X|Y)=3and P(Y|X)=2.Find events.
(a) P(XNY), (b) P(Y), (c)(XU Y).
15. If
A box contains two yellow and two black & = { (x,y): x and y are positive integers }
tickets numbered 1 and 2. Two tickets
A =({(x,y):2<x<5and1<y<4}
are drawn from the box. Indicate the
sample space by listing all possible pairs = (x,y): 44 y= 5}
of results. C ={(x,y): y= 2}
What is the probability that both tickets Find the probability that a member of A
drawn will be yellow, (a) if nothing is chosen at random will also be a member
known about either of them, (0) if one is of (a) B, (b) C, (c) BNC, (d) BUC.
INDEPENDENT EVENTS
Example 2.14 A die is thrown twice. Find the probability of obtaining a 4 on the
first throw and an odd number on the second throw.
Solution 2.14 Let A be the event ‘a 4 is obtained on the first throw’, then P(A) = i
Let B be the event ‘an odd number is obtained on the second throw’.
Now the result on the second throw is not affected in any way by
. the result on the first throw. Therefore A and B are independent
events and P(B) = Sicyt
62k
As A and B are independent events
P(ANB) = P(A)-P(B)
= (@)
ee aly al
= li
a2
The probability that the first throw results in a 4 and the second
throw results in an odd number is os
Example 2.15 A bag contains 5 red counters and 7 black counters. A counter is
drawn from the bag, the colour is noted and the counter is replaced.
A second counter is then drawn. Find the probability that the first
counter is red and the second counter is black.
7
Now P(B2) = 12
Pica
35
5 7
~ 144
The probability that the first counter is red and the second counter
is black is =.
Example 2.16 A fair die is thrown twice. Find the probability that (a) neither
throw results in a 4, (b) at least one throw results in a 4.
Solution 2.16 Let A be the event ‘the number on the first throw is 4’.
Let B be the event ‘the number on the second throw is 4’.
Now P(A) = zsso P(A) = :where A is the event ‘the number on the
first throw is nota 4’.
Similarly P(B) = 2.
NOTE: Aand Bare independent events.
-A 5\/5
25
36
The probability that neither throw results in a 4 is 2.
oh ee ES
36
ou
36
The probability that at least one throw results in a4 isae
Example 2.17 Events A and B are such that P(A) = =and P(ANB)= = If A and
B are independent events, find (a) P(B), (b) P(AUB).
PROBABILITY 97
i
2
Therefore P(B)= fand P(A UB) =
Example 2.18 Two events A and B are such that P(A)= i P(A|B) = 5and
P(B|A) =
(a) Are A and B independent events? (b) Are A and B mutually
exclusive events? (c) Find P(AMB). (d) Find P(B).
Solution 2.18 (a) If A and B are eee events then P(A|B) = P(A).
Now P(A|B)= 5and P(A)=
Therefore ere # P(A) and . and B are not independent events.
- (NC
ie
«6
Therefore P(AM B) = &.
sO ra
IG
$P(B) = ie
P(B)
ees(Bt= os
98 A CONCISE COURSE IN A-LEVEL STATISTICS
in
Example 2.19 The probability that a certain type of machine will break down
s
the first month of operation is 0.1. If a firm has two such machine
the same time, find the probabi lity that, at
which are installed at
the end of the first month, just one has broken down.
-
Solution 2.19 We assume that the performances of the two machines are indepen
dent.
Let A be the event ‘machine 1 breaks down’ and let B be the event
‘machine 2 breaks down’.
a SS SS Se
Exercise 2d a
init ee ee ee ee ee
“1, A die is thrown twice. Find the proba- is 0.05. Find the probability that, on two
bility of obtaining a number less than 3 consecutive mornings, (a) I am late for
on both throws. work twice, (b) I am late for work once.
A card is picked from a pack containing Events A and B are such that P(A) = 2 and
52 playing cards. It is then replaced and a P(B) = 4. If A and B are independent
second card is picked. Find the probability
events, find (a2) PPANMB), (b) P(A OB),
that (a) both cards are the seven of
(c) PAN B).
diamonds, (b) the first card is a heart
and the second card is a spade, (c) one If events A and B are such that they are
card is from a black suit and the other is independent and P(A) = 0.3, P(B) = 0.5,
from a red suit, (d) at least one card is a find (a) PPAMB), (b) (AUB).
queen.
Are events A and B mutually exclusive?
A coin is tossed and a die is thrown. What
is the probability of obtaining a head on Events A and B are such that P(A) = 2
the coin and an even number on the die? P(A|B)= 3, P(B) = §. Find (a) P(B|A),
Two men fire at a target. The probability (b) (ANB).
that Alan hits the target is 5 and the In a group of 120 girls, each is either
probability that Bob does not hit the freckled or blonde or both; 80 are
target is . Alan fires at the target first, freckled and 60 are blonde. A girl is to
then Bob fires at the target. Find the be chosen at random from the group. A
probability that (a) both Alan and Bob is the event ‘a freckled girl is chosen’ and
hit the target, (b) only one hits the B is the event ‘a blonde girl is chosen’.
target, (c) neither hits the target. (a) Calculate P(AMB). (b) State, giving a
reason, if you think A and B are indepen-
The probability that I am late for work dent events. (L Additional)
PROBABILITY 99
10. A and B are independent events and 11. The probability that I have to wait at the
P(A) = L. P(B) = 3. Find.the probability ee Reed:
traffic lights on my way to school is a
Find the probability that, on two con-
that (a) both A and B occur, (b) only secutive mornings, I have to wait on at
one occurs. least one morning.
Result 9
For events
Aand B we have P(B) = P(BNA)+P(BNA).
Illustrating this by means of the Venn diagram:
P(B)
n n
P(BNA)+P(BNA)
This result is often written
Solution 2.20 Let A be the event ‘it is sunny tomorrow’ and let B be the event
‘Susan plays tennis tomorrow’.
Then A is the event ‘it is not sunny tomorrow’.
P(A) = § and P(A) = 2; also P(B|A) =# and P(B| A) = 2.
Sieh
We require P(B) lI P(B|A)-P(A) + P(B|A)-P(A)
II
Example 2.21 If events A and B are independent, show that events A and B are
independent.
100 A CONCISE COURSE IN A-LEVEL STATISTICS
P(B)-P(A)
Therefore P(B A) = P(B)- P(A) and so A and B are independent.
Exercise 2e
1. A bag contains 6 white counters and 4 blue buying none. Calculate the probability
counters. A counter is drawn, its colour is that the winning ticket will be bought by
noted and it is not put back into the bag. A a boy. (L Additional)
second counter is then drawn. Find the 7 ;
probability that the second counter drawn 5. P(X) = 5 and P(Y) = Z. Given that X and Y
is blue. are mutually exclusive, find (a) P(X UY),
2. Ina restaurant 40% of the customers (b) P(Y MX).
choose steak for their main course. If a
6. It is estimated that one-quarter of the
customer chooses steak, the probability
drivers on the road between 11 p.m. and
that he will choose ice cream to follow is
midnight have been drinking during the
0.6. If he does not have steak, the proba-
evening. If a driver has not been drinking,
bility that he will choose ice cream is 0.3.
the probability that he will have an accident
Find the probability that a customer
at that time of night is 0.004%; if he has
picked at random will choose (a) steak
been drinking, the probability of an
and ice cream, (b) ice cream,
accident goes up to 0.02%. What is the
3. Events C and D are such that P(C) = 4, probability that a car selected at random
P(CND) = . P(C|D) = 1% Find at that time of night will have an accident?
(a) (COD), (b) P(D), (c) (DIC). A policeman on the beat at 11.30 p.m. sees
a car run into a lamp-post, and jumps to
4. Exactly 60% of the members of a form are the conclusion that the driver has been
boys, and 90% of these boys and 75% of drinking. What is the probability that he
the girls each buy one raffle ticket, the rest is right? (SMP)
we)
ais,
0<P(E)<1
P(S) =1
P(E)+ P(E) =1
PROBABILITY , 101
P(ANB) P(BN A)
P(A|B) = on) P(B\A)=
P(A)
i.e. P(ANB) = P(A|B)-P(B) i.e. P(BN A) =P(B|A)-P(A)
so that P(A|B)P(B) = P(B|A)P(A)
If A and B are independent, P(A |B) = P(A)
and P(AM B) = P(A)-P(B) Multiplication law for
independent events
P(A) = P(ANB)+P(ANB)
or P(A) = P(A|B)-P(B)+ P(A|B)-P(B)
a Te
Miscellaneous Exercise 2f
SESSILIS ESO ne
Bag A contains 5 red and 4 white counters. A domino is drawn from the set. Let the
Bag B contains 6 red and 3 white counters. event A be ‘The domino is a double’,
A counter is picked at random from bag A event B be ‘The sum of the spots is 6’ and
and placed in bag B. A counter is now event C be ‘The number of spots at each
picked from bag B. Find the probability end differ by more than 3’. On graph paper
that this counter is white. draw a diagram to represent the possibility
space with, for example, the point (1, 2)
In a set of 28 dominoes each domino has representing the selection of the domino
from 0 to 6 spots at each end. Each domino shown in the figure. On your diagram mark
is different from every other and the ends clearly the set of elements associated with
are indistinguishable so that, for example, each of the events A, B and C. Using your
the two diagrams in the figure represent diagram find the probability that (a) both
the same domino. A and B occur, (b) both A and C occur,
(c) both B and C occur.
State a pair of events which are indepen-
dent and also a pair which are mutually
exclusive. Find the probability that A
A domino which has no spots at all or the
occurs and B does not occur.
same number of spots at each end is called (L Additional)
a ‘double’.
A CONCISE COURSE IN A-LEVEL STATISTICS
The probability that a person in a parti- (You may leave your answers as fractions
in their lowest terms.) (O &C)
cular evening class is left-handed is z. From
the class of 15 women and 5 men a person At a féte the vicar has a board in the
is chosen at random. Assuming that ‘left- shape of a circle, having sectors coloured
handedness’ is independent of the sex of a red and green, with an arrow which can be
person, find the probability that the spun above it: you have to try to guess
person chosen is a man or is left handed. the colour on which the arrow will come
to rest when it is next spun. It is made so
Two events A and B are such that that the results of successive spins are
P(A) = 0.2, P(A' NB) = 0.22, independent, and
P(ANMB) = 0.18.
P(the arrow rests on red) = 0.6
Evaluate (a) P(AMB’), (b) P(A|B). (JMB)
Find the probability of guessing correctly
(NOTE: B’ is the event ‘B does not occur’.) (i) if you always guess ‘green’;
(ii) if you toss a fair coin and guess ‘green’
In a group of 100 people, 40 own a cat, 25
if it comes down ‘head’ and ‘red’ other-
own a dog and 15 own a cat and a dog.
Find the probability that a person chosen wise;
(iii) if your guess is always the colour the
at random (a) owns a dog or a cat,
arrow is resting on before the spin. (SMP)
(b) owns a dog or a cat, but not both,
(c) owns a dog, given that he owns a cat, Two soldiers, Alan and Bill, are shooting at
(d) does not own acat, given that he owns a target with independent probabilities of
a dog.
2 and & respectively of hitting the bull
The two events A and B are such that with a single shot. If they each fire two
P(A) = 0.6, P(B) = 0.2, P(A|B) = 0.1. shots, copy and complete the tables which
Calculate the probabilities that (i) both show the possible outcomes, together with
of the events occur, (ii) at least one of the their associated probabilities.
events occur, (iii) exactly one of the Alan’s two shots at the target
events occurs, (iv) B occurs, given that A
row |[al
has occurred. (JMB) Number of bulls eas aria: ge
P(AUBUC) = P(A)+P(B)+P(C)—P(ANB)—PBOC)
—P(COA)+P(ANBNC) 2 :
To illustrate this, consider the Venn diagram with the number of
elements in each part as shown:
PROBABILITY ‘ 103
AUBUC
Example 2.22 In the Good Grub Restaurant customers may (if they wish) order
any combination of chips, peas and salad to accompany the main
course. The probability that a customer chooses salad is 0.45,
peas and chips 0.19, salad and peas 0.15, salad and chips 0:25, salad
or peas 0.6, salad or chips 0.84, salad or chips or peas 0.9. Find the
probability that a customer chooses (a) peas, (b) chips, (c) all
three, (d) none of these.
Solution 2.22 Let A be the event ‘salad is chosen’, E the event ‘peas are chosen’
and C the event ‘chips are chosen’.
Then
P(A) = 0.45, P(ENC) = 0.19, P(ANE) = 0.15,
P(ANC) = 0.25, P(AVE) = 0.6, P(AUC) = 0.84,
P(AUEUC) = 0.9
YjasUi
iS,
AUVUEUC
EM \
ANENC
Example 223 Records in a music shop are classed in the following sections:
classical, popular, rock, folk and jazz. The respective probabilities
PROBABILITY : 105
that a customer buying a record will choose from each section are
0.3, 0.4, 0.2, 0.05 and 0.05. Find the probability that a person
(a) will choose a record from the classical or the folk or the jazz
sections, (b) will not choose a record from the rock or folk or
classical sections.
Solution 2.23 A record cannot be classed in more than one section, so the events
are mutually exclusive. !
(a) (classical or folk or jazz) = P(classical) + P(folk) + P(jazz)
0.3+0.05+0.05
= 04
The probability that the record will be classical or folk or jazz is 0.4.
Example 2.24 A bag of sweets contains 4 red ‘fruities’ and 5 green ones. A child
picks out 8 fruities one after the other and eats them. Find the
probability that the first is red, the second is green and the third
is red.
Solution 2.24 Let R, be the event ‘the first fruitie is red’, G, be the event ‘the
second fruitie is green’, R; be the event ‘the third fruitie is red’.
We require P(R, 1G, R3) = P(R,)-P(G2 |R,)-P(R31G,R,).
4
Now P(R,) = o
and
4\/(5\/3
So P(R,NG,ZOR3) = 5a |
5
42
Therefore the probability that the first is red, the second is green
and the third is red is 2.
Independent events ;
Example 2.25 A die is thrown four times. Find the probability that a 5 is obtained
each time. ;
Solution 2.25 Let 5, be the event ‘a 5 is obtained on the first throw’, 5, be the
event ‘a 5 is obtained on the second throw’ and so on.
The events are independent, so
P(5,5,953M 5,4) P(5;)-P(52) -P(53)-P(54)
(a)(a(a
nf
1296
Therefore the probability that a 5 is obtained each time is =
Example 2.26 Three men in an office decide to enter a marathon race. The respec-
tive probabilities that they will complete the marathon are 0.9, 0.7
and 0.6. Find the probability that at least two will complete the
marathon. Assume that the performance of each is independent of
the performances of the others.
Solution 2.26 Let A be the event ‘the first man completes the marathon’, then
P(A) = 0.9.
PROBABILITY 107
Let B be the event ‘the second man completes the marathon’, then
P(B) = 0.7.
Let C be the event ‘the third man completes the marathon’, then
P(C) = 0.6.
P(all complete the marathon) = P(AN BMC). We will abbreviate
P(AN BMC) as P(ABC). Then
P(all complete the marathon) 'P(ABC)
P(A)-P(B)-P(C) (independent
events)
(0.9)(0.7)(0.6)
II 0.378
P(two out of the three complete the marathon) P(ABC) + P(ABC)
+ P(ABC)
(0.9)(0.7)(0.4)
+ (0.9)(0.3)(0.6)
+(0.1)(0.7)(0.6)
0.456
P(at least two complete the marathon) 0.378 + 0.456
= 0.834
Therefore the probability that at least two complete the marathon
is 0.834.
ee oe ee eee ee ee
Exercise 2g See
Three cards are drawn from a pack con- Of 24 boys ina class, 8 play rugby, 6 play
taining 52 playing cards. Find the proba- hockey and 13 play soccer. One boy plays
bility that they are a heart, club and spade, both soccer and rugby. Every boy plays at
in that order, if (a) the card is looked at least one game but not one plays all three
and then replaced after each draw, (0) the games. Two boys play both hockey and
card is not replaced after each draw. rugby. A boy is to be picked at random
from the group.
A die is thrown three times. What is the (a) Draw a possibility space diagram to
of scoring a 2 on just one illustrate the situation.
probability
occasion? (b) Calculate the probability of a boy
being selected who (i) only plays hockey,
A coin is tossed four times. Find the proba- (ii) plays both hockey and soccer.
bility of obtaining less than two heads.
(c) If S is the event ‘a boy is chosen who
plays soccer’, H is the event ‘a boy is
A box contains 4 black, 6 white and 2 red chosen who plays hockey’ and R is the
event ‘a boy is chosen who plays rugby’,
balls. Balls are picked out of the box
state, giving reasons for your answers,
without replacement. With obvious notation,
(i) two events which are independent,
find (a) P(By1 W2), (b) P(W2), (ii) two events which are mutually ex-
(c) P(ByU W2), (d) (Bi W20R3), clusive. (L Additional)
(e) P(the first three are different colours).
A CONCISE COURSE IN A-LEVEL STATISTICS
In a lucky drawafirst prize is given, then a (b) An athlete aims to measure his fitness
second, then a third prize. 8 boys and 4 by subjecting himself to a sequence of 3
girls each buy one ticket. (a) Find the physical tests, the completion of each
probability that (i) a girl has the first test in a specified time being classed by
prize, a boy the second andagirl the third, him as a ‘pass’. The probability that he
(ii) the prizes go to 3 boys. (6) If the first passes the first test in the sequence is p,
and third prizes go to members of one but the probability of passing any subse-
sex and the second to a member of the quent test is half the probability of passing
opposite sex, find the probability that the the immediately preceding test. Show that,
second prize goes to a boy. if the probability of passing all 3 tests is
se the value of p is ‘ Hence find the
probabilities (i) that he fails all the tests,
Three fair cubical dice are thrown. Find (ii) that he passes exactly 2 of the 3 tests.
the probability that (i) the sum of the (MEI)
scores is 18, (ii) the sum of the scores is
A, B and C are three events and, for
5, (iii) none of the three dice shows a
example, AUB denotes the event that
6, (iv) the product of the scores is 90.
‘either A or B or both A and B occur,
(AEB 1974)
AB denotes the event that both A and B
occur, A is the event complementary to the
event A. Find Pr(AN BMC) given that
(a) A, B and C represent 3 events. If 11
A(MB is the event that both A and B occur PHA) = Pie 9 OC yaa19
and P(B|A) is the probability that B occurs
given that A has already occurred, show
PANE) EBay
that 16 16
P(ANB) = P(A)-P(BIA)
Deduce, or show otherwise, that
Pr(BNC) = a Pr(AIBNC) = .
P(ANBNC) = P(A)-P(B|A)-P(C| ANB) (MEI)
PROBABILITY TREES
Example 2.27 A bag contains 8 white counters and 3 black counters. Two counters
are drawn, one after the other. Find the probability of drawing one
white and one black counter, in any order, (a) if the first counter is
replaced, (b) if the first counter is not replaced.
PIW,AW;) = (F)(2)
11/ \11
=e121
| 3 24
P(B, AW) = |—||—) = —
ve alte 121
SUN 9
Iststd draw 2nd draw P(B, 1 OB.) Bs) = lili
|— =|ee 1
NOTE: these events are mutually exclusive, so check that the sum
of the probabilities is 1.
P(drawing one white and one black counter) P(W,B>)
+ P(B, OW)
——
24 +
2A———
121 121
48
121
The probability of drawing one black and one white counter if the
counter is replaced after the first draw isaaa
pm.owy = (2)(2)
-
te a “(7 fr ~ 110
pte,082)=(=(2| = 35
(8, = (FHig 110
110 | R=)
1st draw 2nd draw Total = 110 |
110 A CONCISE COURSE IN A-LEVEL STATISTICS
eee
#110
— 24
55
The probability of drawing one black and one white counter if the
counter is not replaced after the first draw is 2.
Example 2.28 The probability that a golfer hits the ball on to the green if it is
windy as he strikes the ball is 0.4, and the corresponding probability
if it is not windy as he strikes the ball is 0.7. The probability that
the wind will blow as he strikes the ball is 0.3.
Find the probability that (a) he hits the ball on to the green, (b) it
was not windy, given that he does not hit the ball on to the green.
Solution 2.28 Let W be the event ‘it is windy’, then P(W) = 0.3 and P(W) = 0.7.
Let H be the event ‘he hits the ball on to the green’.
Then P(H|W) = 0.4 and P(H| W) = 0.7.
PROBABILITY 177
0.12+0.49
= 0.61
prmtrereca
(b) We require aWwiny
( == 240»
PH)
So P(W\) = —
= 0.54 (2d.p.)
The probability that it was not windy, given that he does not hit
the ball on to the green, is 0.54 (2 d.p.).
= :.
Example 2.29 Events A and B are such that P(A) = 1 P(B|A) = @ and P(B|A)
Solution 2.29 If we draw atree diagram and put on it the information given, we have
. 3
(a) P(BIA) =F
1
(b) P(ANB) = 75
are y I
(c) P(B) = P(BNA)+P(BNA) =—+ ope
12 15 60
Example 2.30 Afair coin is tossed three times. What is the probability of obtaining
(a) exactly two heads, (b) at least two heads?
1
P(H,O.H>9 A) = 5
1
P(H,QHNT3) = a
1
PUHy NT 2 Ha) ==
P(T,; Ha MTs) =
TATION UR OE) =
co|—
1st toss | 2nd toss 3rd toss
+ —
1
+
Co
|e 8
co
cole
|e
Exercise 2h _
The probability that a biased die falls Draw a tree diagram to show all the
showing a 6 is £.This biased die is thrown possible total scores and their respective
twice. probabilities after a player has completed
(a) Draw a tree diagram showing the two rounds.
possible outcomes and the corresponding Find the probability that a player has
probabilities, considering the event ‘a six (a) a score of 4 after 2 rounds, (6) an
is thrown’. odd number score after 2 rounds.
(b) Find the probability that exactly one (L Additional)
six will be obtained.
An unbiased die is now thrown.
(c) Extend the tree diagram to show the
possible outcomes, again with regard to
whether or not a 6 is thrown. ‘
(d) Find the probability that, in the three 6. Three bags, A, B and C contain counters
[Ret [Yatow
throws, exactly one 6 will be obtained. as follows:
9. A bag contains 7 black and 3 white bility of choosing (a) three black marbles,
marbles. Three marbles are chosen at (b) a white marble, a black marble and a
random and in succession, each marble white marble in that order, (c) two white
being replaced after it has been taken out marbles and a black marble in any order,
of the bag. (d) at least one black marble.
Drawatree diagram to show all possible State an event from this experiment
selections. which together with the event described
From your diagram, or otherwise, cal- in (d) would be both exhaustive and
culate, to 2 significant figures, the proba- mutually exclusive. (L Additional)
BAYES’ THEOREM
P(A; |B) =
P(B|A,) -P(Ai)
/P(BIA;)-P(A) + PBI Ay)P(A) +... + PBIA,)-PCAn)
fori = 1,2..."
Proof Ss
so
Example 2.31 Three girls, Aileen, Barbara and Cathy, pack biscuits in a factory.
From the batch allotted to them Aileen packs 55%, Barbara 30%
and Cathy 15%. The probability that Aileen breaks some biscuits in
a packet is 0.7, and the respective probabilities for Barbara and
Cathy are 0.2 and 0.1. What is the probability that a packet with
broken biscuits found by the checker was packed by Aileen?
Solution 2.31 Let A be the event ‘the packet was packed by Aileen’, B be the
event ‘the packet was packed by Barbara’, C be the event ‘the
packet was packed by Cathy’, D be the event ‘the packet contains
broken biscuits’.
We are given P(A) = 0.55, P(B) = 0.8, P(C) = 0.15
and P(D| A) = 0.7, P(D|B) = 0.2, P(D|C) = 0.1.
We require P(A |D). So we use Bayes’ theorem to ‘reverse the con-
ditions’:
P(A|D) = P(D|A)-P(A)
P(D)
Therefore P(A|D) = ee
0.46
0.837 (3d.p.)
PROBABILITY 117
Example 2.32 Three children, Catherine, Michael and David, have equal plots in a
circular patch of garden. The boundaries are marked out by pebbles.
Catherine has 80 red and 20 white flowers in her patch, Michael has
30 red and 40 white flowers and David has 10 red and 60 white
flowers. Their young sister, Mary, wants to pick a flower for her
teacher.
(a) Find the probability that she picks a red flower if she chooses
a flower at random from the garden, ignoring the boundaries.
(b) Find the probability that she picks a red flower if she first
chooses a plot at random.
(c) If she picks a red flower by the method described in (b), find
the probability that it came from Michael’s plot.
(b) A plot is chosen first. Each of the three plots is equally likely
to be chosen.
Let C be the event ‘Catherine’s plot is chosen’, then P(C) = 5
80 red
40 white
20 white
10 red
60 white
A CONCISE COURSE IN A-LEVEL STATISTICS
* P(R|D)P(D) = (35)(5)
= Grol) ls
The probability that Mary picks a red flower if she chooses a plot
at random first is ee
NOTE: thetwo different results for part (a) and part (b) are slightly
surprising. In the first case, there is one group of flowers and each
flower is equally likely to be chosen. In the second case, even
though each plot is equally likely to be chosen, the proportions of
red and white flowers within these plots are different.
Therefore P(MIR) = =
35
7116
Given that Mary picks a red flower, the probability that it came
from Michael’s plot is 2.
PROBABILITY
119
ae
Exercise 2i —
Find the probability that (a) a customer B is an arbitrary event of S such that
buys grade A petrol, (b) a customer pays P(B) # 0, show that
by cheque, (c) a woman customer pays P(A)-P(B|A)
by cheque, (d) a customer who pays by P(A|B) =
cheque for grade A petrol is a man. P(A)-P(B|A) + P(A) -P(B |A)
: ties that aon boy
(ii) The probabilibicycle goes to
;
6,) A bag contains school by bus, or foot on a
8.) 10 counters of which 4 are
pink, 3»are-green and3 "are yellow. certain day are 0.2, 0.3 and 0.5 respec-
Counters are removed at random, one at tively. The probabilities of his being
a time and without replacement. Find late by these methods are 0.6, 0.3 and 0.1
on the first two branches. whee and write down the results for P(A,| B) and
(b) Calculate (i) QPANMB), (ii) QPANB), P(A3iB).
(iii) P(A UB). A factory has three machines 1,2 and 3,
(c) Draw a possibility space diagram to
illustrate both the given data and your producing a particular type of item. One
anowerd item is drawn at random from the factory’s
(d) Calculate (i) P(AIB), (ii) P( A\B). production. Let B denote the event that
(e) Use your answers to (d); to draw a the chosen item is defective and let A,
denote the event that the item was
fully labelled tree diagram with B
preceding A. (L Additional) produced on machine k where k = 1,2 or
38. Suppose that machines 1,2 and 3
t 10. ) a, s produce respectively 35%, 45% and 20% of
eee the total production of items and that
Gey P(B|A,) = 0.02, P(B| A2) = 0.01,
P(B|A3) = 0.08.
Given that an item chosen at random is
defective, find which machine was most
(i) By considering the diagram which repre- likely to have produced it.
sents the sample space S, for AU A, where (L Additional)
Example 2.33 (a) Find the probability of obtaining at least one 6 when 5 dice are
thrown.
(b) Find the probability of obtaining at least one 6 when n dice are
thrown.
(c) How many dice must be thrown so that the probability of
obtaining at least one 6 is at least 0.99?
PROBABILITY d 121
ff
= 1—P(66666)
5 5
6
= 0.598 (8d.p.)
The probability of obtaining at least one 6 when 5 dice are thrown
is 0.598 (3 d.p.).
f 5 \”
i.e. | < 0.01
Dividing both sides by log (2) and reversing the inequality sign
since log (2)is negative, we have
Example 2.34 A, B and C, in that order, throw a tetrahedral die. The first one to
throw a 4 wins. The game is continued indefinitely until someone
wins. Find the probability that (a) A wins, (b) B wins, (c) C wins.
Solution 234 With a tetrahedral die the number ‘thrown’ is the number on which
the die lands. Therefore P(4 is thrown) = i:
(a) Let A, be the event ‘A wins on his first throw’, A, be the event
‘A wins on his second throw’, and so on.
Now
Sia
P(A3) = P(A,B,C,A,B,C,A3) — (=) (=) and so on
navn = i} BV+(RTG)
Therefore
- bay +(ah+-
ll
eH
—]S_ where Sis the sum of an infinite G.P.
ee
with a=landr= re
“Aira
-
iS 1
109
Ga
pan
a IS
e
z
16
37
The probability that A wins is e-
(b)
P(B wins) = P(B,)+P(B,)+ P(B3)+... to infinity (mutually
; exclusive events)
_ Sig 1
Now P(B;) II P(A,B,) = ale
PROBABILITY 123
P(B,) =
So
P(B wins)
i
wien!) ||1
4]\4
oy hla)
me
37
Therefore the probability that B wins is 12
a
(c)
P(C wins) = 1—P(A wins) — P(B wins)
28
= ]-—
37
ace
37
Therefore the probability that C wins is 37
i
Exercise 2]
G) A coin is biased so that the probability that 3. A die is biased so that the probability of
it falls showing tails is 2. obtaining a 3 is p. When the die is thrown
four times the probability that there is at
(a) Find the probability of obtaining at
least one 3 is 0.9375. Find the value of p.
least one head when the coin is tossed five
times. How many times should the die be thrown
(b) How many times must the coin be so that the probability that there are no 3’s
tossed so that the probability of obtaining is less than 0.03?
at least one head is greater than 0.98?
On asafe there are four alarms which are
2. A missile is fired at a target and the proba- arranged so that any one will sound when
bility that the target is hit is 0.7. someone tries to break into the safe. If
(a) Find how many missiles should be the probability that each alarm will function
fired so that the probability that the target properly is 0.85, find the probability that
is hit at least once is greater than 0.995. at least one alarm will sound when some-
(b) Find how many missiles should be fired one tries to break into the safe.
so that the probability that the target is
not hit is less than 0.001. ) Two people, A and B, play a game. An
)
ordinary die is thrown and the first person The first boy to draw the white ball wins
to throw a 4 wins. A and B take it in turns the game. Assuming that they do not
to throw the die, starting with A. Find the replace the balls as they draw them out,
__ probability thatB wins. find the probability that Bill wins the
) ‘ game.
reat
(
6,/ A,B, Cand D throwacoin, in turn, starting If the game is changed, so that, in thelnee
with A. The first to throw a head wins. game, they replace each Rent ther i bas
oesae cae a aang 7 been drawn out, find the probabilities that:
os pee wee “ a ee A,
because the others have their first turn
itine Scorn Mit thind attem
) a ?
before him. Compare the probability that
S Ph.
D wins with the probability that A wins.
Show that these answers are terms in a
7. A box contains five black balls and one Geometric Progression. Hence find the
white ball. Alan and Bill take turns to draw probability that Alan wins the new game.
a ball from the box, starting with Alan. (SUJB Additional)
eT
ARRANGEMENTS
NOTE: n! = n(n—1)(n—2)...(8)(2)(1).
For example, consider the letters A, B, C, D.
These are
Example 2.35 How many different number plates can be formed if each is to con-
tain the three letters A, C and E followed by the three digits 4, 7, 8?
PROBABILITY 125
Solution 2.35 There are 3! ways of arranging the letters A, C and E, and 3! ways
of arranging the digits 4, 7 and 8.
Therefore the total number of different plates = (3!)(3!)
= 36
36 different number plates can be formed.
Example 2.36 (a) In how many ways can the letters of the word STATISTICS
be arranged?
(b) Ifthe letters of the word MINIMUM are arranged in a line
at random, what is the probability that the three M’s are
together at the beginning of the arrangement?
Therefore
10!
number of ways =
3!3!2!
(10)(9)(8)(7)(6)(5)(4)(3)(2)A)
(3)(2)(1)(8)(2)(1)(2)(1)
50 400
50400 Y&oeye
are oye
There are of arranging the letters in the word
ways eee
there
STATISTICS.
126 A CONCISE COURSE IN A-LEVEL STATISTICS
Let E be the event ‘the three M’s are placed together at thepees
of the arrangement’.
ways.
Therefore n(£) = 12
So P(E) n(B)
n(S)
12
420
va
35
The probability that the three M’s are together at the beginning of
an arrangement is =
Example 2.37 Ten pupils are placed at random in a line. What is the probability
that the two youngest pupils are separated?
Now treat these two together as one ‘item’ and so we have 9 ‘items’
to arrange.
Now Eis the event ‘the two youngest are not together’.
So P(E) LPF)
= 1-1
1 5
a A
5
The probability that the two youngest are separated is ;.
Example 2.38 If a four-digit number is formed from the digits 1, 2, 3 and 5 and
_ repetitions are not allowed, find the probability that the number
is divisible by 5.
A D C B
C B A D
A A
oO Oo 0 ao
3!
Therefore the number of arrangements of 4 beads on aring is 2 = 3.
Example 2.39 Six bulbs are planted in a ring and two do not grow. What is the
probability that the two that do not grow are next to each other?
Let E be the event ‘the bulbs that do not grow are next to each
other’. Consider the two bulbs that do not grow as one ‘item’.
They can be arranged in 2! ways. There are now five ‘items’ to be
arranged in a ring and this can be done in 4! ways.
PROBABILITY 129
Therefore n(E) = 2!4!
So P(E) i n(E)
n(S)
214!
The probability that the bulbs that do not grow are next to each
other is2.
Example 2.40 One white, one blue, one red and two yellow beads are threaded on
a ring to make a bracelet. Find the probability that the red and
white beads are next to each other.
4!
n(S) =
(2)(2!)
= 6
Let E be the event ‘the red and the white beads are next to each
other’.
red and white can be arranged in 2! ways
So n(E) = 8
P(E) = n(E
(E)
n(S)
mo
6
a al
2
are next to each other
The probabil
aah hl the
ity that nS beads
red and white ean
real
IS 5:
130 A CONCISE COURSE IN A-LEVEL STATISTICS
W W B
vy Y aB ‘Co
Y R
Y B
W Y B
R ay
Y coy
WR NA te)
R ¥
NOTE: in three of the six arrangements the red and white beads
are next to each other.
Exercise 2k
‘1. / In how many ways can the letters of the 5. Nine children play a party game and hold
word FACETIOUS be arranged in a hands in acircle.
line? What is the probability that an (a) In how many different ways can this
arrangement begins with F and ends be done?
with S? (b) What is the probability that Mary will
: be holding hands with her friends Natalie
2. (a) In how many ways can 7 people sit endl Sarah?
at a round table?
(b) What is the probability that a husband
ang pute sit together (6) If the letters in the word ABSTEMIOUS
(3) On ashelf there are 4 mathematics books | +~—«-8Ye arranged at random, find the probability
and 8 English books. that the vowels and consonants appear
(a) If the books are to be arranged so that alternately.
the mathematics books are together, in
how many ways can this be done? y !
(b) What is the probability that all the (77 (a) In how many different ways can the
mathematics books will not be together? letters
‘ in the word ARRANGEMENTS
be arranged?
[ 4) If the letters of the word PROBABILITY (b) Find the probability that an arrange-
\ are arranged at random, find the proba- ment chosen at random begins with the
bility that the two I’s are separated. letters EE.
es Py = (7)(6)(5)= 210
Now (7)(6)(5) could be written CHC CELE
(4)(3)(2)(1)
— ae ed rie
7P,
i-e:
Alek (a8):
NOTE: the order in which the letters are arranged is important —
ABC is a different permutation from ACB.
n “ee
n!
ss (ir)!
n! n!
NOTE: "P,
(n—n)! 0!
But we know that the number of ways of arranging n unlike objects
is n!
So we must define 0! to be 1.
So Ot = 1
(GC: — Ps
3!
le
3!4!
(7)(6)(5)(ABAZVEA)
(3)(2)(1)(A)(2)(2)(4)
= 35
132 A CONCISE COURSE IN A-LEVEL STATISTICS
n
NOTE: "C, is sometimes written ,,C, or |}
r
Example 2.41 In how many ways can a hand of 4 cards be dealt from an ordinary
pack of 52 playing cards?
Solution 2.41 We need to consider combinations, as the order in which the cards
are dealt is not important.
Now 340, = a
e (52)(51)(50)(49)
‘ (4)(3)(2)(1)
= 270725 ways
Example 242 Four letters are chosen at random from the word RANDOMLY.
Find the probability that all four letters chosen are consonants.
- n(E) 6C,
6!
412!
meget)
(2)(1)
= 15
PROBABILITY . / 133
Now P(E)
n(E)
n(S)
aR
70
3
14
The probability that the four letters chosen are consonants is 3.
Solution 2.43 (a) (i) There are 11 people, from whom 4 are chosen. The order
in which they are chosen is not important.
Number of ways of choosing the team II eC.
os
4'7!
_ (11)(20)(9)(8)
(4)(3)(2)(1)
= 330
If there are no restrictions, the team can be chosen in 330 ways.
(ii) If there are to be more boys than girls, then there must be 3
boys and 1 girl or 4 boys.
-_ (6)(5)(4)(5)
(ssi 6! 5!
(3)(2)(1)(1)
= 100
Now
rs) la
6! 5!
_ (6)(5)(4)
(2)(1)
= 60
So P(E) i ,
n(S)
60.
330.
11
Example 2.44 Four items are taken at random from a box of 12 items and
inspected. The box is rejected if more than 1 item is found to be
faulty. If there are 3 faulty items in the box, find the probability
that the box is accepted.
Solution 2.44 The box is accepted if (a) there are no faulty items in the sample
of 4, or (b) there is one faulty item in the sample of 4.
n(S) = °C,
12!
4'8!
_ (12)(11)(10)(9)
(4)(3)(2)(1)
= 495
PROBABILITY 4 135
There are 9 items that are not faulty, so the number of ways of
choosing 4 items that are not faulty
= EO,
9!
415!
: _ (9)(8)(7)(6)
(4)(3)(2)(1)
| = 126
The number of ways of choosing 1 faulty item and 3 good items
= (C,)(7C3)
- (i)i)
3! o}
— (8)(9)(8)(7)
~ (8)(2)(1)
= 252
Let E be the event ‘the number of faulty items chosen is 0 or 1’.
(5)(4)
2
= 10
_ ————-n
=
wnt)
2
n?—n—2n
eZ.
n(n—8)
2.
n(n—8
The number of diagonals for a polygon with nsides is pees
Example 2.46 A certain family consists of Mother, Father and their ten sons.
(a) They are invited to send a group of four representatives to a
wedding. Evaluate the number of ways in which the group can
be formed, if it must contain (i) both parents; (ii) one and
only one parent; (iii) neither parent.
(b —
On another occasion, the ten sons decide to play five-a-side
football. Evaluate the number of ways in which the teams can
be made up. Determine the probability that the two eldest
brothers are in the same team. (SUJB Additional)
_ (10)(9)
(2)(1)
= 45
If the group is to contain both parents, then it can be chosen in
45 ways.
_ (10)(9)(8)
(3)(2)(1)
= 120
PROBABILITY , 137
Therefore number of ways to choose the group of 4 = (2)(120)
= 240
If the group is to contain one, and only one parent, then it can be
chosen in 240 ways.
pe Sy
(4)(3)(2)(1)
=
210
If the group is to contain neither parent, then the number of ways
in which it can be chosen is 210.
Let E be the event ‘the two eldest are in the same team’, then
n(E£) = 56.
If S is the possibility space, then n(S) = 126.
P(two eldest are in the same team) = P(E)
n(E)
n(S)
56
126
4
9
The probability that the two eldest are in the same team is 5:
138 : A CONCISE COURSE IN A-LEVEL STATISTICS
Exercise 21
From a group of 10 boys and 8 girls, 2 5. Four persons are chosen at random from
pupils are chosen at random. Find the a group of ten persons consisting of four
probability that they are both girls. men and six women. Three of the women
are sisters. Calculate the probabilities that
From a group of 6 men and 8 women, 5 the four persons chosen will: (i) consist
people are chosen at random. Find the of four women, (ii) consist of two women
probability that there are more men and two men, (iii) include the -three
chosen than women. eieteecs (JMB)
From a bag containing 6 white counters
and 8 blue counters, 4 counters are chosen , R :
at random. Find the probability that 2 6. <A touring party of 20 cricketers consists
white counters and 2 blue counters are of 9 batsmen, 8 bowlers and 3 wicket
ehGeen! keepers. A team of 11 players must have
at least 5 batsmen, 4 bowlers and 1 wicket
From a group of 10 people, 4 are to be keeper. How many different teams can be
chosen to serve on a committee. selected, (a) if all the players are available
(a) In how many different ways can the for selection, (b) if 2 batsmen and 1
committee be chosen? bowler are injured and cannot play?
(6) Among the 10 people there is one
married couple. Find the probability that
both the husband and the wife will be 7. Find the number of ways in which 10
chosen. different books can be shared between a
(c) Find the probability that the 3 boy and agirl if each is to receive an
youngest people will be chosen. even number of books.
PROBABILITY 139
8. Four letters are picked from the word ca) How many even numbers can be formed
BREAKDOWN. What is the probability with the digits 3,4,5,6,7 by using some
that there is at least one vowel among the or all of the numbers (repetitions are
letters? not allowed)?
Example 2.47 The events A and B are such that P(A) = 5?P(B) = : and
P(B|A) po ta
aE :
Find
(a) P(ANMB),
(b) P(AUB),
(c) P(A|B),
(d) P(A|B).
State whether A and B are (i) independent, (ii) mutually exclusive.
n s
() BNA
ANB
(a) Now
Ti z
Fipencre nets but P(A)
=— so Eg:
n(S) n
’
a) n 3
Now
ll s/n—t/n
20 Ln
_ 2/5t/n
3 1-1/3
—(2\(t) 2
3/ \20 oi vi
S by & fected,
. n 5 80
Ls fede
80
N Ow P(ANB)
( )== n(AMB)
n(S)
nae
Or
Therefore P(ANMB) = rt
e P(ANB)
(c) P(A|B) = PB)
_ n(ANB)
——-n(B)
ha s—t
me
s/n—t/n
s/n
142 A CONCISE COURSE IN A-LEVEL STA TISTICS
= 2/5—1/30
2/5
aL
12
Therefore P(A|B) = oS
P(ANB)
(d) P(A|B) P(B)
1/30
2/5
Zn
12
Therefore P(A|B) = *.
Example 2.48 Tung-Pong and Ping-Ho play a game of table tennis. The score
reaches 20-20. The game continues until one player has scored two
more points than the other.
The probability that Tung-Pong wins each point is 0.6. What are
the probabilities that:
(a) Tung-Pong wins the game after 2 further points?
(b) Ping-Ho wins the game after 2 further points?
(c) The score is 21-21 after 2 further points?
(d) Tung-Pong wins the game after 3 further points?
(e) Tung-Pong wins the game after 4 further points?
(f) Tung-Pong wins the game after 6 further points?
If the game can continue indefinitely, for each player what is the
probability that he will ultimately win? (SUJB Additional)
PROBABILITY , 143
Solution 2.48 Let W be the event ‘Tung wins a point’.
Then P(W) = 0.6 and P(W) = 0.4.
(a) P(Tung wins after 2 further points) = P(WW)
= (0.6)(0.6)
= 0.36
The probability that Tung wins after 2 further points is 0.36.
So
P(Tung wins after 4 further points) = P(WW WW)+P(WW WW)
= 2(0.6)3(0.4)
= 0.1728
Therefore
P(Ping wins) ll H
| |
13
The probability that Tung wins is A and the probability that Ping
rere ee
wins is =.
13
Example 2.49 (a) A bag contains 5 red and 4 blue balls. 3 balls are picked out,
one at a time, and are not replaced. Find the probability that at
least 1 of the 3 balls is blue.
(b) One letter is selected from each of the names: SIMMS, SMITH,
THOMPSON. What is the probability that 2, and only 2 are the
same?
4D
~ 42
So P(at least one ball is blue) = 1—P(8 balls are red)
Dh 42
_ 387
~ Ag
The probability that at least one of the 3 balls is blue isa7
ns = (5) (a)
respective probabilities:
0.07
P(S,S,S3) = alee
( 12 ey) jak 5 5 8 =".0.04
= .
P(§,S,S3)
( Le) 3) =ye algle
5 5 8 = 0.015
are .
PULLS
(hh) = ane: (1)
i T="p=" G04
0;
P(M,M,M3) =
pS
mo = (F)(s}
Be
Oe
(5)= 9(ANT
Va
1:
:
ie Sela
P(M,M,M3) = (5 =
[| | 0.015
a Bet
P(A, H>H;) = (1) | | =, 01025
1\/1
P(T,T)T3) = (1) aig == "0.025
Total 0.340
ore £4
Therefore and sey
P(2 ant letters are the same) = 0.34.
2 ke
only 4
ineret
146 A CONCISE COURSE IN A-LEVEL STATISTICS
(c) Let K be the event ‘he knows the correct answer’, then
P(K) =ee!3
ax os “| 1
| ein? a P(KOM) AM) =|—)(1)
(5( =>
3
P(KNM) =0
P(KNM)
We require P(K|M) =
P(M)
Now P(M) P(MNK)+P(MNK)
Le
3° 16
q
ats
1/3
So P(K|M)
7/15
5
7
Example 250 (a) A bag contains a number of counters, alike in shape and size,
but x are red and y are green. Counters are to be chosen at ran-
dom from the bag. Prove that the probability that the second
counter chosen will be red is the same, whether the first counter
is replaced or not before the second is drawn.
(b) A three-figure number, not less than 100, is to be made up
using three digits selected at random from the digits 0,1; 2,3, 4,
PROBABILITY , 147
5,6,7,8,9 WITHOUT using the same digit twice in any num-
ber. Show that the total possible number of numbers is 648.
Calculate the probabilities: (i) that the number is even, (ii) that
the number is divisible by 5, (iii) that the number is greater
than 600, (iv) that the number is even and greater than 600.
What are the corresponding results for (i) and (ii) if the same
digit may be used two or three times in the same number?
(SUJB)
Solution 250 (a) The bag contains x red counters and y green counters.
When the first counter is replaced
x
With obvious notation PER) = cess
xy
RRS (x —1)
FUROR ~ (x+y)(x +y—1)
Sy suppl
Mi dele
ee icyah
x (x—1) y x
(ety) (@ty—1) (xty)(e+y—-1)
x
= ——__——__ (x-1+
Guyer ”)
wh x
re xy
The number of ways in which the number is even and greater than
600 = 144.
: 144 2
P(number is even and greater than 600) = 648 = a
Now consider the case when the same digit may be used two or
three times:
We are concerned with the 3rd digit, which can be chosen in 5 ways.
If there was no restriction, this could be chosen in 10 ways.
—~5 = —al
P(number is even) = 9
10
We are concerned with the 3rd digit which can be chosen in 2 ways.
P(number is divisible by 5) =
10 ou
|e
Example 2.51 (a) A and B play a game as follows: an ordinary die is rolled and if
a six is obtained then A wins and if a one is obtained then B
wins. If neither a six nor a one is obtained then the die is rolled
again until a decision can be made. What is the probability that
A wins on (i) the first roll, (ii) the second roll, (iii) the rth
roll? What is the probability that A wins?
(b) A bag contains 4 red and 3 yellow balls and another bag con-
tains 3 red and 4 yellow. A ball is taken from the first bag and
placed in the second, the second bag is shaken and a ball taken
from it and placed in the first bag. If a ball is now taken from
the first bag what is the probability that it is red?
(You are advised to draw a tree diagram. ) (SUJB)
150 A CONCISE COURSE IN A-LEVEL STATISTICS
- File
2
) (a
i) (sl
are
These are mutually exclusive events.
desl 2) 1 215 (4
So PUA WO) dle a)+ aiaie +
+... to infinity
shawn
= —|1+(=)+(=]
2) 002 +... ailesth|
6 3 3
1
= —S
6
II oo
yay —
oO
|r
mle
The tree diagram to show the possible outcomes and the proba-
bilities is as follows:
4
ma=(5ra)ea)
5) ~sa
=(7r
nv
moran =(]a)(a)a
Draw from 1st bag | Draw from 2nd bag | Draw from 1st bag
1
P(red from 1st bag) —— (64+
48+ 45+ 60)
392
217
392
0.554 (8d.p.)
The probability that the ball is red is 0.554 (3 d.p.).
Example 252 (a) A pack of 52 playing cards is cut at random into three piles.
Find the probability that the top cards are all (i) black,
(ii) hearts, (iii) aces.
After the top cards have been examined and found not to be
picture cards, calculate the probability that the three bottom
cards are all queens.
152 A CONCISE COURSE IN A-LEVEL STATISTICS
(b) A bag contains eight black counters and two white ones. Each
of two players, A and B, draws one counter in turn, without
replacement, until one of them wins by drawing a white counter.
A draws first. Calculate his chance of winning. (AEB 1975)
4
17
The probability that the top cards are all black is =.
Loy ie eee ;
(ii) P(H,H,H3) = S ea ) (non-independent events)
ail
~ 850
The probability that the top cards are all hearts is =
There f re
erefo P(Q: Q2Q3
(Q1:Q 2Q3)) == (S le
49]\a 3) 460
g}]\47) = 6
The probability that the three bottom cards are queens is
4606.
(b) Let A, be the event ‘A wins on the first draw’,
A, be the event ‘A wins on the second draw’, and so on.
Then, with obvious notation for the black and white counters,
PAve= PW) a ae
10 5
Sl NZ
P(A,) = P(BBW) = * Bg =
nos= mmm =61618)
PROBABILITY 153
- GI) IEIE)- 2
P(A,) = P(BBBBBBW)
P(A;) = P(BBBBBBBBW)
Example 2.53 (a) Ruby Welloff, the daugher of a wealthy jeweller, is about to
get married. Her father decides that as a wedding present she
can select one of two similar boxes. Each box contains three
stones. In one box two of the stones are real diamonds, and the
other is a worthless imitation; and in the other box one is a real
diamond, and the other two are worthless imitations. She has
no idea which box is which. If the daughter were to choose
randomly between the two boxes, her chance of getting two
real diamonds would be 5:Mr Welloff, being a sporting type,
allows his daughter to draw one stone from one of the boxes
and to examine it to see if it is a real diamond. The daughter
decides to take the box that the stone she tested came from if
the tested stone is real, and to take the other box otherwise.
Now what is the probability that the daughter will get two real
diamonds as her wedding present?
(b) A fair die is cast; then n fair coins are tossed, where nis the
number shown on the die. What is the probability of exactly
two heads?
(c) A fair die is thrown for as long as necessary for a 6 to turn up.
Given that 6 does not turn up at the first throw, what is the
probability that more than four throws will be necessary?
(AEB 1979)
Solution 2.53 (a) Let A be the event ‘she chooses the box with 2 diamonds’,
B be the event ‘she chooses the box with 1 diamond’,
D be the event ‘she chooses a diamond from the box’.
154 A CONCISE COURSE IN A-LEVEL STATISTICS
P(A
vats)5
=\5}\3) 7 3
a
3
Therefore the probability that she has 2 diamonds for her wedding
present is 2.
So P(2H,1T) =
oe
sj
(iii) Four coins are tossed
The possibility space consists of 2* equally likely outcomes.
4!
Number of ways of arranging H,H, T, T = 313! = 6.
6 3
So P(2H, 2T) = = = =
16 8
So P(2H,3T) = —
32. 16
So P(ZHEAT St =
Therefore
Til 3 or 5 © 15
P(exactly 2 heads) ~+— i
6 an asge8-
16: 564
33
128
_ P(6,6 5636461)
We require P(6 663641 6)
P@,)
_ POO®
5
6
5)
5 3
125
216
Example 2.54 (a) When a person needs a minicab, it is hired from one of three
firms, X, Y and Z. Of the hirings 40% are from X, 50% are from
Y and 10% are from Z. For cabs hired from X, 9% arrive late,
the corresponding percentages for cabs hired from firms Y and
Z being 6% and 20% respectively. Calculate the probability
that the next cab hired
(i) will be from X and will not arrive late,
(ii) will arrive late.
Given that a call is made for a minicab and that it arrives late,
find, to 3 decimal places, the probability that it came from Y.
(b —
For a certain strain of wallflower, the probability that, when
sown, a seed produces a plant with yellow flowers is g. Find
the minimum number of seeds that should be sown in order
that the probability of obtaining at least one plant with yellow
flowers is greater than 0.98. (L)
(b) When
aseed is sown, P(yellow flower) = §
When n seeds are sown,
P(at least one yellow flower) = 1—P(no yellow flowers)
= 1-(8)"
Now we need
P(at least one yellow flower) > 0.98
Dividing both sides by log (2) and reversing the inequality since
log (2) is negative, we have
log 0.02
5
log (@)
n >021.45...
Therefore the minimum number of seeds that should be sown
Isic.
158 A CONCISE COURSE IN A-LEVEL STATISTICS
Miscellaneous Exercise 2m
———
SERURISaTneeeRT
(a) Two dice are thrown together, and The probability that it is red is 1.5 times
the scores added. What is the probability the probability that it is blue, and the
that (i) the total score exceeds 8? (ii) the probability that it is blue is twice the
total score is 9, or the individual scores probability that it is green. Find the
differ by 1, or both? probabilities that the counter is (a) red,
(b) A bag contains 3 red balls and 4 black (b) blue, (c) green.
ones. 3 balls are picked out, one at a time A counter is taken at random from the
and not replaced. What is the probability bag, its colour is noted and it is then
that there will be 2 red and 1 black in the replaced in the bag. The process continues
sample? until at least one of each colour has been
(c) A committee of 4 is to be chosen seen. Considering the order in which the
from 6 men and 5 women. One particular colours are first seen, find the proba-
man and one particular woman refuse to bilities that (d) red is seen before green,
serve if the other person is on the com- (e) the order is green, blue and finally
mittee. How many different committees red. (O &C)
may be formed? (SUJB)
There are eight girls and ten boys in the
In Camelot it never rains on Friday, upper sixth form of a small school. Six
Saturday, Sunday or Monday. The prefects are to be selected. In how many
probability that it rains on a given Tues- ways can this be done if (a) there must
day is :. For each of the remaining two be three girl and three boy prefects;
days, Wednesday and Thursday, the con- (b) there must be at least four boy
ditional probability that it rains, given prefects?
that it rained the previous day, is @, and Amongst the eighteen pupils, there is a
the conditional probability that it rains, pair of twins, one girl and one boy. The
given that it did not rain the previous Headmaster has decided that there must
day, is B. be three girl and three boy prefects. Find
(a) Show that the (unconditional) proba- the probability that both of the twins
bility of rain on a given Wednesday is are selected. (SUJB Additional)
eat 46), and find the probability of rain
The events A and B are such that
on a given Thursday.
(b) If X is the event that, in a randomly
chosen week, it rains on Thursday, Y is
(4) =— 1=3
P
(4'1B) == 3—1
the event that it rains on Tuesday, and Y
is the event that it does not rain on Tues- P(AB)
day, show that
3
P(X|Y)—P(X|Y) = (a—6) P(AUB) = 5
(c) Explain the implications of the
case a= 8B. (Cambridge) where A’ is the event ‘A does not occur’.
Using a Venn diagram, or otherwise,
Mass-produced glass bricks are inspected determine P(B| A’), P(BMA) and P(A|B’).
for defects. The probability that a brick The event C is independent of A and
has air bubbles is 0.002. If a brick has air P(ANC) = §. Determine P(C| A’).
bubbles the probability that it is also
cracked is 0.5 while the probability that a State, with a reason in each case, whether
brick free of air bubbles is cracked is (a) A and B are independent, (b) A and
0.005. What is the probability that a C are mutually exclusive. (Cambridge)
brick chosen at random is cracked? If p; and p2 are the probabilities of two
The probability that a brick is discoloured
independent events, show that the proba-
is 0.006. Given that discolouration occurs
bility of the simultaneous occurrence of
independently of the other two defects, these two events is pp.
find the probability that a brick chosen at
random has no defects. (O &C) In 18 games of chess between A and B,
A wins 8, B wins 6 and 4 are drawn.
A bag contains red, blue and green A and B play a tournament of 3 games.
counters of equal size and shape. A On the basis of the above data, estimate
counter is taken at random from the bag. the probability that: (a) A wins all three
PROBABILITY 159
games, (b) A and B win alternately, Queen’s and the: Royalty. Find the
(c) two games are drawn, (d) A wins at probability that (a) A and B meet, (b) B
least one game. (AEB 1972) and C meet, (c) A, B and C all meet,
(d) A, B and C all go to different places,
(a) From an ordinary pack of 52 cards (e) at least two meet. (C)
two are dealt face downwards on a table.
What is the probability that (i) the first 12. In a game, three cubical dice are thrown
card dealt is a heart, (ii) the second card by a player who attempts to throw the
dealt is a heart, (iii) both cards are hearts, same number on all three. What is the
(iv) at least one card is a heart? chance of the player
(6) Bag A contains 3 white counters and (a) throwing the same number on all
2 black counters whilst bag B contains three?
2 white and 3 black. One counter is (6) throwing the same number on just
removed from bag A and placed in bag two?
B without its colour being seen. What is If the first throw results in just two dice
the probability that a counter removed showing the same number, then the third
from bag B will be white? is thrown again. If no two dice show the
(c) A box of 24 eggs is known to con- same number, then all are thrown again.
tain 4 old and 20 new eggs. If 3 eggs are The player then comes to the end of his
picked at random determine the proba- turn. What is the chance of the player
bility that (i) 2 are new and the other succeeding in throwing three identical
old, (ii) they are all new. (SUJB) numbers in a complete turn?
The probabilities of A, B or C winning a What is the chance that all the numbers
game in which all three take part are 0.5, are different at the end of a turn?
0.3 and 0.2 respectively. A match is won (O &C)
by a player who first wins two games.
13. Alec and Bill frequently play each other
Find the probability that A will win a in a series of games of table tennis.
game involving all three players. Records of the outcomes of these games
When the players are joined by a fourth indicate that whenever they play a series
player, D, the probabilities of A, B or C of games, Alec has the probability 0.6 of
winning a game, in which all four take winning the first game and that in every
part, are reduced to 0.3, 0.2 and 0.1 subsequent game in the series, Alec’s
respectively. A match is played in which probability of winning the game is 0.7 if
all four players take part; again, the first he won the preceding game but only 0.5
player to win two games wins the match. if he lost the preceding game. A game
Find the probabilities that D wins in cannot be drawn. Find the probability
fewer than (i) four games, (ii) five that Alec will win the third game in the
games, (iii) six games. (JMB) next series he plays with Bill. (JMB)
10. A bag contains 5 red, 4 orange and 3 14. The events A and B are such that
yellow sweets. One after another 3 children
1
select and eat one sweet each. When the (A) = =és
P(A)
bag contains n sweets, the probability of
any one child choosing any particular dl
P(A or B but not both A and B) = ee
sweet is 1/n. What are the probabilities
that (a) they all choose red sweets,
(b) at least one orange sweet is chosen, PB) = =4
(c) each chooses a different colour,
(d) all choose the same colour? Answers Calculate P(ANB), P(A’ OB), P(A| B) and
may be left as fractions in their lowest P(B| A’), where A’ is the event ‘A does
terms. (O &C) not occur’. State, with reasons, whether
A and B are (a) independent, (6) mutu-
11. Three men, A, B and C agree to meet at ally exclusive. (C)
the theatre. The man A cannot remember
15. (a) Two men each have a set of 7 cards,
whether they agreed to meet at the
numbered 1 to 7. Each shows a card
Palace or the Queen’s and tosses a coin
to decide which theatre to go to. The man drawn at random. Find the probability
B also tosses a coin to decide between the that the total of the two numbers is
(i) even, (ii) odd, (iii) greater than 5.
Queen’s and the Royalty. The man Cc
(b) A signal consisting of 7 dots and/or
tosses a coin to decide whether to go to
dashes is to be given. The probability of a
the Palace or not and in this latter case
he tosses again to decide between the
dot in any position is 2/5 and of a dash is
160 A CONCISE COURSE IN A-LEVEL STATISTICS
3/5. Find the probability that, in asignal, 18. Six fuses, of which two are defective and
no two consecutive characters are the four are good, are to be tested one after
same. another in random order until both defec-
(c) A die is loaded so that the chance of tive fuses are identified. Find the proba-
throwing a one is x/4, the chance of a two bility that the number of fuses that will be
is 1/4 and the chance of a six is (1— x)/4. tested is
The chance of a three, four or five is 1/6. (a) three,
The die is thrown twice. (b) four or fewer. (L)P
Prove that the chance of throwing a total 19. In this question you may leave the answers
9x —9x?+ 10 as fractions. Your arguments must be
of7 is carefully explained in both parts.
72 :
Find the value of x which will make this (a) A pack of ten cards consists of two
chance a maximum, and find this maxi- marked with the letter A, three with E,
mum probability. (SUJB) four with S and one with T. The pack is
well shuffled and six cards are dealt.
Find the probability that (i) they form
the word ASSETS, the letters
16. 4 girls and 3 boys plan to meet together
appearing in that order; (ii) the letters
on the following Saturday. The ie pian
either form or can be made to form the
that each boy will be present is 2 indepen- word ASSETS.
dently of the other boys. Find the proba- (bo) A manufacturer of tea inserts one of
bility that (a) 0, (6) 1, (c) 2, (d) 3 boys five types of picture card into each
will be present. packet. Equal numbers of each type
The probability that each girl will be are distributed randomly. Estimate the
present is 4 independently of the other probability that a person buying three
packets will have (i) three cards of the
girls and of the boys.
same type, (ii) just two the same. If a
(e) Find the probability that the mnoee
person buys five packets, estimate the
of girls present will equal the number of
probability of obtaining five different
boys.
(f) Find the probability that both sexes types of card. (SUJB)
will be present. 20. A census of married couples showed that
(g) Afterwards it was reported that the 50% of the couples had no car, 40% had
gathering had included at least one boy one car and the remaining 10% had two
and at least one girl. What is the proba- cars. Three of the married couples are
bility that there were equal numbers of chosen at random.
boys and girls in the light of this addi- (a) Find the probability that one couple
tional information? had no car, one has one car and one has
(Answers may be left as fractions in their two cars.
lowest terms.) (OO &C) (6) Find the probability that the three
couples have a combined total of three
cars.
17. A sailing competition between two The census also showed that both the
boats, A and B, consists of a series of husband and the wife were in full-time
independent races, the competition being employment in 16% of those couples
won by the first boat to win three races. having no car, in 45% of those having one
Every race is won by either A or B, and car and in 60% of those having two cars.
their respective probabilities of winning (c) For a randomly chosen married
are influenced by the weather. In rough couple find the probability that both the
weather the probability that A will win husband and wife are in full-time employ-
is 0.9; in fine weather the probability ment.
that A will win is 0.4. For each race the (d) Given that a randomly chosen married
weather is either rough or fine, the couple is one where both the husband
probability of rough weather being 0.2. and wife are in full-time employment,
Show that the probability that A will find the conditional probability that the
win the first race is 0.5. couple has no car. (JMB)
Given that the first race was won by A, 21. Three machines A, B and C produce 25%,
determine the conditional probability 25% and 50% respectively of the output
that (a) the weather for the first race of a factory manufacturing a certain
was rough, (b) A will win the competi- article. A sample of 3 articles is selected
tion. (C) at random from the total output. Find
PROBABILITY 161
the probabilities that (a) they are all (a) the five observations include the
from C, (b) at least 2 are from B. largest and the least among the 12 ob-
If a second independent sample of 3 servations,
articles is selected, find the probability (b) the second largest and the second
that both samples have the same number smallest will be included,
of articles produced by A. (c) the five smallest observations are
included,
Of the articles produced by A, B and C, (d) at least three of the smallest five
1%, 2% and 5% respectively are defective. observations are included? (MET)
A single article is selected at random. If
D denotes the event ‘defective’ and C the 26. In a class of 30 pupils, 12 walk to school,
event ‘produced by machine C’, find 10 travel by bus, 6 cycle and 2 travel by
p(D) and p(C and D). car. If 4 pupils are picked at random,
obtain the probabilities that (a) they all
An article is examined and found to be travel by bus, (b) they all travel by the
defective. What is the probability that it same means.
was produced by C? (SMP)
If 2 are picked at random from the class,
(a) The events A and B are such that find the probability that they travel by
22.
P(A) = 0.6, P(B) = 0.25, P(A UB) =0.725. different means.
Show that the events A and B are neither In picking out pupils from the class, find
mutually exclusive nor independent. Cal- the probability that more than three
culate the values of P(AUB) and P(A|B). trials are needed before a pupil who
(b) One red card and two black cards are walks to school is selected. (JMB)
removed from a pack of cards. From the
27. Four ball-point pen refills are to be drawn
remainder, three cards are taken at
at random without replacement from a
random without replacement. Show that
bag containing ten refills, of which 5 are
the probability that they are all of the
red, 3 are green and 2 are blue. Find
same colour is 3 Assuming that this (a) the probability that both blue refills
event occurs, find the probability that a will be drawn,
fourth card drawn from the remaining 46 (b) the probability that at least one refill
cards will be of the same colour as the of each colour will be drawn. (JMB)
previous three. (L Additional)
28. At the ninth hole on a certain golf course
there is a pond. A golfer hits a grade B
23. Events A and B are such that P(A) = 3
ball into the pond. Including the golfer’s
P(A|B) = §, (ANB) = §- Find (a) P(B), ball there are then 6 grade C, 10 grade B
(b) P(A|B), (c) P(BIA), (d) (AUB). and 4 grade A balls in the pond. The
State whether events A and B are golfer uses a fishing net and ‘catches’ four
(a) mutually exclusive, (b) independent. balls. The events X, Y and-Z are defined
as follows:
24. The following are three of the classical X: the catch consists of two grade A
problems in probability. balls and two grade C balls
(a) Compare the probability of a total of Y: the catch consists of two grade B
9 with the probability of a total of 10 balls and two other balls
when three fair dice are tossed once Z: the catch includes the golfer’s own
(Galileo and Duke of Tuscany).
ball
(b) Compare the probability of at least
one six in four tosses of a fair die with Assuming that the catch is a random
the probability of at least one double-six selection from the balls in the pond,
in twenty-four tosses of two fair dice determine
(Chevalier de Mere). (a) P(X), (b) P(Y), (¢) P(Z), (d) P(ZI Y).
(c) Compare the probability of at least
For each of the pairs X and Y, Y and Z,
one 6 when six dice are tossed with the
state, with a brief reason, whether the
probability of at least two sixes when 12
two events are (i) mutually exclusive,
dice are tossed (Pepys to Newton). (ii) independent. (C)
Solve each of these problems. (AEB 1978)
29. A committee of 8 members consists of
one married couple together with 4 other
25. A set consists of 12 observations no two
men and 2 other women. From the
of which are equal. Five of the observa-
committee a working party of 4 persons
tions are selected at random. What are
is to be formed. Find the number of
the probabilities that
162 A CONCISE COURSE IN A-LEVEL STATISTICS
different working parties which can be score on the card and die noted. X
formed. denotes the event ‘Both dice are thrown’,
i : . and Y denotes the event ‘The score noted
CE eS tay is less than five.’ Calculate the proba-
party ae
(a) may not contain both the husband bilities
and his wife, (a) P(X), (b) (XNY), (ce) P(Y),
(b) must contain 2 men and 2 women, (d) P(Y|X), (e) P(XIY). (C)
(c) must contain at least one man and at 33. In a constituency containing many
LOS EO Co elderly inhabitants there are twice as
The 8 committee members sit round an many women as men. At an election
octagonal table, their positions being seven-eighths of the women and half the
decided by drawing lots. Find the proba- men cast a vote. Show that the proba-
bility of bility that an adult inhabitant (selected
(d) the man sitting next to his wife, at random) casts a vote is 3/4. For a
(e) the man sitting opposite to his wife, random group of four inhabitants, find
(f) the 3 women sitting together. (AEB) (a) the probability that just one of them
‘ votes:
30. In a game of chance, a player’s turn (b) the probability that two or more
starts by drawing a card at random from vote
a pack of playing cards. If he draws a black a :
card which is not an ace, his turn ends. If It is further found that for married
he draws a black ace he throws a black couples the probability that a man
die, and if he draws a red card he throws votes is z, the probability that a woman
a red die. After a die has been thrown, votes is g the probability that a woman
the card that was drawn is replaced in the
votes given that her husband votes is 2,
pack which is then shuffled and the
player draws again with the same con- and the probability that a man votes
ditions leading to the throwing of a die. given that his wife votes is 2 (you may
This continues until the player draws a assume that this information is con-
black card, which is not an ace, when his sistent). Find
turn ends. A player’s score in any turn is (c) the probability that a husband and
the sum of the scores thrown with the wife both vote;
red die plus three times the sum of the (d) the probability that a husband votes
scores thrown with the black die. Cal- and his wife does not vote;
culate the probability that in a turn a (e) the expected number of votes per
player will score (a) zero, (b) exactly married couple.* (SMP)
three. (L)
34. Inasingle round of a general knowledge
31. (a) Two cards are drawn at random contest, each competitor is first asked a
without replacement, from an ordinary question. If the competitor answers
pack of 52 cards. Find the probability correctly, then that competitor is asked
that they are: (i) of the same suit, another question. This continues until
(ii) of the same value (both aces, both either the competitor has answered five
kings, etc.), (iii) either of the same suit questions correctly, in which case the
or of the same value. competitor scores six points (including a
(6) Two cards are drawn at random, one bonus point), or until the competitor ‘
from each of two ordinary packs. Find answers a question incorrectly, in which
the probability that they are (i) of the case the competitor’s score in that round
same suit, (ii) of the same value, is equal to the number of correct answers
(iii) either of the same suit, the same given.
value, or both.
One of the competitors is named Smith.
(c) Three fair cubical dice are thrown. The probability that Smith answers a
Find the probability that the sum of the question correctly is p, independent of all
number of spots on the upper faces is a
previous answers. Determine the proba-
perfect square. (AEB 1976)
bility distribution of Smith’s score in a
32. A card disd single round, and show that Smith’s
is drawn from a full pack of 52 bxpected eoere is pat py pte op*)*.
playing cards. If the card drawn is an
Ace, King, Queen or Jack, two dice are At the start of the final round of the
thrown and the sum of the scores on the contest Smith is 3 points ahead of Jones,
dice noted. If any other card is drawn, and Smith and Jones are then the only
one die only is thrown and the sum of the *Expectation required — see p.171.
PROBABILITY 163
competitors who can win the contest. The 37. A company makes a certain type of fan
probability that Jones answers a question heater (called an X-heater) at each of its
correctly is also p, independent of all two factories F, and F,. The factory F;
previous answers. Show that the proba- produces one quarter and F three quarters
bility that Jones wins the contest is of the total output. X-heaters are coloured
p'(1—p)(1+
p*+p’). either red or blue. One third of the
Given that Jones wins the contest, deter- X-heaters produced at F, are red and
mine the probability that he scored 6 in seven-ninths of the X-heaters produced
the final round. (C) at F, are red.
A customer goes into a shop and selects
35. Two men are walking directly towards an X-heater at random. Show the proba-
each other on a wide pavement, along the bility is 2 that when he unpacks it he will
same line. When they are six paces apart, find that it is red.
they realise that they are in danger of
colliding. With each of his next three Two shops A and B stock X-heaters.
steps forward therefore, each pedestrian Shop A has four and shop B has three.
adopts the following strategy: if the two Find
are still in line with each other each (a) the probability that neither shop has
independently steps half-left with proba- ared X-heater;
(6) the probability that there are at least
bility p, or steps half-right with probability
3 X-heaters in shop A;
Dp, or keeps straight on with probability
(c) the probability that there are the
1— 2p; if they are not still in line, each
same number of red X-heaters in each
keeps walking straight forward (the
shop;
diagram illustrates one possible version
(d) the probability that there are two
of the encounter). Calculate the proba-
red X-heaters in each shop, given that all
bility that they are still in line after each
the X-heaters in shop A come from F and
has taken his first step, and deduce that the
that all the X-heaters in shop B come
probability of a collision is (1— 4p + 6p’).
from F).
(You may leave all your answers as
fractions with powers of 3 as denomina-
tors.) (SMP)
probabilities satisfy the conditions for a exactly (c) 2 boys, (d) 2 children with
distribution. blue eyes. (MEI)
If
43. A committee has 22 members, of whom
Pr(at least one event) = 0.664 7 have black hair, do not smoke and do
Pr(at least two events) = 0.212, not wear glasses; 5 have white hair, do not
and smoke and do not wear glasses; 4 have
white hair, smoke and wear glasses; 3
Pr(at most two events) = 0.976, have black hair, smoke and do not wear
find the probabilities of exactly 0,1, 2 glasses, 2 have white hair, do not smoke
and 3 events respectively. and wear glasses; 1 has black hair, smokes
By considering a linear combination of and wears glasses.
Pr(one event) and Pr(two events) and (a) One committee member is chosen at
. Pr(three events) find the value of random. Let W be the event that this
member has white hair, G be the event
Pit pot p3. (MEI)
that this member wears glasses and S
40. (a) Two cards are drawn from a well the event that the member smokes.
shuffled pack of 52 playing cards. If Find (i) P(W), (ii) P(WIS), (iii) P(W1G),
Jacks, Queens and Kings count 10 points, (iv) the probability that this member has
aces count 1 point, and the rest count either white hair or glasses (but not both),
points equal to their face values, what is given that this member smokes. Are the
the chance that the total points of the events W and S independent? Are the
two cards will be 12? events W and G independent? Give a
(bo) If three cards are drawn in succession reason for each answer.
from a complete pack, what is the proba- (b) Two committee members are chosen
bility that the first two cards score 12 at random. Let W, be the event that both
points and the total points will be less have white hair. Let S, be the event that
than 21? (O &C) both smoke. Find (i) P(W2), (ii) P(W2! S2).
4l. A hand of 18 cards is dealt from a stan- (C)
dard pack of 52 cards (which consists of
4 suits, clubs, diamonds, hearts, and 44, (a) How many odd numbers can be
spades, each of 138 cards). formed from the figures 1,2,3 and 5 if
(a) Write down, but do not calculate, an repetitions are not allowed?
expression for the probability that the (6) See worked example, p. 135.
hand consists of 3 spades, 4 hearts, and 6 (c) Six different books lie on a table, and
cards from the other suits. a boy is told that he can take away as
(b) Calculate, to 3d.p., the conditional many as he likes but he must not leave
probability that the hand contains exactly empty handed. How many different
3 diamonds, given that it contains exactly selections can he make? One of the books
3 spades and 4 hearts. is a Bible. How many of these selections
(c) Calculate, again to 3 d.p., the proba- will include this Bible? (SUJB)P
bility that the hand contains at least 2
diamonds given the same conditions as 45. (a) Of the households in Edinburgh, 35%
in part (b). (O) have a freezer and 60% have a colour TV
set. Given that 25% of the households
42. Show that the total number of random have both a freezer and a colour TV set,
samples of size r that can be drawn from calculate the probability that a house-
a population of size n is hold has either a freezer or a colour TV
n!} set but not both.
nna)! State, with your reasons, whether the
In a class of 10 boys and 10 girls there are events of having a freezer and of having a
5 children with blue eyes. A random colour TV set are or are not independent.
sample of 4 children is taken. Find the (b) State in words the meaning of the
probabilities that in this sample there are symbol P(B|A), where A and B are two
exactly (a) 2 boys, (6) 2 children with events.
blue eyes.
A shop stocks tinned cat food of two
Half the children living in a big city are makes, A and B, and two sizes, large and
boys, and one quarter of the children small. Of the stock, 70% is of brand A,
have blue eyes. A random sample of 4 30% is of brand B. Of the tins of brand
children is taken. Estimate the proba- A, 30% are small size whilst of the tins
bilities that in this sample there are of brand B, 40% are small size. Using a
PROBABILITY 165
tree diagram, or otherwise, find the his third shot, (iii) the probability that
probability that A wins.
(i) a tin chosen at random from the (6) Given that the archers toss a fair coin
stock will be of small size, to determine who shoots first, find the
(ii) a small tin chosen at random probability that A wins. (JMB)
from the stock will be of brand A. (L)
49. (a) Explain in words the meaning of the
46. During an epidemic of a certain disease a symbol P(A|B) where A and B are two
doctor is consulted by 110 people suffer- events. State the relationship between A
ing from symptoms commonly associated and B when (i) P(A|B) = 0,
with the disease. Of the 110 people, 45 (ii) P(A|B) = P(A).
are female of whom 20 actually have the
When a car owner needs her car serviced
disease and 25 do not. Fifteen males have
she phones one of three garages, A, B, or
the disease and the rest do not.
C. Of her phone calls to them, 30% are to
(a) A person is selected at random. The
garage A, 10% to B and 60% to C. The
event that this person is female is denoted
percentages of occasions when the garage
by A and the event that this person is
phoned can take the car in onthe day of
suffering from the disease is denoted by
phoning are 20% for A, 6% for B and 9%
B. Evaluate (i) P(A), (ii) P(A UB),
for C. Find the probability that the
(iii) P(A NB), (iv) P(AIB).
garage phoned will not be able to take
(b) If three different people are selected
the car in on the day ef phoning.
at random without replacement, what is
the probability of (i) all three having the Given that the car owner phones a
disease, (ii) exactly one of the three garage and the garage can take her car
having the disease, (iii) one of the three in on that day, find the probability that
being a female with the disease, one a she phoned garage B.
male with the disease and one a female (b) A shelf contains ten box files of which
without the disease? four are empty and six contain papers.
(c) Of people with the disease 96% react Five files are chosen at random one after
positively to a test for diagnosing the another from the shelf. Find, to 3 decimal
disease as do 8% of people without the places, the probability that exactly two
disease. What is the probability of a person of the chosen files will be empty wher
selected at random (i) reacting positively, the files are chosen (i) with replacement,
(ii) having the disease given that he or she (ii) without replacement. (L)
reacted positively? (AEB 1987)
50. Show that, for any two events E and F
47. In a simple model of the weather in Oct- P(EUF) = P(E)+P(F)—P(ENF)
ober, each day is classified as either fine
or rainy. The probability that a fine day Express in words the meaning of
is followed by a fine day is 0.8. The prob- P(E\F).
ability that a rainy day is followed by a Given that E and F are independent
fine day is 0.4. The probability that 1 events, express P(EMF) in terms of
October is fine is 0.75. P(E) and P(F), and show that E' and F
(a) Find the probability that 2 October are also independent.
is fine and the probability that 3 October In a college, 60 students are studying one
is fine. or more of the three subjects Geography,
(b) Find the conditional probability that French and English. Of these, 25 are
3 October is rainy, given that 1 October studying Géography, 26 are studying
is fine. French, 44 are studying English, 10 are
(c) Find the conditional probability that studying Geography and French, 15 are
1 October is fine, given that 3 October is studying French and English, and 16 are
(C)
rainy. studying Geography and English. Write
down the probability that a student
48. Two archers A and B shoot alternately at chosen at random from those studying
a target until one of them hits the centre English is also studying French. Deter-
of the target and is declared the winner. mine whether or not the events ‘studying of
Independently, A and B have probabili- Geography’ and ‘studying French’ are
ties of 3 and 4, respectively, of hitting the independent.
centre of the target on each occasion
A student is chosen at random from all
they shoot.
60 students. Find the probability that the
(a) Given that A shoots first, find (i) the
chosen student is studying all three sub-
probability that A wins on his second (L)
jects.
shot, (ii) the probability that A wins on
166 A CONCISE COURSE IN A-LEVEL STATISTICS
51. Explain, by suitably defining events A view is wrongly transmitted to the appli-
and B, what is meant by ‘the probability cant as a morning interview with proba-
of A occurring given that B has occurred’. bility 0.1. Find the probability that an
A local greengrocer sells conventionally applicant arriving
grown and organically grown vegetables. (i) for a morning interview is expec-
Conventionally grown vegetables con- ted for a morning interview,
stitute 80% of his sales; carrots constitute (ii) for an afternoon interview is
12% of the conventional sales and 30% of expected for an afternoon interview.
the organic sales. (AEB 1988)
Display this information in an approp- 53. (a) A bag contains 4 red, 6 white and 5
riately and accurately labelled tree blue balls. If a random sample of 6 balls is
diagram. selected (without replacement) what is
One day a customer emerges from the the probability that there are two balls of
shop and is questioned about her pur- each colour?
chases. What is the probability that she (b) A number N consists of n digits each
bought of which can be 0 or 1. It is copied onto a
(a) conventionally grown carrots, _ Sheet of paper by A and the probability
(6) carrots? that A transcribes any digit wrongly is p.
Given that she did buy carrots, what is The sheet of paper is then passed to B
the probability that they were organically who copies the number onto another
grown? What assumptions have you made sheet of paper. The probability that B
in answering this question? (O) transcribes any digit wrongly is p’. What
is the probability that the number written
52. (a) In a group of 200 people, each by B contains no error?
individual is classified as either male or (c) An assembly plant receives 60% of its
female and according to whether or not resistors from supplier X and 40% from
he or she wears glasses. The numbers supplier Y. 5% of X’s resistors and 6% of
falling into each category are as tabulated. Y’s are defective. If a resistor is tested at
the plant and found to be defective, what
Not wearing | Wearing is the probability that it was supplied
glasses glasses by X? (SUJB)
Male 90 24 54. A and B are mutually exclusive and
Female 66 20 exhaustive events in a sample space S
and Cis any event in S for which
Suppose one of this group is chosen at P(C) #0 on ideri
random. Let A be the event that the See Or ye eee aie a
person chosen is male and B the event
that the person chosen is not wearing PrAIC Se P(C\A)*P(A)
glasses, P(C|A)*P(A)+P(C|B)*P(B)
(i) Define the events A’ and AUB’.
(ii) Calculate the probability of occur-
rence of each of the events in-(i)
Tee SU DRE eects carmen ey”
it classifies the client as either class A
(iii) Given that the person chosen is (eoad Tee) OU es oe
pie: Weis euinced Ocalcuin icine clients are class A. Records show that
PrOcebuteharihic persod ic malc. the probability of a client making a claim
(iv) Use the available data to deter- during any year is 0.08 for class A and
mine whether not wearing glasses is Coos ae ;
independent of sex within the group. (2) Mr Smith buys a policy and makes a
Give a practical interpretation to your claim during his first year. Calculate,
finding. each to 3 decimal places, the probability
(b) After advertising for an assistant, a that Mr Smith was originally classified A
manager decides to interview suitable or was originally classified B.
applicants. The interview of an applicant (6) Mrs Jones bought a policy two years
will take place during the morning or the ago and has not made a claim during that
afternoon with probabilities 0.45 and time. Show that she is more likely to be
0.55 respectively. Each applicant is in- class B.
formed by telephone and in each case a Show, also, that if she does not make a
message has to be left. A morning inter- claim for a further two years she is more
view is wrongly transmitted to the appli- likely to be class A than class B. What do
cant as an afternoon interview with you need to assume for your calculations
probability 0.2, and an afternoon inter- to be valid? (SUJB)
Oe eee eee
PROBABILITY
DISTRIBUTIONS I —
DISCRETE RANDOM
VARIABLES
DISCRETE RANDOM VARIABLE
P(e X32) = Pa
P(X = Xn) = Pn
then X isa discrete random variable if p, +p,+...+p, = 1.
ll ie (= 1.2...77
or > P(X = x) Il ra
allx
Example 3.1 Let X be the discrete variable ‘the number of fours obtained when
two dice are thrown’. Show that X is a random variable.
Solution 3.1 With regard to the number of fours thrown, the outcome could be
one of the following: 0 fours, or 1 four, or 2 fours.
167
168 A CONCISE COURSE IN A-LEVEL STATISTICS
Ped 5\ (5 25
P(X =0) = P(44) = aid. er
P(X =2) FE
= P44)=| ("
[EJIE)| Pee
=
95. bite et
oy DPX =3) = 36+ 36* 6
_ 36
v4eRG
a4
Therefore X is a discrete random variable.
Example 3.2 Two tetrahedral dice, each with faces labelled 1,2,3 and 4 are
thrown and the score noted, where the score is the sum of the
two numbers on which the dice land. If X is the r.v. ‘the score when
two tetrahedral dice are thrown’, find the p.d-f. of X.
PROBABILITY DISTRIBUTIONSI — DISCRETE RANDOM VARIABLES 169
Solution 3.2 The score for each possible outcome is shown in the table:
‘Score’
die
Second From the table we can see that X
can assume the values 2, 3, 4, 5, 6,
7,8 only.
First die
Example 3.3 The p.d.f. of a discrete random variable Yis given by P(Y = y) = cy’,
for y = 0,1,2,3,4. Given that c is a constant, find the value of c.
ae1
30
170 A CONCISE COURSE IN A-LEVEL STATISTICS
Example 34 The p.d.f. of the discrete r.v. X is given by P(X =x)= a(2)* for
x = 0,1,2,3,.... Find the value of the constant, a.
So » X=) = ata(3)+a(3)+a(2)3+...
allx
=" (a)(4)
We have 4a = 1
1
Therefore a=
4
Exercise 3a
1. For each of the following random variables (e) The number of tails obtained when
write out the probability distributions. three fair coins are tossed.
Check that the variables are random ((f) The difference between the numbers
and for parts (b), (d) and (f) write the \when two ordinary dice are thrown.
formula for the p.d.f.
(a) The number of heads obtained when The probability density function of a
two fair coins are tossed. discrete random variable X is given by
(6) The sum of the scores when two P(X = x) = kx forx = 12,18,14. Find the
ordinary dice are thrown. value of the constant k.
(c) The number of threes obtained when
two tetrahedral dice are thrown. The discrete random variable R has p.d.f.
(d) The numerical value of a digit chosen given by P(R =r)=c(3—r) for r=0,
from a set of random number tables. 1, 2,3. Find the value of the constant c.
PROBABILITY DISTRIBUTIONS | — DISCRETE RANDOM VARIABLES 171
4. <A game consists of throwing tennis balls 5. A drawer contains 8 brown socks and 4
into a bucket from a given distance. The blue socks. A sock is taken from the
probability that William will get the tennis drawer at random, its colour is noted
ball in the bucket is 0.4. A ‘go’ consists of and it is then replaced. This procedure is
three attempts. (a) Construct the proba- performed twice more. If X is the r.v. ‘the
bility distribution for X, the number of number of brown socks taken’, find the
tennis balls that land in the bucket in a go. probability distribution for X.
William wins a prize if, at the end of his
go, there are two or more tennis balls in 6. Ther.v. X has p.d.f. P(X =x) = e(2)” for
the bucket. (b) What is the probability x =0,1,2,3,.... Find the value of the
that William does not win a prize? constant, c.
EXPECTATION, £(X)
Experimental approach
Suppose we throw an unbiased die 120 times and record the results:
Score, x 1 2 3 4 5 6
Frequency, f 1Oee225 Coe t eee bom. FOtalt ZO
Theoretical approach
The probability distribution for the r.v. X where X is ‘the number
on the die’ is as shown:
Score, x
expected mean
fg)+ alg}+96) le) lal]
21
6
= 3.5
So, the expected mean = 3.5.
If we have a statistical experiment:
a practical approach results in a frequency distribution and a
mean value,
a theoretical approach results in a probability distribution and
an expected value.
172 A CONCISE COURSE IN A-LEVEL STATISTICS
E(X) » «P(x = x)
allx
P(X
= 2) ai Or 0.28 0 de ee
It can be seen from the table that the distribution is symmetrical
about the central value X = 3, so E(X) = 8. .
Check: E(X) = > P(X = x) = 1(0.1) + 2(0.2) + 3(0.4)
ax + 4(0.2) + 5(0.1) = 3
(b) Consider the r.v. with p.d.f. P(X =x) = 5forx— lee, eae
Now the amount paid out by the fruit machine could be 100p,
80p, 50p, 40p or Op.
So considering the initial payment of 10p for a turn, X can assume
the values 90, 70, 40, 30,—10.
The probability distribution for X is
Example 3.7 (a) Three dice are thrown. If a 1 ora 6 turns up, you will be paid
1p, but, if neither a 1 nor a 6 turns up, you will pay 5p. How
much would you expect to lose in 9 games?
You are now given the opportunity to change the rule for
payment when a 1 ora 6 appears. To make the game worth-
while to yourself, what is the minimum amount in everyday
currency that you would suggest?
(b —
Three coins are thrown. If one head turns up, 1p is paid. If two
heads turn up, 3p is paid, and if three heads turn up 5p is paid.
If the game is to be regarded as fair (i.e. neither the player nor
the bank should lose in the long run), what should be the
penalty if no heads turn up?
(c) A bag contains 3 red balls'and 1 blue ball. A second bag contains
1 red ball and 1 blue ball. A ball is picked out of each bag and
and is then placed in the other bag. What is the expected num-
ber of red balls in the first bag? (SUJB)
12
P(X =1) = P(al ora6) = 27
=, eee 19
( a(S] + (55
(een
9
Therefore the expected loss after one game is 1p,
8 19
We now have E(X) = =
(— a (>>|
—40+19y
27
To make the game worthwhile, we require E(X) > 0,
i.e. —40+19y > 0.
So Re RS
(b) Three coins are thrown. Let X be the r.v. ‘the number of pence
paid in 1 game’.
Then
ll co lI
bo
|eSee
NS |
co
Tae iL
P(X =—y) = P(Oheads) = P(TTT) = S = e
So Ey > P(X = x)
Bla) )eol
allx
8 8 8 8
Licey
- 8
Now, if the game is to be fair, the expected winnings must be zero.
So a. = 0
i.e. y = 17
Therefore the penalty if 0 heads turn up should be 17p.
(c) Assume that the balls are taken from each bag simultaneously.
If a red ball is picked from each bag and placed in the other then
the number of red balls in the first bag is now 3, etc.
Let X be the r.v. ‘the final number of red balls in the first bag’.
Then X can assume the values 2, 3 or 4 only.
P(X = 2) = P(red from first bag and blue from second bag)
P(R,B,) with obvious notation
)
3
8
P(X = 3) Il P(R,R2) + P(B,Bo)
l(a} +a)
1
2
P(X = 4) P(B,R2)
iG)
PROBABILITY DISTRIBUTIONS | — DISCRETE RANDOM VARIABLES 177
So E(X) = xP(X
= x)
3
(3) +963) (6)
= 2(—| +3(/—| +4|—
8 2 8
= 2-—-
4
The expected number of red balls in the first bag after the exchange is
22balls.
Exercise 3b
a
‘WW The probability distribution for the r.v. X If it lands on a face marked with a 2 ora
\—"_ is shown in the table: 4, the player wins 5p and if it lands on a
3, the player wins 3p. Find the expected
gain in one throw.
11. Ina game a player tosses three fair coins. (d) the probability that no red disc will
He wins £10 if 3 heads occur, £x if 2 be drawn,
heads occur, £3 if 1 head occurs and £2 (e) the most probable number of red
if no heads occur. Express in terms of x discs that will be drawn,
his expected gain from each game. (f) the expected number of red discs
Given that he pays £4.50 to play each that will be drawn, and state the proba-
bility that this expected number of red
game, calculate
discs will be drawn. (JMB)
(a) the value of x for which the game is
fair,
17. A woman has 3 keys on aring, just one of
(b) his expected gain or loss over 100
which opens the front door. As she
games if x = 4.90. (C Additional)
approaches the front door she selects one
12. A committee of 3 is to be chosen from key after another at random without
4 girls and 7 boys. Find the expected replacement. Draw a tree diagram to
number of girls on the committee, if the illustrate the various selections before she
members of the committee are chosen at finds the correct key. Use this diagram to
random. calculate the expected number of keys
that she will use before opening the
13. The discrete r.v. X has p.d.f. given by door. (L Additional)
P(X =x) =kx for x =1,2,3,4,5 where
k is constant. Find E(X).
18. An urn containing 4 black balls and 8
14. In an examination a candidate is given the white balls is used for two experiments.
four answers to four questions but is not In experiment 1, two balls are to be
told which answer applies to which drawn at random from the urn, one after
question. He is asked to write down each the other, without replacement. In experi-
of the four answers next to its appropriate ment 2, one ball is to be drawn at random
question. S from the 12 balls in the urn and replaced
(a) Calculate in how many different ways before a second ball is drawn at random.
he could write down the four answers. Copy and complete the following two
(b) Explain why it is impossible for him tables, which give the probabilities for
to have just three answers in the correct the different compound events in the
places and show that there are six ways of two experiments.
getting just two answers in the correct
places. Second ball
(c) If a candidate guesses at random
where the four answers are to go and X
is the number of correct guesses he makes,
draw up the probability distribution for
‘X in tabular form.
_-—./(d@) Calculate E(X). (L Additional)
f)
Example 3.8 Ina game a turn consists of a tetrahedral die being thrown three
times. The faces on the die are marked 1, 2,3, 4 and the number on
which the die falls is noted. A man wins £x? whenever x fours
occur in a turn. Find his average win per turn.
Solution 38 Let X be ther.v. ‘the number of fours obtained when the die is
thrown three times’. Then X can assume the values 0, 1, 2, 3 only.
a ee al 27
Pee aN1 5
alah =z
Wehave P(X =0) II
a8
P(X =1) = 3P(444)
P(X = 3) P(444) = EI =—
Now
ll S
i) (ei)*Ce)*
~]
eat Ga
1.125
= 3(0.1)
+ 3(0.6) +3(0.3)
= 3
E(3) = 3
= 1(0.1)
+ 2(0.6) +3(0.3)
= 22
E(X) = 2.2
= 5(0.1)+10(0.6)
+15(0.3)
= 11
E(5X) = 11
PROBABILITY DISTRIBUTIONS | — DISCRETE RANDOM VARIABLES 181
= 8(0.1)+13(0.6) + 18(0.3)
= 14
E(\5X+3) = 14
= 1(0.1)
+4(0.6) +9(0.3)
= 5.2
E(X?) = 5.2
= 1(0.1)+13(0.6)
+ 33(0.3)
= 17.8
E(4X?—3) = 17.8
(h) 4E(X*)—3 = 4(5.2)—3
= 17.8
4E(X?)—3 = 17.8
We note that E(5X+8) = 5E(X)+38
E(4X?—3) = 4E(X?)—3
=a P(X =x)
a) xP(X = x)
allx
aE(X)
= aE(X)+b
Proof:
= So Ale)P(X = x) + D* f(@)P(X = x)
= Elfi)1+ EAC)
_ Exercise 3c
1. The discrete r.v. X has p.d.f. P(X = x) for 3. The discrete r.v. X has p.d.f. given by
x= 1,2,3. P(X =x) =# for x = 1,2,3,4,5,6.
Find (a) E(X), (b) E(X), (e) E(8X+ 4).
P(X=x) 10.2 03 0.5 Verify that .
E(2X*+ X—4) = 2E(X*) + E(X)—
Find (a) E(X), (b) E(X?). : oo ae aie
(c) Verify that E(x 1) = 38E(X)—1.
(d) Verify that E(2X?+ 4) = 2E(X)+4. 4. The discrete r.v. X has p.d.f. given by
2. The discrete r.v. X has p.d_f. P(X =x)= 3x + 1 f =
P(X = 0) = 0.05, P(X = 1) = 0:45, aire aa
P(X = 2) = 0.5. Verify that Find (a) E(X), (b) E(X?), (c) E(3X— 2),
E(5X?+ 2X —8) = 5E(X?)+ 2E(X)—3. (d) E(2X?+ 4X— 8).
PROBABILITY DISTRIBUTIONS | — DISCRETE RANDOM VARIABLES 183
5. A roulette wheel is divided into 6 sectors 6. Ther.v. X has p.d.f. P(X = x) as shown in
of unequal area, marked with the numbers the table:
1,2,3,4,5 and 6. The wheel is spun and
X is the r.v. ‘the number on which the
wheel stops’. The probability distribution
of X is as follows: P(X=x)|01 01 O08 O04 O12
VARIANCE, Var(X)
In the same way, we can find an alternative form for the formula
for Var(X). Now
Var(X) E(X—p)?
= A(X 22x + py")
E(X*)— 2uE(X) + E(u’)
E(X?)=2y?+yp?
E(X?)—yw?
So we have _ ‘Var(X) Il Bee
NOTE: yp? =[E(X)]?.
We write [E(X)]? as E?(X) in a similar way to the notation used in
trigonometry where (sin A)? is written sin7A.
the table:
Example 3.10 The r.v. X has probability distribution as shown in
Lie 2daSers aq 55
(0.3) 0120.5, 0.1
184 A CONCISE COURSE IN A-LEVEL STATISTICS
Find
(a) w= E(X),
(b) Var(X), using the formula Var(X) = E(X— p)?,
(c) E(X”),
(d) Var(X), using the formula Var(X) = E(X?)— y?.
eae 3 4 iad
—2 —- 1-8 Troae?
Usually we use the most convenient form of the formula for the
variance. *
Example 3.11 Two discs are drawn, without replacement, from a box containing
3 red discs and 4 white discs. The discs are drawn at random. If X
is the r.v. ‘the nu mber of red discs drawn’, find (a) E(X), (b) the
standard deviation of X.
PROBABILITY DISTRIBUTIONS I — DISCRETE RANDOM VARIABLES 185
4\/3 12 2
P(X
= 0) = P(W,W2) = Fe = 76 = a
“Bo 6
7 i 7
oT
discs is 8.
So E(X) = :, or the expected number of redeee
eens een ete ec 0 2g. ee
6 2
So Var(X) = =(5|
= 0.408 (38d.p.)
=0
NOTE: this is as expected, as a constant does not vary.
Proof:
Example 3.12 The discrete r.v. X has the probability distribution shown in the
table.
= feaa
eas
allx
8
Now Var(X) = E(X?)—E?(X)
We have E(X?) = De x?P(X = x)
“BhGlestlont
allx
57
8
57 =)
So Vax) = |
Now consider 2X + 8.
ae Eee
(2x + 3)? 81 121
P(X =x)
We require
E(2X+8) = Di (2x + 3) P(X =x)
eat
all x
4
E{(2X+3)?] = a (2x + 3)?P(X = x)
-nf$} oo)an
allx
= 66
188 A CONCISE COURSE IN A-LEVEL STA TISTICS
Therefore
Var(2X+3) = E[(2X+3)?]—E(2X+ 3)
66 31\7
r 4
95
16
95
= A\——
64
= 4 Var(X)
Therefore Var(2X + 3) = 4Var(X).
Exercise 3d
'1./ The probability distribution for the r.v. 5. A team of 3 is to be chosen from 4 boys
X is as shown: and 5girls. If X is the r.v. ‘the number of
girls in the team’, find (a) E(X),
(b) E(X7), (c) Var(X).
eer
6. , The r.v. X has p.d.f. as shown:
(2) If X is the r.v. ‘the sum of the scores on (b) Verify that Var(2X— 1) = 4Var(X).
two tetrahedral dice’, where the ‘score’ is ‘ “
the number on which the die lands, find 7. \Two discs are drawn without replacement
(a) E(X), (b) Var(X), (c) Var(2X) from a box containing 3 red and 4 white
(d) Var( 9X+3) A ; discs. If X is the r.v. ‘the number of white
: discs drawn’, construct a probability
3. | Find Var(X) for each of the following distribution table.
__/ probability distributions: Find (a) E(X), (b) E(X?), (c) Var(X),
E(X—y)’ = E(X ae
(c)
4. If X is the r.v. ‘the number on a biased 9. Ten identically shaped discs are in a bag;
die’, and the p.d.f. of X is shown, two of them are black, the rest white.
Discs are drawn at random from the bag
in turn and not replaced.
Let X be the number of discs drawn up
to and including the first black one.
find (a),the value of y, (6) E(X), List the values of X and the associated
(c) E(X7), (d) Var(X), (e) Var(4X). theoretical probabilities.
PROBABILITY DISTRIBUTIONS | — DISCRETE RANDOM VARIABLES 189
Calculate the mean value of X and its 2,3. Find (a) the value of the constant k,
standard deviation. What is the most (b) E(X), (c) E(X?), (d) the standard
likely value of X? y deviation of X.
If instead each disc is replaced before , 44. The random variable X takes integer
the next is drawn, construct a similar | values only and has p.d.f.
list of values and point out the chief
differences between the two lists. (SUJB) Oe P(X 2x) ener x = 1,2,3,4,5
10. The discrete r.v. X has p.d.f. P(X=x) = k(10—x) x = 6,7,8,9
P(X=x) = klix| Find (a) the value of the constant k,
(b) E(X), (¢) Var(X), (d) E(2X— 3),
where x takes the values — 3,—2,—1,0,1, (e) Var(2X— 3).
Example 3.13 Find the cumulative distribution function for the r.v. X where X is
‘the score on an unbiased die’.
F(1) = P(X<1) = -
=
F(2) = P(X <2) == PX =1)+ fa tae
P(x=-2 aur
=1)+P(X= Mic6
3
F(3) = P(X<3) =.—
190 A CONCISE COURSE IN A-LEVEL STA TISTICS
Example 3.14 The probability distribution for the r.v. X is shown in the table.
Construct the cumulative distribution table.
zea 2 ridlaiied
NOTE: it is not possible to write a formula for the cumulative
distribution function in Example 3.13.
Example 3.15 For a discrete r.v. X the cumulative distribution function F(x) is
as shown:
_ Exercise 3e
Gea es oa
(c) Write out the probability distribution
of X, (d) Find E(2X— 3).
P(Y=y) 10.05 0.25 0.3 0.15 0.25 For a discrete r.v. X the cumulative distri-
bution function is given by F(x) = kx,
Construct the cumulative distribution table. x =1,2,3.Find (a) the value of the con-
stant k, (b) P(X <3), (c) the probability
For a discrete r.v. R the cumulative distri- distribution of X, (d) the standard
bution function F(r) is as shown in the deviation of X.
table: The discrete r.v. X has distribution
Example 3.16 X is the r.v. ‘the score ona tetrahedral die’, Y is the r.v. ‘the number
of heads obtained when two coins are tossed’.
(a) Obtain the probability distributions of X and of Y.
(b) Find E(X) and E(Y).
(c) Find Var(X) and Var(Y).
(d) Obtain the probability distribution for the r.v. X + Y.
(e) Find E(X + Y) and Var(X + Y) using the probability distribution
for X + Y; comment on your results.
Solution 3.16
(a) The probability distributions are as follows:
(c)
E(X?) = > Pur =e) E(Y?) II
allx
uta)+4QG)+2(8)+26(7)
=
1
—|+ =
1 1
oe
1
a2
i
—
2
Var(X) = E(X?)—E(X) Var(Y) BY) Ee
AP sot 1
Loan
25-4 2
1
4
So Var(X) = 1i. Var(Y)
bw
:
Nile
|e
PROBABILITY DISTRIBUTIONS | — DISCRETE RANDOM VARIABLES ; 193
(d) Consider the r.v. X+ Y.
X + Y can assume values 1, 2,3,4,5 and 6.
. hy 2 1
P(X+ Y=1) = P(1 ondie,
0heads) = |—]|—|] = —
4]\4 16
E+E) 8
+ P(1 on die, 2 heads)
4
16
P(X+ Y=5) = P(4 on die, 1 head) + P(3 on die, 2 heads)
= le) GIG)
aid Vid
3
16
P(X+ Y=6) = P(4 on die, 2 heads)
- (i) Eta
1
16
The probability distribution is as follows:
Ano
Now
E(X+Y)
16
14
1
Var(X + Y) 14—12—
4
3
tt fe
4
Therefore Var(X + Y) = 13.
BAY EOE
: Var(X—Y) = Var(X) + (—1)*Var(¥)
: = Var(X) + Var(¥)
Example 3.17 X and Y are independent random variables with p.d.f. as shown:
Example 3.18 The r.v. X is such that E(X) = 2, Var(X) = 0.5; the r.v. Y is such
that E(Y) = 5, Var(Y) = 2; X and Yare independent.
Find (a) E(83X+4Y), (b) Var(3X+4Y), (c) Var(5X—2Y).
Example 3.19 The table gives the joint probability distribution of two random
variables X and Y:
By symmetry E(X) = 5:
BUY). = > PY = 9)
ally
= 1(0.6)
+ 2(0.4)
= 1.4
Therefore E(Y) = 1.4.
THE DISTRIBUTION OF X; + X2
2.9
BOC lI Da PX =
II 4(0.3) + 9(0.5) + 16(0.2)
= 89
= 0.49
X,+X2 probability
7 (0:5)(0.2)'= 0:1
6 (0.2)(0.3) = 0.06
7 = (0.2)(0.5) =0.1
8 (0.2)(0.2)
=0.04
Xx; | X2 |
PROBABILITY DISTRIBUTIONSI — DISCRETE RANDOM VARIABLES : 199
Example 3.21 Find the expectation and variance of the number of heads obtained
when 6 coins are tossed.
Solution 3.21 Let X be the r.v. ‘the number of heads when a coin is tossed’. Then
X can take the values 0, 1
Example 3.22 (a) A tetrahedral die is thrown and the number of the face on
which it lands is noted. Find the expectation and the variance.
(b) The ‘score’ is double the number on which it lands. Find the
expectation and the variance of the ‘score’.
(c) A new experiment is set up, where the ‘score’ is the sum of the
numbers obtained when the die is thrown twice. Find the
expectation and the variance of this new ‘score’.
Solution 3.22 (a) Let X be the r.v. ‘the number on which the die lands’.
Var(X) = E(X?)—E(X)
oS x? P(X = x)—E*(X)
all x
1
qilt4+9+16)—
(2.5)
1.25
So E(X) = 2.5, Var(X) = 1.25.
PROBABILITY DISTRIBUTIONS | — DISCRETE RANDOM VARIABLES 201
By symmetry E(R) = 5.
Var(R) = E(R?)—E7(R)
1
7 (4 +16 +36 + 64)—25
= 5
We note that
(c) Consider the r.v. S where S is the sum of the two numbers on
which the die lands when it is thrown twice. Therefore S = X; + X).
Now S can assume the values 2,3, 4, 5,6, 7,8 and the outcomes (all
equally likely) are shown in the diagram:
throw
Second
First throw
By symmetry, E(S) = 5.
Var(S) = E(S?)—E*(8)
= Ds? P(S =s)—25
alls
+ 64(1)]—25
= 2.5
We note that
We can see that the distribution for R, double the number on which
the die lands, is very different from the distribution for S, the sum
of the numbers on which the die lands when it is thrown twice.
Although the means of the two distributions are the same, the
variances are not, with the r.v. ‘double the number’ having the
greater variance.
Summarising, we have
PROBABILITY DISTRIBUTIONS| — DISCRETE RANDOM VARIABLES : 203
Exercise 3f
1. Independent random variables X and Y 6. Two ordinary dice are thrown, a red and a
have probability distributions as shown in green die. Let R be the r.v. ‘the score on
the tables: the red die’ and let G be the r.v. ‘the score
on the green die’.
(a) Construct the probability distribution
for R+ G, the r.v. ‘the sum of the two
scores’, and find (i) E(R+G),
(ii) Var(R + G).
(b) Construct the probability distribution
Oo 02s 0A for R—G and find (i) E(R—G),
(ii) Var(R— G).
(a) Find E(X), E(Y), Var(X), Var(Y). (c) Given that E(R) = 3.5 and
(b) Construct the probability distribution Var(R) = 3, comment on your answers.
forthe rv.xct Y.
(c) Verify that E(X+ Y) = E(X)+ E(Y).
(d) Verify that 7. X has probability distribution as shown:
Var(X+ Y) = Var(X) + Var(Y)
(e) Construct the probability distribution
for the r.v. X— Y.
(f) Verify that E(X— Y) = E(X)— E(Y).
(g) Verify that
Var(X— Y) = Var(X)+ Var(Y). (a) Find E(X) and Var(X).
(b) Find P(X,+ X2,=4) where X,, X2
are two independent observations of X.
2. Independent random variables X and Y (c) Find E(X,+X,) and Var(X;+ X2).
are such that E(X) = 4, E(Y)=5, (d) Find P(2X = 4).
Var(X) = 1, Var(Y) = 2. Find (e) Find E(2X) and Var(2X).
(a) E(4X+ 2Y), (b) E(5X—Y),
(c) Var(3X+ 2Y), (d) Var(5Y— 3X), 8. : Rods of length 2m or 3 mare selected
1 at
(e) Var(3X—5Y).
random with probabilities 0.4 and 0.6
respectively.
(a) Find the expectation and variance of
the length of a rod.
(b) Two lengths are now selected at
random. Find the expectation and
variance of the sum of the two lengths.
(c) Three lengths are now selected at
The above table gives the joint probability random. Show that the probability distri-
distribution of two random variables X bution of Y, the sum of the three lengths,
and Y. Calculate (a) P(Y = 1), ie
(b) P(XY = 2), (c) E(X+ Y).
(L Additional)
t
F(t) = » P(X =x) where F(t) is the cumulative
x= xX, distribution function
E(X) = Dee x)
allx
E(X?)—E*(X)
I x?P(X = x)— E*(X)
all x
Two tetrahedral dice are thrown and the the probabilities of all other possible
score is the product of the numbers on values of X.
which the dice fall. What is the expected Use your results to show that the mean of
score for a throw? xX is 2, and find the standard deviation
of X.
A housewife removes the labels from
Two trials are made. (The two balls in the
three tins of peaches and a tin of baked
first trial are replaced in the box before
beans in order to enter a competition and
the second trial.) Find the probability
then puts the tins in a cupboard. She
that the second value of X is greater than
discovers that the tins are outwardly (MET)
or equal to the first value of X.
identical. Let X be the number of tins she
now needs to open in order to have baked
beans. List the values that X can take and A man stakes £2 to play a game in which
determine the probabilities for each of he rolls an ordinary (fair) die. If he scores
these values of X. Calculate the expected 1 or 2 he wins £3 (plus his stake) and
value of X. loses his stake if he scores 3, 4 or 5. If he
scores a six he may roll the die once
Her neighbour has five tins of peaches and again, winning if he scores 1, 2 or 6,
two tins of baked beans, again outwardly losing if he scores 3, 4 or 5. Find
identical once the labels are removed. (a) the probability that the man wins the
This woman removes the labels and puts game by rolling (i) once, (ii) twice.
the tins away. Find the probability that (b) his expectation,
this woman later requires to open at least (c) the expected number of times he will
three tins to have baked beans. roll the die.
: (SUJB Additional)
If the rules are changed so that the
winning scores are 1 and 2 but that every
On a long train journey, a statistician is
time he scores 6 he may roll the die again,
invited by a gambler to play a dice game.
find
The game uses two ordinary dice which
(d) the probability that he wins on his
the statistician is to throw. If the total
rth roll of the die,
score is 12, the statistician is paid £6 by
(e) the probability that he wins the
the gambler. If the total score is 8, the (SUJB)
game.
statistician is paid £3 by the gambler.
However if both or either dice show a 1,
A and B each roll afair die simultaneously.
the statistician pays the gambler £2. Let
£X be the amount paid to the statistician
Construct a table for the difference in
their scores showing the associated proba-
by the gambler after the dice are thrown
bilities. Calculate the mean of the distri-
once.
bution. If the difference in scores is 1 or
Determine the probability that (a) X =6, 2, A wins; if it is 3,4 or 5, B wins and if it
(b) X =38, (ec) X=—2. is zero, they roll their dice again. The
Find the expected value of X and show game ends when one of the players has
that, if the statistician played the game won. Calculate the probabilitity that A
100 times, his expected loss would be wins on (a) the first, (0) the second,
£2.78, to the nearest penny. (c) the rth roll. What is the probability
that A wins?
Find the amount, £a, that the £6 would
have to be changed to in order to make If B stakes £1 what should A stake for
(SUJB) the game to be fair? (SUJB)
the game unbiased.
A box contains nine numbered balls. A gambler has 4 packs of cards each of
balls
Three balls are numbered 3, four which is well shuffled and has equal
ered 4 and two balls are num- numbers of red, green and blue cards.
are numb
bered 5. For each turn he pays £2 and draws a
he
Each trial of an experiment consi
sts of card from each pack. He wins £3 if
drawing two balls with out repl acem ent gets 2 red cards, £5 if he gets 3 red cards
and recor ding the sum of the numb ers on and £10 if he gets 4 red cards.
that
them, whichRt is denoted a by X.rTShow a as: (a) What are the probabilities of his
the proba bilit y that X = 10 is 36, and find drawing 0,1, 2,3,4 red cards?
206 A CONCISE COURSE IN A-LEVEL STATISTICS
(b) What is the expectation of his 11. The faces of an ordinary die are re-
winnings (to the nearest 10p)? (SUJB) numbered so that the faces are 1, 2, 2,3,3
and 3. This die and an ordinary, unaltered
die are thrown at the same time. The
During winter a family requests 4 bottles score, X, is the sum of the numbers on
of milk every day, and these are left on the uppermost faces of the two dice.
the door-step. Three of the bottles have Show that the probability of X being 3
silver tops and the fourth has a gold is + and of being 4 is §.
top. A thirsty blue-tit attempts to remove
List the values that X can take and deter-
the tops from these bottles. The proba-
bility distribution of X, the number of mine their respective probabilities. Hence
silver tops removed by the blue-tit, is the obtain the expected value of X, correct
same each day and is given by to 3 decimal places.
If the dice are thrown 3 times, determine
P(X=0)
=% P(X=1) = &,6 the probability, correct to 3 significant
P(X=2) = 3, P(X=3)=% figures, that none of the three values of
X exceeds 3. (SUJB)
The blue-tit finds the gold top particularly
attractive, and the probability that this
top is removed is 2 independent of the
number of silver tops removed. Determine
the expectation and variance of 12. Alan and his younger brother Bill play a
(a) the number of silver tops removed in game each day. Alan throws three darts at
a day, a dartboard and for each dart that scores
(b) the number of gold tops removed in a a bull (which happens with probability p)
day, Bill gives him a penny, while for each dart
(c) the total number of tops (silver and which misses the bull (which happens with
gold) removed in 7 days. ; probability 1— p) Alan gives Bill two-
pence. By considering all possible out-
Find also the probability distribution of comes for the three throws, or otherwise,
the total number of tops (silver and gold) find the distribution of the number of
removed in a day. (C) pence (positive or negative) that Bill
receives each day. Show that, when p = é,
The probability of there being X unusable the mean is 3 and the variance 6.
matches in a full box of Surelite matches The game takes place on 150 days. What
is given by P(X = 0) = 8k, P(X = 1) = 5k, is the mean and standard deviation of Bill’s
P(X= 2) =P(X =3)=k, P(X 24)=0. total winnings when p = 3? (O)
Determine the constant k and the expecta-
tion and variance of X.
Two full boxes of Surelite matches are
chosen at random and the total number Y
of unusable matches is determined. Cal- 13. In a certain field, each puffball which is
culate P(Y > 4), and state the values of growing in one year gives rise to a number,
the expectation and variance of Y. (C) X, of new puffballs in the following year.
None of the original puffballs is present
in the following year. The probability
10. A player throws a die whose faces are
distribution of the random variable X is
numbered 1 to 6 inclusive. If the player
as follows:
obtains a six he throws the die a second
time, and in this case his score is the sum RX 0) = PEG2\e 10 s
of 6 and the second number; otherwise PUGS) lI =,04s
his score is the number obtained. The Find the probability distribution of Y,
player has no more than two throws. the number of puffballs resulting from
Let X be the random variable denoting there being two puffballs in the previous
the player’s score. Write down the proba- year, and show that the variance of Y is
bility distribution of X, and determine 2"
the mean of X.
Hence, or otherwise, determine the proba-
Show that the probability that the sum bility distribution of the number, Z, of
of two successive scores is 8 or more is ut. puffballs present in year 3, given that
Determine the probability that the first there was a single puffball present in year
of two successive scores is 7 or more, 1. Find also the mean and variance of Z.
given that their sum is 8 or more. (C) (C)
PROBABILITY DISTRIBUTIONSI — DISCRETE RANDOM VARIABLES 207
14. A discrete random variable X can take Show that the expected value of X, the
only the values 0, 1, 2 or 3, and its length of the selected rod, is 3 units and
probability distribution is given by find the variance of X.
P(X= 0) = "kp P(X = 1) = 8k; After a rod has been selected it is not
P(X = 2) = 4k, P(X = 3) = 5k, where replaced. The probabilities of selection
k isa constant. Find for each of the three rods that remain
(a) the value of k, are in the same ratio as they were before
(6) the mean and variance of X. (JMB) the first selection. A second rod is now
selected from the bag. Defining Y to be
the length of this rod and writing
15. A random variable R takes the integer P,=P(Y=1|X=2),P,=P(Y= 2|X=1)
value r with probability P(r) where show that 16P,; = 9P>.
P(r) = kr’, ees Show also that (X+ Y=3)=35, (C)
P(r) -= 0, otherwise.
Find
(a) the value of k, and display the
distribution on graph paper, 18. A game is played in which a complete
(b) the mean and the variance of the throw consists of three fair coins being
distribution, tossed once each and any which have
(c) the mean and the variance of landed tails being tossed a second time;
5R—3. (L)P no coin is tossed more than twice. The
score for the complete throw is the total
number of heads showing at the end of
16. A gambling machine works in the the throw.
following way. The player inserts a (a) Find the respective probabilities that
penny into one of five slots, which are the score after a complete throw is (i) 0,
coloured Blue, Red, Orange, Yellow and (ii) 1, (iii) 2, (iv) 3.
Green corresponding to five coloured (b) Show that the average score over a
light bulbs. The player can choose which large number of complete throws is 9/4.
ever coloured slot he likes. After the (You may leave your answers as fractions
penny has been inserted one of the five in their lowest terms.) (O &C)
bulbs lights up. If the bulb lit up is the
same colour as the slot selected by the
player, then the player wins and receives
from the machine R pennies, where 19. The random variable X takes values —2,
P(X = 3)
|
P(X = 4) = =z be I —
|
gained by the player from a single try,
o
21. A random variable R takes the integer (a) Write down the probability distribu-
values 1,2,...,n each with probability tion of X.
1/n. Find the mean and variance of R. (b) Find the probability distribution of
the sum of two independent observations
A pack of 15 cards bearing the numbers 1 from X and find the mean and variance of
to 15 is shuffled. Find the probability the distribution of this sum.
that the number on the top card is larger
than that on the bottom card, giving
reasons for your answer.
If the sum of these two numbers is S,
find 23. A random variable R takes the integer
(a) the probability that S <4 value r with probability P(r) defined by
(b) the expected value of S.
P(r) = kr’, r = 1,2, 8,
(Answers may be left as fractions in their
lowest terms.) (O &C) P(r) = k(7—r), rv = 4,5,6,
JQP) otherwise.
22. A discrete random variable X has the Find the value of k and the mean and
distribution function variance of the probability distribution.
Exhibit this distribution by a suitable
diagram.
Determine the mean and the variance of
the variable Y where Y=4R—2. (L)P
SPECIAL DISCRETE
PROBABILITY
DISTRIBUTIONS
THE BINOMIAL DISTRIBUTION
Consider an experiment which has two possible outcomes, one
which may be termed ‘success’ and the other ‘failure’. A binomial
situation arises when n independent trials of the experiment are
performed, for example
toss a coin 6 times; consider obtaining a head onasingle toss as
a success, and obtaining a tail as a failure;
throw a die 10 times; consider obtaining a 6 on a single throw as
a success, and not obtaining a 6 as a failure.
Example 4.1 A coin is biased so that the probability of obtaining a head is2. The
coin is tossed four times. Find the probability of obtaining exactly
two heads.
4!
in BIO
But the result ‘two heads and two tails’ can be obtained
ways.
the heads
This is the number of ways of choosing the 2’places for
from the 4 places, i.e. 4C, ways. The arrangements are:
209
210 A CONCISE COURSE IN A-LEVEL STATISTICS
Tp
ere
Therefore P(2 heads exactly) = orf )
Example 4.2 An ordinary die is thrown seven times. Find the probability of
obtaining exactly three sixes.
0.078 (38d.p.)
The probability of obtaining exactly three sixes when a die is
thrown seven times is 0.078 (3 d.p.).
Example 43 The probability that a marksman hits a target is p and the proba-
bility that he misses is g, where q = 1—p. Write an expression for
the probability that, in 10 shots, he hits the target 6 times.
In general:
PRat = Gq op 2 0,12,.550
_ In general:
The values P(X = x) for x =0,1,...,n can be obtained by con-
sidering the terms in the binomial expansion of (q+ p)”, noting
that gq+tp=1 iS
Example 45 The probability that a person supports Party A is 0.6. Find the
probability that in a randomly selected sample of 8 voters there are
(a) exactly 3 who support Party A, (b) more than 5 who support
Party A.
Solution 4.5 We will consider ‘supporting Party A’ as success. Then p = 0.6 and
q=1—p=0.4. Let X be the r.v. ‘the number of Party A
supporters’. Then X ~ Bin(n, p) with n = 8 and p = 0.6.
So X ~ Bin(8, 0.6)
and
PX =a)e= UC. ites GlOA) Ooi ek =a de
(a) We require
P(X = 8) = °C;(0.4)°(0.6)? = 0.124 (3d.p.)
The probability that there are exactly 3 Party A supporters is 0.124
(op)
(b) We require
P(X > 5) P(X = 6)+P(X = 7)+ P(X =8)
8C,(0.4)?(0.6)°
+8C(0.4)(0.6)7 +8C,(0.6)®
28(0.4)2(0.6)°
+ 8(0.4)(0.6)7 +(0.6)®
(0.6)°(4.48
+1.92 +0.36)
0.315 (3dp.)
The probability that there are more than 5 Party A supporters is
0.315 (8d.p.).
Example 46 A box contains a large number of red and yellow tulip bulbs in the
ratio 1:3. Bulbs are picked at random from the box. How many
bulbs must be picked so that the probability that there is at least
one red tulip bulb among them is greater than 0.95?
“(
Bixee) 1—P(X
=0)
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS 213
3 \n
So 1(5 > 0.95
3 n
0.05 > 3]
4
log 0.05 > nlog0.75 _ (taking logs to base 10)
i.e.
nlog0.75 < log0.05
(=O. 1301
1.301
n> (change inequality when dividing
mek ep by a negative quantity)
eee eae ee —
Exercise 4a
the probabilities of obtaining 0,1, 2,3 Find the least number of shots which
heads with the terms in the binomial should be fired if the probability that the
expansion of (q +p)? where g = 1— p. target is hit at least once is greater than
(b) The coin is now tossed four times. 0.95.
Compare the probabilities P(X= x) for
x = 0,1, 2,3,4 given in the tree diagram 16. In a multiple choice test there are 10
with the Bionuel expansion of (q + p). questions and for each question there is
X is the r.v. ‘the number of heads obtained a choice of 4 answers, only one of which
in four tosses’. is correct. If a student guesses at each of
the answers, find the probability that he
12. If X ~ Bin(n, 0.6) and P(X <1) = 0.0256,
gets (a) none correct, (b) more than 7
find n. correct. If he needs to obtain over half
13. 1% of a box of light bulbs are faulty. marks to pass, and the questions carry
What is the largest sample size which can equal weight, find the probability that
be taken if it is required that the proba- he passes.
bility that there are no faulty bulbs in
the sample is greater than 0.5? 17. Of the pupils in a school, 30% travel to
school by bus. From a sample of 10
14. If X ~ Bin(n, 0.3) and P(X2 1) > 0.8,
pupils chosen at random, find the proba-
find the least possible value of n.
bility that (a) only 3 travel by bus,
15. The probability that a target is hit is 0.3. (b) more than 8 travel by bus.
Proof Now
P(X =x) = ECG Tr.
X= OFF
Zoe ee te
[nica[ear
feceythYENal cased ae g" 2p? n(n—1)(n—2) ,_,
3 Ls
= (0)q" + (1)nq"'p+42D) na oN
2
DOS Cy 2
ai q”? 33 +... +p”
(n—1)(n—2)
= np[q”~'+(n—1)q”~2p+ a gn 3? +
pat pte |
= np[(q+p)"—*]
= np since qt+p=1
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS 215
Therefore
E(X). =
iB (xe) ye x?P(X = x)
allx
lO aso pe
= npla" + 2(n—1)g"2p + I gray +... np" 3]
Sag phe ee ra
+(n—1)q"~*p rit...
+(n—1)p"-
Now the first row of terms is, as before, the expansion of
(qtp)"~’:
So E(X?) = np{(q+p)"*
+(n—1)p[q” 7+ (n—2)q""*pt+...+p”7J}
= np(1+(n—1)p(q+p)" 7]
= np(1t+(n—1)p]
= np(1—p) +n’p*
Therefore Var(X) = np(1—p)+n?p?— (np)?
= npq where q = 1—p
X be the
Solution 4.7 Let ‘fine day’ be ‘success’. Then p = 0.4 and q = 0.6. Let
r.v. ‘the number of fine days in a week’.
24 24 Re
Var(X) = 13 so npq = 5 (ii)
eke
Bae
Therefore p= 1-@q
12 13
i
mis
Now substituting for p in (i) we have
13
n = 26
so P(X = 2)
ols) (Ga)
13
cy (26)(25)(12)4
13
~ (1(2)(23)2¢
= 0.282 (3d.p.)
Therefore P(X = 2) = 0.282 (3d.p.).
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS 217
Exercise 4b
Of the articles from a certain production bility that exactly three sixes are recorded
line, 10% are defective. If a sample of 25 during a particular experiment.
articles is taken, find the expected number (C Additional)
of defective articles and the standard
deviation. (i) For each of the experiments described
below, state, giving a reason, whether a
binomial distribution is appropriate.
The probability that an apple, picked at
Experiment 1. A bag contains black, white
random from a sack, is bad is 0.05. Find
and red marbles which are selected at
the standard deviation of the number of random, one at a time with replacement.
bad apples in a sample of 15 apples. The colour of each marble is noted.
Experiment 2. This experiment is a repeat
X is ar.v. such that X ~ Bin(n, p). Given of Experiment 1 except that the bag con-
that E(X) = 2.4 and p = 0.3, find n and tains black and white marbles only.
the standard deviation of X.
Experiment 3. This experiment is a repeat
of Experiment 2 except that marbles are
In a group of people the expected num- not replaced after selection.
ber who wear glasses is 2 and the variance (ii) On average 20% of the bolts produced
is 1.6. Find the probability that (a) a by a machine in a factory are faulty.
person chosen at random from the group Samples of 10 bolts are to be selected at
wears glasses, (b) 6 people in the group random each day. Each bolt will be selected
wear glasses. and replaced in the set of bolts which have
been produced on that day. °
If the r.v. X is such that X ~ Bin(10, p) (a) Calculate, to 2 significant figures, the
where p <$ and Var(X) = 14, find (a) p, probability that, in any one sample, two
(b) E(X), (c) P(X = 2). bolts or less will be faulty. ;
(b) Find the expected value and the
variance of the number of bolts in a sample
A die is biased and the probability, p, of
which will not be faulty. (L Additional)
throwing a six is known to be less than é.
An experiment consists of recording the In two binomial distributions the ratio of
number of sixes in 25 throws of the die. In the number of independent trials is 5:6,
a large number of experiments the standard the ratio of the arithmetic means is 2:9
deviation of the number of sixes is 1.5. and the ratio of the variances is 32:45. For
Calculate the value of p and hence deter- each distribution, find the probability of
mine, to two places of decimals, the proba- success.
Po eee Re Fe ee ee
<
0.0081
wi <0.000
45
+] 0.00001
o}< ©}< 45
0.000010.000
|< 0.0081
\§<
t+
8o
S
{
>1< 0.000
0.0064 32
a}
<— <
0.00032
©} 1
Solution 439 (a) P(X <4) 0.9976 (directly from the tables)
However we are
In the tables values of p are given from 0.1 to 0.5.
0.8 and 0.9 by using the
still able to use them for p = 0.6, 0.7,
fact that
Parl Xo Binin, p)dn=— P= n—r|X~ Bin(n, 1 —p))
~ Bin(5, 0.3)
Consider again the probability distributions for X
and X ~ Bin(5, 0.7).
X ~ Bin (5, 0.7)
X ~ Bin (5, 0.3)
Example 4.10 If X ~ Bin(5, 0.7) find (a) P(X >3), (b) P(X <4),
(c) P(X = 4).
Example 4.11 Given X ~ Bin(8, 0.2) write out the probability distribution of X.
Exercise 4c
So
(n—x)p ee
PX xe Chin
(n—x)p ] .
Px +1 are
(elias where a ( =xt+1)
= P(X :
ll P(X =x)
Px
Example 4.12 If X ~ Bin(8, 0.3) use the recurrence formula to calculate P(X <4).
When x = 1
7(0.3)
Po. 2(0.7)"! 0.296 475 4
When x = 2
6(0.3)
P3 = 3(0.7) P2 0.2541218
When x = 3
5(0.3)
D4 = 0.136 136 7
4(0.7)
°°
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS 223
Example 4.13 A pottery produces royal souvenir mugs. It is known that 6% are
defective. If 20 mugs are selected at random, find the probability
that the sample contains less than 5 defective mugs.
Method1
P(X =0) = (0.94)?° = 0.290 106 (6 d.p.)
P(X =1) = 20(0.94)!9(0.06) = 0.370 348
P(X (20)(19) 18
(2)(1) ((0.94)!8(0.06
( =2) ) = ——— yr )
2 =es 0.224573
(20)(19)(18) ms i 5= 0.086 007
P(X (3)(2)(1) ((0.94)!7(0.06
( = 3) ) = ————— point )
(20)(19)(18)(17) bs sets
=A) = — (0,94) !9(0:06 5 0.023 332
Sa (4)(3)(2)(1) ip —— =
0.994 366
So P(X <5)=0.994 (38S.F.).
When x = 1
19 /0.06
Poe 2 fenle and so on
When x = 2
18 (0.06
Bs 3. fea
When x = 3
17 (0.06
Pa > ns oot,
Example 4.14 Of the inhabitants of a certain African village, 80% are known to
have a particular eye disorder. If 12 people are waiting to see the
nurse, what is the most likely number of them to have the eye
disorder?
Solution 4.14 Let X be the r.v. ‘the number of people with the eye disorder’.
Then X ~ Bin(n, p) with n = 12 andp = 0.8.
Therefore X ~ Bin(12, 0, 8)
and P(X = x) =#@,(0:2)'7**(0.8):ons OF eee.
Using the recurrence formula
aR 2 (n—x)p
+1 Ging
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS 225
Exercise 4d
(a) If the r.v. X is such that X ~ Bin(6, 3), (b) the most likely number of green
use the recurrence formula to find the counters drawn,
most likely value of X. (c) the probability that no more than 4
(b) Now use the formula to find P(X = x) yellow counters are drawn.
for x = 0,1,..., 6. Check whether your
answer is consistent with your answer to The random variable X is distributed
part (a). binomially with mean 2 and variance 1.6.
Find (a) the most likely value of X,
The r.v. X is such that X ~ Bin(9, 0.35). (b) P(X <6).
Find P(X <6) (a) without using the re-
currence formula, (b) using the recurrence The probability that a student is awarded
formula. Compare your answers. a pass in the mathematics examination is
0.75. Find the probability that in a group
In a bag there are 6 red counters, 8 yellow
of 10 students more than half pass the
counters and 6 green counters. A counter is mathematics examination.
drawn at random from the bag, its colour
is noted and it is then replaced. This pro-
cedure is carried out ten times in all. Find The random variable X is such that
(a) the expected number of red counters X ~ Bin(8, 0.4). Find (a) the most likely
drawn, value of X, (b) P(X <4), (c) P(X2 4).
Re ee UU wees (BO ee
Number of heads
Frequency
the coin is
(a) Find the probability of obtaining a head when
tossed.
4 heads, using
(b) Calculate the theoretical frequencies of 0,1, 2,3,
the associated theoretical binomial distribution.
1300
500
= 2.6
Let X be the r.v. ‘the number of heads obtained in 4 tosses’. Then
X ~ Bin(n, p) with n = 4. So the mean, E(X) = np.
Therefore np = 2.6
4p = 2.6
So p = 0.65
Therefore the probability that the coin will show heads is 0.65.
= 0.310 5375
2(0.65)
3(0.35)"” = (1.238 095 2) (0.310 537 5)
= 0.384 475
1(0.65)
4(0.35)*° = (0.464 285 7) (0.384 475)
= 0.178506 25
Number of heads
ncy
NOTE: this compares reasonably well with the original freque
distribution.
illustrated on
A statistical test to compare the two sets of data is
p. 540 (chi-squared test).
S
A CONCISE COURSE IN A-LEVEL STA TISTIC
228
Exercise 4e
A biased die is thrown 3 times and the 4. Fit a theoretical binomial distribution to
1.
number of fours is noted. The procedure is the following frequency distribution, given
n=A:
performed 180 times in all and the results
are shown in the table.
[x|o Bipen?}2 Foye eu!
Number of 4’s OD I 2s ed: 7 20 35 30 8
Ruubgotyar qth
Naa Ge Tae ie y
a9 14) me
& 40 80° BSEoe
2. Ina large batch of items from a production
line the probability that an item is faulty
is p. 400 samples, each of size 5, are taken
and the number of faulty items in each Find the theoretical frequencies of 0,1,...,
batch is noted. From the frequency distri- 6 seeds germinating in a row, using the
bution below estimate p and work out the associated theoretical binomial distribution.
expected frequencies of 0,1, 2,3,4, 5 faulty
items per batch for a theoretical binomial
distribution having the same mean.
WORKED EXAMPLE
Example 4.16 70% of the passengers who travel on the 8.17 to London buy the
‘Daily Doom’ at the bookstall before boarding the train. The train is
full and each compartment holds eight passengers.
(a) What is the probability that all the passengers in a compart-
ment have bought the ‘Daily Doom’?
(b) What is the probability that none of the passengers in a com-
partment has bought the ‘Daily Doom’?
(c) What is the probability that exactly three of the passengers in
a compartment bought the ‘Daily Doom’?
(d) What is the most likely number of passengers in a compartment
to have bought the ‘Daily Doom’?
(e) If there are 40 compartments on the train in how many of
them would you expect there to be exactly three copies of the
‘Daily Doom’?
(f) The train is so full that in each carriage ten people are standing
in the corridor. What is the probability that the third passenger
I pass in the corridor of a carriage is the first I meet who has
bought the ‘Daily Doom’?
(g) What is the mean number of buyers of the ‘Daily Doom’
standing in a corridor? (SUJB)
Solution 4.16 Let ‘buying the Daily Doom’ be termed ‘success’. Therefore
p= 0.7 and q =1—p=0.3.
Let X be the r.v. ‘the number of passengers who have bought the
Daily Doom’. Then X ~ Bin(n, p) where n = 8 and p = 0.7, i.e.
xX ~
Bin(8, 0.7).
bought the
(d) To find the most likely number of people who have
Daily Doom, use the recurrence formula to find the term with the
highest probability.
Recurrence formula:
Therefore pg > Ps > Pa > P3> P2—> Pi Do but De > P1> Ds-
(g) Let C be the r.v. ‘the number of people in.the corridor to have
bought the Daily Doom’. Then C ~ Bin(10, 0.7).
E(C) = (10)(0.7)
= 7
Therefore the expected number of buyers standing in a corridor is 7.
Example 4.17 (a) State in words the meaning of P(E’) and of P(E|F) for two
events E and F.
(b —
All the letters in a particular office are typed either by Pat, a
trainee typist, or by Lyn, who isa fully trained typist. The
probability that a letter typed by Pat will contain one or more
errors is 0.3. Find the probability that a random sample of 4
letters typed by Pat will include exactly one letter free from
error.
(c) The probability that a letter typed by Lyn will contain one or
more errors is 0.05. Using the tables provided, or otherwise,
find, to 3 decimal places, the probability that in a random
sample of 20 letters typed by Lyn, not more than 2 letters will
contain one or more errors.
(d) On any one day, 6% of the letters typed in the office are typed
by Pat. One letter is chosen at random from those typed on
that day. Show that the probability that it will contain one or
more errors is 0.065.
(e) Given that each of 2 letters chosen at random from the day’s
typing contains one or more errors, find, to 4 decimal places,
the probability that one was typed by Pat and the other by
Lyn. (L)
Solution 4.17 (a P(E’) is the probability that event E does not occur.
F has
P(E|F) is the probability that E occurs, given that
—
occurred.
by Pat which
Let X be the r.v. ‘the number of letters typed
are free from errors’ . Then X ~ Bin(4, 0.7).
P(X = 1) 4¢3(0.3)3(0.7)
= 0.0756
of 4 letters
Therefore the probability that a random sample
free from error is
typed by Pat will include exactly one letter
0.0756.
232 A CONCISE COURSE IN A-LEVEL STATISTICS
P(E | Pat)+P(Pat)
(e) Now P(Pat|E)
P(E)
(0.3) (0.06)
0.065
|
Fileo
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS F 233
P(E |Lyn)*P(Lyn)
P(Lyn|E) =
P(E)
(0.05) (0.44)
0.065
Sed65
P(1 typed by Lyn, 1 typed by Pat |2 letters contained errors)
= 2(8)
ge 18,
(@)
4
= 0.4005 (4d.p.)
Therefore the probability that one was typed by Pat and one
by Lyn, given that the two letters contained errors, is 0.4005
(4d.p.).
Var(X) = npq
Recurrence formula:
.. (n= x)p
Dx ime ea|Px where Px+1 = P(X =x+1)
(x +1)q
Dicwcmel Auge)
~ Bin(n, 1—p))
~ Bin(n, p)) = P(X =n—r|X
P(X =r|X
~ Bin(n, 1—p))
~ Bin(n, p)) = P(X <n—r|X
P(X >r|X
~ Bin(n, 1—p))
~ Bin(n, p)) = P(X 2n—r|X
P(X <r|X
__ Miscellaneous Exercise 4f
Find the probability of throwing three sixes (b) Find the most likely number of left-
1.
handed people in a random sample of 12
twice in five throws of six dice.
people.
(c) Find the mean and the standard devia-
tion of the number of left-handed people
2. Ina large city 1 person in5 is left handed.
in a random sample of 25 people.
(a) Find the probability that in a random
(d) How large must a random sample be
sample of 10 people
(i) exactly 3 will be left handed, if the probability that it contains at least
(ii) more than half will be left- one left-handed person is to be greater
than 0.95?
handed.
A CONCISE COURSE IN A-LEVEL STATISTICS
Now
P(X = 3) = q’p
P(X = 4) = q°p
X~Geo(p)
where q=1—p. |
E(X)
pt+2qp+3q2p+4q*pt...
+3q?2+ 493+...)
p(1+2q
p(l—q)_ since (1—g)?=1+2q+3q?+4q*+...
S/S
SIH
Now
p+4qp+9q2p+16q*pt...
p(it+ 4q+9q?+16q?+...)
p(1+2q+38q?+4q?+...
4+2q + 6q2+129q3+...)
p((1—9) * + 2g) #38q 6q2-e7))
i
sat2atta>) since
p[ (1—q) °=1+3q+6q7+...
Da
De ae
& =)
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS ; 237
Var(X) E(X*) =X)
heen et
pp? p?
p*+2q-—1
Pp?
iat eeee a
=, site p—1—9
p
1
Therefore E(X) =— and Var(X) =~.
p p
Example 4.18 The probability that a marksman hits the bull’s eye is 0.4 for each
shot, and each shot is independent of all others. Find
(a) the probability that he hits the bull’s eye for the first time on
his fourth attempt,
(b) the mean number of throws needed to hit the bull’s eye, and
the standard deviation,
(c) the most common number of throws until he hits the bull’s
eye.
Solution 4.18 (a) P(hits bull’s eye on fourth attempt) = (0.6)3(0.4) = 0.0864.
Result 1 ‘
II p(atqrbiatad
7)
jana)
(f==q)
= 1—q’
Example 4.19 A coin is biased so that the probability of obtaining a head is 0.6.
If X is the r.v. the number of tosses up to and including the first
head, find
Solution 4.19 P(X =x)=q* ‘p, x =1,2,3,... with p=0.6 and q =0.4.
(a) P(X>4) =q 4
(b) P(X>5) =q 5
(0.4)5
0.010 24
P(X>8NX>5)
oath
(c) > 8|X>5)
P(X =
=" (0,4)*
= 0.064
The probability that more than 8 tosses will be required given
that more than 5 tosses are required is 0.064.
Grn
In general, P(X >a+b|X >a) lI
a
q
b
q
P(X >b)
Example 4.20 In a particular board game a player can get out of jail only by
obtaining two heads when she tosses two coins.
(a) Find the probability that more than 6 attempts are needed to
get out of jail.
aa)
= 0.178 (3S.F.)
Oe aee
lity that player needs more than 6 attempts
ne probabi
The Oe Oey
before getting out of jail is 0.178 (3 S.F.).
So 1—(3)" > 09
log 0.1
nea
log 0.75
n 2 8.0039
So least value of n is 9.
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS 241
Exercise 4g
State conditions which give rise to a free gift. On any occasion when a motor-
geometric distribution whose Der aere ist buys petrol, the card received is equally
function is P(X=r)=(1—p)"_ likely to carry any of the ten pictures in
=1,2,3,..., where oe the set.
Prove that P(X <r) =1—(1—p)’. (a) Find the probability that the first
four cards the motorist receives all carry
Hence prove that, for any two positive different pictures.
integers s and f¢,
(b) Find the probability that the first
P(X>stt|X>s) = P(X>t) four cards received result in the motorist
and explain in words the meaning of this having exactly three different pictures.
result. (c) Two of the ten film stars in the set
During the winter in Glen Shee, the are X and Y. Find the probability that
probability that snow will fall on any the first four cards received result in the
given day is 0.1. Taking November 1st as motorist having a picture of X or of Y
the first day of winter and assuming (or both).
independence from day to day, find, to 2 (d) At a certain stage the motorist has
significant figures, the probability that collected nine of the ten pictures. Find
the first snow of winter will fall in Glen the least value of n such that
Shee on the last day of November (30th).
P(at most n more cards are needed
Given that no snow has fallen at Glen
to complete the set) > 0.99.
Shee during the whole of November, a
teacher decides not to wait any longer to (C)
book a skiing holiday. The teacher
decides to book for the earliest date for A marksman fires at a target. The proba-
which the probability that snow will have bility of his hitting the bull’s-eye is p for
fallen on or before that date is at least each shot, and each shot is independent
0.9. Find the date of the booking. (L) of all others. The random variable X
denotes the number of shots previous
In a sales campaign, a petrol company to that on which the bull’s-eye is first hit.
gives each motorist who buys their Show that
petrol a card with a picture of a film
star on it. There are 10 different picture Pr(X =x) "ap
cards, one of each of ten different film where gq = 1—p. Find the mean of X
stars, and any motorist who collects a and show that the variance is q/p”.
complete set of all ten pictures gets a (O &C)
oe x)= eak
: xl
for 7 0,1, 2, 3, oo infinity
ae A. can take any positive value, is said tofollow the Poisson
pe :
N ow oy P(X=x)
(X=x) = yy
. Pa
i
allx x=0
Pe rx
= eA pie
x!
x=0
2 3
Ah?
But e=14+A+—4+—+4...
2) 3!
So PX = x)s=(e“)(e*)
= y
Therefore X is a random variable.
Example 4.22 If X ~ Po(A) find (a) E(X), (b) E(X?), (c) Var(X).
XNx
Solution 4.22 P(X =x) = Ohne KE ON Lee, ones
x.
m eae he
E(X) = de iaaehet..]
ae rhe (e*)
=X
Therefore E(X) = X.
244 A CONCISE COURSE IN A-LEVEL STATISTICS
(b) Now
E(X?)2y ies De 2 P(X = =x)
= forsee ee.
= pe a(rtant ea ]
A Ook,
= deaa vasfue + =Soe =on ts =aka ue
=
= de “(er + ded)
= +2?
Therefore E(X*) = A+2?.
Example 4.23 If X ~ Po(2), find (a) P(X = 4), (b) P(X> 8).
De
1—5e2
0.323 (3S.F.)
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS 245
Exercise 4h
1. IfX ~ Po(3.5), ; find (a) P(X = 0) h 4. j X~ Po(A) and P(Xk== 0) =0)=0.0.201 9. Find
i
(6) P(X = 1), (ce) P(X = 2), (d) P(X =8), (a) A, (b) P(X <4).
(e) (X<8), (f) P(X2 4). 5.
:
Find the first 5 terms of the Poisson
distribution if (a2) \= 0.5, (b) A= 2.8,
2. IfX~ Po(1.8), find (a) P(X =6), (c) A= 3.6.
(b) P(X = 8), (c) P(X <2), (d)P(X> 4). -@. If X~ Po(A) and E(X’) =6, find (a) A,
(b) P(X = 2).
3. If x a Po(2.4) and F(X) is the cumulative 7. The random variable X follows a Poisson
distribution, find (a) F(0), (b) F(1), distribution with standard deviation 2.
(c) F(2), (d) F(8). Find P(X S 8).
known to be
Example 4.24 The mean number of bacteria per millilitre of a liquid is
n distribu-
4. Assuming that the number of bacteria follows a Poisso
will be
tion, find the probability that, in 1 ml of liquid, there
(a) no bacteria, (b) 4 bacteri a, (c) less than 3 bacteri a.
246 A CONCISE COURSE IN A-LEVEL STATISTICS
liquid’.
Solution 4.24 Let X be the r.v. ‘the number of bacteria in 1 ml of
x
(a) P(X = 0) er
0.01838 (3S.F.)
that there will be no bacteria in 1 ml of liquid is
Oene probability
The ee ee
0.0183 (3S.F.).
44
= 0.195 (38.F.)
The probability that there will be 4 bacteria in 1 ml of liquid is
0.195 (3S.F.).
UNIT INTERVAL
Example 4.25 Using the date of Example 4.24, find the probability that
(a) in 3 ml of liquid there will be less than 2 bacteria,
(b) in 5ml of liquid there will be more than 2 bacteria.
12”
So Y~ Po(12) and P(Y=y) =e ache VseO phe, ses
We require
|
P(R > 2) 1—[P(R = 0)+ P(R = 1)+P(R = 2)]
t
92
e *7+e 72+e 27—
1—e (5)
0.323 (3S.F.)
The probability that there are more than 2 bacteria in 5ml of liquid
ee a ot ee
is 0.323 (3S.F.).
eee ee SS
Exercise 4i
A book containing 750 pages has 500 3. Cars arrive at a petrol station at an average
1.
misprints. Assuming that the misprints rate of 30 per hour. Assuming that the
occur at random, find the probability that number of cars arriving at the petrol
station follows a Poisson distribution, find
a particular page contains (a) no misprints,
the probability that
(b) exactly 4 misprints, (c) more than 2
misprints. (a) no cars arrive during a particular 5
minute interval,
An insurance company receives on average (b) more than 3 cars arrive during a 5
2 claims per week from acertain factory. minute interval,
Assuming that the number of claims follows (c) more than 5 cars arrive in a 15 minute
a Poisson distribution, find the probability interval,
that (a) it receives more than 3 claims in a (d) in a period of half an hour, 10 cars
given week, (b) it receives more than 2
arrive,
claims in a given fortnight, (c) it receives
(e) less than 3 cars arrive during a 10
no claims on a given day, assuming that the
minute interval.
factory operates on a 5 day week.
248 A CONCISE COURSE IN A-LEVEL STATISTICS
r2
(q tee Aieaeh +... as n->oo
bee aa n+ “le(n—1) 4
2! n n 3! n n
2 3
rt2+2 (-2)42 ye a|bet Zep
2! n 3! n nijate ce
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS 249
as n>oo,
Now, |1——]-> 1, and
n
x\" ees
f+=) ee ee
n 2h ct 3)
=—er
Similar]
imuarly - lim |1——]i .=_e ~
Now
barb ae
pe ree totttae Ar equ So
n n
IN n n—\ ee Tie 2
(q+p)" n n n 2! n n
‘uae?
| |
=
=: 4 a=) | al
tag
—t
ee
Ca Se |3 a |>
3
2!{1
Sea
fs ————
= ee
le a >
n
ix We r
Asn > © we have (i=) +e* from (a) and —~> 0.
n n
2
Example 4.26 A factory packs bolts in boxes of 500. The probability that a bolt
is defective is 0.002. Find the probability that a box contains 2
defective bolts.
Solution 4.26 Let X be the r.v. ‘the number of defective bolts in a box’.
This is a binomial situation, with n = 500, p = 0.002.
So X ~ Bin(500, 0.002)
CS
250 A CONCISE COURSE IN A-LEVEL STA TISTI
= Se) (0.998)*98(0.002)?
(2)(1)
0.184 (3d.p.)
Example 4.27 Find the probability that at least two double 6’s are obtained when
two dice are thrown 90 times.
Let X be the r.v. ‘the number of double 6’s obtained when two dice
are thrown 90 times’, then X ~ Bin(90, x) and np = (90)(z) = 2.5.
Exercise 4j
1. If X ~ Bin(100, 0.03), use (a) the binomial average, 0.8% of the eggs are found to be
distribution, (b) the Poisson distribution broken when the eggs are unpacked.
to evaluate (i) P(X = 0), (ii) P(X = 2), (a) Find the probability that in a box of
(iii) P(X = 4). 500 eggs (i) exactly 3 will be broken,
4 f (ii) less than 2 will be broken.
2. If x~ Bin(200, 0.006), use the Poisson (b) A hypermarket unpacks 100 boxes of
distribution to find (a) P(X <3), eggs. What is the probability that there will
(b) P(X > 5). be exactly 4 boxes containing no broken
3. On average one in 200 cars breaks down on eges?
a certain stretch of road per day. Findthe 6. The probability that there is a flaw in a
probability that on a certain day (a) none metre length of cloth is 0.02. Find the
of a sample of 250 cars breaks down, probability that there are more than three
(6) more than 2 of a sample of 300 cars flaws in 175 m of cloth.
break down.
7. An aircraft has 116 seats. The airline has
4. The probability that a particular make of found, from long experience, that on
light bulb is faulty is 0.01. The light bulbs average 2.5% of people with tickets for a
are packed in boxes of 100. particular flight do not arrive for that
(a) Find the probability that in a certain flight. If the airline sells 120 seats for a
box there are (i) no faulty light bulbs, particular flight determine, using a suitable
(ii) 2 faulty light bulbs, (iii) more than 3 approximation, the probability that more
faulty light bulbs. than 116 people arrive for that flight.
(b) A buyer accepts a consignment of 50 Determine also the probability that there
boxes if, when he chooses two boxes at are empty seats on the flight. (C)
random, he finds that they contain no : ‘ ;
more than two faulty light bulbs altogether. 8- A firm selling electrical components packs
Find the probability that he accepts the them in boxes of 60. On average 2% of the
consignment. components are faulty. What is the chance
of getting more than 2 defective com-
5. Eggs are packed in boxes of 500. On ponents in a box? (SUJB)P
EE
Se
0
1
2
3
4
5
6
7
8
9
1
A CONCISE COURSE IN A-LEVEL STATISTICS
252
Example 4.28 If X ~ Po(2.4) find (a) P(X <6), (b) P(X 7 3),
(c) P(X <8), (d) P(X >7), (e) P(X = 4).
Solution 4.28 (a) P(X <6) = 0.9884 (directly from the tables).
Notice that for small values of \ the distribution is very skew, but
it becomes more symmetrical as A increases.
X ~ Po(10)
012345 67 8 91011121314
1516 171819 20
The mode is the value which is most likely to occur, that is the one
with the highest probability. Consider X ~ Po(A). Considering the
diagrams, we see that
when A =1, there are two modes, 0 and 1
Notice that
when \ = 1.6, the mode is 1
when A = 2.2, the mode is 2
when A = 3.8, the mode is 3.
If X ~ Po(A) then
dh
P(X =x) =e
x!
rx +1
sO P(X =x FA) (=
(x +1)!
Therefore
P(X=xt1) ev *ntthe!
P(X =x) (x +1)!e—*d*
Lian
(x7LD)
So P(X=x+1))= eG iy P(X=x)
G *) §o ee So ee
Example 4.29 If X ~ Po(2.3), use the recurrence formula to find P(X 2 5).
g
Solution 4.29 Pyt 1 = — 2.3
iedaftyay
Now po =e? (0.100 258 8)
2.3
Pi = Po (0.280 5953)
2.3
P2= Pi _— (0.265 1846)
2.3
Pale 3 2 (0.203 308 2)
2.3
DP4 = Pipe (0.116 902 2)
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS 255
We require P(X25) = 1—(pot+p,+p,.+p3+Da,4)
= 1—0.916 2492 (from memory
store)
= 0.0838 (35.F.)
Therefore P(X > 5) = 0.0838 (38.F.).
The recurrence formula can be used to find the value of X which is
most likely to occur, i.e. the value of X with the highest probability.
eee
Exercise 4k ieee
Cnn ee
1. Use (a) cumulative Poisson probability butions and verify that the mode m is
tables, (b) the recurrence formula, to the integer such that A—1<m<).
find the first six terms of the following (i) X~ Po(1.8) (ii) X ~ Po(2.6)
Poisson distributions. Sketch the distri- (iii) X ~~ Po(4.5) (iv) X ~ Po(3.8).
hs gue dtena 3 Hie Ae hag! 1 AST Te eS
of
Example 4.31 I recorded the number of phone calls I received over a period
150 days:
S
256 A CONCISE COURSE IN A-LEVEL STA TISTIC
Pe Expected
frequency (150 p,,)
1.04
P3= ae ?2= 0.066 2647
1.04
P4= pee = 0.017 2288 3
Total 150
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS 257
Exercise 41
1. For each of the following sets of data, fit a 3. A firm investigated the number of employ-
theoretical Poisson distribution: ees suffering injuries whilst at work. The
results recorded below were obtained for a
(a) One ee 8 ee 52-week period:
Asighta 120 0 Bin 23
Number of employees
ey
injured in a week
Onakiibeyies ko
more
Co) On ee ge eae Sts
Part 50-120 1a the yi Number of weeks Syl Wye} al 0
yields on average £3 net per day, estimate Poisson distribution of mean m and show
the annual gain or loss of income (excluding that its variance is also m.
the capital outlay) were the owner to have Tests for defects are carried out in a textile
the room built. It may be assumed that factory on a lot comprising 400 pieces of
during the rest of the year fewer than five cloth. The results of the tests are shown in
rooms are let. (SUJB) Table A below.
Table A
Number of faults
per piece
Number of pieces 92 142 96 46
258 A CONCISE COURSE IN A-LEVEL STATISTICS
Show that this is approximately a Poisson How many pieces from a sample of 1000
distribution and calculate the frequencies pieces may be expected to have 4 or more
faults? (AEB 1972)
on this assumption.
Proof
m~ 7
X ~ Po(m) so P(X "a= ei sa. Y~ Po(Y) sorPCYi = y) es Fe
bee x!
and so on
Now
P(X+Y=0) = P(X=0)-P(Y=0)
= (ear \(emc)
e (m+n)
xy 0 1 2
2
P(X + Y=x+y) e (mtn) e +” (m+n) e (m+n) (m+n)"
ae
Example 4.32 Two identical racing cars are being tested on a circuit. For each car,
the number of mechanical breakdowns follows a Poisson distribution
with a mean of one breakdown in 100 laps.
The first car does 20 laps and the second does 40 laps. What is the
probability that there will be (a) no breakdowns, (b) one break-
down, (c) more than two breakdowns altogether? Assume that
breakdowns are attended and the cars continue on the circuit.
Let X be the r.v. ‘the number of breakdowns for the first car’.
Then X ~ Po(0.2)
Let Y be the r.v. ‘the number of breakdowns for the second car’
Then Y ~ Po(0.4)
T ~ Po(0.2 + 0.4)
i.e. T ~ Po(0.6)
(a) P(T=0) = ae
= 0.549 (8d.p.)
= 0.3829 (3d.p.)
ility that
probabblity
The propa there will be one breakdown is 0.329 (3d.p.).
the Ee eS
S
260 A CONCISE COURSE IN A-LEVEL STA TISTIC
= 0.023 (3d.p.)
be more than two breakdowns eeis
ON
probabilit
The EO eat eswill
there
y that See
0.023 (8 d.p.).
Example 4.33 The centre pages of the ‘Weekly Sentinel’ consists of 1 page of film
and theatre reviews and 1 page of classified advertisements. The
number of misprints in the reviews has a Poisson distribution with
mean 2.3 and the number of misprints in the classified section has a
Poisson distribution with mean 1.7.
(a) Find the probability that, on the centre pages, there will be
(i) no misprints, (ii) more than 5 misprints.
(b) Find the smallest integer n such that the probability that there
are more than n misprints on the centre pages is less than 0.1.
Solution 4.33 Let X be the r.v. ‘the number of misprints on the review page’.
Then xX ~ Po(2.3)
Let Y be the r.v. ‘the number of misprints on the classified page’.
Then Yo sPo(ie)
Let T be the r.v. ‘the number of misprints on the centre pages’.
Therefore T=X+Y and T~ Po(2.3+ 1.7)
ie. T ~ Po(4)
P(T>5) = 1—P(T<5)
= 1—0.7851
= 0.215 (3dp.)
If tables are not available, consider
Po e* (0.018 3156)
Py ADo (0.073 262 5)
Ps (0.146 5251)
4
HS (0.195 366 8)
4
Ds BPs (0.156 293 4)
- Exercise 4m
Telephone calls reach a secretary indepen- direction will pass P in a given 10-minute
dently and at random, internal ones at a period,
mean rate of 2 in any 5 minute period, and (c) that there will be exactly 4 lorries
external ones at a mean rate of 1 in any 5 passing P in a given 20-minute period.
(O &C)
minute period. Calculate the probability
that there will be more than 2 calls in any 3. A large number of screwdrivers from a trial
period of 2 minutes. (O&C)
production run is inspected. It is found
that the cellulose acetate handles are
During a weekday, heavy lorries pass a defective on 1% and that the chrome steel
census point P on a village high street blades are defective on 13% of the screw-
independently and at random times. The drivers, the defects occurring indepen-
mean rate for westward travelling lorries dently.
is 2 in any 30-minute period, and for east- (a) What is the probability that a sample
ward travelling lorries is 3 in any 30- of 80 contains more than two defective
minute period. screwdrivers?
Find the probability (b) What is the probability that a sample
(a) that there will be no lorries passing P of 80 contains at least one screwdriver
in a given 10-minute period, with both defective handle and a defective
blade? (O&C)
(b) that at least one lorry from each
A CONCISE COURSE IN A-LEVEL STATISTICS
262
places, the probability that in the next
4. Avrestaurant kitchen has 2 food mixers,
3 weeks
A and B. The number of times per week
(a) A will not break down at all,
that A breaks down has a Poisson distri-
(b) each mixer will break down exactly
bution with mean 0.4, while indepen-
dently the number of times that B breaks once,
down in a week has a Poisson distribu- (c) there will be a total of 2 breakdowns.
(L)P
tion with mean 0.1. Find, to 3 decimal
ing
Example 4.34 Along a stretch of motorway, breakdowns requiring the summon
of the breakdown services occur with a frequency of 2.4 per day,
that
on average. Assuming that the breakdowns occur randomly and
they follow a Poisson distribu tion, find
a
(a) the probability that there will be exactly 2 breakdowns on
given day,
(b) the smallest integer n such that the probability of more than n
breakdowns in a day is less than 0.03.
Solution 4.34 (a) Let X be the r.v. ‘the number of breakdowns a day requiring
the breakdown services’.
Then
9.AY
X ~ Po(2.4) and P(X =x) = e2 EAT 1 Des
ei
Oe
54 (2.4)?
So P(X 2) eae a iio 0-261 (as, F)
: 2.4
Using the recurrence formula p,. 4, = era DP, we will need to
Cumulative probability
Now py = e24 = 0.0907179 | F(0) = 0.0907179
2.4
Py = Po = 0.217728 F(1) = 0.308 441
2.4
Pr = “>Pi =/0.2612677 | F(2) = 0.5697087
2.4
P3 = —
Pa = 0.2090141 | F(8) = 0.778 7229
2.4
Pa = “Pa = 01254085 | F(4) = 0.904 1314
2.4
Pa = 0.060 1960 | F(5) = 0.9643274
Ps = —
2.4
Pe = “SPs = 0.024 0784 | F(6) = 0.9884059
By trial, De =
potpitpotpstPatpst 0.988 405 9
Prove that the mean of X is A. Give two examples (other than that
suggested below) of situations where you would expect a Poisson
distribution to occur.
n distribution
The number of white corpuscles on aslide has a Poisso Taw
ON
with mean 3.2. By considering the values of r for which T
Solution 4.35 For the answer to the first part of the question, see page 243.
Let X be the r.v. ‘the number of white corpuscles on a slide’. Then
X ~ Po(8.2).
:
So P(X=r) = e*? =
g.9rtl
and P(X =r4+1) = & 3? +1)!
PiSerbijee- 6,273.27") 3!
ee P(X=r)e-3?3.2" (r+1)!
32
r (ead)
= 0.223 (8d.p.)
(3 d.p.).
Let X, be the r.v. ‘the number of white corpuscles on the first slide’.
Then X, ~Po(8.2).
Let X, be the r.v. ‘the number of white corpuscles on the second
slide’. Then X, ~ Po(3.2).
Let Y = Xt Xow nen ye Yee BP Ola.
2 ee)
i.e. Y ~ Po(6.4)
We require P(Y22) = 1—[P(Y=0)+P(Y =1)]
= 1—(e56 4+ e~ 6:4)
=e
= 0.988 (8d.p.)
The probability of obtaining at least two white corpuscles in total
on the two slides is 0.988 (3 d.p.).
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS 265
We now show an alternative approach to the last part:
If two slides are prepared, we require P(total number of white
corpuscles = 2).
Now
P(total > 2) = 1—[P(total = 0) + P(total = 1)]
= 1—[P(X, = 0)P(X, = 0)
+ P(X, = 0)P(X, = 1)
+ P(X, = 1)P(X2 = 0)]
= 1—[(e~32)(e~
32)+ e-3-2(e-323.2) +(673:3.2)e~ 37]
Example 4.36 Derive the mean and the variance of the Poisson distribution.
In a large town, one person in 80, on the average, has blood of type
X. If 200 blood donors are taken at random, find an approximation
to the probability that they include at least five persons having
blood of type X.
How many donors must be taken at random in order that the
probability of including at least one donor of type X shall be 0.9
or more? (AEB)
Solution 4.36 We have already shown (p. 243-4) that if R is a r.v. such that
R ~ Po(A), then
f(t) cle a i ah
The probability that the sample will contain at least five people
having blood of type X is 0.109 (3 d.p.).
peSeyr OCs tiga pees skyeleree eeeer
266 A CONCISE COURSE IN A-LEVEL STATISTICS
SoR~ P of}
e7/80 = i
0.1
ice. e7/89 > 10
So, taking logs to the base e,
= > In(10)
=a iN
80
n
— 2 2.30
80
n = (80)(2.30)
n 2 184.2
P(R2S 1)=1—-e723!44
= 0.901 (8d.p.)
Example 4.37 In the Growmore Market Garden plants are inspected for the
presence of the deadly red angus leaf bug. The number of bugs per
leaf is known to follow a Poisson distribution with mean one. What
is the probability that any one leaf on a given plant will have been
attacked (at least one bug is found on it)?
g
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS 267
A random sample of twelve plants is taken. For each plant ten leaves
are selected at random and inspected for these bugs. If more than
eight leaves on any particular plant have been attacked then the
plant is destroyed. What is the probability that exactly two of these
twelve plants are destroyed? (AEB 1977)
Solution 4.37 Let X be the r.v. ‘the number of bugs per leaf’.
ilies
Now let Y be the r.v. ‘the number of leaves that have been attacked
on a plant’.
Then Y ~ Bin(10, 0.632) since n = 10, and the probability that a
leaf has been attacked is 0.632.
We have
P(Y=y) = 10C,(0.368)1°—7(0.632)” y =0,1;2,...,10
We require
Fixe Oo) II P(Y = 9)+P(Y = 10)
10(0.368)1(0.632)? + (0.632)!°
0.069 (3 d.p.)
probability that
The Oe any one plant is destroyed is 0.069 (3 d.p.).
ne ee
Now let R be the r.v. ‘the number of plants that are destroyed’.
Then R ~ Bin(12, 0.069) since 12 plants are inspected, and the
probability that a plant is destroyed is 0.069.
P(R =r) = 120 (0.931)!2—"(0.069)’
= (2)@1) (9 931)1(0.069)?
(2)(1)
= 0.154 (8d.-p.)
E(X)=A
Var(X) =A
hee
Recurrence formula: Pe
i.
IfX ~ Po(m) and Y ~ Po(n) then X + Y ~ Po(m ‘ty 7)
(X, Y independent)
Miscellaneous Exercise 4n —
Lemons are packed in boxes, each box not all the sensors in a unit are opera-
containing 200. It is found that, on tional. 100 units are tested and the
average, 0.45% of the lemons are bad numbers N of pressure sensors which
when the boxes are opened. Use the function correctly are distributed accord-
Poisson distribution to find the proba- ing to Table A below.
bilities of 0,1,2, and more than 2 bad Calculate the mean number of sensors
lemons in a box. which are faulty.
A buyer who is considering buying a The manufacturer only markets those
consignment of several hundred boxes units which have at least 32 of their 36
checks the quality of the consignment sensors operational. Estimate, using the
by having a box opened. If the box Poisson distribution, the percentage of
opened contains no bad lemons he buys units produced which are not marketed.
the consignment. If it contains more than (O &C)
2 bad lemons he refuses to buy, and if it
Show that, for the Poisson distribution in
contains 1 or 2 bad lemons he has another
box opened and buys the consignment if which the probabilities of 0,1,2,...
2.-m
e
the second box contains fewer than 2 bad
successes are e7””, me~™, 5 hates
lemons. What is the probability that he 2!
buys the consignment? the mean number of successes is equal to
Another buyer checks consignments on a m. State the variance.
different basis. He has one box opened; A sales manager receives 6 telephone calls
if that box contains more than 1 bad on average between 9.30a.m. and
lemon he asks for another to be opened 10.80a.m. on a weekday. Find the
and does not buy if the second also con- probability that
tains more than 1 bad lemon. What is the (a) he will receive 2 or more calls between
probability that he refuses to buy the 9.30 and 10.30 on a certain weekday;
consignment? (SUJB) (6) he will receive exactly 2 calls between
9.30 and 9.40;
A manufacturer produces an integrated (c) during a normal 5-day working week,
electronic unit which contains 36 separate there will be exactly 3 days on which he
pressure sensors. Due to difficulties in will receive no calls between 9.30 and
manufacture, it happens very often that 9.40. (SUJB)
Table A
86 85 84 83 32 31 30 29 28 <28
Number
of units 3) 0 916) 22) (2217 tee bee eee 0
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS 269
4, X is a random variable having a Poisson of 100 working days would you expect
distribution given by a particular lawnmower not to be in use?
f(x) = e&™ m*/x!, x =0,1,2,... (MEI)
f is the number of minutes with x calls Calculate the mean number of plants per
per minute. plot.
By consideration of the mean and variance Assuming that the plants were scattered
of this distribution show that a possible randomly with this same mean number
model is a Poisson distribution. of plants per plot, find
(a) the probability of a given plot con-
Using the calculated mean and on the
taining no plants;
assumption of a Poisson distribution
(6) the probability of at least three plots
calculate (a) the probability that two or
being found which contain no plants.
more calls were received during any one
minute, (6b) the probability that no calls What conclusions, if any, can be drawn
were received during any two consecutive from the observed number of plots in
minutes. (SUJB) which the experimenter found no plants?
(You may assume that e237 ~ 0.1) (SMP)
Customers enter an antique shop inde-
pendently of one another and at random Derive the Poisson distribution as the
intervals of time at an average rate of limiting form of the Binomial distribution
four per hour throughout the five days of when n becomes very large and p becomes
a week on which the shop is open. The very small in such a way that np remains
owner has a coffee-break of fifteen min- constant. Write down the mean and the
utes each morning; if one or more cus- variance of this distribution.
tomers arrive during this period then his The mean number of bacteria per milli-
coffee goes cold, otherwise he drinks it litre of a liquid is known to be 3. Ten
while it is hot. samples of the liquid, chosen at random
Let X be the random variable denoting and each of volume 1 ml, are examined.
the number of customers arriving during Assuming the Poisson distribution is
a Monday coffee-break, and let Y be the applicable, obtain expressions for the
random variable denoting the number of probabilities
days during a week on which the owner’s (a) that each of the ten samples contains
coffee goes cold. Assuming that X has a at least one bacterium,
Poisson distribution, determine (correct (b) that exactly eight of the samples
to three significant figures) (a) P(X = 9), contain at least one bacterium,
(b) P(X> 2), (c) E(Y), (d) P(Y = 2). (C) If 3 ml of the liquid is examined show
that it is rather improbable that it will
A hire company has two electric lawn-
contain fewer than 3 or more than 15
mowers which it hires out by the day. (MEI)
bacteria.
The number of demands per day for a
lawnmower has the form of a Poisson
distribution with mean 1.50. In a period A discrete random variable, X, can take
of 100 working days, how many times values 0,1,... and has a Poisson distribu-
do you expect tion such that the probability that X =r
(a) neither of the lawnmowers to be is em" /r!. Prove that the mean of X
is m.
used,
(b) some requests for the lawnmowers During each working day in a certain
to have to be refused? factory a number of accidents occur
If each lawnmower is to be used an equal
independently according to a Poisson
distribution with mean 0.5.
amount, on how many days in a period
270 A CONCISE COURSE IN A-LEVEL STATISTICS
Calculate the probability that (b) Also find, to 3 decimal places, the
(a) during any one day there are 2 or probability that, if two of the sample
more accidents, are chosen at random, they have birth-
(b) during two consecutive days there days in the same month. (In 1961 there
are exactly three accidents altogether. were 7 months with 31 days, 4 months
with 30 days and 1 month with 28 days.)
Out of 50 consecutive five-day weeks how (MEI)
many would you expect to be accident-
free?
13. Define the Poisson distribution and derive
Give two further situations where you its mean. State the circumstances under
would expect a Poisson distribution to which it is appropriate to use the Poisson
apply. (SUJB)
distribution as an approximation to the
binomial distribution.
10. Prove that for the Poisson distribution in
which the probability of r successes is A lottery has a very large number of
tickets, one in every 500 of which entitles
eqttm"
(r2 0) the purchaser to a prize. An agent sells
1000 tickets for the lottery. Using the
the expected number of successes is equal Poisson distribution, find, to three
tom. decimal places, the probabilities that the
number of prize-winning tickets sold by
The telephone exchange inside an office
building has a number of outside lines of the agent is (a) less than three, (b) more
which, on average, 3 are being used at any than five.
instant. Assuming that the number of Calculate the minimum number of tickets
lines in use at any instant follows a the agent must sell to have a 95% chance
Poisson distribution, find of selling at least one prize-winning ticket.
(a) the probability that, at any given (JMB)
instant, not more than 3 lines are in use,
(b) the minimum number of outside lines 14. Define the Poisson distribution and derive
required if there is to be a probability of its mean and variance.
more than 0.9 that, at any given instant, The number of telephone calls received at
at least one of the lines is not being used. a switchboard in any time interval of
(C) length T minutes has a Poisson distribution
tae The monthly demand for a certain with mean ST. The operator leaves the
magazine at a small newsagent’s shop switchboard unattended for five minutes.
has a Poisson distribution with mean 3. Calculate to three decimal places the
The newsagent always orders 4 copies probabilities that there are (a) no calls,
of the magazine for sale each month; any (b) four or more calls in her absence.
demand for the magazine in excess of 4 is
Find to three significant figures the maxi-
not met.
mum length of time in seconds for which
(a) Calculate the probability that the
the operator could be absent with a 95%
newsagent will not be able to meet the
probability of not missing a call. | (JMB)
demand in a given month.
(b) Find the most probable number of
magazines sold in one month. 15. Define the Poisson distribution and derive
(c) Find the expected number of maga- its mean and variance. :
zines sold in one month. In the first year of the life of a certain
(d) Determine the least number of copies type of machine, the number of times a
of the magazine that the newsagent maintenance engineer is required has a
should order each month so as to meet Poisson distribution with mean four. Find
the demand with a probability of at least the probability that more than four calls
0.95. (JMB) are necessary.
A random sample of 500 people born in The first call is free of charge and subse-
12.
1961 is being studied. It can be assumed quent calls cost £20 each. Find the mean
that birthdays are uniformly distributed cost of maintenance in the first year.
throughout the year. (JMB)
(a) Use the Poisson distribution to find,
to 3 decimal places, the probabilities that 16. The number of oil tankers arriving at a
there are (i) exactly two people, and port between successive high tides has a
(ii) no more than two people, with birth- Poisson distribution with mean 2. The
days on 1 January. depth of the water is such that loaded
SPECIAL DISCRETE PROBABILITY DISTRIBUTIONS 271
i
vessels can enter the dock area only on Find the probability that in exactly half
the high tide. The port has dock space for of these 10 rooms the carpets will con-
only three tankers, which are discharged tain exactly 3 faults. (AEB 1988)
and leave the dock area before the next
tide. Only the first three loaded tankers 18. A randomly chosen doctor in general
waiting at any high tide go into the dock practice sees, on average, one case of a
area; any others must await another high broken nose per year and each case is
tide. independent of other similar cases.
(a) Regarding a month as a twelfth part
Starting from an evening high tide after of a year,
which no ships remain waiting their turn, (i) show that the probability that,
find (to three decimal places) the proba- between them, three such doctors
bilities that after the next morning’s high see no cases of a broken nose in a
tide (a) the three dock berths remain period of one month is 0.779,
empty, (0) the three berths are all filled. correct to three significant figures,
Find (to two decimal places) the proba- (ii) find the variance of the number
bility that no tankers are left waiting out- of cases seen by three such doctors
side the dock area after the following in a period of six months.
evening’s high tide. (JMB) (b) Find the probability that, between
them, three such doctors see at least
17. The random variable X has a Poisson three cases in one year.
distribution with parameter A. (c) Find the probability that, of three
(a) Prove that E(X) =A. such doctors, one sees three cases and the
(b) If P(X =k) =P(X=k+1), where other two see no cases in one year. (C)
k is some integer, show that A must also
be an integer. 19. State, giving your reasons, the distribution
(c) If is not an integer, show that the which you would expect to be appropriate
mode, m, of the distribution is such that in describing
A <n <1), (a) the number of heads in 10 throws of
a penny,
In the manufacture of commercial carpet, (b) the number of blemishes per m? of
small faults occur at random in the car- sheet metal.
pet at an average rate of 0.95 per 20 m?.
Find the probability that in a randomly A building has an automatic telephone
selected 20 m? area of this carpet exchange. The number X of wrong con-
nections in any one day is a Poisson
(d) there are no faults,
(e) there are at most 2 faults. variable with parameter A. Find, in terms
of A, the probability that in any one day
The ground floor of a new office block there will be
has 10 rooms. Each room has an area of (c) exactly 3 wrong connections,
80 m? and has been carpeted using the (d) 3 or more wrong connections.
same commercial carpet described above.
For any one of these rooms, determine Evaluate, to 3 decimal places, these
the probability that the carpet in that probabilities when A = 0.5. Find, to 3
room decimal places, the largest value of A for
(f) contains at least 2 faults, the probability of one or more wrong
connections in any day to be at most 6.
(g) contains exactly 3 faults,
(h) contains at most 5 faults.
(L)
PROBABILITY
DISTRIBUTIONS Ii —
CONTINUOUS RANDOM
VARIABLES
A continuous random variable (r.v.) is a theoretical representation
of a continuous variable such as height, mass or time.
b
ie. | f(x)dx = 1
The area under the curve y = f(x)
between x =a andx = bisl.
(ii) Ifa<x,;<x,<b
then
272
PROBABILITY DISTRIBUTIONS I! — CONTINUOUS RANDOM VARIABLES 273
i
NOTE: in an experimental approach, the area under the histogram
represents frequency. In a theoretical approach, the area under the
curve y = f(x) represents probability.
Example 5.1 A continuous r.v. has p.d.f. f(x) where f(x) = kx, O<x <4.
(a) Find the value of the constant k,
(b) sketch y = f(x),
(c) find P(l<X < 25).
a a
= 4
ZO
8k = 1
i
k=.
8
Therefore f(x) = EX O0<x <4.
=a
x?|2
= 0.828 (35S.F.)
Therefore
eteoe
< 25)=
P(1< en gee ee (3S.F.).
0.328
Meee Rize) 2a = 4
0 otherwise
274 A CONCISE COURSE IN A-LEVEL STATISTICS
2 4
Therefore | hexdx+ | k(4—x) dx Il e
(0) on?
x2]? x24
a +k w= 1
2 |o 2 |2
2k+ki16—8—(8—2)} = 1
4k = 1
pee
a
So the p.d.f. for X is given by
ax O0<x<2
f(x) = {e(4—-x) 2<xn<4
0 otherwise
Il
j
nile
Lede + ["H4—x) dx
|e[2
a a 40-2 |
PROBABILITY DISTRIBUTIONS || — CONTINUOUS RANDOM VARIABLES, 275
eT
= —+4+—
32 32
a
16
Therefore PS <xXx< 25) = as
_ Exercise 5a :
The following distributions will be used in 5. The continuous r.v. X has p.d.f. f(x) where
Exercises 5c and 5d and it will be useful to f(x) = kx®, O<x <c and P(X <3)= %.
refer to these answers. Start each answer on a Find the values of the constants c and k
fresh sheet of paper so that you can add to it and sketch y = f(x).
later.
z (6) The continuous r.v. X has p.d.f. f(x) where
1. The continuous r.v. X has p.d-f. f(x) where k Oma
f(x) = kx?, O<x <2. soca
(a) Find the value of the constant k. f(x) = \R(2x—3) 25x43
(b) Sketch y = f(x). 0 otherwise
(c) Find P(X 2 1). :
(d) Find P(0.5 <X<1.5). (a) Find the value of the constant k.
(b) Sketch y = f(x).
2. The continuous r.v. X has p.d.f. f(x) where (c) Find P(X <1).
f(x) =k, —2<x <3. (a) Find the value (d) Find P(X > 2.5).
of the constant k. (b) Sketch y = f(x). (e) Find P(l<X < 2.8).
(c) Find P(—1.6 < X S 2.1).
8. The continuous r.v. X has p.d.f. f(x) where The continuous r.v. X has p.d.f. f(x) where
f(x)
= k(4—x), 1<x <8. (a) Find the k(x+ 2)? —2<x<0
value of the constant k. (b) Sketch 2 ea
y = fix). (c) Find P(1.2<X < 2.4). [ees AR Ose515
: 0 otherwise
4. The continuous r.v. X has p.d.f. f(x) where
f(x) =k(x +2)’, O<x <2. (a) Find the (a) Find the value of the constant k.
value of the constant k. (b) Sketch (b) Sketch y = f(x).
= f(x). (c) Find P(O < X <1) and hence (c) Find P(-1<X <1).
find P(X > 1). (d) Find P(X > 1).
i
ee
EXPECTATION
: “BuO=oe en
NO TE: E(xViis often denoted oFpw and refered | to as be mean
(Of x.
3x 2
Example 53 If X is a continuous r.v. with p.d_f. f(x)=—, 05x < 4, find E(X).
64
276 A CONCISE COURSE IN A-LEVEL STATISTICS
ie Se
'o \64
= eres
ae
2 a
Therefore E(X) = 3.
[ 8x(3—x)(e—5) dx
3
P= 15x? |
3 es 3
cour,
eee
ae See
Blow
Plo
-
Therefore E(X) = 4.
-As in the case of the discrete r.v. (see p. 181), the following results
hold when X is a continuous r.v.
= a| f(x) ax
Jallx
=a
E(aX) al ax f(x) ax
Jallx
ag Hees
= aE(X)
The continuous r.v. X has p.d.f. f(x) where f(x) = a(x 3);
Example 55
0<x<4.
(a) Find E(X). (b) Verify that E(2X+ 5) = 2E(X) + 9.
(c) Find E(X?). (d) Find E(X?+ 2X—8).
41
|rex (toto) ax
o 20
1 4
aa (x2 + 8x) dex
20/ 0
ae
aa
se)’
—-+—
20| 3 2P'6
wet
1 (64
‘20\ 3
34
15
Therefore E(X) = 2.
[* (2x+ 5)(@+3) dx
/o9 20
eat
ae x x )
: 2x?+11x*+15)dx
4
1 11x?
—— + ———
+ 1x
20 3 2 0
1 ee
128
TOO 60|
20
572
60.
143
15
Therefore E(2X+ 5) = we
Now QE(X)+5 (2 Fe +5
15
143
15
So E(2X +5) = 2E(X)+5.
PROBABILITY DISTRIBUTIONS I! — CONTINUOUS RANDOM VARIABLES / 279
z = [*x%(x + 3) de
20 0
1 (4 34 342
20/5 Sa
1 |x* 2
= —|—+ =|
20,4 0
1
= (64Gh64)
20
_ 82 5
Therefore E(X*) = 32
we 8 x?de + | 6 2(2—x)dx
2
¥ pul ac
280 A CONCISE COURSE IN A-LEVEL STATISTICS
o [x], 8 Peal’
elas MtO. a ees 4}1
ia
6/5
15
14
Therefore E(X) = 2.
eed
"6-3
{3 dx +} ox (2—x) dx
"678
x4 1 6 x4 x> 2
4 tlie ei ee Hit
oe
mle
attSee
|e
AIH
AIR
A(R oe
oO
!lw
Sa
Exercise 5b
The continuous r.v. X has p.d.f. f(x) 4. The continuous r.v. X has p.d-f. f(x)
where f(x) = 3x?,0<x <1. (a) Find where f(x) = kx?, O<x <2. (a) Find
E(X). (b) Find E(X?). (c) Verify that the value of the constant k. (b) Find
E(3X—1) = 3E(X)—1. (d) Find uU=E(X). (c) Find E(8X). (d) Find
E(2X7+ 3X+ 3). E(X?—4X+ 8).
The continuous r.v. X has p.d.f. f(x) 5. The continuous r.v. X has p.d.f. f(x)
where f(x) = $x(2—<x), 0x 2. where
(a) Find E(X). (b) Find E(X?). 3 2SX < re,
VARIANCE
For a random variable X,
Var(X) = E(X—yp)? where pw = E(X)
be written:
As in the discrete case (see p. 183) the formula can
Var(X) = E(X?)—E>(X)
282 A CONCISE COURSE IN A-LEVEL STATISTICS
or yp?
Var(X) = E(X?)—
As in the discrete case (see p. 186), the following results also hold
when X is a continuous r.v.: vi
Example 5.7 The continuous r.v. X has p.d.f. f(x) where f(x) = aX O<x <4.
Find (a) E(X), (b) E(X?), (c) Var(X), (d) the standard deviation
o of X, (e) Var(3X+ 2).
ait |
[il 22ax
alt
1 =2
428,
3
Therefore E(X) = 5.
1 |"
814 Jo
Shen
3 | )
= 8
Therefore E(X’) = 8.
-)
“(c) Var(X) = E(X?)—E%(X)
Therefore Var(X) = S.
_ 2/2
3
2/2
Therefore o = 2.
=
9|—
8
9(|
Therefore Var(3X + 2) = 8.
Example 5.8 The continuous random variable X has p.d.f. given by f(x) where
ax
1y2 0<x<3
<=
f(x) = 5 3<x<5
0 otherwise
(a) Sketch y = f(x). (b) Find E(X). (c) Find E(X?): (d) Find the
standard deviation o of X.
- EEL
Therefore E(X) = oe
1
27
9
os
57
571
45
Therefore E(X’) = ao
= fines
Therefore o = 1.008 (3 d.p.)
The standard deviation of X is 1.008 (3 d.p.)
Example 5. The continuous r.v. X has p.d.f. f(x) where f(x) = $(1 +x),
0<x <1. IfE(X)= wand Var(X) = o?, find P(|X—pl<o).
Solution 5.9 From the sketch of y = f(x) we see that Y| Fix) = = + x?)
there is no symmetry.
sa E(X)
1 3 fl
[ Sea taxtax = 2 [ (etx)dx
sed
64 4 Jo
3 [x2 x4]!
Aa lee) 4 jo
at)
2
16
Therefore py = = = 0.5625.
E(X?) II |: x? f(x) dx
1
| $27(1 + x?) dx
0
3 rl
= 71,
eae 2
+x)4 dx
es =)
—+—
3 5 jo
So Var(X) =
= 0.0836 (3S.F.)
and /Var(X)
= 0.289 (3S8.F.)
eee + x2)
a A
We require
as |
|
3 x? 0.8515 Me
= —Ix+—
rE a m2
3
5 [oasisst;
“
(08818) (oyr95
0.8515)?
+02785"
0.2735)?
0.583 (3858.F.)
Therefore P(|X—ul< a) = 0.583 (3 S.F.).
THE MODE
The mode is the value of X for which f(x) is greatest, in the given
range of X. It is usually necessary to draw a sketch of y = f(x) and
this will give an idea of the location of the mode.
For some probability density functions it is possible to determine
the mode by finding the maximum point on the curve y = f(x)
d
from the relationship f'(x) = 0, where f(x) = ze f(x).
Example 5.10 The continuous r.v. X has p.d.f. f(x) where f(x) = 3(2 +x \(4—=-x);
0<x <4. (a) Sketch y = f(x). (b) Find the mode.
3 3
(b) fle) =~ (2tay(4—x) = a0(8 2)
j 3
f (x) = oe
Now f(x) = 0 when 2—2x =0
x=1
_ Exercise 5c
4. f(x) = (x +2)" 0O<x <2 Find (a) the value of the constant Re,
(b) E(X), (c) Var(X), (d) P(a@<X<13),
5. = 4x?
f(x) O<x<1 (e) the mode.
eee eee
t
in the same way, if X is a continuous random variable with p.d.
f(x) defined for a<x <b, then the cumulative distribution
fypcion is Bee »y F(t) where -
The median
The median splits the area under the curve y = f(x) into two halves.
So, if the value of the median is m,
Example 5.11 If X is a continuous r.v. with p.d.f. f(x) = 5X, 0<x <4,
(a) find the cumulative distribution function F(x) and sketch
y = F(x), (b) find the median m, (c) find P(0.83< X <1.8).
- al
t?
16
t?
So that F(t) = —, 0<xt<4
16
42 ]
NOTE: (1) F(4)= 16 = 1 (as expected).
x<0
F(t) es Ee 1g a
PROBABILITY DISTRIBUTIONS Il — CONTINUOUS RANDOM VARIABLES , 289
Sketch of y = F(x)
m= §
m = 2.838 (2d.p.)
The median m = 2.83 (2 d.p.).
Now F(1.8) = Ls
16
ll 0.2025.
0.37
and F(0.3)
16.
= 0.005 625
2x
f(x) = 4e 2<x<3
0 otherwise
bution function
(a) Sketch y = f(x). (b) Find the cumulative distri
2.5). (e) Find the
F(x). (c) Sketch y = F(x). (d) Find P(1 <X<
median, m.
290 A CONCISE COURSE IN A-LEVEL STATISTICS
But, as f(x) is given in two parts, we must find F(x) in two stages:
Consider 0<x <2
F(t) \| &
aa
°o w|s
II
~
°
beeen
I>
SR,
OM.
its
x2
So, forO<x<2, F(x) =—.
NOTE: F(2)=%= 2.
2x
F(t) = F(2) + (area under the curve y = — 3 + 2 between 2 and ft)
So
t 2x
F(t) = F(a)+[ [= +2)ax
2
ray+|- + 2x]"
3 2
2 {2 4
= —+(\——+2t-—|——+4
3 3 3
12
ae ahaa 2<5t<3
PROBABILITY DISTRIBUTIONS Il — CONTINUOUS RANDOM VARIABLES , 291
== O0<x<2
6
he
F(x) = (== +.2x—2 25x53
1 x23
a a + 2(2.0)2
3
aE.
12
oe
F(1) = ca (as 1 is in the range 0 <x < 2)
oe
«6
Therefore P(A<X<2.5) = F(2.5)—F(1)
Leaips
27 3G
= 0.75
So P(1 <X < 2.5) = 0.75.
So
|
3,
= 0.5
2= 8
O&O
3
m = 1.73 (2d.p.)
Example 5.13 A continuous r.v. X takes values in the range 0 <x <1 and ney,
p.d.f.
3.75x +0.1 0<x<0.4
f(x) = 416 ' 0.4<x<06
3.85—3.75x 06<x<1
(a) Sketch y = f(x). (b) Find the mean uy. (c) Find the cumulative
distribution function, F(x) and sketch y = F(x). (d) Find
POX —pl= 0.2):
0<t<04
t
F(t) zal (3.75x
+ 0.1) dx
0
x2 t
- s75% +0.
2 0
= 1.87574 0,1¢
Now
F(0.4) = (1.875)(0.4)?
+(0.1)(0.4)
0.34
PROBABILITY DISTRIBUTIONS Il — CONTINUOUS RANDOM VARIABLES 9
293
0.4<¢t<0.6
II F(0.4) + [1.6x]¢,
= 0.34+ 1.6t—0.64
1.6t—0.3
Now F(0.6) = (1.6)(0.6)—0.3
| 0.66
06<x<1
t
F(t) F(0.6) sat _ (8.85 —3.75x) dx
0.
2 |t
F(0.6) + [s.85<—3.75 ~
0.6
Sketch of y = F(x).
Exercise 5d © SS ee
ee
For each of the following probability density 0, (c) the cumulative distribution function
functions of the continuous r.v. X, find F(x), (d) the median, m.
(a) the cumulative distribution function F(x),
and for questions 1, 2, 5 and 7 find also 10. The continuous r.v. X has continuous
(b) the median, m. p.d.f. f(x) where
4. fix) =%(xt2) O0<x<2 Find (a) wand B, (b) F(x) and sketch
y = F(x), (ec) P(2<X S3.5),
Se f(x) = 4x? O0<x<1 (d) P(X > 5.5), (e) E(X), (f) Var(X).
1
6. 4;
f(x) -| 0O<x<2
SxeS 11. The continuous r.v. X has probability
density function given by
4(2x 3) 25553
k
a(x+2)?> —2<x<0 in) = for 1<x <Q,
0 otherwise,
2 0<x<13
where k is a constant. Giving your
8. The continuous r.v. X has p.d.f. f(x) = 4, answers correct to three significant
0<x <3. Find (a) E(X), (b) Var(X), figures where appropriate, find
(c) F(x) and sketch y = F(x), (a) the value of k, and find also the
(d) P(X > 1.8), (e) P(1.1<X<1.7). median value of X,
(b) the mean and variance of X,
9. X isthe continuousr.v. with p.d.f. f(x) = kx?, (c) the cumulative distribution function,
1<x <2. Find (a) the constant k and F, of X, and sketch the graph of
sketch y = f(x), (b) the standard deviation y = F(x): (C)
PROBABILITY DISTRIBUTIONS 11 — CONTINUOUS RANDOM VARIABLES 295
12. The continuous r.v. X has probability places, the median and the interquartile
density function f given by range of the distribution (L)P
k(4—x") for OS x <2. 14. Define the probability density function
Oj 0 otherwise, f(x) and the distribution function F(x)
where F is a constant. Show that k = % of a continuous random variable X.
and find the values of E(X) and Var(X). A factory is supplied with flour at the
Find the cumulative distribution function beginning of each week. The weekly
of X, and verify by calculation that the demand, X thousand tonnes, for flour
median value of X is between 0.69 and from this factory is a continuous random
0.70. variable having the probability density
Find also P(0.69 < X <0.70), giving function
your answer correct to one significant fix) = k(1—x)*, O<x<1,
figure. (C) f(x) = 0, elsewhere.
Find
13. A continuous random variable X has (a) the value of k,
probability density function, f, defined (b) the mean value of X,
by (c) the variance of X, to 3 decimal
f(x) ll 5, 0<x <1, places.
NOTE: the gradient of the F(x) curve gives the value of f(x).
Vitx)
Shallow curve,
small gradient, Steep curve,
small value of f 1 large gradient,
| large value of f
296 A CONCISE COURSE IN A-LEVEL STATISTICS
Example 5.14 The continuous r.v. X has cumulative distribution function F(x)
where
) x<0
eo
F(x) =)" 0<x<3
Zt
1 xes
Solution 5.14
Fey
ae ee
The p.d.f. for X is f(x) where
F ;
een Ones ea8
fix) =| 9
0 otherwise
Example 5.15 The continuous r.v. X has cumulative distribution function F(x)
where '
0 Ne
1(2+) a 2 =x <0
dal 1
So, for) §= 2 =x0 x) = ——(24+x) = —
a dx12' . 12
dl 1
for O<Sx<4 f(x)
x) = ae
——(1+ x) es ;
lpg) 1
for 4<x<6 f(x) os x) = —=
x) = ——(6+x)
Example 5.16 The cumulative distribution function F(x) for a continuous r.v. X
is defined as
0 ae)
ly —1y2 0<x<l
2 8
F(x) = at ix 1<x<2
b+ ix? ae 2<x<3
1 <0
(a) Find a and 6 and sketch F(x). (b) Find and sketch the p.d.f.
f(x). (c) Find the mean p and the standard deviation o.
erefore
Theref: F(38) = ae
Wye
= ipches
S ar 1
8
b= —85
Therefore FQ) =
F(x) =
F(2) =
Ble
wpole
298 A CONCISE COURSE IN A-LEVEL STATISTICS
1 5
So we have at-—- =-
2 8
1
Therefore a= -
F(x) = 1<<x
2<x
x KH
NM
VA own
Sketch of y = F(x).
(b) fix) = =F
— Fr)
Therefore <0
R
Pie
<x<1
R
f(x) = Sx <2
8| Pie
F <x 8
\oO
wo <3
O
DIF
BIFP
fle 8 >3
Sketch of y = f(x).
PROBABI LITY DISTRIBUTIONS [1 — CONTINUOUS RANDOM VARIABLES , 299
Sag By
symmetry
pes BUS)od By
E(X?) = x? f(x) dx
allx
ieessEe a [Ge ae
= 3)dx + a4 2 dx + 3 ies esl 2
-G-eb-EL- EAs
_ 19
ne
Ger
4]1
16to" (21
3]2 4
16
313
12)2
— er 5 2
6 ®)
_u
12
Biel
12
= 0.957 (38S.F.)
l 0 to 3.
Example 5.17 A continuous random variable X takes values in the interva
It is given that P(X> x) =a+ bx*, O<x <3.
(a) Find the values of the constants a and b.
(b) Find the cumulative distribution function F(x).
3
b) N Now
(b) ( Sx )
Px OT
a
i.e. ee 0<xS<3
F(x) = (27
1 x3
c
(c) x) = —F(x
f(x) s F(x)
dx
ra
9
ee
Therefore the p.d.f. of X is f(x) where f(x) = > 0<xS3.
mikes
at;
= 2.25
36] 0
3x4
| Sdee39 1952
fi
09
3
+ 5.0625 .
0
= 0.3375
PROBABILITY DISTRIBUTIONS I! — CONTINUOUS RANDOM VARIABLES 301
i
So iE V 0.33875
0.581 (3 S.F.)
The standard deviation of X is 0.581 (3 S.F.).
Exercise 5e
Determine
with F(x) = 0 for x <—1, and F(x) =1
(a) the value of a, for x >'3..
(a) Sketch the graph of the probability
(b) the frequency function f(x) of X,
density function f(x).
(c) the expected value py of XxX,
standard deviation 0 of X, , (b) Determine the expectation of X and
(d) the
the variance of X.
(e) the probability that |X— p| exceeds 5. (C)
(C) (c) Determine P(3 < 2X < 5).
Ce tas
EE EE
302 A CONCISE COURSE IN A-LEVEL STATISTICS
7.
“THE RECTANGULAR DISTRIBUTION
A continuous r.v. X having p.d.f. f(x) where
xX Hiab)
v
[,,feo a =|
P(1.8<X<1.9) lI So e ee
ole
ae
P(5.1<X<5.2) ll o ry
ole
eg
eal
i
Example 5.18 The rounding error made when measuring the lengths of metal rods
to the nearest 5 mm is a random variable E. What is the distribution
of E?
Solution 5.18 The error is the difference between the true length and the recorded
length after rounding to the nearest 5mm.
So, the error, E, could be anywhere in the interval — 2.5 < E < 2.5.
All points in this interval are equally likely ‘stopping places’ for E,
so E is uniformly distributed in the interval.
Example 5.19 A child spins a ‘Spinning Jenny’ at a fair. When the wheel stops, the
shorter distance of an arrow measured along the circumference
from the child is denoted by C. What is the distribution of C?
If X ~ R(a, b) then
E(x) = Bat)
— Var(X) = 7g(ba)?
The graph of y = f(x) is as shown.
1
cubation!
= : (b—a)(b?
+ ab+ a’)
3(b—a)
(Oarab ra)
eee
So Var(X) = E(X2)—E%(X)
_ (b? Fab +07) | @* Head +b")
3 4
1
PG {4(b? + ab +a”)— 3(a? + 2ab + b?)}
= eee +a?)
12
1
—
em o—
a) 2
Example 5.20 The random variable X has a rectangular distribution over the
Pee eieak 8). Find the probability density function of Y where
= XxX”.
1 = |jax
Now, we require some function of y, g(y)
say, such that
v2
De ey )Gy,
al
1 = [ay
ne yo ay
2 8y?
i.e. 1= ——a
0 8 of
Example 5.21 A rectangle, with one side of length x cm and perimeter 12cm, has
area A cm?. If X is uniformly distributed between 0 and 2, find the
probability density function of A.
Therefore
=| & 1
0 2
A
da
2/9—-a
_ Exercise 5f
aD If the continuous
X ~ R(3,6) find
r.v. X is such
(a) the p.df.
(b) E(X), (c) Var(X), (d) P(X> 5).
that
of X,
Show that the mean is (b+ a)/2, and the
variance is (b—a)?/ 12 for this distribu-
tion.
(0<A<16)
Find (a) the value of k, 4\/(25—A)
(b) P(2.1<X< 3.4), Find the mean and variance of A. (JMB)
(c) E(X), A child rides on a roundabout and his
(d) Var(X).
father waits for him at the point where
The length X of a side of a square is he started. His journey may be regarded
rectangularly distributed between 1 and as a circular route of radius six metres and
4. Find the probability density function the father’s position as a fixed point on
of A, the area of the square, and calculate the circle. When the roundabout stops,
the mean and variance of the area of the the shorter distance of the child from the
square. father, measured alone the circular path,
The radius of a circle follows a rectangular is S metres. All points on the circle are
distribution between 1 and 3. Find the equally likely stopping points, so that S
probability density function of A, the is uniformly distributed between 0 and
area of the circle and calculate the mean 67. Find the mean and variance of S.
and the variance of the area of the circle. The direct linear distance of the child’s
The random variable X has probability stopping point from the father is D
density function given by metres. Show that the probability density
2
1 function of D is. =. for D
(b—a)
aXSx<b where b>a m/(144— D?)
f(x) = between 0 and 12 and zero outside this
0 otherwise range.
PROBABILITY DISTRIBUTIONS Ii — CONTINUOUS RANDOM VARIABLES , 307
The father’s voice can be heard at a Han5
distance of up to ten metres. Find to two
decimal places the probability that the
(v= fy
child can hear his father shout to him and state the range of corresponding
when the roundabout stops. (JMB) values for V. Obtain the mean and median
of V. (C)
10. The line y + 2x = R crosses the coordinate
-» The object distance U and the image axes Ox and Oy at P and Qrespectively.
distance V for a concave mirror are related Given that the area of AOPQ is A, show
to the focal distance f by the formula that A = k?/4.
pet A random variable takes values k such
that 0 <k S65 and is rectangularly dis-
Umar. i tributed in this interval.
U is a random variable uniformly distri- (a) Show that the expected value of A
buted over the interval (2f, 3f). Show that is 25/12.
V is distributed with probability density (b) Calculate the variance of A.
function (L Additional)
ee | eax
Jo
—[e *]-
=1 (since lim e = 0)
x CO
i A CONCISE COURSE IN A-LEVEL STATISTICS
— 308
Now E(X)
‘
iex(he—™) dx
= [eee — [ce ae 0
Shamehp
Ba!
ee Om
cy
r
|Sct hon ne
0
Var(X) = E(X?)—E*(X)
man
neces
L
2
1
Therefore E(X) = i and Var(X) = =.
P(X >a) = e™
P(X >at+b|X>a)= P(X>bd).
To show this, consider
P(X >a) | re * dx
a
= [-e ae
*]7
= e Aa
P(X>at+bNX>a)
Also P(X >at+b|X
>a)
P(X >a)
e A(atb)
pe Aa
eek
= P(X>bd).
So | Ae adh 14
0
a [te
1 ] co 2 are
k ) \
Ape : ie
oe sincee *>Oasx7@
A=kR
[—e“*]5= 0.371
—e%*+e9 = 0.371
oot
= 0.629
a
So a
0.629
i.
Taking logs to base e 2h = inh =
0.629
= 0.464 (3S.F.)
1
(b) E (Tp ios see p. 308
1
0.232
4.3 (2S.F.)
1
Now Var(T) = ea
ae
(0.232)?
19 (2S.F.)
So, putting k = 0.232, we have Var(T) = 19 years? (2 S.F.).
mile e],
= —¢@ *+]
=1—07793
= 0.207 (858.F.)
and P(T>6) = 1—P(T<6)
6
Ms | ke-*t dt
0
a alraen ue
= 1l+e *—]j
= 0.249 (3S.F.)
Therefore, if two tubes are bought,
PUT, < TOT, >6)] + Pl (T,< 1) 0 (7, > 6)] lI 2(0.207)(0.249)
0.103 (8S8.F.)
Therefore the probability that one fails within its first year and the
other lasts longer than 6 years is 0.103 (3S.F.).
Example 5.23 The continuous random variable X has the negative exponential
distribution whose probability density function is given by
(ixje= sea. x 20,
f(x) = 0, otherwise,
where J is a positive constant. Obtain expressions, in terms of X, for
(a) the mean, £(X), of the distribution,
(b) F(x), the (cumulative) distribution function.
Television sets are hired out by a rental company. The time in
months, X, between major repairs has the above negative exponential
distribution with \ = 0.05. Find, to 3 significant figures, the
S12 A CONCISE COURSE IN A-LEVEL STATISTICS
not
probability that a television set hired out by the company will
period. Find also the
require a major repair for at least a 2-year
median value of X.
The company agrees to replace any set for which the time between
major repairs is less than M months. Given that the company does
not want to have to replace more than one set in 5, fin (L)
= [ey
= —(@s771)
= {—e
Therefore F(x) =1—e *, x20.
Ora 2
0.301 (3S.F.)
The probability that a television set will not need major repair in a
2-year period is 0.301 (3 S.F.).
NT TOS
m = —p6n05
= 13.9months (3S.F.)
The median is 13.9 months (3 S.F.).
PROBABILITY DISTRIBUTIONS || — CONTINUOUS RANDOM VARIABLES 313
We require P(X <M) < 0.2
Therefore 1—e 005M < 0.2
0.8
—0.05M > In 0.8
In 0.8
Ms
0.05
M IX 4.46
Since M is an integer M = 4
The company agrees to replace any set for which the time between
major repairs is less than 4 months.
Example 5.24 The lifetime of a particular type of lightbulb has a negative exponen-
tial distribution with mean lifetime 1000 hours.
(a) Find the probability that a bulb is still working after 1300
hours.
(b) Given that it is still working after 1300 hours, find the prob-
ability that it is still working after 1500 hours.
if
therefore 1 = 1000
E(X) = 1000,
aN
A = 0.001
@ P(>Xx) = e)
e209
(sp.
Poe= 1300) =e °9010200)
= e 13
= 0.273 (3S.F.)
The probability that a bulb is still working after 1300 hours is
0.273 (3 S.F.).
A CONCISE COURSE IN A-LEVEL STATISTICS
314
P(X > 1500|X > 1300) II P(X > 200) (see p. 309)
(b)
= g 0-001(200)
= pro?
= 0.819 (35.F.)
The probability that the bulb is still working after 1500 hours,
given that it is still working after 1300 hours, is 0.819 (3 S.F.).
E
fitLINK BETWEEN THE EXPONENTIAL DISTRIBUTION AND
THE POISSON DISTRIBUTION '
So P(T<t) = 1—e
Therefore F(t) = 1—e (cumulative distribution function)
Now f(t). =<F (t)
= Xe. (probability density function)
This is the exponential distribution, with parameter A.
NOTE: the parameter X is the same value as the respective Poisson
parameter, and the units of time are the same in both distributions.
Example 5.25 Ona stretch of road, breakdowns occur at an average rate of 2 per
day, and the number of breakdowns follows a Poisson distribution.
Find
(a) the mean time between breakdowns,
(b) the median time between breakdowns.
PROBABILITY DISTRIBUTIONS 11 — CONTINUOUS RANDOM VARIABLES ¢ 315
Solution 5.25 Let T be the rv. ‘the time between breakdowns’. Then T follows an
exponential distribution with parameter 2 where
Liane
a)
(a) E(T)=—
E(T) ce = —by
Ob | Qe~?* at
m
al
at
6 eka oe
qee(o "say
= be we
Therefore
eer e— 05
—2m = 1n0.5 (taking logs to base e)
m = —31n0.5
= 0.3465 days (4d.p.)
= 8hours (approx.)
Therefore the median time between breakdowns is approxi-
mately 8 hours.
--——s Exercise 5g
A continuous r.v. X has p.d.f. f(x) where other lasts for less than the mean number
f(x) = 5e *, x 20. Find (a) P(X > 0.5), of hours.
(b) E(X), (c) P(X < E(X)), (d) the (d) A random sample of6 lightbulbs is
standard deviation of X, (e) the median, chosen. Find the probability that exactly
(f) the mode. 4 will each last more than 2500 hours.
The lifetime T, in years, of articles Show that A =1/a and that the mean
and variance of T are a anda’ respec-
produced by a manufacturer can be
tively.
modelled by the probability density
function given by
(You may assume that | t"e—*/4 dt
fit) = ae @-t 20, 0
f(t) = 0, t<0. =a"+t1n! for integral values of n.)
1 The life in hours of a type of electric
Prove that the mean of T is a and its battery can be modelled by the above
Pee tine
median is ——. distribution and when a sample of 800
a is tested the mean life is found to be
The articles are produced at a unit cost 92.2h. What are the values of A anda
of £10 and sold for £25. Research shows based on this figure?
that 50% of those produced fail within (a) What is the probability that a battery
the first five years of life. Find the value will last for at least 200 h?
of a. (b) If a battery has lasted 200 h what is
the probability that it will last for at
After some time in business the manu-
least a further 100 h?
facturer decides to guarantee free replace-
(c) If two batteries are bought what is
ment of items which fail during their first
the probability that one fails before 200h
year, but at the same time he raises the and the other after 200 h? (SUJB)
price so that the increase covers the expec-
ted cost of providing the guarantee. What
should the new price be? The random variable X can take all values
If two items are purchased what is the between 0 and a inclusive, where a > 0.
probability that just one will be replaced Its probability density function f(x) is
under guarantee? (SUJB) zero for x <0 and x >a, and, for
0 <x Sa, satisfies
Describe the conditions under which it is f(x) = (A/a)exp(— x/a),
appropriate to use the Exponential Distri- where A is a positive constant. Show by
bution, supporting your answer with integration that A = 1.582 to 3 decimal
reference to an experiment you may have places.
carried out. Also use integration to find to 2 decimal
A major road construction project is places
underway. In the site supervisor’s office, (i) the probability that X is less than 3a;
there is an average of two telephone calls (ii) the number A for which there is a
every 5 minutes. Stating any assumptions probability 5 that X is less than Aa. (MEI)
you make, write down the probability
that in a period of ¢ minutes there is
(a) no telephone call, Explain briefly, from your projects if
(b) at least one telephone call. possible, a real-life situation that can be
Presenting a carefully reasoned argument, modelled by an exponential distribution.
give the cumulative distribution function, An archer shoots arrows at a target. The
F(t), for the length of time between tele- distance X cm from the centre of the
phone calls. Hence establish that the prob- target at which an arrow strikes the
ability density function, f(t), is target has probability density function,
iit = Uden ow peO f, defined by
1 co
E(X) fe x fx) dx
s x e & a u)?/207 ax
ov 2°
dhe 1
Now, let t = £ so that dt =—dx and whenx =~, f=©,
0
x = —00,
t =—o,
1 <2 102
So E(X) = ov 2m J—
| . (utot)e2? odt
II
oe at dt +=
al ern dt
n+ Aa
Ire ne
Therefore E(X) = u
Var(X) l| |f x? f(x)
dx—p?
f co. 402
I—p? where I= o/on |we + ot) ot)? eCe—7t o dt
1 Po, 12 co” :
dt+ auo| ted ae
a Belel et
Now
Pea at =| t(tet*yae
0+/27
2 1
Sak he eee 200
So I Car (u?/2m + Quo[—e- 2" |" +0?./27)
SA
p?+o? since e 7! >Oast> +0
PROBABILITY DISTRIBUTIONS I] — CONTINUOUS RANDOM VARIABL
ES y 319
Therefore Var(X) = p?+0?—-p?
‘Weconsider f(x) = 1 oe
&— 2,52
B20
20
’ L (x — p)
fx) => t-a a =) aya[ae
ov/ 2a Oo”
1 2 2
= aa (saa) ont a) /20
o°\/ 2a
Now
21 oO
pect. 2
a 1 e—@—H)"/20" [-
Seis +4
21 o
When x = yp, f"(x) <0.
There is a maximum value of f(x) when x = uy.
| Ax(6—x)? ax
6
Therefore 1 II
0
6
Il A| (36x — 12x? + x3)dx
(0)
= A[18x?=4a2 Fixt],
= 108A
So ASS
il
Therefore f(x) = —x(6—x)?
ia ail x)? O<x<6
<x<
1 6
BOS roa 2°3127/6 a x)?dx
(X) = e=
= —(° (36x?
—12x?+ x4) dx
108) o
1 x5] 6
Se fl 2 ok ae ae
= eo =|"
= 2.4
1
x)
f(x) = —=
10g (36x
(86%— 12x? —12x?+x3
+ x*)
1
(i) = ——
ee toa(8624g% + 8x7)ex"
34
= i0a ae a
f(x) = 0 when x = 2 andwhen x = 6
— 24).
Consider f"(x) = z35 (6x
When x = 2, f"(x)<0O and when x = 6, f"(x) >0.
Var(X) = E(X2)—E%(X)
Now E(X?) = | x? f(x) dx
allx
= —("
108]
(36x3—12x4+
x5) dx
1 2x Fee!) ©
a 108 lax 5 a
= 7.2
Var(X) = 7.2—(2.4)
= 1.44
Standard deviation of X = /Var(X)
= 12
Therefore the variance of X is 1.44 and the standard deviation is 1.2.
2
| (ax —bx?)dx = 1
0
= ml
pecsSeen oe =1
2 3 jo
_ 8b
24> =}
3
rate
2P eo) 2a—1
a i
(i)
so
Now ‘
E(X) = [_
allx
xflx)de
= | (ax?
— bx?) dx
_ [x2
_bxtl?
3 4 |o
8a
= ——A4b
3
But, we are given that E(X) = 1.So
;
Say == 1
8,
4b = —-1
3
2 8ae _ 16a_2 :
olen ‘es
From equations (i) and (ii) we have
16A 2 en
9 3 -
1 _ 2a
3 9
3
a=—
2
PROBABILITY DISTRIBUTIONS II — CONTINUOUS RANDOM VARIABLES a 323
8b
Substituting for a in (i) we have — = 3—-1 .
S
b=—
4
(b) ny i} x? f(x)
dex
allx
2 (8 3
= | [eae dx
0 \2 4
: BeBe 2
8 20 0
= 6—4.8
= 1.2
="072
So eae ae
3 1
and F(2) me)
= 1 as required
-30)-4eh
(d) P(X < 5) F(5)
5
32
324 A CONCISE COURSE IN A-LEVEL STATISTICS
27
an Cee es,
Therefore P(X 2 5) Eig
$9) \32
0.288 (3d.p.)
Therefore if two independent observations are made on X, the
one of them is less than =is 0.288 (3 d.p.).
probability that at least a
ee ae oy OS)
Example 5.28 The time taken to perform a particular task, t hours, has the proba-
bility density function
10ct? 0<t<0.6
f(t) = | 9c(1—t) 0.6<t<1.0
0 otherwise.
where c is a constant.
(a) Find the value of c and sketch the graph of this distribution.
(b) Write down the most likely time.
(c) Find the expected time.
(d) Determine the probability that the time will be
(i) more than 48 minutes,
(ii) between 24 and 48 minutes.
1= | f(t) dt
allt
0.6 1.0
= 10¢ | Pat +9e| (1—t) dt
Jo 0.6
10c ¢2)1-0
= te aac e=5
3 2 a6
= 0.72c
+ 0.72c
= 1:44c
Therefore c= er
== 200
ae
|
SR
PROBABILITY DISTRIBUTIONS I1 — CONTINUOUS RANDOM VARIABLES / 325
We have
Le 2 0<¢<06
f(t) = {|2a-t) 0.6<t<1.0
0 otherwise.
0 0.6 Tame
Therefore the mode is 0.6 hours = 36 mins.
0.6 1.0
aatuca Pat +9 | (t—t?) dt
0 0.6
10ec t? t? 1.0
Selly 1 Ochoa
A ae me 0.6
= 0.225+0.366...
= 0.591 ... hours
= 35.5 minutes
The expected time is 35.5 minutes.
1.0
P(T> 0.8) = Qe | (1 —t)dt
/ 0.8
: [ “\. 0 0608 1¢
a eee
2 |0.8
0.125
Now
P(0.4<T<0.8) = 1—P(T > 0.8) —P(T < 0.4)
0.4
10¢ ‘Gabe
3
yee
Therefore
P(0.4<T<0.8) = 1—0.125—0.1481...
O721. (o°5.5.)
The probability that the time will be between 24 and 48 minutes
isO.f2i (So.r.).
[,.feee—2 d
P(e<X<d)=| f(x) dx cH ocd
a
Bex) = | x f(x) dx
all
Var(X) = | x? f(x)
dx —E*(X)
all
“t
F(t) = | f(x)dx a<t<b_ where F(t) is the cumulative
2 distribution function
f(zelier
-<p (x)
PROBABILITY DISTRIBUTIONS Il — CONTINUOUS RANDOM VARIABLES 327
7
in
E(X) = a)
A
Var(X) = = (b—a)?
The exponential distribution
If f(x) =r\ x20
E(X) = X
Var(X) = =
1
If LS Oa e-@ rN)’ 207 —oo<
x <00
then X ~ N(y, 0)
B(x) =
Var(X) = 0?
Miscellaneous Exercise 5h —
xX ~ Ni(u, 07)
Sketch ofy = f(x):
The distribution is bell shaped and fix)
symmetrical about x = uy.
Approximately 95% of the distribu-
tion lies within + 2 standard devia-
tions of the mean. 5 & Sie © 5 S
Approximately 99.8% of the dis- ’ I b “ + *
tribution lies within +8 standard
deviations of the mean.
d
The range of the distribution is therefore approximately 6 standar
deviations.
by
The maximum value of f(x) occurs when x = u and is given
uk
f(x) = ov/ 20
331
332 A CONCISE COURSE IN A-LEVEL STATISTICS
0.16
f(x)
fix)
0.2
ZANSTAGRN 4S 5052554 OG 7 ex
2 TS rae 56
Pa<X<b)
ams S = [ :eee erates de
a ov
ov/ 27 L
However, this integral is very difficult to evaluate, so we work with
the standard normal variable Z.
Las
ae
0
Now, if the r.v. X has a normal distribution with mean y and variance
o”, then the r.v. Z has a standard normal distribution with mean 0
and variance 1,
THE NORMAL DISTRIBUTION
#; 333
1é oe = Neto) ede
‘then Z ~ N(0,1)
Example 61 Show that E(Z) = 0 and Var(Z) = 1, where Z is the standard
normal variable.
1 1
= ( EX)-E)] pelea
atVar(i)
eal
= mie Ls M) oneae [Var(X
+)0]
2
So E(Z) =0 ies
So Var(Z) = 1
35 2 al 0 1 2 3
@(z)= P(Z<z)= ia Sk e- z dz
@(z)
Q(z)
In the main text we will refer to the tables giving ®(z), the cumula-
tive probabilities of the standard normal distribution. These are
printed on p. 634.
However, you will find instructions for the use of the Q(z) tables
in Appendix 2 on pp. 641-53. Q(z) tables are printed on p. 640.
@(a) P ye
0a a0
EM
0a —a 0
NOTE: We have
THE NORMAL DISTRIBUTION 335
7
1.377 ea t3770 10
eee 1.377
er — 1.3778 0
0.9192 — 0.7257
0.1935
+ © 0
So P(—1.4 <Z<—0.6) = 0.1935. ay
(e)
P(Z > 0.863 or Z<—1.527) = 1—(#(0.863) + (1.527) —1)
= 2—(0.863) — (1.527)
= 2—0.8059— 0.9365
= 0.2576 | D
a 9
So P(Z>0.863 or Z<—1.527) = 0.2576. = 8
0.99
— 2.575 2.575
Therefore P(—2.575 < Z < 2.575) = 0.99.
Exercise 6a
i ntl
1. IfZ~N(0,1), find (a) P(Z > 0.874), (f) P(Z > 2.326), (g) P(Z > 2.808),
(b) P(Z < 0.874), (c) P(Z <— 0.874), (h) P(Z < 1.96).
(d) P(Z > — 0.874).
IfZ ~ N(0, 1), find
(a) P(0.829<Z <1.843),
2. IfZ~N(0,1), find (a) P(Z > 1.8), (b)/ P(— 2.56 <Z<0.134),
(b) P(Z<—0.65), (c) P(Z >— 3.46), (c) P(—1.762 <Z<— 0.246),
(d) P(Z < 1.36), (e) P(Z > 2.58), (d) P(O<Z <1.73),
(f) P(Z > — 2.37), (g) P(Z< 1.86), (e) P(-2.05<Z<0),
(h) (Z <— 0.725), (i) P(Z > 1.863), (f) P(— 3.08 < Z < 3.08),
(i) P('Z<1.63), (k) P(Z >— 2.061), (g) P(1.764 <Z < 2.567),
(1) P(Z <— 2.875). (h) P(—1.65 <Z< 1.725),
(i) P(—0.98<Z<—0.16),
(ji) P(Z<—1.97 or Z> 2.5),
3. IfZ~N(0,1), find (a) P(Z > 1.645), (k) P(|Z|<1.78), (1) P(|Z|> 0.754),
(b) P(Z <—1.645), (c) P(Z > 1.282), (m) P(— 1.645 <Z <1.645),
(d) P(Z > 1.96), (e) P(Z > 2.575), (n) P(|Z|> 2.326).
Example 65 If Z~N(0,1), find the value of aif (a) P(Z >a) = 0.3802,
(b) P(Z >a) = 0.7818, (c) P(Z <a) = 0.0793,
(d) P(Z <a) = 0.9698, (e) P(|Z|<a) = 0.9.
—1.41
338 A CONCISE COURSE IN A-LEVEL STATISTICS
Exercise 6b
Example 66 Ther.v. X ~ N(300, 25). Find (a) P(X > 305), (b) P(X < 291),
(c) P(X < 312), (d) P(X > 286).
THE NORMAL DISTRIBUTION i 339
Solution 66 (a) P(X > 305). X ~ N(300,
25)
s.d.=5
First we have to standardise the random
variable X by subtracting the mean, 300,
and dividing by the standard deviation, uw=300 305
X — 300
(s.d.), 5,,so0 that Z = ———_—_,
X—3800 _291—300
(b) P(X < 291) = ee
5 5
= P(Z<-—1.8)
a i, Ie
: (ie) 291 300
= 1—0.9641 SV 180
= 0.0359
Therefore P(X < 291) = 0.0359.
X—3800 | 312—300
(6) ee ed 2) aed er
5 5
= P(Z< 2.4)
(2.4)
0.9918
Therefore P(X < 312) = 0.9918.
A CONCISE COURSE IN A-LEVEL STATISTICS
340
Example 6.7 Ther.v. X is such that X ~ N(50, 8). Find (a) P(48 < X < 54),
(b) P(52 <X <55), (c) P(46<xX < 49), (d) P( |X—50| <+/8).
or 50
Solution 6.7. Standardise X so that Z = ai
a X—50 a
(a) P(48 <X < 54) Ve ais meee Vs
52.7750 gX 50 ny 0D
= 50
(b) P(52<X < 55) P
ar RR gIN ig
P(0.707 <Z <1.768)
(1.768)
— &(0.707)
= 0.9615 —0.7601
= 0.2014 50'52 55
P|46-50 X— 50 49 ve J
(c) P(46<X<49)
GSir aoe
P(—1.414 < Z < — 0.354) s.d.=/8
(1.414) — (0.354)
0.9213 — 0.6383
4 om 4950
0.283 eS Vi +t
=
~ cS
Ww
= P(-1<Z<1)
= 20(1)—1
= 2(0.8413)—1
= 0.6826
Therefore P(|X—50| <./8) = 0.6826.
Paxomeis 6.8 The time taken by a milkman to deliver milk to the High Street is
normally distributed with mean 12 minutes and standard deviation
2 minutes. He delivers milk every day. Estimate the number of days
during the year when he takes (a) longer than 17 minutes, (b) less
than 10 minutes, (c) between 9 and 13 minutes.
Solution 6.8 Let X be the r.v. ‘the time taken to deliver the milk to the High
Street’. Then X ~ N(12, 27).
X—12
We standardise X so that Z =
X—-12_ 17-12
(a) P(X>17) = S sae
2 2
= P(Z>2.5) —
= 1—(2.5) S.V. Or 225
= 1—0.99379
= 0.006 21
The number of days when he takes longer than 17 minutes
II 365(0.006 21)
= 2.27
~ 2
Therefore on approximately 2 days in the year he takes longer
than 17 minutes.
X—-12 10-12
(b) P(X<10) = >| Ea |
2 2
en (Z a1) 10 12
= 1-—@(1) Se she
= 1—0.8413
0.1587
342 A CONCISE COURSE IN A-LEVEL STATISTICS
on 58daysintheyearhetakes less
approximately
Therefore
than 10 minutes.
ae P Gola Saale =)
(c) POX < 13)e= 9 9 re
1. If X~N(300, 25), find (a) P(X > 308), 6. If X ~N(84,12), find (a) P(80< X < 89),
(b) P(X > 311.5), (c) P(X > 294), (b) P(X <79 or X > 92),
(d) P(X > 290.5), (e) P(X < 302), (c) (76 <X <82),
(f) P(X < 312), (g) P(X < 299.5), (d) P(| X—84|> 2.9), (e) P(87 <X< 98).
De-standardising
Sometimes it is necessary to find a value X which corresponds to
x?
jis
Exercise 6d
Find the value of X which corresponds to a (iii) X ~ N(84.5, 50), (iv) X ~ N(62.3, 38),
standardised value of (a) — 2.05, (b) 0.86 for (v) X ~ N(u, 0”), (vi) X ~ N(a, b),
each of the following distributions: (vii) X ~ N(a, a”), (viii) X ~ N(49, 49).
(i) X ~ N(60,17), (ii) X ~ N(124, 3.2%),
Example 6.10 IfX ~ N(100,36) and P(X >a) = 0.1093, find the value of a.
Solution 6.10 As P(X >a) is less than 0.5, a must be greater than the mean, 100.
Now P(X >a) = 0.1093 3
X—100_ a—100 P(X >a) = 0.1093
so pf > ]= 0.1093
—10
100 a
= 0.8907
344 A CONCISE COURSE IN A-LEVEL STATISTICS
Example 6.11 If X ~ N(24,9) and P(X >a) = 0.974, find the value of a.
Solution 6.11 As P(X >a) is greater than 0.5, a must be less than the mean 24.
Now P(X >a) = 0.974
= —3A
eG (= 3 22 tel 3 ye 0.974
—924
os plz>" ; = 0.974
a—24 :
Now —e must be negative and
P -2.<2*
A= 710 <4) = 08 Sie 6
5 5 5
UV | /\ N A lI 0.8
o|& Saale
THE NORMAL DISTRIBUTION
| J 345
Now, by symmetry
E
a
20 '—|~—1, = 0.8
a
20(2 = 1.8
5
o(é = 0.9
Therefore = 1.282
= 6.41
Exercise Ge
9. A sample of 100 apples is taken from a Determine the mean and standard devia-
load. The apples have the following distri- tion of these diameters.
Example 6.13 The lengths of certain items follow a normal distribution with mean
j.cm and standard deviation 6 cm. It is known that 4.78% of the
items have a length greater than 82 cm. Find the value of the mean L
so. .1—®
82=—p
“nafs
= 0.0478
8Z—
6
S240
@ a = 0.9522
so a = 1.667
82—p = 10.002
w= 72 (28.F.)
The mean of the distribution is 72 cm.
Example 614 X ~ N(100,o7) and P(X < 106) = 0.8849. Find the standard devia-
tion, oO.
THE NORMAL DISTRIBUTION
y 347
(1.2) = 0.8849
6
Therefore SES L2
oO
6
C= ——
2
= 5
The standard deviation of the distribution is 5.
Solution 6.15 Let X be the r.v. ‘the mass, in g, of an article’. Then X ~ N(u, o7)
where yw and o are unknown.
Now P(X
>85) = 0.05
vel Fee
Aan
ren|
85 —
Sais 5%%
Ove: 0
3 bu 85
= ‘vp ,
| =—108 a ee
0
oS) —SOOr
A CONCISE COURSE IN A-LEVEL STA TISTICS
348
25—K. :
But is negative, and by symmetry,
From tables
+A] =
_
oor
0
(1.282) = 0.9
25—pu
Therefore a Oe
0
i.e. pw—25 = 1.2820 (ii)
PS 51.8 b— 51.3) _
0.125
20.5 20.5 )
(=|
So p| 22 0.875
20.5
THE NORMAL DISTRIBUTION
y 349
But from tables @(1.15) = 0.875
Oe-Ol.
Therefore se = 1.15
20.5
b = 51.3+(20.5)(1.15) = 74.9 (3S.F.)
From symmetry a= 51.3—(20.5)(1.15) = 27.7 (38.F.)
Therefore, the central 75% of the distribution lies between the limits
27.7g and 74.9 g.
Exercise 6f ~
X ~ N(45,
0”) and P(X > 51) = 0.288. 12. The diameters of bolts produced by a
Find o. particular machine follow a normal
distribution with mean 1.84cm and
X ~ N(21, 0”) and P(X < 27) = 0.9332. standard deviation 0.04cm. A bolt is
Find o.
rejected if its diameter is less than 1.24 cm
X ~ N(u, 25) and P(X < 27.5) = 0.3085. or more than 1.40cm. (a) Find the
Find wu. percentage of bolts which are accepted.
The setting of the machine is altered so
X ~ N(u, 12) and P(X > 32) = 0.8438. that the mean diameter changes but the
Find uw. standard deviation remains the same. With
the new setting, 3% of the bolts are
X ~ N(u, 0”) and P(X > 80) = 0.0113,
rejected because they are too large in
P(X> 30) = 0.9713. Find p and o.
diameter. (b) Find the new mean diameter
X ~ N(u, 0”) and P(X > 102) = 0.42, of the bolts produced by the machine.
P(X < 97) = 0.25. Find p and o. (c) Find the percentage of bolts which are
rejected because they are too small in
X ~ N(u, 0”) and P(X < 57.84) = 0.90, diameter.
P(X > 50) = 0.5. Find wand o.
13. A certain make of car tyre can be safely
X ~ N(u, 0”) and P(X < 35) = 0.2, used for 25000 km on average before it
P(35 < X < 45) = 0.65. Find pu and o.
is replaced. The makers guarantee to pay
The marks in an examination were compensation to anyone whose tyre does
normally distributed with mean wu and not last for 22000km. They expect
standard deviation 0. 10% of the candi- 7.5% of all tyres sold to qualify for com-
dates had more than 75 marks and 20% pensation. Assuming that the distance, X,
had less than 40 marks. Find the values travelled before a tyre is replaced has a
of wand Oo. normal probability distribution, draw a
diagram illustrating the facts given above.
10. The lengths of rods produced in a work-
shop follow a normal distribution with Calculate, to 3 significant figures, the
standard deviation of X.
mean J and variance 4. 10% of the rods
are less than 17.4cm long. Find the Estimate the number of tyres per 1000
probability that a rod chosen at random which will not have been replaced when
will be between 18 and 23 cm long. they have covered 26 500 km.
(L Additional)
11. A man cuts hazel twigs to make bean
poles. He says that astick is 240cmlong. 14. A cutting machine produces steel rods
In fact, the length of the stick follows a which must not be more than 100 cm in
normal distribution and 10% are of length length. The mean length of a large batch
250 cm or more while 55% have a length of rods taken from the machine is found
over 240 cm. Find the probability that a to be 99.80 cm and the standard deviation
stick, picked at random, is less than of these lengths is 0.15 cm.
235 cm long. (a) Assuming that the lengths of the rods
A CONCISE COURSE IN A-LEVEL STATISTICS
Example 6.16 Tests on 2 types of electric light bulb show the following:
Type A, lifetime distributed normally with an average life of 1150
hours and a standard deviation of 30 hours.
Type B, long-life bulb, average lifetime of 1900 hours, with standard
deviation of 50 hours.
(a) What percentage of bulbs of type A could be expected to have
a life of more than 1200 hours?
(b) What percentage of type B would you pee to last longer
than 1800 hours?
(c) What lifetime limits would you estimate would contain the
central 80% of the production of type A? (SUJB)
Solution 6.16 (a) Let X be the r.v. ‘the length of life in hours of type A bulb’.
Then X ~ N(1150, 302).
a | X ~ N(1150, 30%)
~ 2
s.d. = 30
a re
= P(Z > 1.667) TypeA
= 0.0478
1150 1200
Sve O 1.667
(b) Let Y be the r.v. ‘the length of life in hours of type B bulb’.
Then Y ~ N(1900, 507).
(a) Assuming that the mean has not changed but that the produc-
tion has become more variable, estimate the new standard
deviation.
(b) Assuming that the standard deviation has not changed but that
the mean has moved, estimate the new mean.
(c) If 1000 components are produced ina shift, how many of them
may be expected to have lengths in the range 6.48 to 6.53 cm
if the machine is set as in (a)? (AEB 1972)
A CONCISE COURSE IN A-LEVEL STATISTICS
352
X—6.50 6.54—6.50
E > = 0.05 6.50 6.54
O1 O1 S.V. 0 1.645
0.04
P\|Z> = O05
0;
6.54—p
therefore —— sb O0L
0.0243 :
p = 6.54—(1.501)(0.0243)
6.504 (3d-p.)
Therefore the new mean is 6.504 cm (38 d.p.).
THE NORMAL DISTRIBUTION y 353
Miscellaneous Exercise 6g
Batteries for a transistor radio have a mean Six hundred rounds are fired from a gun at
life under normal usage of 160 hours, with a horizontal target 50 m long which extends
a standard deviation of 30 hours. Assuming from 950 m to 1000 m in range from the
that battery life follows a normal distribu- gun. The trajectories of the rounds all lie
tion, in the vertical plane through the gun and
(a) calculate the percentage of batteries the target. It is found that 27 rounds fall
which have a life between 150 hours and short of the target and 69 rounds fall
180 hours; beyond it. Assuming that the range of
(b) calculate the range, symmetrical about rounds is normally distributed, find the
the mean, within which 75% of the battery mean and standard deviation of the range.
lives lie; Estimate the number of rounds falling
(c) if a radio takes four of these batteries within 5 m of the centre of the target.
and requires all of them to be working, (C)
calculate the probability that the radio
will run for at least 135 hours. (O&C) Machine components are mass-produced at
(a) Without using a calculator, calculate a factory. A customer requires that the
the mean and standard deviation of the components should be 5.2cm long but
numbers: 2,3,5,5,8,11,11,11. they will be acceptable if they are within
limits 5.195 cm to 5.205 cm. The customer
(6) A machine produces components in tests the components and finds that
batches of 20000, the lengths of which 10.75% of those supplied are over-size and
may be considered to be normally distri- 4.95% are under-size. Find the mean and
buted. standard deviation of the lengths of the
At the beginning of production, the components supplied assuming that they
machine is set to produce the required are normally distributed.
mean length of components at 15mm, If three of the components are selected at
and it can then be set to give any one of random what is the probability that one is
three standard deviations: 0.06 mm, under-size, one over-size and one satis-
0.075 mm, 0.09 mm. factory?
It costs £850, £550 and £100 respectively If the standard deviation of the machine
to set these deviations. producing the components is altered with-
Any length produced must lie in the out altering the mean so that 4.95% are
range 14.82 mm to 15.18 mm, otherwise it over-size, what will be the new standard
is classed as defective and costs the com- deviation and what percentage of com-
pany £1. ponents will now be under-size? (SUJB)
Which standard deviation should be used, if
the decision is to be made purely on the 5. A marketing organisation grades onions
cost of setting the machine and of the into 3 sizes: small (diameter less than
defectives? (SUJB) 60 mm), medium (diameter between 60 mm
354 A CONCISE COURSE IN A-LEVEL STATISTICS
and 80 mm) and large (diameter greater The candidate is not admitted unless his
than 80 mm). A certain grower finds that 1.Q. as given by the test, is at least 130.
61% of his crop falls into the small category Estimate the median I.Q. of the members
and 14% into the large category. Assuming of the Egghead Society, assuming that their
that the distribution of diameters of the 1.Q. distribution is representative of that
onions in his crop is described by a Normal of the part of the population having I.Q.s
probability function, sketch a graph greater than, or equal to, 130.
showing the information given above. What I.Q. would be expected to be exceeded
On this basis, calculate the standard by one member in ten of the society?
deviation and the mean of the diameters (AEB )
of the onions in his crop. (SMP)
The acidity of each of 100 random samples
Packets of semolina are nominally 226 g in of soil from an area of land was measured
weight. The actual weights have a Normal and the results given in Table A below.
distribution with u = 230.00g and
Assuming that the pH values are deter-
o = 1.50g. What is the probability that a
packet is underweight? mined correct to the nearest tenth of a
unit construct a cumulative frequency
A decision is taken that the probability * curve to illustrate the distribution.
of an underweight packet should not
A possible measure of kurtosis (i.e. flat-
exceed 0.001. To change the distribution
ness) is given by
of weights of the semolina packets to con-
form to this decision, two methods are
considered:
eoneen
Pg99— Pio
(a) to increase U, leaving 0 unaltered;
(b) to improve the packing machine, thus where Q is the semi-interquartile range,
reducing 0, while leaving W unaltered. Poo the 90th percentile and Pio the 10th
percentile. Estimate the value of k for the
Find the new values (of mu and of © res-
above distribution.
pectively) required for each method to
succeed, given that, for the standardised Use the standard normal table (p. 633)
Normal distribution, to estimate the value of k for a Normal
distribution. Is the above distribution
P(Z > 3.0902) = 0.0010 flatter than a Normal distribution with the
(SMP) same total frequency? (SUJB)
A factory is illuminated by 2000 bulbs. 10. Describe the principal features of a normal
The lives of these bulbs are normally distribution. Draw a sketch of the proba-
distributed with a mean of 550 hours and a bility density function of the distribution
standard deviation of 50 hours. It is N(0,1).
decided to replace all the bulbs at such
A machine is producing a type of circular
intervals of time that only about 20 bulbs
gasket. The specifications for the use of
are likely to fail during each interval. How
these gaskets in the manufacture of a
frequently should the bulbs be changed?
certain make of engine are that the thick-
When the manufacturing process is improved ness should lie between 5.45 mm and
so that the mean life of bulbs is increased 5.55 mm, and the diameter should lie
to 600 hours and the standard deviation is between 8.45 mm and 8.54 mm. The
reduced to 40 hours, the replacement machine is producing the gaskets so that
interval is changed to 500 hours. Show that their thicknesses are N(5.5, 0.0004), that
it will now be necessary to tolerate the is, normally distributed with mean 5.5 mm
failure of only about 12 bulbs per interval. and variance 0.0004 mm’, and their dia-
(AEB 1973) meters are independently distributed
Before joining the Egghead Society, every N(8.54, 0.0025).
candidate is given an intelligence test Calculate, to one decimal place, the per-
which, applied to the general public, would centage of gaskets produced which will
give a normal distribution of 1.Q.’s with not meet
mean 100 and standard deviation 20. (a) the specified thickness limits,
Table A
If X ~ Bin(n, p) then
E(X) ane
Var(X) ll npq where q = rep
Now, for large n and p not too small or too large,
X ~ N(np,npq) approximately
Example 618 Find the probability of obtaining between 4 and 7 heads inclusive
with 12 tosses of a fair coin,
(a) using the binomial distribution,
(b) using the normal approximation to the binomial distribution.
Solution 6.18 Let X be the r.v. ‘the number of heads obtained’. Let ‘success’ be
‘obtaining a head’.
Then X ~ Bin(n, p) where n = 12 and p = P(head) = 5
A CONCISE COURSE IN A-LEVEL STATISTICS
356
1 12—x 4: x
1 12
ah 5 6 ae Om On i 12
3.5 7.5 Number of heads
where n= 12 and p = 5
X ~ N(np,npq)
So xX ~ N(6,3)
However, before using the approximation we must take into account
the fact that we are using a continuous distribution to approximate
a discrete variable. So we make a continuity correction.
In this example P(4 < X < 7) transforms to P(3.5 < X < 7.5).
So,
pea yeKas a
P(3.5 < X < 7.5)
/3 /3 /3
= P(—1.443 <Z < 0.866)
35 675
0.732 (8d.p.) S.V. —1.443 0 0.866
THE NORMAL DISTRIBUTION
/ 357
The probability of obtaining between 4 and 7 heads inclusive is
0.732 (3 d.p.).
NOTE: this answer compares very well with the answer in part (a),
and the working is much quicker to perform.
The approximation is even better for large n and it is preferable that
D is close to 5:
If we require the probability that there are less than 3 heads, i.e.
P(X <3), then we consider P(X < 2.5).
So P(X <0) 6—« ox
=2.5)
25335
P(X <3)—rectangle not included
s’ be
Solution 6.19 Let X be the r.v. ‘the number of ryegrass seeds’. Let ‘succes
‘obtaining a ryegrass seed’.
Then X ~ Bin(n, p) where n = 400 and p = 0.35.
Now, as 7 is large, we use the normal approximation to give
X ~ N(np,npq) where np = (400)(0.35) II 140
npq = (140)(0.65) 91
so X ~ N(140,91)
(a) We require
P(X <120) —— P(X < 119.5) (continuity correction)
Now
X—140 ,119.5—140
= Ae ed
P(X <119.5)
V91 J/91
=" P(Z <2 .149) 119.5 140
= 0.0158 $.V. —2.149 0
The probability that there are less than 120 ryegrass seeds is 0.0158.
119.5— ue Xi= e
P(119.5 <X<150.5) = p[uesa oe Ba
V91 V91 J91
P(—2.149 <Z<1.101) s.d.=/91
0.8487
The probability that there are between 120 and 150 ryegrass seeds
is 0.8487.
THE NORMAL DISTRIBUTION 359
i
(c) P(X > 160) —+ P(X > 160.5) (continuity correction)
Px 160.5) = a
x14 i ; s.d.=+/91
ey
V91 V91
= P(Z > 2.149)
140 160.5
= 0.0158 S.V. 0 2.149
The probability that there are more than 160 ryegrass seeds is
0.0158.
Example 6.20 The random variable X has a binomial distribution with parameters
n and p. Derive the mean and variance of X.
Show that the probability of obtaining a total of seven when two
fair dice are tossed is 1/6. A pair of fair dice is tossed 100 times and
the total observed on each occasion. What is the probability of
getting more than 25 sevens? How many tosses would be required
in order that the probability of getting at least one seven is 0.9
or more. (AEB)
Solution 6.20 If X ~ Bin(n, p) then E(X) = np and Var(X) = npq (see p. 214).
Total on two dice
P(total of 7 when two dice are tossed)
die
Second
ees qe
First die
Let X be the r.v. ‘the number of sevens when two dice are tossed’.
Let ‘success’ be ‘obtaining a total of 7’.
Now nis large and p is not too small, so we use the normal approxi-
mation:
1
X ~ N(np,npq) where np = (100)(5] = 50/3
aS 125
mB AOD)Ke harbiye
standard deviation = 55/3
so X ~ N(50/3, 125/9).
360 A CONCISE COURSE IN A-LEVEL STATISTICS
We require
(continuity correction)
P(X > 25) —» P(X > 25.5)
| s.d. = 55/3
P(X > 25.5 = oe
: 8) ee SVB 5/5/38
P(Z > 2.370) 50/3 25.5
S.V. 0 2.370
Il 0.00889
ee Binnie
3
6 | |
Now P(X = 0)
3)
P(at least one 7) = P(X 21)
1—P(X
=0)
iS
ie)
—s
6
We require n such that
3
ke
| V 0.9
oO
|o
aw
eee
5 n
5, 108(0.1)
log(5/6)
(when dividing by a negative quantity the inequality is reversed.)
n 2 12.63
Exercise 6h
er RR a a cn orn nA CCAR RECO ld
Continuity correcti—ons write down the 7. It is estimated that 1/5 of the population
transformations for each of the following: of England watched last year’s Cup Final
(a) (3 <5 X <Q), (BF) P(B<X<9), on television. If random samples of 100
(c) P(10< X < 24), (d) P(2<X <8), people are interviewed, calculate the
(e) P(X > 54), (f) P(X> 76), mean and variance of the number of
(g) P(45 < X <67), (h) P(X < 109), people from these samples who watched
(i) P(X <45), (j) P(X = 56), the Cup Final on television.
(k) P(400 < X < 560), (1) P(X = 67), Use normal distribution tables to estimate,
(m) P(X > 59), (n) P(X = 100), to 2 significant figures, the approximate
(0) P(84< X <48), (p) P(X=7),
probability of finding, in a random
(q) P(X > 509), (r) P(X <7), sample of 100 people, more than 30
(s) P(27 SX <-29), (t) P(X = 58). people who watched the Cup Final on
television. (L Additional)
If X ~ Bin(200, 0.7), use the normal
approximation to find (a) P(X 2 130), 8. In a series of n independent trials the
(b) P(1386 <X<148), (c) (X< 142), probability of a ‘success’ at each trial is
(d) P(X > 152), (e) P(141 <<X < 146). p. If R is the random variable denoting
the total number of successes, state the
10% of the chocolates produced in a probability that R =r. State, also, the
factory are mis-shapes. In a sample of mean and variance of R.
1000 chocolates find the probability that A certain variety of flower seed is sold
the number of mis-shapes is (a) less than in packets containing about 1000 seeds.
80, (b) between 90 and 115 inclusive, The packet claims that 40% will bloom
(c) 120 or more. white and 60% red. This may be assumed
to be accurate.
Find the probability of obtaining more If five seeds are planted estimate the
than 110 ones in 400 tosses of an un- probability that
biased tetrahedral die with faces marked (a) exactly three will bloom white;
1,2,3 and 4. (6) at least one will bloom white.
If 100 seeds are planted use the normal
A coin is biased so that the probability approximation to estimate the probability
that it will come down heads is double of obtaining between 30 and 45 white
the probability that it will come down flowers. (SUJB)
tails. The coin is tossed 120 times. Find
the probability that there will be (a) bet- 9. A die is biased so that the probability of
ween 42 and 51 tails inclusive, (b) 48 obtaining a six isi. The die is thrown 200
tails or less, (c) less than 34 tails, times. (a) Find the probability of obtain-
(d) between 72 and 90 heads inclusive. ing a six on the die (i) more than 60
times, (ii) less than 45 times, (iii) between
An experiment consists of tossing two 40 and 55 times (inclusive). (b) How
unbiased coins. The outcome is called a many throws would be required if the
success if and only if two heads appear, probability of obtaining at least one six
all other outcomes being called a failure. is greater than 0.9?
If the experiment were repeated 27 times,
write down the binomial distribution 10.
Two hundred fair dice are thrown 1000
times. Use the normal approximation to
governing this series of experiments in the
form (p+ q)”, stating the values of p, q the binomial distribution to find the
and n. number of times you would expect to
have the following number of sixes
Find the expected number of successes (a) 30, (b) 53, (c) more than 38, (d) less
and the standard deviation of this distribu- than 28, (e) between 28 and 38 inclusive.
tion.
With the normal curve approximation 11. A certain tribe is distinguished by the
estimate, using tables and giving your fact that 45% of the males have 6 toes
answer to 2 decimal places, the proba- on their right foot. Two explorers discover
bility of obtaining at least 5 successes. a group of 200 males from the tribe. Find
(L Additional) the probability that the number who have
A CONCISE COURSE IN A-LEVEL STATISTICS
362
six toes on their right foot is (a) 90, new pass mark be if it is decided that only
(b) less than 85, (c) between 82 and 91 115 candidates pass?
prelusives stg enOaeen ines 13. A lorry load of potatoes has, on average,
12. Four hundred pupils sit a test which con- one rotten potato in 6. A greengrocer
sists of 80 true-false questions. None of tests a random sample of 100 potatoes
the candidates knows any of the answers and decides to turn away the lorry if he
and so guesses. (a) If the pass mark is finds more than 18 rotten potatoes in the
38, how many of the candidates would be sample. Find the probability that he
expected to pass? (b) What should the accepts the consignment.
: : Var(X) lI >
Solution 6.21 Let X be the r.v. ‘the radioactive count in a 1 second interval’. Then
X ~ Po(25).
P(X =x) = e 3
25 i
P25 = = ~Pra (0.079 5229)
25
THE NORMAL DISTRIBUTION / 363
25
Pos = 96°25 (0.076 464 8)
25
Pa = 97 P26 (0.070 800 3)
Gi i(w®
Exercise
1.) If X ~ Po(24), use the normal approxima- one hour, (a) there are more than 33
tion to find (a) P(X < 25), calls, (6b) there are between 25 and 28
(b) P(22<X < 26), (c) P(X> 23). calls (inclusive), (c) there are 34 calls.
7. Inan experiment with a radioactive sub- than 250 eggs are laid, (iii) between 180
stance the number of particles reaching a and 240 eggs (inclusive) are laid.
counter over a given period of time (b) If the probability that an egg develops
mean
follows a Poisson distribution with is 0.1, show that the number of survivors
22. Find the probability that the number follows a Poisson distribution with para-
of particles reaching the counter over the meter 20, and find the probability that
given period of time is (a) less than 22, there are more than 30 survivors.
(b) between 25 and 30, (c) 18 or more.
10. Two towns, Allport and Bunchester, are
linked by telephone. There are 2000 sub-
8. The number of accidents on a certain rail- scribers in Allport, but it is too expensive
way line occur at an average rate of one
to install 2000 trunk lines between the
every 2 months. Find the probability that two towns. In a busy hour, each sub-
(a) there are 25 or more accidents in 4 scriber in Allport requires a trunk line
years, (b) there are 30 or less accidents in to Bunchester for an average time of 2
ovens. minutes. Show that the number of trunk
lines in use follows a Poisson distribution
9. The number of eggs laid by an insect with mean 66.67 per hour. What is the
follows a Poisson distribution with para- ‘minimum number of trunk lines that
meter 200. (a) Find the probability that should be installed if only 1% of all the
(i) more than 150 eggs are laid, (ii) more calls will fail to find an empty trunk line?
Example 622 If X ~ Bin(10,}), find the probability that X = 5. Then find the
approximation to this probability using (a) the normal distribution,
(b) the Poisson distribution.
so P(X =5) 10
cs{5)i f
= 0.2461
P(X = 5) 0.2461.
THE NORMAL DISTRIBUTION ; 365
(a) Using the normal approximation:
1
X ~ N(np,npq) where = np = ao(5] = 5
a
npq = @)(5] = 2.5
2
So x ~ N(5; 2.5)
Now
P(X = 5) ——._ P(4.8<.X, < 5.5) (continuity correction)
45-5 X-5 5.5—5
P(4.5<X<5.5)
V2.5 J/2:5. »/25
P(— 0.316 < Z < 0.316)
0.2478
45 5 55
S.V. —0.316 0 0.316
Example 623 If X ~ Bin(20, 0.4), find the probability that 6 < X <10. Then find
the approximations to this probability using (a) the normal distri-
bution, (b) the Poisson distribution.
_ (n—x)p
Px+1 (x +1)q x
Di se (0.165 882 2)
(04),
Ps = 13 (0.179 705 7)
P(6<X<10) = petpzt+...+
Pro
= 0.7469 (4d.p.)
Using the binomial distribution, P(6 < X < 10) = 0.7469 (4 d.p.).
5.5 8 105
SV. —1.141 0 1.141
Therefore P(6 < X < 10) = 0.7462, using the normal approxima-
tion.
8
Pea gP? (0.139 586 5)
Pos
is9 P8 (0.124 076 9)
Pio = 2
7pPs (0.099 261 5)
P(6<SX<10) = petprt+...+
Dio
= 0.6246 (4d.p.)
So P(6 S X < 10) = 0.6246 (4 d.p.) using the Poisson approxima-
tion.
so. X ~ N(5,4.75)
368 A CONCISE COURSE IN A-LEVEL STATISTICS
Now
P(X = 4) — P(3.5<X< 4.5) (continuity correction)
3,5 54 VX 5) 40
P(8.5<X<4.5) = PSs <
i V/V4.75
<
a)
P(— 0.6882 < Z <— 0.2294)
0.1637
3.5 |5
=a
S.V. oO
0.6882
— 0.2294
—
P(X =x) = e §—
x!
54
Example 6.25 If X ~ Po(30), find P(28 < X < 32). Then find the approximation
to this probability using the normal distribution.
0.3516
27.5 30 32.5
S.V. —0.456 0 0.456
E
pee i e SS
_ Exercise 6j ee
1. A number of different types of fungi are 4. Henri de Lade regularly travels from his
distributed at random ina field. Eighty home in the suburbs to his office in Paris.
per cent of these fungi are mushrooms, He always tries to catch the same train,
and the remainder are toadstools. Five per the 08.05 from his local station. He
cent of the toadstools are poisonous. A walks to the station from his home in
man, who cannot distinguish between such a way that his arrival times form a
mushrooms and toadstools, wanders across normal distribution with mean 08.00
the field and picks a total of 100 fungi. hours and standard deviation 6 minutes.
Determine, correct to 2 significant figures, (a) Assuming that his train always leaves
using appropriate approximations, the on time, what is the probability that on
probability that the man has picked any given day Henri misses his train?
(a) at least 20 toadstools, (b) If Henri visits his office in this way 5
(b) exactly two poisonous toadstools. days each week and if his arrival times at
(C) the station each day are independent,
what is the probability that he misses his
An old car is never garaged at night. On train once and only once in a given week?
the morning following a wet night, the (c) Henri visits his office 46 weeks every
probability that the car does not start year. Assuming that there are no absences
is 3. On the morning following a dry during this time, what is the probability
night, this probability isx. The starting that he misses his train less than 35 times
performance of the car each morning is in the year? (AEB 1980)
independent of its performance on
previous mornings. The probability of a man aged exactly 85
(a) There are 6 consecutive wet nights. dying before he is 86 is about 0.211.
Determine the probability that the car Write down an expression for p,, the
does not start on at least 2 of the 6 probability that r of a group of n men
mornings. aged exactly 85 die before they are 86.
(bo) During a wet autumn there are 32 (a) Calculate pp when n= 5.
wet nights. Using a suitable approxima- (b) By considering (p,/p,+1), or other-
tion, determine the probability that the wise, calculate the most likely value of r
car does not start on less than 16 of the for the case n = 100.
32 mornings. (c) Use the normal approximation to the
(c) During a long summer drought there binomial to estimate the probability that
are 100 dry nights. Using a Poisson at least 25 of a group of 100 men aged
approximation, determine the probability exactly 85 die before they are 86. (MEI)
that the car does not start on 5 or more
of the 100 mornings. In Urbania, selection for the Royal Flying
(Give 3 decimal places in your answers.) Corps (RFC) is by means of an aptitude
(C) test based on a week’s intensive military
training. It is known that the scores of
An urn contains 100 balls of which 4 potential recruits on this test follow a
are coloured red and the remainder are normal distribution with mean 45 and
coloured white. A ball is drawn at random standard deviation 10.
from the urn, its colour is noted and it is (a) What is the probability that a
then replaced in the urn. randomly chosen recruit will score
Write down (but do not evaluate) an between 40 and 60?
expression for the probability that, in a (6) What percentage of the recruits is
total of 10 such draws, a red ball is drawn expected to score more than 30?
exactly once. (c) Ina particular year 100 recruits take
the test. Assuming that the pass mark is
Determine, correct to two decimal places,
making use of a suitable approximation 50, calculate the probability that less than
in each case, the probability that 35 recruits qualify for the RFC.
(a) in a total of 100 such draws, a red (AEB 1978)
ball is drawn on exactly four occasions,
(6) ina total of 9600 such draws, a red During an advertising campaign, the
ball is drawn on between 350 and 400 manufacturers of Wolfitt, (a dog food)
occasions inclusive. (C) claimed that 60% of dog owners preferred
THE NORMAL DISTRIBUTION , 371
to buy Wolfitt. Assuming that the manu- 10. (a) Every year very small numbers of
facturer’s claim is correct for the popula- American wading birds lose their way
tion of dog owners, calculate on migration between North and South
(a) using the binomial distribution, and America and arrive in Great Britain
(b) using a normal approximation to the instead, so that in September the propor-
binomial; tion of American waders amongst the
the probability that at least 6 of a random waders in Great Britain is about one in
sample of 8 dog owners prefer to buy ten thousand.
Wolfitt. Comment on the agreement, or At Dunsmere (a bird reserve in Great
disagreement, between your two values. Britain), one September, there are twenty
Would the agreement be better or worse if thousand waders, which may be regarded
the proportion had been 80% instead of as a random sample of the waders present
60%? in Great Britain. Determine the probability
Continuing to assume that the manu- that there are
facturer’s figure of 60% is correct, use the (i) no American waders present at
normal approximation to the binomial Dunsmere,
to estimate the probability that, of a (ii) more than two American waders
random sample of 100 dog owners, the present at Dunsmere.
number preferring Wolfitt is between 60 (6) Three-quarters of all the sightings in
and 70 inclusive. (MEI) Great Britain of American waders are
made in the autumn. Suppose that in
If the probability of a male birth is 0.514, 1980 there will be ten sightings of
what is the probability that there will be American waders at Dunsmere. Assuming
fewer boys than girls in 1000 births? that all sightings are independent of one
(You may assume that 0.514 x 0.486 another, determine the probability that
~ 0.25.) exactly seven of these ten sightings will
be made in the autumn. (C)
How large a sample, to the nearest
hundred, should be taken to reduce the 11. An inter-city telephone exchange has 100
probability of fewer boys than girls to lines and on average 80 are in use at any
less than 5%? (You may assume that the moment (on a typical business-day .
sample size in this part of the question is morning). Calculate
sufficiently large for a continuity correc- (a) the probability that all lines are engaged;
tion to be unnecessary.) (SMP) (6) the probability that more than 30
lines are free.
On the surface of halfpenny postage We say that a number «x of lines is the
stamps there are either one or two ‘effective minimum level’ if the number
phosphor bands. Ninety per cent of half- of lines in use exceeds x for 95% of the
penny stamps have two bands and the rest time. Find x.
have one band. Of those having one band, (You may assume that for large n the
95% have the band in the centre of the binomial probability may be approxi-
stamp and the remainder have the band mated by a normal probability with mean
on the left-hand edge of the stamp. na and variance nab.) (SMP)
(a) Determine the probability that in a
random sample of ten halfpenny stamps 12. A telephone exchange serves 2000 sub-
there are exactly eight having two phos- scribers, and at any moment during the
phor bands. busiest period there is a probability of
(b) Determine, using a normal approxi- 1/30 for each subscriber that he will
mation, the probability that in a random require a line. Assuming that the needs
of subscribers are independent, write
sample of 100 halfpenny stamps there are
between five and fifteen stamps (inclusive) down an expression for the probability
having one phosphor band. that exactly N lines will be occupied at
any moment during the busiest period.
(c) Determine, using a Poisson approxi-
mation, the probability that in a random Use the normal distribution to estimate
sample of 100 halfpenny stamps there are the minimum number of lines that would
less than three stamps which have only a ensure that the probability that a call
single band, this band being on the left- cannot be made because all the lines are
hand edge of the stamp. occupied is less than 0.01.
(Any expressions evaluated should be Investigate whether the total number of
clearly exhibited, and answers should be lines needed would be reduced if the sub-
given correct to three significant figures.) scribers were split into two groups of
(C) 1000, each with its own set of lines. (MEI)
372 A CONCISE COURSE IN A-LEVEL STATISTICS
Example 7.1 If.X ~ N(60,16) and Y ~ N(70,9), find (a) P(X+ Y< 140),
(b) P(120<X+Y<135), (c)P(Y-X>1),
(d) P(2< Y—X <12).
We require R=X+Y
ie 00. te 2O0
P(R < 140) = pe < eae
5 5
= P(Z<2)
130 140
= 0.9772 SV. 0 2
Therefore P(X + Y < 140) = 0.9772.
(b) We require
120 3130.5. FR=1380 ..185—130
P(120 < R < 135) = pee Ae <
5 5 5
= P(i-2<Z2<1) BY
0.8185
Therefore P(120 <X+ Y<135) = 0.8185.
120 130 135
SiVe —2 ||
(d) We require
29—10. 1-10 12-10) 7=Y-x
P(2<T<12) ll
5 5 5 ree
= P(—-1.6 < Z< 0.4)
= 0.6006 2 10 12
SV. —1.6 0 04
Therefore P(2< Y—X <12) = 0.6006.
A CONCISE COURSE IN A-LEVEL STATISTICS
376
the news-
Example 7.2 Each weekday Mr Jones walks to the local library to read
to and from the library is a
papers. The time he takes to walk
normal variable with mean 15 minute s and standar d deviati on 2
is a normal variabl e with
minutes. The time he spends in the library
J/12 minutes . Find the
mean 25 minutes and standard deviation
the
probability that, on a particular day, (a) Mr Jones is away from
house for more than 45 minutes, (b) Mr Jones spends more time
travelling than in the library.
Solution 7.2 Let L be the r.v. ‘the time in minutes spent in the library’. Then
L ~ N(25,12).
Let W be the r.v. ‘the time in minutes spent walking to and from
the library’. Then W ~ N(15, 4).
(a) We require the distribution of the total time spent away from
the house.
Let Tes LW,
So T ~ N(40,16)
We require
T—40_ 45—40 T=L+W
P(T
> 45) = | at
= P(Z>1.25)
= 0.1056 Sanaa
S.V. QO 1.25
Therefore the probability that Mr Jones is away from the house for
more than 45 minutes is 0.1056.
Therefore the probability that Mr. Jones spends more time travelling
than in the library is 0.006 21.
RANDOM VARIABLES AND RANDOM SAMPLING
y CHUA
_ Exercise 7a may
1. If X ~ N(100, 49) and Y ~ N(110, 576), per cup, what will be the gross profit on
find (a) P(X+ Y > 200), 1000 dispensed cups?
(b) P(180<xX + ¥ < 240), (v) What price per cup (to the nearest 5p)
(c) (Y—X <0), should the cafeteria charge if the average
(d) P(—20 <Y—X<50). profit is to be 5p per cup? (SUJB)
If X ~ N(75,5) and Y ~ N(78, 20), find Bolts are manufactured which are to fit
(a) P(X+ Y > 162), in holes in steel plates. The diameter of the
(b) P(140<X+Y <150), bolts is normally distributed with mean
(c) (X+ Y<4155), (d)P(X—Y>0), 2.60 cm and standard deviation 0.03 cm;
(e) (Y—X< 15). the diameter of the holes is normally
If A ~ N(3, 0.05) and B ~ N(2, 0.04), find distributed with mean 2.71 cm and standard
(a) PA—B> 1.9), (b) (A+ B<4.4), deviation 0.04 cm.
(c) P(B>A—0.6). (a) Find the probability that a bolt
selected at random has a diameter greater
If X ~ N(25,5) and Y ~ N(30, 4), find than 2.65 cm.
(a) (|\X+ Y—55|<5), (b) P(Y >X), (b) Find the probability that a hole
(c) (| Y—X—5|<3). selected at random has a diameter less than
2.65 cm.
At a self-service cafeteria a coffee machine
(c) Prove that, if a bolt and a hole are
is installed which dispenses (a) black coffee
selected at random, the probability that
in amounts normally distributed with mean
the bolt will be too large to enter the hole
6.10z and standard deviation 0.4 oz,
is about 0.0139.
(b) white coffee by first releasing a quantity
(d) The random selection of a bolt and a
of black coffee normally distributed with
hole described in (c) above is carried out
mean 4.9 oz and s.d. 0.3 0z and then
five times. Find the probability that in
adding milk normally distributed with
every case the bolt will be able to enter the
mean 1.2 0z and s.d. 0.2 oz. Each cup is
hole. (C)
marked on the inside to a level of 5.5 oz
and if this level is not attained the customer The diameters of axles supplied by a
receives the drink without charge. factory have a mean value of 19.92 mm
(i) What percentage of cups of black and a standard deviation of 0.05 mm. The
coffee will fall short of the 5.5 oz? inside diameters of bearings supplied by
(ii) What is the mean and s.d. of the another factory have a mean of 20.04 mm
amount of white coffee dispensed into and a standard deviation of 0.038 mm. What
each cup? is the mean and standard deviation of the
(iii) What percentage of cups of white random variable defined to be the diameter
coffee will fall short of 5.5 oz? of a bearing less the diameter of an axle?
(iv) If 10% of cups dispensed are black Assuming that both dimensions are nor-
and the cost per cup for the ingredients is mally distributed, what percentage of axles
2.1p per cup for both black and white and bearings taken at random will not fit?
coffee, whilst the customer is charged 10p (O & C)
. :
If x Xo,...,Xy aren independent normal variables such that
02oe Xx, ~ Nin0) = :
Xi Be Oj a — a aon
o
- then
io N(ui + bat. 24074
+ bn02 mee +on?)
nay.
So A ~N(330, 30).
We require
A=W+X+Y
<A
P( 820) = pA
/30 /30 s.d. =1/30
= P(Z<—1.826) =
= 0.0340
P( W+ X+ ¥<82 0)=0. 03 40. &N ~'me ©
Therefore
X12 ~ N(20, 4)
RANDOM VARIABLES AND RANDOM SAMPLING 379
Now, let B= X,+X,+.:..+X45}
so E(B) = E(X,)+E(X,)+...+E(1)
X
==128(X)
= 240
and Var(B) = Var(X,)+ Var(X,)+...+ Var(X)2)
= 12Var(X)
= 48
We have B ~ N(240, 48)
We require
q B=X,+Xot...4+X49
B—240 | 230—240
P(B < 230) s.d. =/48
V48 /48
= P(Z<—1.443)
ns 230 240
mine ise S.V. —1.443 0
Therefore the probability that the total mass of the articles is less
than 230 g is 0.0745.
So W~ N(—6, 25)
We require
W—(=—6)— 0-—(-—6 w=B-C-A
P(W>0) = pH) SE)
5 5
=) P(Z 21,2)
= 0.1151 ——
Example 7.6 Ina cafeteria, baked beans are served either in ordinary portions or
in children’s portions. The quantity given for an ordinary portion is
a normal variable with mean 90 g and standard deviation 3 g and the
quantity given for a children’s portion is a normal variable with
mean 43 g and standard deviation 2 g. What is the probability that
John, who has two children’s portions, is given more than his
father, who has an ordinary portion?
Solution 7.6 Let C be the r.v. ‘the quantity given, in g, in a children’s portion’.
Then C ~ N(43, 4).
Let A be the r.v. ‘the quantity given, in g, in an ordinary portion’.
Then A ~ N(90, 9).
Now
P(C,+C,—A>0) P(W>0)
(28)
Aa ca 0)
eh
W=C,+C,—A
0.166
470
S.V. 0 0.970
Exercise 7b
Var(aX) a’Var(X)
384 A CONCISE COURSE IN A-LEVEL STATISTICS
Example 7.8 If X ~ N(70,10) and Y ~ N(50, 8), find P(2X > 3Y).
Solution 7.8 We require P(2X > 3Y), i.e. P(2X—3Y > 0).
Let 4 enex —3 A = 2X—3Y
then E(AY = 2QE(X)=8EY)
= 140—150
Sr —10 0.
Var(A) = 4Var(X)+9Var(Y) °”’ Airy 2.848
= 40+72
= 112
So A ~ N(—10,112)
A—(—10)_ 0—(—10)
P(A>0) = a 2 Teaol ]
Care must be taken to distinguish between the r.v. 2X and the r.v.
X,+X,, where X, and X, are two independent observations of
the r.v. X.
NOTE: the means of the two distributions are the same, but the
variances are different.
Example 7.9 If X ~ N(10,9), find (a) P(2X > 23), (b) P(X,+X,> 23) where
X, and X, are two independent observations from the population
of X.
20 36
So V ~~
N(20, 36)
and P(V> 28)
ee
i= oR
6
P(Z>0.5)
23 — 20
6 |
0.3085
Therefore P(2X > 23) = 0.3085.
= 2E(X) 2Var(X)
= 20 18
So W ~ N(20, 18)
W=X,+X2
W208 23720
and P(W> 23) = Hs > |
s.d. =/18
= P(Z
> 0.707)
= 0.2399 20 23
S.V. 0 0.707
Therefore P(X,+ X,> 238) = 0.2399.
TISTICS
A CONCISE COURSE IN A-LEVEL STA
386
In general, if X ~ N(u, 0) Le :
then nX ~ N(np, n’07)
bet Re Ne
between multiples
The following example illustrates the difference
and sums of random variables.
the proba-
(a) A bottle of each size is selected at random. Find
the
bility that the large bottle contains less than four times
amount in the small bottle.
. Find
(b) One large and four small bottles are selected at random
in the large bottle is less than
the probability that the amount
the total amount in the four small bottles.
Solution 7.10 Let S be the r.v. ‘the amount, in ml, in a small bottle’. Then
S ~ N(252, 4).
Let L be the r.v. ‘the amount, in ml, in a large bottle’. Then
L ~ N(1012, 25).
(a) We need P(L < 48) = P(L—48 <0).
Now E(L—48) = E(L)—E(4S) (multiple of S)
E(L)—4E(S)
II 1012—1008
= 4
Var(L —4S) = Var(L) + Var(4S)
= Var(L) + 16Var(S)
lI 25+ 64
= 89
So L—4S ~ N(4, 89)
P(L—4S <0) II p(2< =)
/89
ora S.V. Se ‘
0.3358
Therefore the probability that the large bottle contains less
than four times the amount of a small bottle is 0.3358.
RANDOM VARIABLES AND RANDOM SAMPLING # 387
O0—4
P(D—(8,+... 48) <0) hel oes
V41
= P(Z<—0.625)
= 0.266 S.V.
0
—0:625,
4
0
Exercise 7c
If X ~ N(40, 12) and Y ~ N(60, 15), find single observation from the population of
(a) P(2X > 90), (b) P(4Y< 270), X is greater than two-thirds of the value
(c) P(83X—2Y < 20), (d) P[d(X+ Y)>55}. of a single observation from the population
of Y.
Le 1.52), B~ N(42, 0.37) and
~ N(85, 0.77), find (a) P(3A < 250), If X ~ N(50,16) and Y ~ N(40, 9), find
ee > 255), (c) P(3A > 6B), (a) P(2X+ Y > 120), (b) P[s(X— Y) > 0],
(d) P(2B+A > 2C), (e) PLX(A+ B) < 64], (c) P(100 < 3X— Y <130).
(f) P(A(A+ B+ C)> 70).
If X ~ N(30, 4) find (a) P(5X > 160),
The r.v. X is normally distributed with
(b) P(Y > 160) where Y= X,+...+
Xs.
mean UL and variance 6, and the r.v. Y is
normally distributed with mean 8 and
The thickness, Pcem, of a randomly
variance 0”. If the r.v. 2X—3Y is normally
chosen paperback book may be regarded
distributed with mean — 12 and variance
as an observation from a normal distribu-
42, find (a) the values of uw and O-:
tion with mean 2.0 and variance 0.730.
(b) P(X > 8), (c) (PY <9), The thickness, H cm, of a randomly chosen
(ayaa 8X2Y <i):
hardback book may be regarded as an
The r.v. X is distributed normally with observation from a normal distribution
mean 25 and standard deviation 4, the r.v. with mean 4.9 and variance 1.920.
Y is distributed normally with mean 30 (a) Determine the probability that the
and standard deviation 3, and X and Y are combined thickness of four randomly
independent. Find the probability that a chosen paperbacks is greater than the
A CONCISE COURSE IN A-LEVEL STATISTICS
388
Solution 7.11 (a) Let X be the r.v. ‘the volume in ml of liquid in a bottle’
Then X ~ N(20.42, 0.4297)
20 20.42
0.1637 S.V. —0.979 0O
Doceteee Oa t)
Pie < =
ris 0.429 see
RANDOM VARIABLES AND RANDOM SAMPLING # 389
2 a
uw = 20+(2.326)(0.429)
20 wu
is
= 21.00 S.V. 2.326 0
The adjusted value of the mean should be 21.00 ml of drug.
asa asst |
Age oo) Oa (sea)
P(X—Y>0)
P(Z 232.827) s.d. =1/0.2281
0.002 35
=1.35 0
8.V. 0 2.827
Example 7.12 The random variable X has a normal distribution with parameters
wand o?. Derive the mean and variance of X.
SL feet 25
You may assume that TEI e,: dt =1).
1 —oo
distribu-
Solution 7.12 For the derivation of the mean and variance of the normal
tion see p. 317.
X—60. <x-—60
then P = 0.99
5 5
! x — 60
1.e. P\Z> = 0.99 60
SV. 2.326 0
therefore — == OZ
x = 48.37
(b)
60— =
P(60<Y<75) = P| < = ie )
4 s.d.=4
P(—2.5<Z<1.25)
0.8323 60 70 «75
S.V; 2.5 O 1.25
Therefore the probability that the mechanism will fail in the inten-
sive inspection period is 0.8323.
RANDOM VARIABLES AND RANDOM SAMPLING y 391
For the normal variable such that X ~ N(u, 0”) and for any
constant a
aX ~ N(ap,a’o?)
The weights of grade A oranges are so that again the total life is more than
normally distributed with mean 200g 3300 hours. Explain why this answer
and standard deviation 12 g. Determine, should be different from the previous
one. (JMB)
correct to 2 significant figures, the
probability that
The weight of a large loaf of bread is a
(a) a grade A orange weighs more than normal variable with mean 420g and
190 g but less than 210g,
standard deviation 30 g. The weight of a
(b) asample of 4 grade A oranges weighs small loaf of bread is a normal variable
more than 820 g.
with mean 220g and standard deviation
The weights of grade B oranges are 10g.
normally distributed with mean 175g (a) Find the probability that 5 large
and standard deviation 9g. Determine, loaves weigh more than 10 small loaves.
correct to 2 significant figures, the (b) Find the probability that the total
probability that weight of 5 large loaves and 10 small
(c) a grade B orange weighs less than a loaves lies between 4.25 kg and 4.4 kg.
grade A orange, (C)
(d) asample of 8 grade B oranges weighs
more than a sample of 7 grade A oranges. The tensile strengths, measured in new-
(C) tons (N), of a large number of ropes of
equal length are independently and
Prints from two types of film C and D normally distributed such that five per
have developing times which can be cent are under 706N and five per cent
modelled by normal variables, C with over 1294 N. Four such ropes are random-
mean 16.18s and standard deviation ly selected and joined end-to-end to form
0.11s and D with mean 15.88s and a single rope; the strength of the combined
standard deviation 0.10s. rope is equal to the strength of the
(a) What is the probability that a type C weakest of the four selected ropes. Derive
print will take less than 16s to develop? the probabilities that this combined rope
(b) A type C print is developed and will not break under tensions of 1000 N
immediately afterwards a type D print and 900N, respectively.
is developed. What is the probability that A further four ropes are randomly selected
the total time is greater than 32.5 s? and attached between two rings, the
(c) What is the probability of a type C strength of the arrangement being the
print taking longer to develop than a sum of the strengths of the four separate
type D print? (SUJB) ropes. Derive the probabilities that this
arrangement will break under tensions of
In testing the length of life of electric 4000 N and 4200N, respectively.
light bulbs of a particular type, it is found
Find the smallest number of ropes that
that 12.3% of the bulbs tested fail within
should be selected if the probability that
800 hours and that 28.1% are still
at least one of them has a strength greater
operating 1100 hours after the start of
than 1000 N is to exceed 0.99. © (JMB)
the test. Assuming that the distribution
of the length of life is normal, calculate, The independent random variables X, and
to the nearest hour in each case, the mean, X, are normally distributed with means
UM, and the standard deviation, 0, of the My, U2 and variances on Oz respectively.
distribution. What is the distribution of the random
A light fitting takes a single bulb of this variable Y = a,X 1+ aX?
type. A packet of three bulbs is bought, Certain components for a revolutionary
to be used one after the other in this new sewing machine are assembled by
fitting. State the mean and variance of inserting a part of one type (sprotsil) into
the total life of the three bulbs in the a part of another type (weavil). Sprotsils
packet in terms of ut and o and calculate, have external dimensions which are
to two decimal places, the probability normally distributed with mean 2.50 cm
that the total life is more than 3300 and standard deviation 0.018 cm. Weavils
hours. have internal dimensions which are
Calculate the probability that all three normally distributed with mean 2.54 cm
bulbs have lives in excess of 1100 hours, and standard deviation 0.024cm. Under
RANDOM VARIABLES AND RANDOM SAMPLING # 393
suitable pressure, the two types fit 9. Next May, an ornithologist intends to
together satisfactorily if the dimensions trap one male cuckoo and one female
differ by not more than +0.035 cm. cuckoo. The mass M of the male cuckoo
Show that, if pairs of parts are chosen at may be regarded as being a normal
random, the difference random variable with mean 116g and
D= internal dimension of a weavil standard deviation 16 g. The mass F of
— external dimension of a sprotsil the female cuckoo may be regarded as
being independent of M and as being a
is distributed with mean 0.04cm and
normal random variable with mean 106 g
standard deviation 0.030 cm. Hence show
and standard deviation 12 g. Determine
that approximately 42.8% of randomly
(a) the probability that the mass of the
selected pairs will fit together satisfactorily.
two birds together will be more than
.Now, if it is known that the internal 230 g,
dimension of a given weavil is 2.517 cm, (6) the probability that the mass of the
what is the probability that a randomly male will be more than the mass of the
chosen sprotsil will fit this weavil satis- female.
factorily? (AEB 1980)
By considering X = 9M—16F, or other-
wise, determine the probability that the
mass of the female will be less than nine-
sixteenths of that of the male.
The mass of a cheese biscuit has a normal Suppose that one of the two trapped
distribution with mean 6 g and standard birds escapes. Assuming that the remaining
deviation 0.2 g. Determine the probability bird will be equally likely to be the male
that or the female, determine the probability
(a) a collection of twenty-five cheese that its mass will be more than 118 g. (C)
biscuits has a mass of more than 149 g,
(6) acollection of thirty cheese biscuits
has a mass of less than 180 g,
(c) twenty-five times the mass of a
10. A train leaves a station punctually at its
cheese biscuit is less than 149 g.
scheduled time, which is currently 0808
The mass of a ginger biscuit has a normal hours (i.e. 8 minutes past 8 a.m.). A bus
distribution with mean 10 g and standard is due to arrive at that station at 08 00
deviation 0.3 g. Determine the probability hours, but in fact its arrival time is
that a collection of seven cheese biscuits normally distributed about the scheduled
has a mass greater than a collection of time with standard deviation 5 minutes.
four ginger biscuits. Transfer from bus to train requires 1
minute. What is the probability that the
(It may be assumed that all the biscuits
bus-train connection is made?
were sampled at random from their
respective populations.) (C) It is proposed to change the scheduled
departure time of the train (it must still
be an exact minute, e.g. 0809 hours,
0810 hours). What would be the earliest
scheduled departure time in order that
In a packaging factory, the empty con-
the probability of making the bus-train
tainers for a certain product have a mean
connection should be at least 99%?
weight of 400 g with a standard deviation
of 10g. The mean weight of the contents The train travels to a junction station, its
of a full container is 800 g with a standard journey time being normally distributed
deviation of 15 g. Find the expected total with mean 15 minutes and standard devia-
weight of 10 full containers and the tion 1.6 minutes. A connecting train
standard deviation of this weight, assuming leaves the junction punctually at 08 29
that the weights of containers and con- hours. Transfer between the two trains
tents are independent. can be regarded as instantaneous. What
is the probability that the two trains will
Assuming further that these weights are connect with the original train schedule?
normally distributed random variables,
Find what departure times (exact minutes)
find the proportion of batches of 10 full
of the train from the first station will
containers which weigh more than 12.1 kg.
result in both connections being made
If 1% of the containers are found to be with probability at least 95%. Find also
holding weights of product which are less whether it is possible to arrange for this
than the guaranteed minimum amount,
deduce this minimum weight. (O &C) probability to be at least 975%. | (MEI)
394 A CONCISE COURSE IN A-LEVEL STA TISTICS
Proof:
: 1
Eine B= (+ Xa... + Xn)
iL
= | (B(X1) + BK) +... FE(Xn)]
1 1
ca UseFyUoueicny1) e108 Ras
n n
1
ae [Var(X,) + Var(X2)+...+ Var(X,)]
1 ft! oO
a BO gd eetiere tO) =a (ng?)<=e
n n n
Example 7.13 The discrete r.v. X has probability distribution P(X = x) where
P(X = 0) = 0.5, P(X = at} = 0.38, P(X = 2) = 0.2. The mean up is
0.7 and the variance o? is 0.61. Random samples of size 2 are taken
from the distribution. By considering all possible samples, find the
probability distribution oethe mean X of such samples. Verify that
0
E(X) = wand Var(X) = ae
Solution 7.13 Consider the samples of size 2 from the distribution. For the sample
(2,1) say, P(X = 2):P(X= 1)= (0.2)(0.3)= 0. 06 and the sample
mean is 1.5. Summarising the results for all the samples:
We have
Example 7.14 (a) For the set of numbers 1, 4, 7 find the mean yp and the variance
oO 2
wa v
vy ey
ay
v. ~~
wv
ayv
v v
~~ ARH
pe
HPPA
PBI RPP
RAR
. YIAIAPABAKRABRH
LowW;#IwnwnNnrarr
bo
JSS
or
09
co
|6 9
RANDOM VARIABLES AND RANDOM SAMPLING 397
Population variance
Variance of sample means =
3
Example 7.15 Find the mean p and the variance o? of the population 1, 4, 7.
Draw up a frequency distribution of the means of all possible
samples of size 2, taken without replacement. Find the mean and
variance of this distribution and verify that
bs. oF{N=n
Var(X) = at
i Niall
]
where X is the r.v. ‘the sample mean’, N is the number in the
population and n is the sample size.
What happens as N > 0?
e 0.7)
|(1,4)
[Sampl 4D 47 YD (4)
2.5 4 25 5.5 4 5.5
From calculator, mean = 4, variance = 1.5.
o?(N—n 6 =|
= = (5 au with quNi= 8) n= 2.96
i=
LAMMeeNisEEA Maik 2181
= 1.5
Var(X) as required.
= me 2
Now,as N77, eh and va Se
Exercise 7e
For each of the following distributions, where X is the r.v. ‘the sample mean’, N
(a) find the mean wu and the variance 0°, is the number in the population and n is
(b) by taking all possible samples of size the sample size.
2
ee nO) What happens as N > ©?
2 verify that E(X) = wand Var(X) = ee
Find the mean and the variance o” of the Another student adopts a different proc-
population 1,4,7,8. Draw up a frequency edure and she selects nine onions at ran-
distribution of the means of all possible dom without replacing them. What would
samples of size 2, taken without replace- you expect the standard deviation of the
ment. mean weight of the nine onions to be in
this case?
Find the mean and the variance of this
distribution and verify that For each approach, what is the minimum
number of onions that have to be selected
2 =
Var(X) = a= if the standard deviation of the sample |
mean is to be less than 3 grammes? (O)
RANDOM VARIABLES AND RANDOM SAMPLING 399
Example 7.17 The heights of a particular species of plant follow a normal distribu-
tion with mean 21 cm and standard deviation / 90cm. A random
sample of 10 plants is taken and the mean height calculated. Find
the probability that this sample mean lies between 18cm and
27cm.
Solution 7.17 Let X be the r.v. ‘the height in cm of a plant’. Then X ~ N(21, 90).
oe 90 es
Now n= 10,so X ~ n(21,95]. ie. X ~ N(21, 9).
400 A CONCISE COURSE IN A-LEVEL STATISTICS
We require
= 18-21 X-—21 27-21
Pas <X <27) II =P ee < ——
eS 2 3
P(-1<Z<2) Distribution of X
0.8185
182 ee 27
SsVene a0 2
Therefore the probability that the mean height of the sample lies
between 18cm and 27cm is 0.8185.
Example 7.18 A large number of random samples of size n are taken from the
distribution of X where X ~ N(7 4,36) and the sample means are
calculated. If P(X > 72) = 0.854, estimate the value of n.
3
go p(z>—*| = 0.854
Therefore va = 1.054
n = 9(1.054)?
= 100° (3S)
Samples of size 10 are taken.
Example 7.19 (a) If X,, X,,...,X, is arandom sample from N(y, 1), state the
distribution of the sample mean X.
(b) Find the sample size required to ensure that the probability
that X is within 0.1 of yu is greater than 0.95.
Es 1
Solution 7.19 (a) X ~ N(u,1), therefore X ~ N 1.
n
RANDOM VARIABLES AND RANDOM SAMPLING 401
Exercise 7f
If X ~ N(200, 80) and a random sample probability that the sample mean exceeds
of size 5 is taken from the distribution, 75 is 0.282, (b) find n if the probability
find the probability that the sample mean that the sample mean is less than 70.4 is
(a) is greater than 207, (b) lies between 0.001 35.
201 and 209.
A normal distribution has a mean of 30
If X ~ N(200, 100) and a random sample and a variance of 5. Find the probability
of size 10 is taken from the distribution, that (a) the average of 10 observations
find the probability that the sample mean exceeds 30.5, (b) the average of 40
lies outside the range 198 to 205. observations exceeds 30.5, (c) the average
of 100 observations exceeds 30.5. Find n
If X ~ N(50,12) and a random sample
of size 12 is taken from the distribution, such that the probability that the average
of n observations exceeds 30.5 is less than
find the probability that the sample mean
1%.
(a) is less than 48.5, (6) is less than 52.3,
(c) lies between 50.7 and 51.7.
The r.v. X is such that X ~ N(u, 4). A
At a college, the masses of the male random sample, size n, is taken from
students are distributed approximately the population. Find the least n such that
normally with mean mass 70 kg and P(|X—pl|<0.5) > 0.95.
standard deviation 5kg. Four male
students are chosen at random. Find the 9. X is the r.v. ‘the sample mean of samples,
probability that their mean mass is less size 15, taken from N(30,18)’ and Y is
than 65 kg. the r.v. ‘the sample mean of samples, size
8, taken from N(20,16)’. Find_the_
A normal distribution has a mean of 40 distribution of (a) X—Y, (b) Xt Y,
and a standard deviation of 4. If 25 items (c) Y—X, (d)5X+8Y, (e) 4X—2Y.
are drawn at random, find the probability
that their mean is (a) 41.4 or more, 10
In a certain country the heights of men
(b) between 38.7 and 40.7, (c) less than are normally distributed with mean
39.5. 175 cm and standard deviation 5 cm and
size n are the heights of women are normally
If a large number of samples,
taken from a population which follows a distributed with mean 165 cm and standard
normal distribution with mean 74 and deviation 6 cm. Find the probability that
standard deviation 6, (a) find n if the the mean height of three women chosen
402 A CONCISE COURSE IN A-LEVEL STATISTICS
at random is greater than the mean height from each distribution. Find the proba-
of four men chosen at random from the bility that the sample from the distribu-
population. tion of Y will have a mean which is at
least 21 more than the mean of the
it The continuous random variable X is sample from the distribution of X.
such that X ~ N(20, 16). If samples of
size n are taken and X is the random
16. Every child in a class does an experiment
variable ‘the mean of the n sample values’,
which consists of measuring V, the
find the least value of n such that
volume of water displaced by a solid
P(X > 21) < 0.05.
sphere. The children’s values of V are
distributed approximately normally with
12. A random sample Xj, X2 is drawn from a mean 27.4cm* and standard deviation
distribution with mean yu and standard lem’.
deviation 0. State the mean and standard
(a) Given that the nominal volume Vo of
deviation of the distribution of (a) X; + Xp, the sphere is 27.1 em?, estimate to 2
(b) X;—X2, (c) X. decimal places the probability, p, that the
A student’s performance is equally good value of V of a child chosen at random
in two subjects. The marks he might be ' exceeds 1.05 Vo.
expected to score in each subject may be (b) The nominal radius ro of the sphere, ,
treated as independent observations calculated from the formula rp = (3V9/47)3,
drawn from a normal distribution with is 1.86 cm. Each child calculates a value
mean 45 and standard deviation 5. Two r for the radius of the sphere, using the
procedures might be used to decide formula r = (8V/47)3. Explain why you
whether to give the student an overall would expect the probability that a value
pass. One is to demand that he pass of r of a child chosen at random exceeds
separately in each subject, the pass mark 1.05rg to be different from the value of p
being 40; the other is to require that his you obtained in (a). Show that, in fact, it
mean mark in the two subjects exceeds is extremely unlikely that any child’s
40. Find the probability that the student value of r exceeds 1.05ro.
will obtain an overall pass by each of (Hint: express r > 1.05ro in terms of V
these procedures. (O) and Vo.)
(c) The measured values of V obtained by
13. In a certain nation, men have heights
a second class of children are also distri-
distributed normally with mean 1.70m
buted approximately normally with the
and standard deviation 10cm. Find the
same mean as the first class, but with a
probability that a man chosen randomly
standard deviation of 1.5 cm”. One child
has height not less than 1.83 m.
from each class is chosen at random.
What is the probability that the average Estimate to 2 decimal places the proba-
height of three men chosen randomly bility that the mean of their values of V
is greater than 1.78 m and the probability exceeds 1.05 Vo. (MEI)
that all three will have heights greater
than 1.83 m?
For the nation, women have heights 17. (a) If X and Y are independent random
distributed normally with mean 1.60m variables with means Mx, My and variances
pee 4 ;
and standard deviation 7.5 cm. Find the Ox", Oy respectively, show from. first
probability that a husband and wife have principles that the mean and variance of
not more than 5 cm difference in heights aX+ bY are au,+ buy and a’o,/+ bo,"
and state the assumptions that you have respectively where a and 0 are constants.
made in the calculation. (MEI) (6) The diameters x of 110 steel rods were
measured in centimetres and the results
14. X, and X> are random variables such that were summarised as follows:
X;,is normally distributed with mean 120 DoT, 86 Butea tae Le. 0.
and variance 8 and Xz is normally dis-
tributed with mean 150 and variance 22. Find the mean and standard deviation of
A random sample of size 20 is taken from these measurements.
the distribution of 3X,+ 4X). Find the Assuming these measurements are a
distribution of the sample mean. sample from a normal distribution with
this mean and this variance, find the
15. Random variables X and Y are such that probability that the mean diameter of
X ~N(100,10) and Y ~ N(120, 20). a sample of size 110 is greater than
Random samples of size 50 are taken 0.345 cm. (O&C)
RANDOM VARIABLES AND RANDOM SAMPLING i 403
18. The number of miles travelled per week average number of miles travelled per
by a motorist is distributed normally witha week over a complete year of 52 weeks
mean of 640 and astandard deviation of 50. will exceed 650.
(a) Calculate the probabilities that in a (c) If the car’s petrol consumption is 30
week he will travel (i) more than 600 miles per gallon, calculate the probability
miles, (ii) between 600 and 700 miles. that the motorist will use less than 80
gallons of petrol over a period of 4 weeks.
(6) Calculate the probability that the (JMB)
if X,, Xo... SA, 18a random sample of size n from any distribu-
tion with en wand variance oa” then, for large n, the distribution
0e
ofthe sample mean (X) is oe oe and x~ NI eemo
pa? Se ad
ma
So)
! 2 2
st i 0 = 29
Now if X ~ N then nX ~ {mu ="
n
But nX = Xj+X,+...+X,
therefore A feet .. +X, ~ N(np, no?)
Example 7.20 If a random sample of size 30 is taken from each of the following
distributions, find, for each case, the probability that the sample
mean exceeds 5.
So P(X>5)= as > a
= P(Z>1.291)
= 0.0983. S.V. 0° 1.291
(b) X ~ Bin(9,0.5)
so
E(X) = (9)(0.5) = 4.5 and Var(X) = (9)(0.5)(0.5) = 2.25
a2 ZnO
Now, by the central limit theorem, X ~ N{4.5, oe :
ae te 2.25
The standard deviation of X = “30° =4/0.075.
(c) X ~ R(3,6).
Then
Ceo)
E(X) = 5(3+6) = 4.5 and Var(X) = a = 0.75
wa 0.75
So xOoeN SO by the central limit theorem
Example 7.21 Ifa large number of samples of size n are taken from Po(2.5) and
approximately 5% of the sample means are less than 2.025, estimate
n.
~ Cc
ix n(x} approximately
: 2 2.0
i.e. xa N[25,|
n
i.e pS Se
Ja5 in aes
Vain == 0.05
(0 Ee)
s.d. =./2.5/n
= 5.476
sO n = 29.98
Exercise 7g
1. To find the mean life and the standard 2. Ifa large number of samples, size 30, are
deviation of a particular make of fluores- taken with replacement from the following
cent light bulbs a large number of samples distribution, find the mean and standard
of 100 bulbs are tested. The mean and deviation of the sampling distribution of
the standard deviation of the resulting means. Estimate the probability that a
sampling distribution of means were sample mean exceeds 4.
found to be 1580 hours and 120 hours,
respectively. Calculate the mean life and
lx | 0 oo 3. Au 6406
the standard deviation of this make of fst G0n16 227) 2116 5
light bulbs.
406 A CONCISE COURSE IN A-LEVEL STATISTICS
A random sample of size 100 is taken standard error of the sampling distribu-
from Bin(20, 0.6). Find the probability tion obtained were 20 500 km and
that (a) X is greater than 12.4, (0) X is 250 km respectively. Estimate the mean
less than 12.2, where X is the sample life and the standard deviation of this
mean. brand of car tyre.
OR a
If a large number of samples of size n
f{ay= (axe 42= 0
are taken from Bin(20, 0.2) and approx-
ax, O<Sx<2,
imately 90% of the sample means are
On eet
less than 4.354, estimate n.
where a is a constant. Sketch the graph
10. If a large number of samples of size n of f and hence, or otherwise, find the
is taken from R(2, 6) and approximately value of a.
1% of the sample means are less than
Show that Var(X) = 2.
2.8, estimate the value of n.
A random sample of 200 independent
11. To find the mean life and standard observations of X is taken. Using a
deviation of a certain brand of car suitable approximation, find the proba-
tyres a large number of random samples bility that the sample mean exceeds 0.2.
of size 50 were tested. The mean and (C)
xX
n bq
and *
xX 1 aT
vere) var(*)
n
pane Va)
n
= (19g)
n
= a
NOTE: the larger the sample size, the better the approximation.
The distribution of P, is known as the sampling distribution of
proportions and the standard deviation of the sampling distribution
Pq
— is known as the standard error of proportion.
n
Example 7.22 It is known that 3% of frozen pies arriving at a freezer centre are
broken. What is the probability that, on a morning when 500 pies
arrive, (a) 5% or more will be broken, (b) 3% or less will be
broken?
Solution 7.22 Let P, be the r.v. ‘the proportion of pies in the sample which are
broken’.
Then
bq :
Pe N [p24] approximately, wheren =i 500,p az
= 0.03
n
. n(0ps eon
Be : apo Te
i.e. P, ~ N(0.03, 0.007 637)
408 A CONCISE COURSE IN A-LEVEL STATISTICS
(a) We require
1 ;
P(P, 2 0.05) > lp> 0.00mi EE (continuity
a ; (2)(500) correction)
ce P,—0.038 2 (08
~ ~ \0.00763 0.007 63
II P(Z > 2.49) s.d. = 0.007 63
0.006 39
0.03 0.049
S:V. 0 2.49
the probability
e ee
Thereforee that 5% or more will be broken is 0.006 39.
a a aie epeeei
(b) We require
So we require
A ied ie 2b. Oe
= aoe “Semen aoet s.d. = 3.814
3.814 8.814
= P(Z<0.181) 15 155
S.V. 0 0.131
= 0.5521
Exercise 7h
For large n, the result holds for a random sample taken from
any distribution.
= 1 — Pp,
RANDOM SAMPLING
Example 7.23 Discuss how to select, at random, a sample of two people from a
group of six.
Solution 7.23 Write the name of each person on one of six otherwise identical
discs and mix them thoroughly in a hat. Without looking, select a
disc, note the name and return it to the hat. Draw again. If the first
name re-appears, disregard it and repeat the procedure until a
different name appears. The sample of two people is then obtained.
An alternative method might be to allocate to each person one of
the numbers 1, 2, 3, 4, 5, 6 and then select the people correspon-
ding to the numbers obtained on a die when it is thrown twice, for
example (3, 5).
If the population is large then the method of ‘drawing out of a hat’
is obviously not practical. We can however allocate a number to
RANDOM VARIABLES AND RANDOM SAMPLING s 411
each item and make the choice by referring to Random Number
Tables, shown on p. 629. If you have a random number generator
on your calculator you will be able to produce a random
3-digit number every time you press it.
Example 7.24 Using random number tables, select at random a sample of 8 people
from a group of 100.
Solution 7.24 Allocate a two-digit number to each person, for example 01 for the
first on the list, 02 for the second, ... to 98, 99, 00 (calling the
hundredth person 00, for convenience).
68 7 538 Bt 59 25 34 W 54 9
32 68 FF AT 05
Example 7.26 Take a random sample of 12 numbers (to 2 d.p.) from the con-
tinuous range 0 <x <10. |
Solution 7.26 We require the sample values to have 2 d.p. accuracy so we will
need to consider groups of 3 digits, inserting the decimal point
between the first and second digit. Using list (b) on p. 411.
Example 7.27 Take a random sample of 4 numbers (to 3 d.p.) from the con-
tinuous range 0O<x <5.
Solution 7.27 Using list (c) on p. 411 and disregarding any values out of range,
we have
Example 7.28 Take a random sample of size 5 from the following distribution,
using the random numbers 364294 588330 923918 400300.
RANDOM VARIABLES AND RANDOM SAMPLING
7 413
Solution 7.28 Consider first the cumulative frequencies and then transfer them to
proportional frequencies with a total proportion of 1. Random
numbers can then be allocated in accordance with the cumulative
proportional frequencies as shown:
Solution 7.29 Form the cumulative distribution function F(x) and then allocate
random numbers in a convenient way:
a ae
Corresponding
random numbers
Example 7.30 Take a random sample of four from a binomial distribution with
parameters n = 4 and p = 0.2, using the random numbers 2811,
5747, 6157, 8988.
Solution 7.30 X ~ Bin(4, 0.2). Since the given random numbers have 4 digits, we
will work to 4 d.p.
Cumulative distribution.
function, F(x)
i [1 )tere ea | at
Corresponding 0001 | 4097 8$193/)~ 9729
random to to to to
numbers 4096 | 8192 | 9728 | 9984
RANDOM VARIABLES AND RANDOM SAMPLING , 415
The given number 2811 is in the range 0001 to 4096 and corres-
pondsto x =0.
Example 7.31 Using the random number 8135 take a single random observation
from a Poisson distribution with parameter 3.
x
Cumulative distribution
function, F(x)
Corresponding
0 0001 to 0498
1 0499 to 1991
2 1992 to 4232
3 4233 to 6472
4 6473 to 8153
5 8154 to 9161
6 9162 to 9665
ft 9666 to 9881
8 or over 9882 to 9999
and 0000
F(x) = | 3x?
dx
x
x
8
Example 7.33 Use the random numbers 382 824 to take a random sample of two
from the normal distribution N(30, 4).
2
NOTE: ®(z) =1—Q(z). @(z) = 0.382
Exercise 7i
In the following, use the random number numbers and call the first two digits x
tables on p. 629 if random numbers have and y. Let z=10x+y. If 1<2z2<58
not been given in the question. then the person who was allocated the
number is selected. Otherwise, the person
1. Select a random sample of size 10 (to allocated the number z— 58 is selected.
3 d.p.) from the continuous range Comment on this method of selection.
2 = x= oO:
6. Take a random sample of size 6 from the
Draw up a random sample of 100 num- distribution:
bers from the discrete integer range 0 to
9. Find the mean and variance of the
sample values and compare them with the
theoretical mean and variance.
Take a random sample of size 5 from the (b) Using the table of random numbers
distribution of X where F(x) = 5x, provided, simulate the number of calls
arriving at the switchboard in 30 con-
x = 2,3, 4, 5.
secutive minutes. Indicate precisely how
10. (a) The discrete r.v. X is such that your values have been obtained.
X ~ Bin(3, 0.4). Take a random sample (c) Calculate the sample mean number
of size 5 from this distribution, using the of calls per minute. Given that the mean
random numbers and standard deviation of the number of
407 315 401 203 972 calls per minute obtained from the table
are 1.345 and 1.087 respectively, calculate
(b) Using the random number 6143 take the probability of a random sample of 30
a single random observation from the giving a mean value of at least that
Poisson distribution with parameter 4. obtained from your sample. (SUJB)
There are many estimators which could be formed, but the best (or
most efficient) estimator is the one which (i) is unbiased, and
(ii) has the smallest variance.
Example 3.1 If X,, X2, X3 is a random sample taken from a population with
mean um and variance o?, find which of the following estimators for
we are unbiased, and which is the most efficient of these.
X,+X,+X; X,+2X, X,+2X,+ 3X;
Ty iT
aguante: 12:Sana ae eaeaaa
420
ESTIMATION OF POPULATION PARAMETERS
Solution 81 Now
E(X;) = w for i = 1,2,3
X,+X,+
So ECT \ i= g(t
1
= yj LEK) + E(X2) + E(X3)]
3 K)
=p
As E(T,) = yw, T, is an unbiased estimator for p.
Aah 2X
Now Ry ae
1
ty [E(X))+ 2E(X)]
akesiewv
3 et git
On
4
As E(T) = p, T> is an unbiased estimator for yu.
X,+2X,+ 3X;
Now E(T3) = E|—————>
3
1
= pe et)
The more efficient estimator is the one which has the smaller
variance.
Xie ks
Now Var(T,) = Ns Wii tern
*~ a
and Var(T,) = Var
3
;[Var(X,) + 4Var(X>)]
5o?
9
As Var(T,) < Var(T>), T; is a more efficient estimator for uw than
fis
NOTE: the sample X,, X2, X3 is made with replacement and the
observations are independent.
Example 8.2 Two random samples of sizes n and 3n are taken from normal
populations with means pu and 3y and variances o” and 307 respec-
tively. If X, and X, are the sample means, show that the estimator
aX,+ bX, is an unbiased estimator for uw ifa+ 3b = 1. Also, find
the values of a and 0 if this estimator is to be the most efficient
estimator.
= Po ao
Solution 8.2 E(X,) = @ and Var(X,) = —
n
‘a: 3a02 o2
_
E(X,) Gapspiaands Var(X,) ies =
3n n
= u(a+ 3b)
fet
Also
n n
GO 2
= (a 40.)
n
Buta+
3b =1.
ESTIMATION OF POPULATION PARAMETERS # 423
So
et oO 2
Var(aX,+bX,) = —[(1—8b)?+ b?]
n
o?
= —(1—6b +100?)
n
Lee
E ahai00|
al leting the
COmpleting th square
The minimum variance occurs when b = 3.
CONSISTENT ESTIMATOR
Example 384 If Xie 5. eee) SONS essample taken from a population with
mean LL Sealevariance oO; state a condition which will
ensure that the
estimator T = k,X,+ boxih PRX, pis’ a consistent estimator
for pL.
-o Da
n
bS 2
i=1
Exercise 8a
been found that both instruments give two means that is unbiased and has mini-
unbiased readings and that determinations mum variance. (O)
by Instrument A have a variance that is
twice that of determinations by Instrument A random variable X has mean pM and
B. Random samples were taken from an variance 2 and an independent random
ingot of metal and divided between the variable Y has mean 3yu and variance 7.
two instruments. The mean of 12 deter- Find the values of a and b if aX+ bY has
minations by Instrument A was 6.0, in mean LM and minimum variance.
appropriate units, while the mean of 9 The values obtained in a single observation
determinations by Instrument B was 6.5. of each of X and Y are 10 and 25 respec-
Estimate the common mean from these tively. Obtain a best estimate of u and
data by using the linear function of the explain in what sense it is best. (C)
oO im
==> 2 _ainel ===
0)2S ti,= es
n n
2 . ~2 ; ns*
The most efficient estimator for 0“, written 0°, 1s ae
! re nS?
We write ae
nS 2 : ; ;
Example 8.5 Show that yo is an unbiased estimator for o°.
1
so nS? = ) xe 0X?
1
and
Example 386 Obtain the most efficient, or best, estimates of. the population mean
and variance from which the following sample is drawn:
19.30, 19.61, 18.27, 18.90, 19.14, 19.90, 18.76, 19.10
ESTIMATION OF POPULATION PARAMETERS , 427
Solution 86 The best estimate of the population mean is-@ where pt = X, the
sample mean.
i z 152.98
Now x = os = A = 19.12 (2d.p.)
So fi = 19.12 (2d.p.).
2
The best estimate of the population variance is G? where o? = Ries
= i)
_and s? is the sample variance.
Now
et ee oF loos,
G20 SS SYS = a = 0.217 (8d.p.)
n 8 8
“ ns? :
Soa as a8 = 0.25 (2id'p?):
Example 87 Obtain the best unbiased estimates of the population mean and
variance from which the following sample is drawn: n = 12,X = 23.5,
D(x —X)* = 48.72.
eS SS
Exercise 8b
———— Ot
Determine unbiased estimates of the Use the random numbers given below to
mean and the variance of the concentra- generate a random sample of size 20 from
tion of the trace element per litre of , the distribution of X and use it to obtain
water from the spring. (L)P unbiased estimates of the population
mean and variance.
13. Using the random numbers on p. 636
take a random sample of size 10 from the Random numbers: 57048 86526
following distribution: 27795 36820
Example 38.8 A random sample of 50 children from a large school is chosen and
the number who are left handed is noted. It is found that 6 are left
handed. Obtain an unbiased estimate of the proportion of children
in the school who are left handed.
Solution 38.8 From the sample, the proportion of children who are left handed is
Ds where p, = 5 = 0.12.
|e ee
o” we take two random samples:
: aX ot nx
Then t=
n,+ny
‘3 mately) 1,
— E(n,X,+n2X>)
n,+n, n, tn,
1 eS =
= [n, E(X,)
+n, E(X)]
n,+n,
t
a - a Ta iMeat re2M)
ea
Al s ge = mSit nS?
e
; Rons
E(n,82+n2S87) = E(n,87)
+ E(n2S.°)
(n;—1)o?+(n,—1)o? (see p. 426)
(n, +n,—2)o?
24 S.2
So pe Nw? - ‘i
hits 2
430 A CONCISE COURSE IN A-LEVEL STATISTICS
Example 383 Two samples, sizes 40 and 50 respectively, are taken from a popula-
tion with unknown mean yw and unknown variance o”. Using re
data from the two samples, obtain unbiased estimates of u and o°.
Sample I Sample II
|x,[18 19 20 21 22 x2|18 19 20 21 22 23
eal 7 A510 8 /f[10 21 PES ©
Example 8.10 A count was made of the bacteria in a certain volume of water.
Denoting the number of bacteria by x = 1800 + d, the results for
the first sample were
ny = 2, 2d ="162)"2(d dd) 1 1466
The results for the second sample, where y = 1800 + e, were
n, = 25, Ye = 125, L(e—é)* = 14984
Obtain unbiased estimates of the population mean and standard
deviation (a) considering the results of the first sample only,
(b) considering both samples.
ESTIMATION OF POPULATION PARAMETERS y 431
in E 125
y = 1800+2@ = NS Gare. = 1805
—d)*+D(e—é)? 11 466+
14 984
=o
0 Pes eed Se eee Pen eng?
Ng ng Z 50
Sample I
Then p, an unbiased estimator for the population proportion p, is
given by
Nye. NP, é
a nyt Na
432 A CONCISE COURSE IN A-LEVEL STATISTICS
1
a [n, E(P,,) + n2E(Ps,)]
Ny ny
1
= (n,;pt+np)
ny + 1)
792
Example 38.11 An opinion poll in a certain city indicated that 69 people ina
random sample of 120 said that they would vote for Mr Jones,
while in a second random sample of 160, 93 said that they would
vote for Mr. Jones. Find an unbiased estimate of the proportion of
people in the city who will vote for Mr Jones.
69
p,, = ic6 n, C= 160,
93
p,, _= ae
Solution 8.11 n, = 120,
Exercise 8c
In the following questions, find an unbiased 12. A random sample of 600 people from a
estimate of the population proportion, based certain district were questioned and the
on the data given by the two samples. results indicated that 30% used a parti-
cular product. In a second random sample
9. ny= 200, Dg, = 0.36; n2 = 300, Ps, = 0.34
of 300 people, 96 used the product. Find
10. ny= 50, Ds, = 0.82;n2= 80,Ds, =) 0:85: an unbiased estimate of the proportion of
people in the district who used the
11. ny= 10, ps, = 0.6; nz = 20, py = 0.7 product.
POPULATION MEAN
So p(-1.96sf
< ste as
<1.96] E005
ahh.
p(-1.96-=< X-p<196-=|
e ae us al Jn = 0.95
e
0 —
0.95
| Var * a
Ppl96 SS XS SG
en GO
P|X+1.96—=2> p> X-1. of = 0.95
Jn n
Therefore
F a x 1.967)oy
0 0 N(O,1)
[z-2.5757- B+ 2.575)
n
: 0.5% 0.5%
This can be written
S.V. —2.575 0 2.575
oO
ee 0
/n
A central 98% confidence interval for uw is given by
0 0 N(O, 1)
fe B20 =k 2.326 ==
1% 1%
This can be written
oO OVEN 23200 O 2.326
Example 38.12 After a particularly wet night, 12 worms surfaced on the lawn.
Their lengths, measured in cm, were:
9.5, 9.5, 11.2, 10.6, 9.9, 11.1, 10.9, 9.8, 10.1, 10.2, 10.9;511;0
Assuming that this sample came from a normal population with
variance 4, calculate a 95% confidence interval for the mean length
of all the worms in the garden.
436 A CONCISE COURSE IN A-LEVEL STATISTICS
So Ye 7 = 10.389 (2d.p.)
Example 38.13 On the basis of the results obtained from a random sample of 100
men from a particular district, the 95% confidence interval for the
mean height of the men in the district is found to be (177.22 cm,
179.18 cm). Find the value of X, the mean of the sample, and 9g, the
standard deviation of the normal population from which the sample
is drawn. Calculate the 98% confidence interval for the mean height.
X
X= —1.96
.96 —2 =
10 4 ico ii
(ii)
Z 0
2(1.96)— = 1.96
10
10
O=4>>
2
o=5
If¥ and s? are the mean and variance of a random sample of size
n (where n is large) from anormal population with unknown _
mean wu and unknown v: iance O°, then a central 95% confidence
interval for pu is ~— aS 2 :
ns
{[x—-1. +1.96 here og? = —— ws? for largen
f 1967 F n:
(
oe)a need :
This can be written Rt1 eo
Jn
— (0) ‘ af O
NOTE: bieeer <p<X+1.96—=|
a = 0.95.
Vn al
Similarly,
2x 1008
N X (es Set ee
Solution 8.14
= cement
Dig—R)e 172.8
and s a= ¥ = 120 = 1.44
sO s = 1.2
‘ ares me .
Using the large sample approximation, with o° = qo = §
+2573
742675 ==
—= = 842.515
8.442.575 aoe
tt)
= 84+0.282
ll (8.118, 8.682) (3d.p.)
Therefore a 99% confidence interval for the population is
(8.118, 8.682).
Sample II
Ss tismand2Ou.2
5 pie elas = 17.6
Ny he,
Dx? 22 536
se = ———Xx = —17.6? = 3.24
ny 12
_~—~—~~——C«dExercise 8d © eR ee
1. Acertain type of tennis ball is known to the mean height of bounce of the sample
have a height of bounce which is normally is 140cm. Find (a) 95%, (b) 98% con-
distributed with standard deviation 2 cm. fidence intervals for the mean height of
A sample of 60 tennis balls is tested and bounce of this type of tennis ball.
A CONCISE COURSE IN A-LEVEL STATISTICS
A random sample of 100 readings taken 10. The age, X, in years at last birthday, of
from a normal population | gave the 250 mothers when their first child was
following data: x = 82, Dx? = 686 800. born is given in the following table:
Find (a) 98%, (b) 99% confidence inter- Ni 18- 20- 22- 24- 26- 28- 30- 32- 34- 36- 38-
vals for the population mean UL.
No.of |14 36 42 57 48 26 17 7 2 0 1
80 people were asked to measure their mothers
pure mares when,they: woKe,up im ine (The notation implies that, for example in
Cree: ane as ues 68 he podvhe column 1, there are 14 mothers for whom
standard deviation a beats: Find (a) 95%, the continuous variable X satisfies
(b) 997% confidence intervals for the 18<X <20,)
population mean.
Calculate, to the nearest 0.1 of a year,
The 95% confidence interval for the mean estimates of the mean and the standard
length of life of a particular brand of light deviation of X.
bulb is (1023.3 h, 1101.7 h). This interval If the 250 mothers are a random sample
is based on results from a random sample from a large population of mothers, find
of 36 light bulbs. Find the 99% confidence 95% confidence limits for the mean age,
interval for the mean length of life of this U, of the total population. (C)
brand of light bulb, assuming that the
length of life is normally distributed.
11. The distribution of measurements of
A random sample of six items taken from thicknesses of a random sample of yarns
a normal population with variance 4.5 cm? produced in a textile mill is shown in the
gave the following data: following table:
Sample values: 12.9 cm, 13.2 cm, 14.6 cm, : ae
12.6 cm, ? 11.3 cm, ’ 10.1 em Se ec)
(mid-interval value) Brequency
Find the 94% confidence interval for the
population mean UL.
Time (hours) 0-20 20-40 40-60 60-80 80-100 100-120 120-140 140-160
80 48 29 18 12 7 4 2
Before we consider the case when the sample size is small, we must
introduce the t-distribution. It has a very complicated p.d.f. which
is included here only for completeness.
THE t-DISTRIBUTION
freedom.
X has one parameter, v, known as the number of degrees of
pronou nced ‘new’.) The constan t €,
(We use the Greek letter p,
depends on pv.
We say that Xe 1)
Normal curve
N(O, 1)
These are printed on p. 636. Note that the upper quantiles of the
t-distribution are printed.
t(6) t(6)
2.5% 2.5% 2.5%
Example 3.16 (a) Find two symmetrically placed values for t outside which 1%
of the t(11) distribution lies.
(b) If X ~ ¢(4), find t such that (i) P(X < t) = 0.99,
(ii) P(X > t) = 0.05, (iii) P(| X|< t) = 0.95.
t(11)
Solution 8.16 (a) Row v=11, column 2Q = 0.01 nee 0.5%
gives t= 3.106, so that 1% of the
distribution lies outside (— 3.106, 3.106).
—3.106 0 3.106
(b) (i) Rowv = 4, column P = 0.99 gives t = 3.747, so
P(X < 3.747) = 0.99.
(ii) Row v = 4, column Q@ = 0.05 gives t = 2.132, so
P(X> 2.132) = 0.05.
(iii) Row v = 4, column 2Q = 1— 0.95 = 0.05 gives t = 2.776, so
PE- 24116 — X = 2.776).— 0.95.
Example 3.17 If the r.v. X is such that X ~ t(8), find (a) PX <—2.9),
(b) P(X > 3.36), (c) P(—2.9< X < 3.36).
Solution 8.17
2 Ou Ol 3.30
1. Find two symmetrically placed values for If X ~ t(13), find P(—1.77 < X < 3.012).
t outside which 1% of a ¢(6) distribution If X ~ t(10), find P(1.812 < X < 3.169).
lies.
If X ~ t(6), find P(— 2.447 <X <—1.44).
2. Repeat Question 1 for (a) 10%, (b) 5%,
If X ~ t(8), find the value of a such that
eee
(c) 2%, (d) £% of a t(6) distribution. P(X > a) = 0.05.
3. Repeat Question 1 for 1% of a (a) ¢(7), - If X ~ t(12), find the value of a such that
(b) t(12), (c) t(15), (d) t(16) distribution. P(| X|<a) = 0.95.
For example
ae 1 a
(a) the sample variance is given by S? = — 2(X;—X)?.
n
When calculating S? the variables are X,, X2,...,X,, so the number
of variables is in. However, these variables are ‘restricted’ by the fact
that 2 X; = nX, so the number of restrictions is 1.
Therefore vy =n—1.
(b) Consider
n Sit Si nSe
o- =
Ny tlea2
The number of variables = n, +n.
The number of restrictions = 2 (these are X, and X,).
Therefore vy =n,+n,—2.
G J/ns S
Vn VnVJ/n-1 Vn-1
X~u oat an
The statistic becomes = whichis 4. ————
ol/n Gi/n A AT
ESTIMATION OF POPULATION PARAMETERS y 445
Now, for small samples this statistic does not follow the normal
distribution; the t-distribution must be used instead. The number of
degrees of freedom involved in calculating the statistic is n—1.
a
So, if 7ST
=
ekLe then T follows a t-distribution with (n—1)
ned
degree of freedom
i.e. Tins)
_ If ¥ and s? are the mean and variance of a random sample of size
n (where n is small) from a normal population with unknown
mean uw and unknown variance a”, then a central 95% confidence
interval for u is given by
5 6 s
X—-t———=, ¥+t-——=
o S eo Ss
where Xt <u< 8+ = 0.95
inl yn1
i.e. (—f, t) encloses 95% of the t(n
—1) distribution.
Example 38.18 Ten packets of a particular brand of biscuits are chosen at random
and their masses noted. The results (in grams) are 397.3, 399.6,
401.0, 392.9, 396.8, 400.0, 397.6, 392.1, 400.8, 400.6. Assuming
that the sample is taken from a normal population with mean mass
bt, calculate (a) the 95% confidence interval for u, (b) the 99%
confidence interval for py.
s De”
x = cali el eS Sa
n n
; 1 583 098.3
aaj obi E = ———
— (397.87)*
10 10
s
Gat
V/n-1
is
(b) The 99% confidence interval for yp
XG ues
Vek ta
where (—t, t) encloses 99% of the t(n —1) distribution, with n = 10.
From tables, row v = 9, column 2Q = 1%, we
find that t = 3.25.
t(9)
The 99% confidence interval is
0.5% 0.5%
/ 9.29
397.87 +£3.25 = 397.87+4 3.302
v9 —3.25 0 3.25
(394.57,401.17) (2d.p.)
The 99% confidence interval for pu is (894.57 g, 401.17).
Example 3.19 Fifteen pupils experimented to find the value of g, the acceleration
due to gravity. Their results were as follows:
9.806, 9.807, 9.810, 9.802, 9.805, 9.806, 9.804, 9.811, 9.801,
9.804, 9.805, 9.808, 9.803, 9.809, 9.807
Calculate the mean and the standard deviation of these results.
Give 95% confidence limits for the value of g based upon them.
Estimate the number of experimenters needed to give a confidence
interval of less than 0.001. (SUJB)
DG 147.088
Solution 38.19 y= 9.0 (a Da
n 15
5 Dx? me 1442.3254 147.088)?
9 Se ea Le
n 15 15
Therefore s = 0.0028 (2S.F.).
The mean value for g obtained from the results is 9.8059 and the
standard deviation is 0.0028.
ESTIMATION OF POPULATION PARAMETERS 447
q
V14
(9.8043, 9.8075) —2.145 0 2.145
So the 95% confidence limits for the value of g based on the results
are (9.8043, 9.8075).
Exercise 8f
The heights (measured in cm) of six Determine a 95% confidence interval for
policemen were as follows: the unknown population mean LU.
180,176,179, 181, 183,179
Calculate (a) 90%, (b) 95%, (c) 99% con-
fidence intervals for the mean height of the A normal distribution has variance Ge and
population of policemen. (Assume that the mean WU. A random sample of ten observa-
heights of policemen are normally tions gives values
distributed.) 0.3, 0.28, 0.27, 0.33, 0.35, 0.33, 0.27
A sample of eight observations of a nor- 0.31, 0.37, 0.29
mally distributed variable gives values Find (a) 95%, (b) 99% confidence intervals
3.6, 3.9, 4.5, 3.8, 4.4, 4.9, 4.2, 3.8 for U.
448 A CONCISE COURSE IN A-LEVEL STATISTICS
4. The masses, in grams, of thirteen washers interval for the mean life of a candle,
selected at random are assuming the length of life to be normally
distributed.
15.4, 15.2, 14.6, 16.1, 14.8, 15.3, 15.9,
16.0, 15.4, 14.6, 15.0, 15.5, 16.1 A random sample of seven independent
Calculate 98% confidence limits for the observations of a normal variable gave
mean mass of the population from which Dx = 35.9, Dx? = 186.19. Calculate a 90%
confidence interval for the population
the sample is drawn, assuming that the
population is normal. mean.
A random sample of eight observations of a
5. Twenty measurements of the life of a normal variable gave
candle (measured in hours) gave the Dx = 261.2, U(x—x)* = 3.22
following data: Dx = 172, Dx? = 1495.4.
By taking a sample of 20 as (a) large, Calculate a 95% confidence interval for
(b) small, calculate a 99% confidence the population mean.
Se ee ee
i
Then ea)
P, ~ Np, where g = 1—p (seep. 407)
PQs
It is reasonable to assume that an estimator for ss! is , where
n n
Q, = 1—P,.
PQ,
So we have ote N(o. approximately
Standardising, we have
Peep
Bi = where Z ~ N(0,1)
V PsQ;/n
Ea
Therefore P(-1.96" < 1.96] = 0.95
V P.Qs/n
so, rewriting
P.Q, iB.
p(P.—1.96 fee <p<P.+1.96 28s) = 0.95
n
ESTIMATION OF POPULATION PARAMETERS / 449
P
(»,—1.06 fe, Pp, 71,96 /Pas) where q, = 1—p,
1. : } PQs
This can be written P2196 f=
n
45
Solution 8.20 The proportion of defective items in the sample, p, = 300 = 0.15.
/Ds4s (0.15)(0.85)
p= +1,1.96 ==- = 0.15+(1.96
( ) ee
300 ayez
= (0.1096, 0.1904) Ps
SV. 196 0 1.96
= 0.15+0.048 1%
(0.101, 0.198)
sv. —2.326 0 2.326
Example 38.21 A point whose coordinates are (X, Y) with respect to rectangular
is
axes is chosen at random where 0<X<1land0O<Y<1. What
whose equatio n
the probability that the point lies inside the circle
isx?+y?=1?
In a computer simulation 1000 such points were generated and 784
of them lay inside the circle. Obtain an estimate for 7 and give an
approximate 90% confidence interval for your estimate. Show that
about 290000 points need to be selected in order to be 90%
certain of obtaining a value for 7 which will be in error by less
than 0.005. (SUJB)
area quadrant DZ
So, P(point lies within circle)
area square &
7/4
ae
= 7/4
If 1000 points are taken and 784 lie within the circle, then if the
true proportion of points lying within the circle is p, an estimate for
p is p, where
784
Ps ~ 1000
= 0.784
So an estimate for 7/4 = 0.784 and hence an estimate for 7 is
(0.784)(4) = 3.136.
= 0.784 +0.0214
(0.7626, 0.8054)
ESTIMATION OF POPULATION PARAMETERS / 451
If the value for 7 is to be in error by less than 0.005 then the value
for 7/4 must be in error by less than 0.001 25.
When n = 1000, the size of the interval was p, + 0.0214.
Now we need to find n such that the size of the interval is
‘p,+0.001 25,
0.0214 0.001 25
SO Jn >asset /(0.784)(0.216)
0.001 25
J/n > 541.55
n > 293279
Example 38.22 Derive the mean and variance of the binomial distribution.
In a survey carried out in Funville, 28 children out of a random
sample of 80 said that they bought Bopper comic regularly. Find
95% approximate confidence limits for the true proportion of all
children in Funville who buy this comic. A similar survey in Funville
found that 45 children out of a random sample of 100 said that
they bought Shooter comic regularly. Find 95% approximate
confidence limits for the true proportion of all children in Funville
who buy this comic.
On the basis of these surveys, is there any evidence that the sales of
Shooter comic are higher than the sales of Bopper comic in Funville?
Justify your reply. (AEB 1980)
452 A CONCISE COURSE IN A-LEVEL STATISTICS
Solution 38.22 For the answer to the first part, see p. 214.
Let pz be the true proportion of all children who buy the Bopper.
In the sample of 80,
28
Ds = 50 = 0.35 q; = 1—p, = 0.65
= 0.35+0.105
(0.245, 0.455)
The 95% confidence interval for the proportion who buy the
Bopper is (0.245, 0.455).
Example 38.23 In a sample of 400 shops taken in 1972, it was discovered that 136
of them sold carpets at below the list prices which had been recom-
mended by manufacturers.
(a) Estimate the percentage of all carpet selling shops selling below
list price.
ESTIMATION OF POPULATION PARAMETERS # 453
(b) Calculate the 95% confidence limits for this estimate, and
explain briefly what these mean.
(c) What size sample would have to be taken in order to estimate
the percentage to within + 2%? (SUJB)
Solution 823 From the sample, the proportion of shops selling below list price is
where = We .
es Pe 400
p#1.96 0.34)(0.66
(2% = 0.34271.96) (Os 8%)
n 400
= 0.34+40.046
= 34% +4.6%
The 95% confidence interval is (34% + 4.6%) = (29.4%, 38.6%).
p.+1.96 /P = p,+0.02
n
0.34)(0.66 ) = 9.02
i.e. 1.96 —
art. 1296
so Jn = 509 ¥0-34)(0-68)
= 46.42
n = 2155.14
size 2156 would ee be taken.
have to eee
Pee te ofee
So a sample ee
454 A CONCISE COURSE IN A-LEVEL STATISTICS
Exercise 8g
i. A sample, of size n, is taken from a between the original colonies where the
population in which the proportion of birds were born and the place of recovery.
‘successes’ is p. From the value of the Using an assumed mean of 250.5 obtain
sample proportion given, calculate the the sample mean recovery distance and
confidence interval indicated for the the sample standard deviation.
proportion p. Assuming that the recovery distances
accurately reflect the dispersal of birds
from their original colonies, estimate the
proportion of this type of bird at more
than 300 miles from the original colony.
Give an approximate symmetric 95%
confidence interval for this estimate. (C)
A random sample of 600 was chosen from
the adults living in a town in order to
investigate the number x of days of work
lost through illness. Before taking the
In a survey carried out in a large city, 170
sample it was decided that certain cate-
households out of a random sample of
gories of people would be excluded from
250 owned at least one pet. Find 95%
the analysis of the number of working
confidence limits for the proportion of
_ days lost although they would not be
households in the city who own at least
excluded from the sample. In the sample
one pet. 180 were found to be from these cate-
In order to assess the probability of a gories. For the remaining 420 members
successful outcome, an experiment is per- of the sample 2x = 1260 and
formed 200 times and the number of Lx? = 46 000.
successful outcomes is found to be 72. (a) Estimate the mean number of days
Find (a) 95%, (b) 99% confidence inter- lost through illness, for the restricted
vals for p, the probability of a successful population, and give a 95% confidence
outcome. interval for the mean.
(bo) Estimate the percentage of people
In a market research survey 25 people out in the town who fall into the excluded
of a random sample of 100 from a certain categories, and give a 99% confidence
area said that they used a particular brand interval for this percentage.
of soap. Find 97% confidence limits for (c) Give two examples, with reasons, of
the proportion of people in the area who people who might fall into the excluded
use this brand of soap. categories. (O)
Frequency 50 16 it 12 9
Population mean uw
o* 2 unknown, x+
SLE = Gt
i Will n is sample size.
n small vn—1 2:
where (— tf, t) where (—t, t) s* is the sample
encloses 95% of the | encloses 99% of the | Vatiance.
t(n—1) distribution | t(n—1) distribution
Miscellaneous Exercise 8h
a (a) Before its annual overhaul, the mean Calculate unbiased estimates, in each case
operating time of an automatic machine to 2 decimal places, of the mean and the
was 103 seconds. After the annual over- variance of the sizes of the diamonds in
the original large pile. (L)
haul, the following random sample of
operating times (in seconds) was obtained.
(b) The results of a survey showed that Find the mean daily amount of chemical
3600 out of 10000 families regularly over the 300 days and estimate, to 2
purchased a specific weekly magazine. decimal places, its standard error.
Find the 95% confidence limits for the
Obtain, to 2 decimal places, approximate
proportion of the population buying the
98% confidence limits for the mean daily
magazine.
amount of chemical in the atmosphere.
Estimate the additional number of If daily measurements are taken for a
families to be contacted if the proba- further 300 days, estimate, to 2 decimal
bility that the estimated proportion is places, the probability that the mean of
in error by more than 0.01 is to be at
these daily measurements will be less than
most 1%. (AEB 1987)
14, (L)
Piemaee
[>[eT fo]8[a
(a) intervals in general based on the
above method,
(b) the interval you have calculated.
Number of 1/2 14 The bars of soap are either pink or white
diamonds in colour and differently shaped according
to colour. The masses of both types of
Graphically, or otherwise, estimate the soap are known to be normally distributed,
mesh size of the sieve if half the diamonds the mean mass of the white bars being
will pass through it, and the mesh size of 176.2 g. The standard deviation for both
the sieve if one quarter of the diamonds bars is 6.46 g. A sample of 12 of the pink
will pass through it. bars of soap had masses, measured to the
nearest gram, as follows.
Construct a frequency table showing for
each mesh size listed in the table the
WA, 164.182.5169 «171 187 176
extra number of diamonds which passed
alefgh abtatey alireat MILESOY IEr/l3}
through.
ESTIMATION OF POPULATION PARAMETERS # 457
Find a 95% confidence interval for the soap of mass x gmis (15+ 0.065x)p,
mean mass of pink bars of soap. and it is sold for 32p. If the company
Calculate also an interval within which manufactures 9000 bars of pink soap
approximately 90% of the masses of the per week, derive a 95% confidence
white bars of soap will lie. interval for its weekly expected profit
The cost of manufacturing a pink bar of from pink bars of soap. (AEB 1988)
a ee Ee ee eee eee eee eee
SIGNIFICANCE
TESTING
Often in scientific enquiry a statement concerning a population
parameter is put forward asa statistical hypothesis. Its validity is
then tested, based on observations made from random samples
taken from the population.
mM
Now, we must decide whether it is likely that the sample value has
been drawn from this population. We consider whether it is ‘close
to p’, or whether it is in the tail end of the distribution.
We consider the standard normal variable Z.
— 1.96 0 1.96
<— Critical pate Acceptance region —>|— Critical _,
region region
(reject Ho) (reject Ho)
—1.96 0 1.96
Reject Ho Reject Ho
<~— 1 «er
460 A CONCISE COURSE IN A-LEVEL STATISTICS
N(O, 1)
Critical region at 2% level:
= 0.02 sia 1%
P(\Z|> 2.326)
=32,396 0) | 2326
Reject Ho Reject Ho
— 4 a
Critical region at 1% level:
N(O, 1)
P(|Z|> 2.575) = 0.01
0.5% 0.5%
—2575 0 2.575
Reject Ho Reject Ho
beset ; eas
—1.645 0 QO 1.645
Reject Ho Reject Ho
ee
<«—__
2%
0 2.054
Reject Ho
| [|
(3) Decide on the level of the test. This fixes the critical values of
the test statistic.
Example 91 Test, at the 5% level, whether the single sample value of te comes
from a normal population with mean yp= 150 and variance o? = 100.
150
~<—Reject Ho-| [Reject Hp—>
Pe
(5) Calculate the value of the e
test statistic o
_.172—150
10
2.2
Solution 9.2 Let the population mean be u and the population variance be o?.
Ho: pw = 65
Hy: uw < 65 (one-tailed test)
Now under Hy, X ~ N(u, 0”) with up = 65, o =./30.
We perform a one-tailed test at the 1% level, and reject Hp if
zZ<—2.326, where
x— s.d. = ./30
ie —
54—65 ae He 65
== ~< Reject Ho
V/30
S:Va a2-Oo208 nO
=.772.01
Conclusion: as z > — 2.326, we do not reject Hy and conclude at
the 1% level, that the sample value could have been drawn from a
population with mean 65.
Example 3.3 If 100 seeds are planted, and 83 seeds germinate, use the normal
approximation to the binomial distribution to test the manu-
facturer’s claim of a 90% germination rate. Use a 5% level of signifi-
cance.
Solution 9.3 Let X be the r.v. ‘the number of seeds that germinate’. Then we
have a binomial situation, and X ~ Bin(n, p) with n = 100.
Hy: p = 0.9 (the germination rate is 90%)
H,: p< 0.9 (the germination rate is less than 90%)
(We have chosen a one-tailed test, as this seems more appropriate
to the situation.)
Under Ho, X ~ Bin(n, p) with n = 100, p = 0.9.
Now, as 7 is large, we use the normal approximation to the binomial
distribution
so X ~ N(np,npq) where np (100)(0.9) = 90
LG nd acre N6 905.9) npq (100)(0.9)(0.1) = 9
464 A CONCISE COURSE IN A-LEVEL STATISTICS
if
Perform a one-tailed test, at the 5% level, and reject Ho
z<—1.645, where
Exercise 9a
oe
1. Test whether the sample value could have whether you would accept the manu-
been drawn from the normal population facturer’s claim.
indicated in the null hypothesis. Test at
(a) the 5% level, (b) the 1% level. 4. Inasurvey it was found that 3 out of 10
people supported a particular political
iaypotheces A party. A month later the party representa-
tive claimed that the popularity of the
= party had increased. Would you accept that
peat the number who supported the party was
,Si still 3 out of 10 if a further survey revealed
u= that 38 people in a random sample of 100
ae supported the party. Test at the 3% level.
2. Acoin is tossed 64 times. Test at the 5% 5. A gardener sows 150 ‘Special’ cabbage
level of significance whether the coin is seeds and knows that the germination rate
fair, or whether it is biased in favour of is 75%. (a) By using a suitable approxima-
showing heads, if (a) 38 heads occur, tion find the probability that (i) more
(b) 42 heads occur. than 122 seeds germinate, (ii) less than
106 seeds germinate. (6) The gardener also
3. A manufacturer claims that 8 out of 10 sows 120 ‘Everyday’ cabbage seeds and
dogs prefer his brand to any other. In a finds that 81 germinate. Test whether the
random sample of 120 dogs, it was found ‘Everyday’ seeds have a germination rate
that 88 ate that brand. Test at the 5% level less than 75%. Test at the 4% level.
SIGNIFICANCE TESTING a i v 465
If X is distributed normally then this holds for all sample sizes, but
if X does not follow a normal distribution then n must be large
(Central Limit Theorem).
ve
Standardising, we have Z = sie where Z ~ N(O, 1).
o//n
— :
We use the teststatistic which jis distributed as N (0,» .
uA,
under the null hypotheesis Hothat the true population mean is A
Example 39.4 The lengths of metal bars produced by a particular machine are
normally distributed with mean length 420 cm and standard devia-
tion 12cm. The machine is serviced, after which a sample of 100
bars gives a mean length of 423 cm. Is there evidence, at the 5%
level, of a change in the mean length of the bars produced by the
machine, assuming that the standard deviation remains the same?
Solution 9.4 Let X be the r.v. ‘the length, in cm, of a metal bar’. Let the popula-
tion mean be p and the population variance be a.
466 A CONCISE COURSE IN A-LEVEL STATISTICS
Reminders:
420 c
<— Reject Ho a Reject Hp >
Example 9.5 Experience has shown that the scores obtained in a particular test
are normally distributed with mean score 70 and variance 36. When
the test is taken by a random sample of 36 students, the mean score
is 68.5. Is there sufficient evidence, at the 3% level, that these
students have not performed as well as expected?
SIGNIFICANCE TESTING y 467
Solution 95 Let X be the r.v. ‘the score of a student’. Let the population mean
be uw and the population variance be o?.
Ho: jw = 70 (the population mean yp is 70 and the students
have not under-achieved)
H,: jw < 70 (the population mean is less than 70 and the
students have not done as well as expected)
Consider the sampling distribution of means where, under Ho
= a?
hee nu “| With 840, O- = 30) n. =_36
(a) = 0.03
a =—c-1.881
So, we reject Hy if z < — 1.881, where 3%
gtf iol
ae ol/n SPnaea i
68.5 — 70 S.V. —1.881 0
6/36
elo
Conclusion: as z > —1.881 we do not reject Hy and conclude that
at the 3% level the students have not under-achieved.
xX ~ nu a ituetne=s0:Sonands (nas) 50
n
468 A CONCISE COURSE IN A-LEVEL STATISTICS
eZ i
Se —
af/n =
Acceptance
region
Now, as Hy is accepted, SV. —1.96
—1.96 <z< 1.96
—=1 99|
75) <0 <4 96
6— 1.96{V50}
= <%<6+1.96(—o
>> a
5.78 <X¥ <6.22
Therefore the mean mass of the 50 components must lie in the
range 5.78g< xX <6.22¢.
Example 3.7 A machine produces elastic bands with breaking tension normally
distributed with mean 45.00 N and s.d. 4.36 N. On a certain day a
sample of 50 was tested and found to have a mean breaking tension
of 43.46 N. Test at the 5% level of significance whether this indicates
a change in the mean, explaining what is meant by ‘5% level of
significance’.
Find a 95% confidence interval for the population mean based on
the sample mean assuming an unchanged s.d.
If the s.d. has changed to a, find the least value of o for a 95%
confidence interval for the population mean to contain 45.00 N.
(SUJB)
Solution 39.7 Let X be the r.v. ‘the breaking tension, in N, of an elastic band’.
Let the population mean be yp and the population variance be o,?.
Ho: wm = 45.00 (there is no change in the mean)
H,: yw # 45.00 (there is a change in the mean)
0, = 4.36N, n = 50
SIGNIFICANCE TESTING # 469
£1.96
BH1.96-0- 46 £1.96 96-2.
= 43.4641
= 43.46+1.209
= (42.25, 44.67)
So the 95% confidence interval for the mean breaking tension is
(42.25 N, 44.67N).
WG 11 06 A
/50
Be ee
45.00 — 43.46
1.96
5G 43.46 45.00
D000 (2 dp) Ih 1G @ ee
Example 9.8 Describe, referring to your projects if you wish, the steps used in
carrying out a significance test.
/
//
\ Over a long period it has been found that the breaking strains of
| cables produced by a factory are normally distributed with mean
6000N and standard deviation 150N. Find, to 3 decimal places,
the probability that a cable chosen at random from the production
will have a breaking strain of more than 6200N.
A modification is introduced into the production process which
only affects the value of the mean breaking ‘strain. Six cables,
chosen at random from the modified process, are tested and found
to have a mean breaking of 5920N.
470 A CONCISE COURSE IN A-LEVEL STATISTICS
e
(a) Test, at the 5% significance level, whether the sample evidenc
is sufficient to conclude that the mean breakin g strain of the cables
is actually less than 6000 N.
Solution 3.8 For steps used in carrying out a significance test, see p. 461.
Let X be the r.v. ‘the breaking strain, in N, of a cable’.
X ~ N(6000, 1507) ‘
6200 — 6000
P(X > 6200) = P\ 7 aae
ee 6000 6200
=3 0091 (3 d.p.) S.V. 0 1.333
vm 1507
sO xX ~ N{6000,
¥ —1.282——,
0
| Vn Siva 1.282 0
gins 0
so C i AO ce
Vn
1
5920 —1.282 i
/6
5840 (35S.F.)
Therefore we can state, with 90% confidence, that the mean
breaking strength of the cables exceeds 5840 N.
_ Exercise 9b
_Level of
H
(b) During a check on the manufacturing machine indicate that the diameters are
process a random sample of 25 washers is
normally distributed with mean 0.824cm
and standard deviation 0.046cm. Two
taken from production and the mean
hundred samples, each consisting of 100
thickness X mm is calculated. Find the
interval in which the value of X must lie ball bearings, are chosen. Calculate the
expected number of the 200 samples
in order that the hypothesis that the
having a mean diameter less than
production mean thickness is 3mm will
0.823 cm.
not be rejected when the significance
level is 5 per cent. (JMB) On a certain day it was suspected that
the machine was malfunctioning. It may
Describe briefly how the Central Limit be assumed that if the machine is malfunc-
Theorem may be demonstrated. tioning it will change the mean of the
The distance driven by a long distance diameters without changing their standard
lorry driver in a week is a normally distri- deviation. On that day a random sample
buted variable having mean 1130 km and of 100 ball bearings had a mean diameter
standard deviation 106 km. Find, to 3 of 0.834cm. Determine a 98% confidence
decimal places, the probability that in a interval for the mean diameter of the ball
given week he will drive less than bearings being produced that day.
1000 km. Find, to 3 decimal places the Hence state whether or not you would
probability that in 20 weeks his average conclude that the machine is malfunc-
distance driven per week is more than tioning on that day given that the signifi-
1200 km. cance level is 2%. (L)
New driving regulations are introduced 10. X, and X, are independen t random
and, in the first 20 weeks after their variables with means Ll, and [2, variances
introduction, he drives a total of ov and oy respectively. Give the mean
21900km. Assuming that the standard and variance of X;— X2. If Y= AX,
deviation of the weekly distances he where A is a constant, give the mean and
drives is unchanged, test, at the 10% variance of Y.
level of significance, whether his mean
weekly driving distance has been reduced. The random variable X , denotes the mean
State clearly your null and alternative of a random sample of size n,; from the
second of the above distributions, show
hypotheses. (L)
how to obtain the mean and variance of
A machine packs flour into bags. A the distribution of AyX;+ A,X from the
random sample of eleven filled bags was results you have stated, A, and A, being
taken and the masses of the bags to the constants.
nearest O.1g were: 1506.8, 1506.6,
The yield of a certain crop per plot of
1506.7, 1507.2, 1506.9, 1506.8, 1506.6,
standard area is normally distributed with
1507.0, 1507.5, 1506.3, 1506.4. Obtain
mean 253.0 kg and variance 67.1 kg. A
the mean and the variance of this sample
new fertiliser is applied to 10 randomly
showing your working clearly. Filled bags
selected plots, and their mean yield is
are supposed to have a mass of 1506.5 g.
found to be 257.8 kg. Is there any evidence
Assuming that the mass of a bag has
of significant improvement in the yield?
normal distribution with variance 0.16 g
Assume the new fertiliser does not affect
test whether the sample provides signifi-
the variance of the yields. (O)
cant evidence at the 5% level that the
machine produces overweight bags. Give 11. The masses of loaves from a certain
the 99% confidence interval for the mass bakery are normally distributed with
of a filled bag. (C) mean 500 g and standard deviation 20 g.
A sample of size 25 is taken from the (a) Determine what percentage of the
distribution of X where X ~ N(p, 4). output would fall below 475 g and what
The sample mean X is 10.72. At what percentage would be above 530 g.
level test would we reject the null hypo- (b) The bakery produces 1000 loaves
thesis that 4 = 10 in favour of the alter- daily at a cost of 8 p per loaf and can sell
native hypothesis all those above 475 g for 20p each but is
(a) u>10, (b)u#10? not allowed to sell the rest. Calculate the
expected daily profit.
Explain, briefly, the roles of a null hypo- (c) A sample of 25 loaves yielded a mean
thesis and a level of significance in a mass of 490g. Does this provide evidence
project which you have undertaken. of a reduced population mean? Use the
Records of the diameters of spherical 5% level of significance and state whether
ball bearings produced on a certain the test is one-tailed or two. (SUJB)
SIGNIFICANCE TESTING q 473
12. Illustrate the role of the null hypothesis turer believes that this will increase the
with reference, if possible, to one of your mean breaking strength without changing
projects making sure that you state the the standard deviation. A random sample
alternative hypothesis and the level of of 50 one-metre lengths of the new rope
significance used. Explain how you is found to have a mean breaking strength
decided whether to use a one-tail or a of 172.4 kg. Perform a significance test
two-tail test. at the 5% level to decide whether this
Research workers measured the body result provides sufficient evidence to
lengths, in mm, of 10 specimens of fish confirm the manufacturer’s belief that
spawn of a certain species off the coast of the mean breaking strength is increased.
Eastern Scotland and found these lengths State clearly the null and alternative
to be hypotheses which you are using. (L)
since n is large, ae
with @~S. : ee
Solution 9.9 Let the population mean be p and the population variance be a”
The sample mean X is 52.6 and the sample standard deviation sis
14.5:
- oS
Under Ho PS deaii n(u=|
n
ns? 2
Now, o? is unknown, so since n is large we use G7 = Teall
<s
_
a2
0 px
Tei Nu =) with 6 ~ 14.5, w= 50, n = 100
n
a emer
o//n e ee
59.6—50 SV. 0 1.645
~ 14.5/,/100
= 1.793
Conclusion: As z > 1.645, we reject Ho and conclude that there is
evidence, at the 5% level, that the population mean has increased.
SIGNIFICANCE TESTING
z=
X—p
@ vn 50 La
= 1.793 (as before) eal , a ee
Conclusion: As z< 2.326, we do not reject Hyp and conclude that
there is not sufficient evidence, at the 1% level, that the population
mean has increased.
Example 3.10 A manufacturer claims that the average life of his electric light
bulbs is 2000 hours. A random sample of 64 bulbs is tested and the
life x in hours recorded. The results obtained are as follows:
Lx = 127808, D(x —x)* = 9694.6. Is there sufficient evidence, at
the 2% level, that the manufacturer is over-estimating the length of
life of his light bulbs? Assume that the distribution of the length of
life of light bulbs is normal.
x= ee andess. Bia)
=
n n
_ 127808 9694.6
64 64
= 1997 = 151.48
The sample mean X = 1997 and the sample variance s?= 151.48
and standard deviation s = 12.31.
ofa light
Significance test: Let X be the r.v. ‘the life, in hours,
tion variance
bulb’. Let the population mean be wu and the popula
be o”.
Ho: uw = 2000 (the manufacturer is not over-estimating the
length of life) d
estimate for
Now, as o? is unknown, and we have a large sample, we
2
2
it, using G? = ——~ *s
sel
~2
Nie nu = with G © 12.31, p= 2000, n= 64
and so
n
We use a one-tailed test at the 2% level, and reject Hy if z<— 2.054,
where
X—u eae OF
ec e a
_ 1997~2000 [es
12.31/./64 <=—Reject Ho
Exercise 9c
In this exercise use 0 ~s unless otherwise distribution with mean p and variance a",
stated.
The sample mean is X. Test the hypo-
i For each of the following, a random theses stated, at the level of significance
sample of size n is taken from a normal indicated.
>:M=7, Hy: WT
2. A sample of 40 observations from a suggest that the mean time to perform the
normal distribution gave 2x = 24 and task is greater than 15 minutes.
Lx? = 596. Test, at the 5% level, whether Determine a symmetric 96% confidence
the mean of the distribution is zero. Per- interval for the mean time, based on the
form a two-tailed test. sample observations.
A random sample of 75 eleven-year-olds
performed a particular task. Denoting the Explain, briefly, the roles of a null
time taken by (15+ y) minutes, the results hypothesis, an alternative hypothesis
are summarised as follows: Ly = 90, and a level of significance in a statistical
X(y—F)? = 2025. Test whether there is test, referring to your projects where
sufficient evidence, at the 4% level, to possible.
SIGNIFICANCE TESTING y 477
Example 9.11 Five readings of the resistance, in ohms, of a piece of wire gave the
following results
If the wire were pure silver, its resistance would be 1.50 ohms. If
the wire were impure, the resistance would be increased. Test, at
the 5% level, the hypothesis that the wire is pure silver.
Solution 39.11 Let X be the r.v. ‘the resistance, in ohms, of a piece of wire’. We
assume X is normally distributed.
Xu
Under Hp,0 the test statistic is T = s)/n=1
————, (
where T ~ t(n—1).)
z 3S («—z)?
i anda aaa
n n
mete _ 0.0018
5 5
= 1.52 = 0.000 36
x = 1.52 s = 0.019
SIGNIFICANCE TESTING 479
Now
X— yp
ttest = S/n t(4)
5%
__1.52—1/50
0.019/./4 Le an fi
= 2406 @). MiKy
_ ZQ@e—x)? 0.74
Solution 3.12 Now s? = me = 0.0925
n
s = 0.804, x = 4.65
Ho: w= 4.3
Has UM =e 4.3
ap
Under Ho, the test statistic is T ===, where T ~ t(n—1).
fe s Sjp/n—1 (
Now, n = 8, therefore T ~ t(7).
i= OLS (7)
es ay
s/n : 1% 1%
_ 4.65—4.3
- 0.304/\/7 De maser re, 2
—2.998 0 2.998
3.05
Exercise 9d
1. For each of the following, a random The sample mean is X. Test the hype-
sample of size n is taken from a normal theses stated, at the level of significance
and variance O°. indicated.
distribution with mean
Assuming that the lengths are normally tains less than 454 g.
distributed, test, at the 1% level of Following a slight adjustment to the
significance, whether the machine is in filling machine, a random sample of 10
good working order. jars is found to contain the following
masses (in g) of marmalade:
4. ‘Family’ packs of bacon slices are sold in significance level, the hypothesis that
1.5 kg packs. A sample of 12 packs was there has been no change in the mean
selected at random and the masses, of the distribution. (C Further Maths)
measured in kg, noted. The following
esuline wereuobiaineds Dc = 17.61 A random sample of 8 women yielded the
Dx? = 26.4357. ; following cholesterol levels:
Assuming that the masses of the packs TALR2 ST howd. 8.4 1.9.3.3 4.6
follow a normal distribution, with variance It is required to test whether the sample
o”,test at the 1% level whether the packs could be drawn from a population whose
ne significantly underweight (a) if G? is mean cholesterol level is 3.1.
unknown, (b) if o” = 0.0003. (a) Assuming that the sample is drawn
from a normal distribution give two
reasons why a t-test is appropriate.
5. It is thought that a certain Normal (b) Perform the test, stating your null
population has a mean of 1.6. A sample and alternative hypotheses. What con-
of 10 gives X = 1.49 and s = 0.3. Does clusions do you draw?
this provide evidence, at the 5% level, (c) Calculate a 90% symmetric confi-
that the population mean is less than dence interval for the mean cholesterol
1.6? level in the population. (SUJB)
SIGNIFICANCE TESTING y ae | 481
\
TEST 3 — TESTING THE DIFFERENCE BETWEEN MEANS _
Consider two unpaired, independent samples of sizes n, and n such
that
z= Te ee where Z — N(O-1)
Cfo a 0
ny Ny
Example 3.13 A random sample of size 100 is taken from a normal population
with variance 0,” = 40. The sample mean <;, is 38.3. Another
random sample, of size 80, is taken from a normal population with
variance 0,’ = 30. The sample mean X; is 40.1. Test, at the 5% level,
whether there is a significant difference in the population means
and p.
We use a two-tailed test, at the 5% level and reject Hy if |z| > 1.96,
where
Example 39.14 The same test was given to a group of 100 scouts and to a group of
144 guides. The mean score for the scouts was 27.53 and the mean
score for the guides was 26.81. Assuming a common population
standard deviation of 3.48, test, using a 5% level of significance,
whether the scouts’ performance in the test was better than that of
the guides. Assume that the scores are normally distributed.
SIGNIFICANCE TESTING y 483
Example 3.15 A certain political group maintains that girls reach a higher standard
in single-sex classes than in mixed classes. To test this hypothesis
140 girls of similar ability are split into two groups, with 68
attending classes containing only girls and 72 attending classes with
boys. All the classes follow the same syllabus and after a specified
time the girls are given a test. The test results are summarised thus:
Lt dag bs OOS
ny, ny
879 912
Ue, He.
= 110 = 121
Single-sex classes:
Let Y be the r.v. ‘the score of a girl in a single-sex class’.
Ly = 7820, Ly? = 904808, n, = 68,
population mean pu
x 2a
Therefore y = ae and %s, = ———¥?
nN nN
820 904 808
= aoe = = a
68 68
= 115 = 81
#2 = nys~tns2
Rein 2
So
=>C=_ 72(121) +68(81)
124 08 =o
= 103.04
Therefore o* = 103.04 and 6 = 10.15 (2 d.p.).
Significance test:
Ho: Uy = M2 (there is no difference in the test scores)
Hy: wy < pw. (the girls in the single-sex classes reach a
higher standard)
where z=
Q) | 1%
Ny Ny
P110—=115—(0), =enaean
Herons By pie a S.V. —2.326 0
—-+ —
72 68
ee
Conclusion: As z <—2.326 we reject Hy and conclude that there is
evidence at the 1% level to suggest that girls in single-sex classes
reach a higher standard.
Example 3.16 Two statistics teachers, Mr Chalk and Mr Talk argue about their
abilities at golf. Mr Chalk claims that with a number 7 iron he can
hit the ball, on average, at least 10 m further than Mr Talk. Denoting
the distance Mr Chalk hits the ball by (100 + c) metres, the following
results were obtained: n, = 40, Dc = 80, U(c —é)? = 1182.
Denoting the distance Mr Talk hits the ball by (100 + t) metres, the
following results were obtained: n, = 35, 2t = —175,
D(t—#)? = 1197
If the distances for both teachers are normally distributed with a
common variance, show that an unbiased estimate of this common
variance is 31.90.
Test whether there is any evidence, at the 1% level, to support Mr
Chalk’s claim.
Mr Talk:
Let X, be the r.v. ‘the distance, in m, for Mr Talk’.
Distance = (100+) metres
ny ny 2
TIS2
ey1198
40+ 35—2
2329
73
= 31.90
Mr Chalk claims that he can hit the ball at least 10 m further than
Mr Talk. Therefore the alternative hypothesis is that Mr Chalk hits
the ball less than 10 m, and a one-tailed test is performed.
Ay: byw, < 10 = (Mr Chalk hits the ball less than 10 m
further than Mr Talk)
ar piel i
At Ae N {Hs “(+
ny Ny
102
102 ——95— (10) See sygine= 10
1 1 S.V. —2.326
5.648 f= 4. ——
40 35
Esa
Example 3.17 The heights (measured to the nearest cm) of a random sample of six
policemen from a certain force in Wales were found to be
Ly = 1991, Vy—y)? = 54
Test at the 5% level, the hypothesis that Welsh policemen are shorter
than Scottish policemen. Assume that the heights of policemen in
both forces are normally distributed and have a common population
variance.
a 1078
S0kA = ay eee 179.67cm (2 d-p.)
ny 6
Scottish policemen:
Ly _ 1991
So I I =_18iem
nye et
8, = 2(y—y)” = 54
A CONCISE COURSE IN A-LEVEL STA TISTICS
488
variance be ao
Let the unbiased estimate of the common population
We have
so Q) II 2.329cem (34d.p.)
Significance test:
= A =Y ty i)
The test statistic is
z i kon
Cs nice ee
ny Ny
The critical value for t is found from row v = 15, column Q = 5%,
giving t = —1.753. ;
Re: 5%
Es (Biases)
test Se
1 1
0 et a <— Reject “i
12 = 1,75 360
_ (119.67—181)-—
(9)
2.329 /—+ oe
62) J
= —1.13 (2 d.p.)
Exercise 9e
1. For each of the following sets of data, means, 1 and My, of the normal popula-
perform a test to decide whether there is tions from which the samples are drawn.
a significant difference between the
Common population
standard deviation (0)
(d)
(e)
(f)
(g)
(h)
9)
()
(R)
(1)
(m)
(n)
cars during a certain month. A sample normal distribution with variance 0.0001,
of 100 cars yielded: test, at the 5% level, the hypothesis that
Dx; = 325.5, Lx; = 1076, x; being the the population mean is 8.00 against the
time spent, in hours, on the ith car. alternative hypothesis that the population
Assuming that the measurements are from mean is not 8.00.
a normal population, give 95% confidence From a second large consignment, sixteen
limits for the population mean. screws are selected at random and their
Could the restriction of the population mean length (in millimetres) is found to
being normal be dropped? A sample of 25 be 7.992. Assuming a normal distribution
cars from the following month yielded a with variance 0.0001, test, at the 5%
mean repair time of 3.55 hours. Is this level, the hypothesis that this population
evidence of an increase in population has the same mean as the first population,
mean over the previous month? (SUJB) against the alternative hypothesis that this
population has a smaller mean than the
Mr Brown and Mr Green work at the same first population. (C)
office and live next door to each other.
Each day they leave for work together 12. A random sample of size n, is taken from
but travel by different routes. Mr Brown a population P; whose mean is [,; and
maintains that his route is quicker, on variance 0,” and a random sample of size
average, by at least 4 minutes. Both men nz is taken from population P, with mean
time their journeys in minutes over a M2 and variance oy. Under what circum-
period of 10 weeks. The results obtained stances is it valid to test the hypothesis
were: Mi—M2= 0 using a two-sample t-test?
Mr Brown: n, = 50, X,; = 21, A machine fills bags of sugar and a
| 7 ne ot random sample of 20 bags selected from
a week’s production yielded a mean
Mr Green: nz = 50, X2 = 24, weight of 499.1 g with standard deviation
sy = 7.84 0.63 g. A week later a sample of 25 bags
Assuming that the times are normally yielded a mean weight of 500.2 g with
distributed and that they have a common standard deviation 0.48 g. Assuming that
population variance, test at the 5% level your stated conditions are satisfied per-
whether Mr Brown’s claim can be accepted. form a test to determine whether the
mean has increased significantly during
10. Hischi and Taschi are two makes of video the second week. Test whether the mean
tapes. They are both advertised as having during the second week could be 500g.
a recording time of 3 hours. A sample of (Use a 5% significance level for both
49 Hischi tapes was tested and, denoting tests.) (SUJB)
the actual recording time by (180+ h)
minutes, the following results were ob- 13 A large number of tomato plants are
tained: : grown under controlled conditions. Half
DA = 147, 2thhn) = 12720 of the plants, chosen at random, are
A sample of 81 Taschi tapes was also treated with a new fertilizer, and the
tested. Denoting the actual recording time other half of the plants are treated with
by (180+ t) minutes, the results obtained a standard fertilizer. Random samples of
100 plants are selected from each half,
were
and records are kept of the total crop
St = 324, D(t—F)? = 33488 mass of each plant. For those treated with
If the recording times for the two makes the new fertilizer, the crop masses (in
are normally distributed and have a suitable units) are summarized by the
common variance, show that the unbiased figures Dx = 1030.0, Dx” = 11045.59.
estimate of this common variance is 361. Obtain an unbiased estimate of the
Test whether there is significant evidence, population variance, and, treating the
at the 5% level, of a difference in the sample as a large sample from a normal
mean recording times. Is the difference distribution, obtain a symmetric 96%
significant at the 4% level? confidence interval for the mean crop
The lengths (in millimetres) of nine mass.
11.
screws selected at random from a large The corresponding figures for those plants
consignment are found to be 7.99, 8.01, treated with the standard fertilizer are
8.00, 8.02, 8.03, 7.99. 8.00, 8.01, 8.01. Dy = 990.0, Dy? =10079.19. Treating
Calculate unbiased estimates of the pop- the sample as a large sample from a
ulation mean and variance. Assuming a normal distribution, and assuming that
492 A CONCISE COURSE IN A-LEVEL STATISTICS
14. From alarge population of students, 120 assuming the heights to be normally
distributed, give a symmetric 99% con-
males and 160 females are chosen at fidence interval for the mean height of
random. Their heights x in metres are the sunflowers in the shady side of the
summarised in the table below. The males garden.
and females may be treated as random
samples from two independent popula- A second group of sunflowers is growing
in the sunny side of the garden. A random
[ [sene[
tions.
sample of 26 of these sunflowers is
measured. The sample mean height is
found to be 3.29m and the sample
Males 120 198 | 327 standard deviation is found to be 0.90 m.
Females 160 248 385 © Treating the samples as large samples
(a) Find the sample means and variances. from normal distributions having the
(b) Assuming that in both populations same variance but possibly different
the heights are normally distributed with means, obtain a pooled estimate of the
these means and variances, find the variance and test whether the results
probability that arandomly-chosen female provide significant evidence (at the 5%
will be taller than a randomly-chosen level) that the sunny-side sunflowers
male. grow taller, on average, than the shady-
(c) Assuming only that height is dis- side sunflowers. (C)
_ 9.76
—0.80_ JOR grads OF
(0.8)(0.2) S.V. —1.645 0
200
= —1.414
Conclusion: As z > —1.645, we do not reject Hy and conclude that
there is not sufficient evidence, at the 5% level, to refute the
manufacturer’s claim.
= —1.503
Alternative Let X be the r.v. ‘the number of dogs who prefer Chummy Morsels’.
Solution 3.18 Then X ~ Bin(n, p).
Hy: p = 0.8 (80% prefer Chummy Morsels)
H,;: p < 0.8 (less than 80% prefer Chummy Morsels)
Under Ho
X ~ Bin(n,p) with n = 200 and p = 0.8
Example 3.19 A large college claims that it admits equal numbers of men and
women. A random sample of 500 students at the college gave 267
males. Is there any evidence, at the 5% level, that the college
population is not evenly divided into males and females?
2
Solution 3.19 From the sample: p, = E00 = 0.534, n = 500
Pq
Py Np.zs where q = 1—p
a Pilih s.d. = bs
Pq
— 2.5% i
n
1.52
_ Exercise Of
sample successes
2. A theory predicts that the probability of is 0.1. A random sample of 100 items is
an event is 0.4. The theory is tested experi- inspected and found to contain 15 defective
mentally and in 400 independent trials the items. Does this provide evidence, at the
event occurred 140 times. Is the number of 5% level, that the machine is producing
occurrences significantly less than that more defective items than expected?
predicted by the theory. Test at the 1%
aoa ‘4. / A coin is tossed 100 times and 38 heads are
3. Itis thought that the proportion of defec- obtained. Is there evidence, at the 2% level,
tive items produced by a particular machine that the coin is biased in favour of tails?
496 A CONCISE COURSE IN A-LEVEL STATISTICS
6. The probability that an oyster larva will (a) Find an unbiased estimate.for the
_/develop in unpolluted water is 0.9, while in
proportion p of sweets produced which
polluted water this probability is less than are black, and, to three significant figures,
0.9. Given that 20 oyster larvae are placed an estimate of its standard error.
in unpolluted water, find the probabilities, (b) Using a distributional approximation
each to two decimal places, that the number and a 5 per cent significance level, test
that will develop is the null hypothesis p = 0.1 against the
(a) atleast 17, alternative hypothesis p#0.1. State
(b) exactly 17. your conclusion.
An oyster breeder put 20 larvae in a sample (c) Given that p=0.1, use tables to
of water and observed that only 16 of find, to the nearest integer, the expected
them developed. Use a 10% significance frequencies corresponding to the observed
level to determine whether the breeder frequencies tabulated above. (JMB)
would be justified in concluding that the
water is polluted. (JMB)
9. Ina public opinion poll, 1000 randomly
chosen electors were asked whether they
A fruit farm grows ‘Golden Delicious’ would vote for the ‘Purple Party’ at the
apples, and it can be assumed that the
next election and 357 replied ‘Yes’. Find
distribution of the masses of the apples is a 95% confidence interval for the propor-
described by a normal probability function. tion p of the population who would
The apples are graded by mass (x g) into
answer ‘Yes’ to the same question.
three grades: ‘small’ for x < 80, ‘medium’
for 80 <x <100, ‘large’ for x > 100. In Twenty similar polls are taken and the
1979, 20% of the apples were graded as 95% confidence interval is determined for
‘small’ and 54% as ‘medium’. Estimate, to each poll. State the expected number of
one decimal place, the mean and standard these intervals which will enclose the true
deviation of the masses of the apples value of p.
produced on the farm in that year. Estimate
The leader of the ‘Purple Party’ believes
also what proportion of the apples had
that the true value of p is 0.4. Test, at the
masses exceeding 105 g.
8% level, whether he is overestimating his
When he begins to harvest his 1980 crop support. (C)
the grower picks out a sample of 100
apples at random and finds that only 9 are
‘small’. Find, on the hypothesis that the 10. In an investigation into ownership of
proportion of ‘small’ apples in the whole calculators, 200 randomly chosen school
crop is the same as in 1979, the probability students were interviewed, and 143 of
of getting 9 or fewer ‘small’ apples in such them owned a calculator. Using the
a sample. Would he be justified in con- evidence of this sample, test, at the 5%
cluding, on the evidence of this sample, level of significance, the hypothesis that
that there has been a reduction (for 1980 the proportion of school students owning
as compared with 1979) in the percentage a calculator is 75% against the alternative
of ‘small’ apples in the crop as a whole? hypothesis that the proportion is less
(SMP) than 75%. ; (C)p
SIGNIFICANCE TESTING as / 497
P2902 ‘
Pe N[pa | with q, = 1—p,
No
; Piqi . P2492
So Es coalos oa N(p.—pa a ]
ny 0)
The distribution is known as the sampling distribution of the
difference between proportions.
We usually wish to test whether the samples have been drawn from
populations which have a common proportion p.
| In this case
1 1
Pag Be N(o.2a(--+ where q = 1—p
ny, Ny
Case 1 — If p is known
Case 2 — If p is unknown
We use an estimate p for it, where
NPs, + N2Ds,
Da
nyt+ny,
498 A CONCISE COURSE IN A-LEVEL STATISTICS
Example 9.20 Two companies ‘Consumer Opinion’ and ‘People’s Choice’ conduct
research in a large city before an election. ‘Consumer Opinion’
finds that in a random sample of 500 people, 325 said that they
would vote for Mr A. The ‘People’s Choice’ finds that in a random
sample of 300 people, 201 said that they would vote for Mr A.
Test, at the 5% level, whether there is a significant difference
between the two proportions.
: a 325
ne = 0.65, n, = 500
Solution 9.20 Consumer Opinion: p,, =
Population'proportion = p,
Population proportion = p»
n,+n,
325+ 201
~ 500+ 300
= 0.6575
and q@ = lp = Oc42G
as 1
Pq at <— Reject es D Reject Hp >
aoe aa ae 0
0.65 —0.67
/ (0.6575)(0.38425) Fao aa
= — O77.
Conclusion: As |z2|< 1.96, we do not reject Hy and conclude that
there is no evidence at the 5% level of a significant difference
between the proportions obtained by the two companies.
118
Solution 3.21 Stay-White toothpaste: p,, = 150 = 0.787, n, = 150,
population proportion = p,
12, =
Ordinary toothpaste: Roe {00 =O) 72,400, 1100,
population proportion = p2
i NPs, + N2Ps,
ny+ny,
118+ 72
250
= 0.76
and gq = 1—p = 0.24
Ho: Pi = P2 = Pp (the proportions are the same)
H,: p, > p2 (the proportion with no dental decay is greater
if Stay-White is used)
Consider the sampling distribution of the difference between
proportions, where
pata
Pa, ah gn ten UC aegateae
ny Ny
10%
Ba
Gaetan
ny, ;
Ne
z=
Dive Pa
fants t 1
BO lie dea EReject Hp —>
ny Nn»
S.V. 0 1.282
0.787 — 0.72
1
(0760.24)(= + a
1.21 (358.F.)
Conclusion: As z <1.282 we do not reject Hy and conclude that
there is no evidence, at the 10% level, to suggest that people have
fewer occurrences of dental decay if they use Stay-White tooth-
paste.
Exercise 9g
ie For each of the following sets of data, test
the hypothesis that there is a common
aes D.
Sample II
ieee in ee
| Number of Number in | Number near erosao
sample ‘successes’ sample ‘successes’ Siena ih cee Level
them. Give full details of your test. (C) Eaten 150 Eaten
Uneaten 40 A Uneaten ot
A certain country in a fairy tale is popu-
lated by elves and fairies. A random You may assume, when answering the
sample of 100 fairies were each asked the questions below, that the fate of an
question ‘Do you believe in people’: 72 individual apple was independent of the
fairies replied ‘Yes’ and the remainder fate of all other apples.
502 A CONCISE COURSE IN A-LEVEL STATISTICS
(a) Before any apple had fallen or been by Dutch elm disease. It is found that in
eaten, the farmer selected at random a country A, out of a random sample of
variety A apple and stated that it would 100 elm trees, 67 are affected by the
not fall before picking time. Estimate the disease and that in country B, out of a
probability that he was correct. random sample of 150 elm trees, 93 are
(b) At picking time the farmer acciden- affected by the disease.
tally trod on a fallen apple. Assuming Tree experts have a theory that the
that this apple was equally likely to have proportions of affected trees in the two
been any one of the fallen apples, estimate countries are the same, although there is
the probability that it was of variety A. a possibility that, since the disease affected
(c) Give an approximate symmetric 95% the trees in country A before those in
confidence interval for p 4, the proportion country B, the proportion of trees
of variety A apples remaining on the tree affected in country A may be greater
and uneaten until picking time. than the proportion affected in country
(d) The proportion of variety B apples B. Using the sample results, test, at the
remaining on the tree and uneaten until 10% significance level, the theory of the
picking time is pg. Determine whether experts. For the test that you perform,
there is evidence at the 0.1% significance ‘state clearly the hypotheses under com-
level of a difference between p, and pz. parison.
(C) Assuming that the proportions of affected
trees in the two countries are the same,
Assuming that the mean and variance of a
give an approximate symmetric 98%
random variable X having a binomial
confidence interval for this common
distribution with parameters n and p
are np and np(1—p) respectively, prove
proportion. (C)
the mean and variance of a propor- 10. According to some recent accident
tion based on a sample of size n are p and statistics, a random sample of 800 car
p(1—p)/n respectively, where p is the true drivers injured in road accidents com-
proportion. Of arandom sample of 50 shop- prised of 250 who were wearing seat
pers in a certain city store 13 stated that belts and 550 who were not wearing seat
they lived more than 10 miles from the city belts. Determine a symmetric 99% con-
centre. Of arandom sample of 50 shoppers fidence interval for the proportion of
from another store in the same city 9 injured drivers wearing seat belts.
lived more than 10 miles from the city The injuries of the car drivers were
centre. Stating your null and alternative described as either slight or serious. Of
hypotheses and using a significance level those wearing seat belts, 50 were seriously
of 5% injured; of those not wearing seat belts,
(a) test that the true proportion in both 150 were seriously injured. Determine an
stores could be 0.15; unbiased estimate of the overall propor-
(6) show that the two samples do not tion of serious injuries amongst the injured
offer evidence of a difference in propor- drivers. Test, at the 5% significance level,
tions between the two stores. (SUJB) the hypothesis that the proportion of
serious injuries is greater amongst those
An organisation interviews a randomly injured drivers not wearing seat belts than
chosen sample of 1000 adults from the amongst those injured drivers wearing
population of the United Kingdom, and seat belts. (C)
517 of those interviewed claim to support
the Conservative party. A second organisa- 1a: It is known that in a large population
tion independently interviews a random there is a proportion p with a certain
sample of 2000 adults, of whom 983 attribute. A random sample of size n is
claim to support the Conservative party. taken and it is found that x of them have
(a) Verify that the results of the two the attribute. If p =x/n show that
organisations do not differ significantly at the mean and variance of p are p and
the 5% level. p(1—p)/n respectively .(You may assume
(b) Obtain a symmetric 99% confidence any result relating to a binomial distribu-
interval (based on the combined results) tion.). What is the approximate distribu-
for the proportion of the population who tion of p when nis large?
claim to support the Conservative party. An ambulance station claims that at least
(C) 30% of its calls are life-threatening emer-
gencies. To check this a random sample
Countries A and B contain large numbers of 150 of its records were examined and,
of elm trees, many of which are affected of these, only 38 were found to be life-
SIGNIFICANCE TESTING # 503
threatening emergencies. Test the claim two sets of volunteers. One group of 90
using a 1% significance level. was treated with A and 59 responded
At a neighbouring station a random with lower blood pressure. The other
sample of 150 records showed that 50 group of 80 was treated with B and 51
were life-threatening emergencies. Test responded with lower blood pressure.
whether there is a difference between (a) Find an approximate 95% confidence
percentage rates in the two stations. interval for the population proportion for
(SUJB) which A is effective. In what way is your
interval approximate?
12. A drug research company has produced (6) Test (at the 5% significance level) if
two compounds A and B for reducing there is any difference between the
blood pressure. They are administered to effectiveness of the two drugs. (SUJB)
X —np
Binomial situation Z=
V npq
o unknown
Large n Small n
Aa Xia
a en =
o/V/n S//n-1
nS?
where 67 = 1S?
ial
: 5 a
Equal population variance o
o known o unknown
Proportions p known
bee,
Z= (large n)
PQ
n
lly Z= Po Psy
a Besp1
ie
iA: 7
pq{(—+—
pa( + | ti No
a NPs, +N2P5,
where DiS Sa
n,+ny,
A CONCISE COURSE IN A-LEVEL STATISTICS
TF ae
Miscellaneous Exercise 9h |
S
ee eei e ee ee
The heights of men can be assumed to be His results for a random sample of six-
normally distributed with standard devia- teen ‘Gofar’ golf balls were X = 224 and
tion 0.11 m. D(x —X)* = 2460. Assuming that the
variance of X is the same for both types
In 1928 the mean height of men in a
of golf ball, obtain a pooled (two-sample)
certain city was 1.72 m. In a survey in
estimate of this variance and, making the
1978 the mean height of a random sample
assumption that the true variance is equal
of 16 men from the same city was 1.77 m.
to this estimate, test at the 5% level
On the hypothesis that the population
whether his results for ‘Gofar’ golf balls
mean height has not changed, calculate
differ significantly from those for ‘Farfly’
the probability of obtaining a sample
golf balls. (C)
mean height greater than that measured.
In another survey in 1978 the mean Mr Smith and Mr Jones are neighbours
height of a random sample of 32 men who work at the same office. Mr Smith
from a second city was 1.73 m. Assuming drives to work in his old car, and each
that the population mean heights are the day records the time (x minutes) his
same in the two cities, calculate the journey takes. After 250 journeys his
probability that a difference in sample observations are summarised by
mean heights greater than that measured Dx = 6250, Dx? =158491. Regarding
would be obtained. (MEI) his observations as constituting a large
random sample, give a symmetric 97%
Jack says ‘Boys are better than girls at
confidence interval for his average journey
Watology’. Jill says ‘Not true, girls are
time.
better’. Assume ability at Watology can
be measured by a test with a maximum Mr Jones drives to work in his new car,
score of 100 and that scores are approxi- and his average time over a random
mately normally distributed. Explain how sample of 50 journeys is found to be 21
to investigate Jack and Jill’s assertions by minutes. Mr Jones claims that if he leaves
describing how to conduct an experiment home 3 minutes after Mr Smith he will,
in which the measurements made are the on average, arrive at work before him.
Watology test scores of boys and girls. Assuming that Mr Smith and Mr Jones
take different routes to work, that their
Assume your experiment has been done
and the following scores obtained:
journey times have standard deviation 3
minutes and that the samples may be
Boys,x | 92 80 76 79 84 80 87 88 81 91 treated as being large samples, test whether
Mr Jones’ claim may be accepted at the
Girls,y | 94 86 78 77 85 83 96 88 82 90
2% significance level. (C)
Test if there is any difference in the
Let p denote the probability of obtaining
ability of boys and girls at Watology.
a head when a certain coin is tossed.
(The following sums may be of use to
(a) If p= 0.4, find the probability of
some candidates: Dx = 838, Ly = 859,
obtaining at least 3 heads in 10 indepen-
Dx? = 70492, Dy? = 74143.) (O)
dent tosses of the coin.
An expert golfer wishes to discover (b) If p= 0.6, find the probability of
whether the average distances travelled by obtaining exactly 12 heads in 20 indepen-
two different brands of golf ball differ dent tosses of the coin.
significantly. He tests each ball by hitting (c) Write down an appropriate null hypo-
it with his driver and measuring the thesis and an appropriate alternative
distance X (in metres) that it travels. The hypothesis for testing whether the coin
distribution of X may be assumed to be is unbiased.
normal. To carry out this test 20 independent
His results for a random sample of nine tosses of the coin are made and the num-
‘Farfly’ golf balls were x = 214 and ber of heads that occurs is observed.
D(x —X)* = 2048. Making the assumption Given that 15 heads occurred, carry out
that the population variance is equal to the test, assuming a 5 per cent significance
the sample variance, obtain a 95% level. Write down a statement of the con-
symmetric confidence interval for the clusion you draw about the value of p
mean of X for ‘Farfly’ golf balls. for this coin. (JMB)
SIGNIFICANCE TESTING ’ 505
6. (a) After a survey a market research com- Given that 0,7= 0.04, 0,7>=0.05 and
pany asserted that 75% of T.V. viewers ny =nz= 100, write down a symmetri-
watched a certain programme. Another cal two-sided 99% confidence interval
company interviewed 75 viewers and for My— M2 in terms of X; — Xp.
found that 51 watched the programme
If in fact WU; = 3.06 and “2 = 3.00, deter-
and 24 did not. Does this provide evidence
mine the probability that the hypothesis
at the 5% level of significance that the
Mi = M2 would not be rejected using a
first company’s figure of 75% was
two-tailed test with significance level 1%.
incorrect?
State how this probability would be
(b) Samples of leaves were collected from
affected if the values of the population
two oak trees A and B. The number of
means were [; = 3.00, U2 = 3.06.
galls was counted on each leaf and the
mean and standard deviation of the num- Determine whether or not the hypothesis
ber of galls per leaf were calculated with My = U2 should be rejected at the 1% level
the following results: of significance in each of the cases when
(a) x1,= 3.07, X2= 2.99,
(b) X;= 3.07, x2 = 3.12. (JMB)
Sample size
The length X of a certain component
Mean
made by a machine is specified by the
S.D.
manufacturer to be 10 cm. X may be con-
Assuming normal distributions, do the sidered to be a random variable distributed
data provide evidence at the 5% signifi- normally with mean 10 cm and standard
cance level of different population means deviation 0.05cm. All components are
for the two trees? (SUJB) tested and are acceptable if they lie
between 9.95cm and 10.08cm. Those
less than 9.95 cm are rejected at a loss of
An investigation was conducted into the
40p each to the manufacturer; those
dust content in the flue gases of two
between 10.03 cm and 10.05 cm can be
types of solid-fuel boilers. Thirteen
shortened at a loss of 20p and those
boilers of type A and nine boilers of type
greater than 10.05 cm can be shortened
B were used under identical fuelling and
resulting in a loss of 25 p. Calculate the
extraction conditions. Over a similar
probabilities that if a component is tested
period, the following quantities (Table A),
the loss L = 0, 20, 25, 40 pence and hence
in grams, of dust were deposited in similar
calculate the expected value of L.
traps inserted in each of the twenty-two
flues. In order to test the accuracy of the
Assuming that these independent samples machine a random sample of 25 com-
came from normal populations with the ponents is measured and found to have a
same variance mean length of 10.014 cm. Is this sufficient
(a) use a two sample t-test at the 5% level evidence at the 5% level of significance to
of significance to determine whether indicate that the mean is greater than
there is any difference between the two 10cm? If a further random sample of 25
samples as regards the mean dust deposit. yielded a mean of 10.012 cm, by pooling
(b) test at the 5% level of significance the two samples determine whether your
whether there is any difference between conclusion about the mean alters. (SUJB)
the two samples as regards the mean dust
Blocks of wood used for flooring are cut
deposit, where this time you should also
by machine. Their lengths are normally
assume that the population variances are
distributed with mean 230mm and
both known to be 196.0.
standard deviation 2mm, while their
Explain the apparent contradiction in widths are normally distributed with
your test results. (AEB 1980) mean 80mm and standard deviation
1.5 mm; the two measurements are
Explain what is meant by a random independent. Calculate the probabilities
sample. (a) that a block selected at random will
Random samples of size n; and nz are lie within the tolerance limits 226.5 mm
taken, one from each of two normal to 233 mm in length,
distributions with means My, U2 and (b) that a block selected at random will
variances 0;°, 07 respectively. The sample lie within the tolerance limits 77 mm to
means are x; and X2 respectively. Write 82 mm for width,
down expressions for the population (c) that a block selected at random will
mean and variance of X;— Xp. satisfy both tolerances,
506 A CONCISE COURSE IN A-LEVEL STA TISTICS
Table A
P(X
= x)
X ~ Bin (8, 0.4)
We use a 1-tailed test and look at the right-hand tail of the distribu-
tion.
We want to draw the boundary line for the critical region so that
5% of the area lies to the right of the boundary.
We find, from tables or calculations, that
P(X 25) = 0.1737 (> 0.05)
P(X 26) = 0.0498 (< 0.05)
So the boundary line must be drawn slightly to the left of the
rectangle for x = 6.
Boundary line
0 1 2 3 4 2 6 7 Sian ex
Critical region —»
From the diagram we see that this is the case, and conclude that
Bin(8, 0.4).
‘x =7 is unlikely to occur in the distribution eee
Peer
508 A CONCISE COURSE IN A-LEVEL STATISTICS
is
Example 3.22 A coin is tossed 6 times. Test, at the 5% level, whether the coin
biased towards headsif (a) 6 heads are obtained, (b) 5 heads
are obtained.
Solution 9.22 Let X be the r.v. ‘the number of heads when the coin is tossed 6
times’, and let p be the probability that the coin shows heads.
Hy:p = 0.5 (the coin is fair)
H,:p > 0.5 (the coin is biased so that it is
more likely to show heads)
so the boundary line for the critical region will be drawn as shown
in the diagram, to give an area of 5% in the critical region.
P(X =x)].- 1 X ~ Bin (6, 0.5)
5% shaded
Comoe ae ore ee
Critical region a
Reject Ho
5% shaded
2Reject Hp —>
We see that the rectangle for x = 6 lies wholly in the critical region,
and conclude that there is evidence, at the 5% level, to suggest that
the coin is biased towards heads if 6 heads are obtained in 6 tosses.
SIGNIFICANCE TESTING , 509
(b) 5 heads are obtained:
P(X =x)
5% shaded
0 1 2 3 4 5 Cr x:
B Reject Hyp >
The rectangle for x = 5 does not lie wholly in the critical region,
so we do not reject Hy and conclude that there is no evidence, at
the 5% level, to suggest that the coin is biased towards heads if 5
heads are obtained in 6 tosses.
Example 3.23 The discrete r.v. X is distributed binomially with n=10. Ifa
single observation x is taken from the distribution, test, at the 8%
level, the hypothesis that p = 0.45 against the alternative hypothesis
p#0.45 when (a) x=7, (b) x=1.
=7
From tables, P(X > 7) = 0.102 > 0.04, so the rectangle for x
does not lie wholly in the critical region. We do not reject Ho, and
conclude, at the 8% level, that p = 0.45.
Boundary line
4% shaded
7 8 9 10
{aes Reject Hy ——>
NOTE: Since P(X > 8) = 0.0274 the boundary line comes within
the rectangle for x = 7.
(b) We now test the single observation x =1. This time we are
interested in the position of the boundary of the critical region in
the left-hand tail. Now the rectangle for x =1 will lie wholly in
the critical region if P(X <1) < 0.04.
From tables, P(X <1) = 0.0233 < 0.04, indicating that the
rectangle for x =1 does lie wholly in the critical region. There-
fore we reject Hy and conclude, at the 8% level, that there is
evidence that p # 0.45.
Boundary line!
‘
4% shaded
0 1 2
<«— Reject Hy =
NOTE: Since P(X < 2) = 0.0996, the boundary line comes within
the rectangle for x = 2.
Example 9.24 State the conditions under which the binomial distribution may be
used. Illustrate your answer by referring to a specific example
preferably from a project.
SIGNIFICANCE TESTING 511
7
Let X be the r.v. ‘the number of casualties who wait more than half
an hour’, and let p be the probability that a casualty has to wait
more than half an hour.
0.488 (3d.p.)
P(X =0) = 0.058
P(X =1) = 8(0.7)7(0.3) = 0.1977
P(X = 2) 28(0.7)%(0.3)? = 0.2965
P(X = 3) = 56(0.7)5(0.3)? = 0.2541
andsoon...
Now let X be the r.v. ‘the number of casualties in 20 who wait more
than half an hour’.
Then X ~ Bin(20, p)
2% shaded
0 1 2 3
Critical region
for 2% level
Critical region
for 5% level
SIGNIFICANCE TESTING
y 513
X ~ Bin(n, p)
Level of test: a%
-tailed test
a
Reject Hy if P(X 2 r) <——
100
a
Reject Hy if P(X <r) <<——
100
2-tailed test
x
5a
Reject
‘J H,0 if P(X
( <r ) pigeon
100
1
5c
or if P(X Sr) <——
100
Exercise 9i
1. For each of the following, a single observation x is taken from a binomial distribution where
X ~ Bin(n, p). Test the hypotheses at the level of significance stated.
Level of
2. Adie is thrown 15 times and it shows a six 10 seeds is tested 9 germinate. Is this
on twelve occasions. Is the die biased in evidence, at the 5% level, of an increase
favour of showing a six? Test at the 1% in the germination rate?
level.
4. Ina test of 10 true-false questions a
3. The probability that a certain type of student gets 8 correct. The student claims
seed germinates is 0.7. The seeds tindergo she was not guessing. Test this claim at
i a new treatment, andpewhen a packet of the 5% level.
ne
a ee Re ee heb
514 A CONCISE COURSE IN A-LEVEL STATISTICS
i
5% shaded
1 31415) Gael 7 aS
_— Critical region ———»>
Example 9.25 The number of misprints on the front page of the Daily Informer
is found to have a Poisson distribution with mean 6.5. A new proof-
reader is employed and shortly afterwards the front page was found
to have 12 misprints. The editor says that the mean number of mis-
prints has increased. Test this claim at the 5% level.
Solution 39.25 Let X be the r.v. ‘the number of misprints on the front page’.
Then X ~ Po(A)
Ho: = 6.5 (the mean is unchanged)
H,:\. > 6.5 (the mean has increased)
We test at the 5% level and will reject Hy if P(X 212) < 0.05,
indicating that the rectangle for P(X = 12) lies wholly within the
critical region.
SIGNIFICANCE TESTING # 515
Example 9.27 State conditions under which the Poisson distribution is a suitable
model to use in statistical work. Describe briefly how a Poisson
distribution was used, or could have been used, in a project.
(a) The number, X, of breakdowns per day of the lifts in a large
block of flats has a Poisson distribution with mean 0.2. Find,
to 3 decimal places, the probability that on a particular day
(i) there will be at least one breakdown,
(ii) there will be at most two breakdowns.
516 A CONCISE COURSE IN A-LEVEL STATISTICS
Solution 9.27 (a) Let X be the r.v. ‘the number of breakdowns per day’.
Then X ~ Po(0.2).
= 0.181 (3d.p.)
=e as before.
X.™ ,Po(A)
Level of test: #%
1-tailed test
a
ject Hy Ho if P(X
Reject ( zr)
r) <——
100
a
Reject Hy if P(X <r) <<—
100
2-tailed test
150
Reject
ject Ho
Ho if P( 2r) ) <—
if P(X 700
1
gm
or if P(X<rn< 00
Exercise 9j
1. For each of the following a single observation x is taken from a Poisson distribution, where
X ~ Po(A). Test the hypotheses at the level of significance stated.
Level of
Hypotheses significance
(f)
518 A CONCISE COURSE IN A-LEVEL STATISTICS
The number of white corpuscles on a slide distribution with mean got. Given that
has Poisson distribution with mean 3.5. the telephone in that office is unmanned
After certain treatment another sample for 10 minutes, calculate, to 2 significant
was taken and the number of white figures, the probability that there will be
corpuscles was found to be 8. Test, at the at least 2 emergency telephone calls to
5% level, whether the mean has increased. the office during that time.
The number of breakdowns in a computer Find, to the nearest minute, the length of
is known to follow a Poisson distribution time that the telephone can be left un-
with a mean of 4.5 per month. A new manned for there to be a probability of
computer is installed and in the first 0.9 that no emergency telephone call is
month there are 2 breakdowns. Test, at made to the office during the period the
the 5% level, the claim that the mean has telephone is unmanned.
decreased. During a week of very cold weather it
was found that there had been 10 emer-
The number of telephone calls to an
gency telephone calls to the office in the
office follows a Poisson distribution with
first 12 hours of the weekend. Using the
a mean number of 6 per hour on a week-
tables provided, or otherwise, determine
day. whether the increase in the average num-
(a) On Monday there were 5 calls between
ber of emergency telephone calls to that
10.00 and 10.30. Test, at the 5% level,
office is significant at the 5% level. (L)
whether the mean has increased.
(b) On Wednesday there were 3 calls
between 11.00 and 12.30. Test, at the Explain briefly, referring to your projects
5% level, whether the mean has decreased. if possible, the role of the null hypo-
thesis and of the alternative hypothesis
The number of flaws per 100 m of fabric
in a test of significance.
is known to follow a Poisson distribution
with mean 2. A 200m length of fabric Over a long period, John has found that
is tested and found to have 7 flaws. Test the bus taking him to school arrives late
at the 5% level, whether the mean has on average 9 times per month. In the
increased. month following the start of new summer
schedules, John finds that his bus arrives
Describe, briefly, the experimental late 13 times. Assuming that the number
evidence which you obtained in order to of times the bus is late has a Poisson
illustrate the Poisson distribution. State distribution, test, at the 5% level of
carefully any assumptions which you significance, whether the new schedules
made. have in fact increased the number of
The number X of emergency telephone times on which the bus is late. State
calls to a gas board office in t minutes clearly your null and alternative hypo-
at weekends is known to follow a Poisson theses. (L)P
Correct
(1) Hy is true Accept Ho
decision
Wrong
(2) Ho is true Reject Ho
decision
Wrong
(3) Hy is false Accept Ho
decision
Correct
(4) Ho is false Reject Ho
decision
SIGNIFICANCE TESTING y 519
We say that
(a) Type I error is made if we reject Hy when it is true.
(b) Type II error is made if we accept Hp when it is false.
We write
(a) P(Type I error) = P(rejecting Ho| Hp is true)
(b) P(Type II error) = P(accepting Ay|H , is true)
NOTE: when considering Type II errors we must state a definite
value of the parameter in the alternative hypothesis H,.
= 0.652
P(drawing at least 1 white) = 1—0.652
= 0.348
Therefore P(Type I error) = 0.348.
= 0.059
Therefore P(Type II error) = 0.059.
520 A CONCISE COURSE IN A-LEVEL STATISTICS
Example 3.29 A man claims that he can throw a six with a fair die five times out
of six on the average. Calculate the probability that he will throw
four or more sixes in six throws (i) if his claim is justified (ii) if he
can throw a six, on the average, only once in six throws.
To test the claim, he is invited to throw the die six times, his claim
being accepted if he throws at least four sixes.
Find the probability that the test will (a) accept the man’s claim
when hypothesis (ii) is true, or (b) reject the claim when it is
justified, that is, when hypothesis (i) is true. (AEB 1974)
So, in six throws, let X be the r.v. ‘the number of sixes obtained’.
t 6x 5\*
Now P(X =x) = tale =| ¢ Se OAM
YEG
6} \6 6/\6 6
5) ei
| 5 (15+ 30+ 25)
0.938 (3d.p.)
Therefore P(X > 4) = 0.938 (3d.p.).
&
(wi) If P(throws a six) = 6
1
5 6—x 1\*
P(X =x) = ce.(2] =) aes 0). dee. «2,6
ll 0.0087
{x<4|p= 3] 5
= 1—0.938
= 0.062 (3d.p.)
Therefore the probability of rejecting the man’s claim when it is
justified (probability of a Type I error) is 0.062 (3d.p.).
PX =x) = er ee ee
f(x) = Z(x+ 1)
Bi]
BIW
NI
524 A CONCISE COURSE IN A-LEVEL STATISTICS
2 1
a ely 0.4
2+2—tk?—k 0.4
k? + 2k—7.2 = 0
(k+1)? = 8.2
k+1 = +2.86
Therefore k = 1.86 since 0 <k < 2.
isl
x4 1.86
0.748
Therefore, when k = 1.86, P(Type II error) = 0.748.
Example 9.32 To test whether a coin is fair, the following decision rule is adopted.
Toss the coin 120 times; if the number of heads is between 50 and
70 inclusive, accept the hypothesis that the coin is fair, otherwise
reject it.
(a) Find the probability of rejecting the hypothesis when it is
correct.
(b) How should the decision rule be modified if
P(Type I error) < 0.01? |
(c) With the original decision rule, find P(Type II error) if the coin
is biased and the probability that a head is obtained is in fact 0.6.
SIGNIFICANCE TESTING , 525
np = c.20)(
5 = 60 and npq = (120)fe a = 30
2 2) \2
So X ~ N(60,30)
49.5 60 70.5
<=— Reject Hy —\« Accept Hee Reject Hy >
Accept the hypothesis that the coin is fair if the number of heads
lies between 46 and 74 inclusive, otherwise reject it.
49.5 70.5 72
S.V. —4.193 —0.28 0
We see that, with the given decision rule, there is a fairly high
probability that the coin will be accepted as fair when in fact
p=0.6.
SSSS5
ga Accept Ho
49.5 Number of heads
P(Type II error) P(Type | error)
SIGNIFICANCE TESTING ' 527
v
Example 9.33 A sample of size 100 is taken from a normal population with un-
known mean uw and known variance 36. An investigator wishes to
test the hypotheses Hp: uw = 65, H,: wu> 65. He decides on the
following criteria:
accept Hy if the sample mean X < 66.5
reject Hy if X > 66.5
Find the probability that he makes a Type I error.
If he uses as alternative hypothesis H,: u = 67.9, find the proba-
bility that he makes a Type II error.
On which critical value should he decide for the sample mean if he
wants P(Type I error) = P(Type II error)?
= 0.006 21
probability thatPerhe rejects Hp, when in fact Hp is
Parayy eantheyrtey
Therefore, ens 2.2 yates geegee eee Se
true (Type I error) is 0.006 21.
Hy
P(Z <— 2.333)
= 0.009 82
Therefore, P(Type II error) = 0.009 82.
66.5 67.9
Se 200g
[38
e400
Example 9.34 The ingredients for concrete are mixed together to obtain a mean
breaking strength of 2000 newtons. If the mean breaking strength
drops below 1800 newtons then the composition must be changed.
The distribution of the breaking strength is normal with standard
deviation 200 newtons.
Samples are taken in order to investigate the hypotheses:
Hy: bm = 2000 newtons
H,: uw = 1800newtons
How many samples must be tested so that
P(Typelerror) = a = 0.05
and P(Type II error) = 6 = 0.1?
Solution 9.34
= 1800 = 2000
+ Accept H, ———» <«——— Accept Hy ————»>
SIGNIFICANCE TESTING ed
‘ Re Exercise 9k (9 hee
1. Two separate tests are proposed to deter- the same result, and unbiased otherwise.
mine whether a coin is biased or unbiased. Test 2—Toss the coin 7 times, and con-
These are: clude that it is biased if at least 6 of the
Test 1—Toss the coin 4 times, and con- tosses give the same result, and unbiased
clude that it is biased if all 4 tosses give otherwise.
530 A CONCISE COURSE IN A-LEVEL STATISTICS
(a) Suppose that the coin is unbiased. Give a sketch of the probability density
Show that each test has the same proba- function for each case.
bility of giving a wrong conclusion. The following test procedure is decided
(b) Suppose that the coin is such that the upon:
probability of a head in any toss is 3 A single observation of X is made and if
Determine which test is more likely to X is less than a particular value a, where
give the conclusion that the coin is 1<a< 2, then Ho is accepted; otherwise
biased. (MEI) H, is accepted.
Find a such that, when Hp is true, the test
Two alternative hypotheses concerning procedure leads, with probability 0.1, to
the probability density function of a the acceptance of H. With this value of a,
random variable are find the probability that, when H is
Ho: f(x) = 2x 0<x<1 true, the test procedure leads to the
= 0 otherwise acceptance of Hp. (C)
Hy: f(x) = 20—2) 0<*e<1 One of two dice is loaded so that there
= 0 otherwise is a probability of 0.2 of throwing a six
with it, nothing being known about the
Give a sketch of the probability density
other scores. The other die is fair. A
function for each case.
person is given one of these dice (which
The following test procedure is decided is just as likely to be the fair as the biased
upon. A single observation of X is made one), together with the above information
and if X exceeds a particular value a, and is asked to discover which die it is.
where 0<a<1, then Ho is accepted, He decides to throw the die 10 times;
otherwise H; is accepted. Find the value if there are two or more sixes he will
of a if the probability of accepting Hy assert that the die is biased, otherwise he
given that Hp is true is é. With this value will assert that it is fair. Calculate the
of a, find the probability of accepting Ho probability of his asserting that the die is
given that H, is true. (C) (a) biased when it is, in fact, fair; (b) fair
when it is, in fact, biased. What is the
probability that his choice will be
A manufacturer makes two grades of
incorrect?
squash ball —‘slow’ and ‘fast’. Slow balls
have a ‘bounce’ (measured under standard If, instead, he decided to throw the die
conditions) which is known to be a 240 times and will assert that the die is
normal variable with mean 10cm and biased if there are N or more sixes, use
standard deviation 2 cm. The ‘bounce’ of the normal approximation to the binomial
fast balls is a normal variable with mean distribution to estimate N if the proba-
15 cm and standard deviation 2 cm. A box bility of his asserting that it is fair when it
of balls is unlabelled so that it is not is biased is to be 0.2. (SUJB)
known whether they are all slow or all
fast. Devise a test, based on a single An automated engineering process for
observation of the bounce of one ball manufacturing components includes an
such that the probability of deciding that automatic screening of the output to
the box contains fast balls when in fact it reject defective components. The process
contains slow balls, i.e. the Type I error, gives on average 5% of defectives. The
is equal to the Type II error. probability that the screening stage
Devise a test, based on an observation
identifies correctly a defective component
of the mean bounce of a sample of 4 is 98% but there is also a probability of
balls from the box such that the Type I 6% that a component which is not
error is 0.05 and state the magnitude of defective is rejected at the screening
the Type II error for this test. (C) stage. What is the proportion of all
components which is rejected and what is
the proportion of all components passed
Two hypotheses concerning the proba- from the screening stage that is still
bility density function of a random defective? (MEI)
variable are
ety 34 time oe In order to examine a six-sided die for
Ho: ah) ke otherwise bias, one face is marked, the die is tossed
a pre-determined number of times, and
Hy: =
BoekTuite tee the number of times the marked face is
A) _ otherwise uppermost is recorded.
SIGNIFICANCE TESTING 531
(a) If this occurred r times in n tosses, 10. A fair coin is tossed 100 times. Use a
explain how you would decide if this normal approximation to determine the
provided significant evidence of bias. Do probability of obtaining (a) more than
not consider any approximate methods 57 heads, (b) more than 58 heads.
in this part. It is desired to construct a significance
(6) Would you consider it likely to be test to choose between the following two
biased if the marked face came up once hypotheses concerning the possible bias
in 30 tosses? of a coin:
(ec) Would you consider it likely to be
Ho: the probability that the coin falls
biased if the marked face came up 39
heads is 0.5
times in 180 tosses? (O)
Hy: the probability that the coin falls
heads is 0.6
Flour is packed in bags. The combined The coin is to be tossed 100 times and the
mass, X grams, of a full bag and its con- number of heads, X, recorded. Construct
tents is a normally distributed random a significance test based upon the ob-
variable with mean LU grams and standard served value of X such that the proba-
deviation 5 grams. When the packing bility of accepting H; when Hp is true is
machine is working correctly UW= 136, as close as possible to 0.05. For this test
but when the packing machine is working calculate the probability of accepting Ho
incorrectly = 130. Show that the when H, is true. (C)
probabilities of a randomly chosen bag
having a combined mass of less than 11. Two alternative hypotheses for the proba-
131.5 grams when the machine is work- bility density function of a random
ing (a) correctly, (b) incorrectly, are variable X are given below.
approximately 0.2 and 0.6 respectively.
Hop: f(x) = at tx =e ed
When X is less than 131.5 the bag is
= 0 otherwise
underweight. Using the approximate
probability 0.2, determine the probability Ay: fx) = b-x —1 521,
that, when the machine is working = 0 otherwise.
correctly, in a random sample of five bags
there are precisely k bags which are Design a test, based on a single observa-
underweight, for k = 3,k = 4 andk = 5. tion of X such that the probability of
wrongly accepting Ho is 0.05.
The machine is presumed to be working
incorrectly if the number of under- Design also a test, based on a single
weight bags found in a random sample of observation of X, such that the probability
five bags is equal to or greater than r. of wrongly accepting Ho is twice the
probability of wrongly accepting H;. (C)
Determine the minimum value of r which
gives a probability less than 0.01 of
presuming the machine to be working 12. You are provided with a coin which may
incorrectly when it is working correctly. be biased. In order to test this you are
(C) allowed to toss it 12 times and count the
number, r, of heads and to use the value
of r to decide. If the coin is really fair you
wish to have at least a 95% chance of
One suggested test for deciding whether saying so. For what values of r should you
a coin is fair or not is to toss it four say that the coin is fair?
times and call it ‘biased’ if four heads or If you adopt your procedure with a coin
four tails are obtained. A second suggested which is actually biased two to one in
way is to toss it seven times and call it favour of heads, what is the probability
biased if six or seven heads, or six or that you decide the coin is biased? (O)
seven tails, are obtained. Show that both
these tests would be equally likely to 13. Random samples of 400 seeds are taken
conclude wrongly that a fair coin was from a large batch. For this batch the
biased. probability of a randomly chosen seed
Which of these two suggested tests would germinating is a. The r.v. X is defined as
be better for correctly judging as biased a the number of germinating seeds in a
coin whose probability of coming down sample. Use an appropriate normal
heads was 2/3? approximation to determine the values of
(a) P(X <340|a= 0.9)
Are any of the above results statistically
(SMP) (b) P(X> 340|a= 0.8)
significant?
532 A CONCISE COURSE IN A-LEVEL STATISTICS
KD)
NOTE: xX’ is pronounced ‘kye-squared’.
The shape of the distribution for various values of v is shown:
F(x) f(x)
So P(X>X,) = | Koger te 2 dx Xe
Pp
These values are summarised on p. 637; the first few lines are
reproduced below:
For example, if v = 4,
9.49 13.28
THE x? TEST
Consider an experiment or situation which results in n observed
frequencies, written O;,i=1,2,...,n.
Say we wish to make a hypothesis about the distribution; we could
then calculate the frequencies expected under this hypothesis,
written E;,i = 1,2,...,m. We now decide how well the ‘observed’
data fits the ‘expected’ data, and consider whether it is likely that
the differences can be attributed to chance.
Now, the comparison between the observed frequencies (O;) and
the expected frequencies (E;), fori =1,2,...,n (that is, for n pairs
of values, or classes) is made by considering the statistic
2 oie
= (O; —E;,)
i=1
We define
THE x? 2 TEST / 535
Now
(rv)
5%
2
X5%
te
Rejection region
(b) When the value of X* alc is very small, it is wise to query the
reliability of the observed data and to question whether they
have been ‘fiddled’.
Degrees of freedom
The parameter v is known as the number of degrees of freedom.
Now the number of degrees of freedom associated witha statistic
is given by
py = number of independent variables involved in calculating
the statistic
n 2
(O;—£,;)
idering
When considering statistic »
the the statis oe
E, the number of classes
1 =
The number
is the number of pairs of values, i.e. there are n classes.
We consider
of restrictions involved depends on the null hypothesis.
several cases in the following examples.
UNIFORM DISTRIBUTION
Example 10.1 The table shows the number of employees absent for a single day
during a particular period of time:
Number of
absentees
(a) Find the frequencies expected under the hypothesis that the
number of absentees is independent of the day of the week.
(b) Test, at the 5% level, whether the differences in the observed
and expected data are significant.
Solution 101 (a) If the number of absentees is independent of the day of the
week, then we would expect the total of 500 to be spread uniformly
throughout the week, so that the expected number of absentees for
any day is 100.
Expected frequencies:
Number of
There are 5 classes and, since the total expected and observed
frequencies each have to be 500, there is one restriction.
Therefore y=. 5-51 =.4
THE x2 TEST y 537
The x? test is carried out as follows:
(1) Make the Hy: the number of absentees is independent of the day of
null hypo- the week
thesis (Ho)
Calculate
ae
(6) Make As X*calc > 9.49 we reject Hy and conclude that the number of
conclusion absentees is not independent of the day of the week.
NOTE: (a) The test does not indicate what the relationship
might be between number of absentees and the day of the week.
However, a look at the observed frequencies suggests a tendency
towards a greater number of absentees on Mondays and Fridays.
(b) When working out the table we are not gouceried whether
O; —E; is positive or negative, as we require (O; — E;)*. Therefore we
find O; aides
|
538 A CONCISE COURSE IN A-LEVEL STATISTICS
on
Example 10.2 An ordinary die is thrown 120 times and each time the number
the uppermost face is noted. The results are as follows :
bero
CNon[i 4 5e6
2 8nd
is
Perform a x? test, at the 5% level, to investigate whether the die
fair.
Now »O; = LE,, therefore there is one restriction, namely that the
totals agree.
vy = number of classes — number of restrictions
= 6-1
= 5
Therefore v = 5 and we consider the x?(5) distribution.
6 2
h 2 calc = (O;a
—E;) 11.07
wnere x Ds Be
i Reject Ho
Now, under H, (that the die is fair) we would expect each number
to occur the same number of times,
sO E, = 20 ~for) 1 =),
2, 2-230
AS X* cate < 11.07 we do not reject Hy and conclude that the differ-
ences between the observed and expected frequencies are not
significant at the 5% level and the die is fair.
THE x2 TEST # 539
Number of plants
Solution 103 H): the colours pink, white and blue occur in the ratio 3: 2:5.
=i =]
= 2
3 O,—E; 2
x ctie — ys ( = oy 9.21
i=1 i; }-____»
Reject Ho
Now under Hp we expect the colours pink, white and blue to appear
in the ratio 3:2:5, so the expected frequencies are
3 2 5
—(100):—(100):—(100) = 30:20:50
10! ) 10! ) 10! )
As X? cate < 9-21, we do not reject Hy and conclude that the differ-
observed and
ences inMeraneaeete expected frequencies are not significant at
eae awe eee ee
the 1% level.
A CONCISE COURSE IN A-LEVEL STATISTICS
Exercise 10a
A tetrahedral die is thrown 120 times and had been 1600 and the observed frequen-
the number on which it lands is noted. cies 220, 820, 300, 260, would the
difference have been significant at the 5%
level? (C Additional)
Frequency 3b 82 25)" 28 Total 120
It is thought that each of the 8 outcomes
Test, at the 5% level whether the die is of an experiment is equally likely to
fair. occur. When the experiment is performed
400 times, the observed frequencies are
From alist of 500 digits, the occurrence of 45, 42,55, 53, 40, 62, 47 and 56. Perform
each digit is noted. a test at the 1% level to investigate the
Digit Que Oho tte nO oma Om validity of the theory.
Example 104 Four coins are thrown 160 times, and the distribution of the num-
ber of heads is observed to be
x (number of heads) mma oo se
f (frequency) 5 385 67 41 12
THE x? TEST ; 541
Find the expected frequencies if the coins are unbiased. Compare
the observed and expected frequencies and apply the x? test. Is
there any evidence that the coins are biased? (AEB 1974)
Solution 10.4 Let X be the r.v. ‘the number of heads obtained when four coins are
thrown’. Then if the coins are unbiased X ~ Bin(4, 5)
; t 4-—x 1 x 1 4
( =x) ) = *C,/—
and P(X A fe a
| ad Od ener. = Oiler ie a4
x Expected
(number of frequencies
heads) [160P(X = x)]
|o
4
16
6
16
4
16
oo
|
x? test:
al
Hp): the coins are not biased and P(head) = 9
i=1
i
542 A CONCISE COURSE IN A-LEVEL STATISTICS
12
Example 10.5 Samples of size 5 are selected regularly from a production line and
tested. During one week 500 samples are taken and the number of
defective items in each sample recorded.
Number of defectives, x 0 1 a 3 4 5
These are shown in the table. Frequencies have been rounded to the
- nearest integer.
Number of defectives 0 i 2 3 4 5
Expected frequency 155 205 108 .- 28 4 0
POISSON DISTRIBUTION
Example 106 Analysis of the goals scored per match by a certain football team
gave the following results:
Ik 230
Solution 106 Now xX = as Sa S028
Lf 100
Consider the r.v. X where X ~ Po(2.3), X is ‘the number of goals per
match’.
eq 2.3 (2:3)%
Then Te ea ice ate ce LS Zee.
x!
and the expected frequencies are given by 100P(X = x).
These have been calculated and are shown in the table.
x? test:
Hp: _ the distribution is Poisson
As the x? test is not valid for expected frequencies less than 5, we
combine the end categories into ‘5 or more goals’
Degrees of freedom: number of classes = 6
number of restrictions = 2
(totals agree, means agree)
Therefore v = 6 — 2 = 4 and we consider the x7(4) distribution.
We wish to test at the 5% level and reject Ho if 214)
x
X?cots 2 X750(4), 5%
14
18
29
18
10
11 :
AS X*caic < 9.49, we do not reject Hy and conclude that the distribu-
tioneecan be reasonably
ttege sores modelled
peat oeSerge eeby eee
the Hn
PoissonSedistribution
OM having
AVINe
the same mean.
Example 167 For a period of six months 100 similar hamsters were given a new
type of feedstuff. The gains in mass are recorded in the table
below:
Solution 10.7 LetX be ther.v. ‘the gain in mass’, thenX ~ N(10, 100). We calculate
P(a< X <b) =p: from the normal distribution tables (p. 634) and
work out the expected frequencies using EL; = 100p.
Upper class
Interval
(a<x<b)
Eanatiaxy Standardised
u.c.b. (z)
PZSes
< Pa<x<b
546 A CONCISE COURSE IN A-LEVEL STATISTICS
x? test:
Hy: the distribution is normal with mean 10 and variance 100
We note that the expected frequencies given by Hp are such that the
first two classes contain less than 5, similarly the last two classes.
So these are combined to give two classes instead of 4.
Degrees of freedom: number of classes = 8
number of restrictions = 1
(totals agree)
1.exif Xo 07,
i=1
15
24
16
14
8
5 4
If the mean and variance are not given for the normal distribution
then these have to be estimated from the observed data. The
expected frequencies are then calculated using these estimates.
This alters the number of degrees of freedom, for if estimates of the
mean and the variance are used, then
Exercise 10b
Test the hypothesis that the distribution Use the xX” distribution to test the
‘is binomial with the parameter p = ¢- adequacy of the Poisson distribution as a
Explain how the test would be modified model for the data given in Example
if the hypothesis to be tested is that the 4.31 (p. 256).
distribution is binomial with the parameter
The numbers of cars passing a check-
p unknown. (Do not carry out the test.)
point during 100 intervals each of time 5
(O)
minutes, were noted:
A six-sided die with faces numbered as Opp lee 2 9 34 5 6 or more
usual from 1 to 6 was thrown 5 times and
the number of sixes was recorded. The
experiment was repeated 200 times, with
PFreaueney[5 23° 23725 14° 10 0
theory and experiment is significant. (b) Find the expected frequencies for a
(MEI) normal distribution having the same mean
and variance as the data given, and test
11. The table below gives the distribution of the goodness of fit, using a 5% levei of
the number of hits by flying bombs in significance.
450 equally sized areas in South London
during World War II. During observations on a patch of white
dead nettles it was noticed that the num-
Number of bers of flowers visited by bees during 100
hits (x) 5-minute intervals were as follows:
Frequency
LSOM om oo Number of flowers Breauenc
(f) visited/5-minute interval a mM
Table A
Table B
Solution 108 The results can be shown ina table, known as a 2X2 (read ‘2 by 2’)
contingency table:
Male 28 el
Female 34 26 a
: e)
: t tim 62
: te passes firs
P(candida = ahh
Expected frequencies:
of first-time candidates
ae
Male 24.8 a2 40
Sex | Female —2 22.8 es:
550 A CONCISE COURSE IN A-LEVEL STATISTICS
7.29
7.29
7.29
7.29
Exercise 10c
[Pas[al|
Totals
4 : 2 eA
O;—-£; k(ae— bd
i=1 Ej efgh . Pass | 102 45
(do not use the continuity correction).
Region
Channel
watched
most
CCB1
CCB2
VIT
29
6
15
i
3
42
26
12
23
7
10
Solution 109 Hy: there is no association between the channel watched most
and the region.
552 A CONCISE COURSE IN A-LEVEL STATISTICS
The observed frequencies are first totalled, and then the expected
frequencies under Hy are calculated from
(row total)(column total)
Expected frequency = grand total
Observed data:
CCB1 29 16 42 23 110
CCB2 6 11 26 7 50
VIT 15 3 12 10 40
Expected data:
Expected frequency for northern viewers of
(110)(50) _
. CCB1 = 27.5
70a
This process is continued for the expected frequencies shown in
heavy type. The remaining frequencies are found by ensuring that
totals and sub-totals agree.
lp
0. or
ol
2
i
6. 5
3. 5
6
3
5
3
4
2
I S
Exercise 10d ~~
A,|16 19 15
A,|26 14 10
554 A CONCISE COURSE IN A-LEVEL STATISTICS
pafale| 5.
(AEB 1976)
3. The following table shows the numbers of ae cee pave See ae with: the hy po:
years it Set ehea tue Mth thesis that the yield is not affected by the
students passed and failed by three type of breed?
examiners A, B and C.
6. In asmall survey 350 car owners from
Examiners four districts P, Q, R, S were found to
have cars in price ranges A, B, C, D, the
frequencies of the prices being as shown
in the table.
2 X 2 contingency 4 classes
table 1 independent
variable
1. Arandom sample of 100 housewives were distribution can be taken as normal, with
asked by a market research team whether the same mean and standard deviation as
or not they used Sudsey Soap. 58 said yes the observed distribution. | (C Additional)
and 42 said no. In a second random sample 5. Smallwoods Ltd. run a weekly football
of 80 housewives, 62 said yes and 18 said
pools competition. One part of this involves
no. By considering a suitable 2 X 2 con- a fixed-odds contest where the entrant has
tingency table, test whether these two to forecast correctly the result of each of
samples are consistent with each other. five given matches. In the event of a fully
(O &C)
correct forecast the entrant is paid out at
odds of 100 to 1. During the last two years
2. Two fair dice are thrown 432 times. Find Miss Fortune has entered this fixed-odds
the expected frequencies of the scores 2, contest 80 times. The table below sum-
Oe eecs marises her results.
Two players, A and B are each given two Number of matches correctly 2 4 65
forecast per entry (x) Pies :
dice and told to throw them 432 times,
Number of entries ;with x
recording the results. The frequencies 8 19 25 22) 5, 1
correct forecasts (/)
reported are given in Table A below.
(a) Find the frequencies of the number of
Is there any evidence that either pair of
matches correctly forecast per entry given
dice is biased? What can be said about B’s
by a binomial distribution having the same
alleged results? (AEB 1976)
mean and total as the observed distribution.
(b) Use the X? distribution and a 10% level
3. Over a period of 50 weeks the numbers of significance to test the adequacy of the
of road accidents reported to a police binomial distribution as a model for these
station are shown in the table below. data.
(c) On the evidence before you, and
assuming that the point of entering is to
win money, would you advise Miss Fortune
to continue with this competition and
Find the mean number of accidents per
why? (AEB 1981)
week.
Use this means, a 5% level of significance, (NOTE: X",o%(1) = 2.71, X?10q(2) = 4.61,
and your table of Xx? to test the hypothesis X7109%(3) = 6.25, X*s9q,(4) = 7.78,
that these data are a random sample from X10%(5) = 9.24)
a population with a Poisson distribution.
(O & C) 6. The table summarises the incidence of
cerebral tumours in 141 neurosurgical
patients.
4. Table B below shows the girths of one type
of fir tree in a plantation of 480 trees set Type of tumour
alongside the distribution that would be
expected if the distribution were normal. Frontal lobes 23 9 6
Site of
Use the X’ test, with a 5% significance tumour
Temporal lobes 21 4 3
Elsewhere 34 24 17
level, to determine whether the observed
Table A
Table B
Number of
stoppages, x
Number of
days, f
Table D
accidents
558 A CONCISE COURSE IN A-LEVEL STATISTICS
(c) The teacher suspected that this 12. Over a long period of time, a research
student had not observed the data but team monitored the number of car
invented them. Explain why the teacher accidents which occurred in a particular
was suspicious and comment on the county. Each accident was classified as
strength of the evidence supporting her being trivial (minor damage and no
suspicions. (AEB 1987) personal injuries), serious (damage to
vehicles and passengers, but no deaths)
11. One formula for the x’ statistic is or fatal (damage to vehicles and loss of
(fo-f, Me life). The colour of the car which, in the
2= Decera opinion of the research team, caused
the accident was also recorded, together
where f, is the observed frequency, he with the day of the week on which the
is the expected frequency and the accident occurred. The following data
summation is over the number of groups. were collected.
Show that the formula may also be
written as
16 s
|Number of
of intervals
intervals observed
observed _| ack 32 eee
Se ofee
intervals expected — 4 33.72]}18.16 |7.33|237| 064] 0.18 |
Es
REGRESSION AND
CORRELATION
SCATTER DIAGRAM
Consider the set of points (x,,¥1), (%2,¥2),---+5 (Xn, ¥n)- If the values
of y are plotted against the values of x, then a scatter diagram is
obtained.
REGRESSION FUNCTION
We then look for a relationship :y = f(x), where the function f is to
be determined, i.e. given the points only we have to ‘work back-
wards’ or ‘regress’ to the original function f. Hence this function is
called the regression function.
559
560 A CONCISE COURSE IN A-LEVEL STATISTICS
Example 11.1 The following table gives the test results for 10 children.
Child AD ah le 2 eg ld ae Pees
Arithmetic mark,x |1 8 15 18 23 28 33 39 45 45
English mark, y 38 14 8 20 19 17 36 26 14 29
2G 255 Ly 186
i
Solution 11.1 (a) (i)Vie <*xg sl = ze
1G 10 5, 25.5,
25. Visa...
¥y 10 = ¢.
10 — 18.6
So we plot the point M(25.5, 18.6) and ensure that the line passes
through it.
For a regression line y on x, draw a line through M parallel to the
y axis.
562 A CONCISE COURSE IN A-LEVEL STATISTICS
2.
17
8 14 33 36
15 8 39 26
18 20 45 14
23 19 29
65 190
S Oo x,
XL = —5 = 18, Xp R = — 5 = 38,
Vik
een!
sent 12.8
i 122
YrR= i = 24.4
We plot
M, (18,12.8) We plot Mp (38, 24.4)
x EY,
1 3
8 14
15 8
28 aSys
45 14
2x=158 LYLy=1380 Lx=97 Yy=56
158 he oe
Ei leo Bgl eee
S 130
AT aaa 5
We plot M, (31.6, 26) We plot Mz (19.4, 11.2)
Now drawa line of good fit through M, Ma, Mg. This is a regression
line x on y. .
mark
English
Arithmetic mark
So, if you were given a height of 6 ft 4 inches you would guess 133
stone for the mass.
But, if you were given a mass of 133 stone, would you guess 6 ft
4 inches for the height? If you would not, then the two regression
lines are different.
5 10 15 20 25
|y| 20 21 23 24 23
then it is obvious that the value of x has been controlled. In this
case we would use a regression line of y on x to estimatey ,given x,.
ot2a regression line x ony to estimate .x, given y..
but no
564 A CONCISE COURSE IN A-LEVEL STATISTICS
ee ee eee ——————
1. For the following sets of data, draw scatter 3. Four identical money boxes contain
diagrams and comment on the correlation. different numbers of a particular type of
Draw regression lines y on x and x on y. coin and no coins of other types. From the
information on the combined weights,
(a) Use these 11 pairs of data: which is given below, it is desired to
[x [3 7 9 11 14 14 15 21 22 23 26 estimate the weight of a box and the mean
[y [5 12 5 12 10 17 28 16 10 20 25 weight of a coin.
eat 10L £4 1012,5. 14 614,65 (a) Plot these data on a scatter diagram,
labelling the axes clearly. State whether the
fy| 81 70 74 66 © 69. 63 data display strong positive, strong
negative, or near zero correlation (or
otherwise).
(b) State the co-ordinates of one point
through which the line of regression of y
upon x must pass.
(c) Draw on your diagram, by eye, this
regression line.
(d) Estimate, from your regression line,
(i) the weight of an empty box, (ii) the
mean weight of a single coin. (C)
2. Values of two variables x and y obtained
from a survey are recorded below.
4. Table A gives the rainfall, in cm, for the
eet oh Vad EG ge es first nine months of a year at two weather
stations. Calculate the mean monthly rain-
ly |81 78 53 585 48°29 15 3 fall over this period at each station and
Represent these data on a scatter diagram plot the information given in the table ona
and draw in the line of best fit. Obtain the scatter Hiegeam;deavang-a line of best 110
equation of the line of best fit in the form Find the equation of this line and use it to
y =mx-+c and estimate the value of y predict the rainfall at B in a month when
when x = 5.5. (SUJB) 2.5 cm of rain fell at A. (C Additional)
Table A
Ym,’ is the sum of the squares of the residuals and if we can find
values of a and b such that 2m; is a minimum, then the line
y = ax + b is called the least squares regression line of y on x.
db
dim; d2(ax;+b—y;)*
Now sa een cine wa) 2 2(ax;
+ b—y;)
db db
= 2(arx;+nb—Zy,j)
dim, "
So a = 0 when 2y; = arx;+nb (i)
mak
Peed Gx et aay) gdb yx,
da da
= 2adxfp oD x2)
d> m?
So 0 when Lx,y; = alxf+brx; (ii)
da
If the least squares regression line y onxisy =ax +b, the values
of a and b are found by solving the simultaneous equations
Ly = arx+nb
Sey = dint
+ 02K
These equations are called the normal equations for y, on x.
566 A CONCISE COURSE IN A-LEVEL STATISTICS
Example 11.2 Show that the least squares regression line y on x passes through
the mean of the data.
Dn, is a minimum.
N3
(x4, V4)
It can be shown that the point (X, y) lies on the line x = cy +d.
NOTE: in general, these lines will not coincide with those obtained
by the earlier methods described.
Example 11.3 Obtain the normal equations for the least squares regression line y
on x for the following data:
eeei Gi 4 opyi 0
ya [io 14 ie Sis is treme
Hence find the equation of the least squares regression line y on x.
REGRESSION AND CORRELATION
567
Solution 11.3
y = 0786x + 11:7
Exercise 11b
1. Calculate the equation of the least squares 2. Calculate the equation of the least squares
regression line x on y for the data given in regression line (a) y onx, (6) x ony for
Example 11.3. the data givenin (i) Exercise 11a, Question
1(a) (p.564), (ii) Example 11.1 (p. 561).
COVARIANCE
re bay= =2e—By-9)
Now ty = =B(e—zly—9)= ey
ee ae ey yay za <<)
n n
568 A CONCISE COURSE IN A-LEVEL STATISTICS
since
1 er
Sxx = (x say eeae) II ” x
n
1 |
Se 7 ty VOY) = ~
Multiplying (ii) by n,
noxy = anXx?+nb&Ux
eeay
6,7
REGRESSION AND CORRELATION
569
or esex) |
Sx
§
g
where c is the coefficient of regression of x on y
C=
w J i) 3
The point Q; lies on this line and its x-coordinate is x;. So its
y-coordinate is yg where
yo 5
2 Ss
So mm; =
A CONCISE COURSE IN A-LEVEL STATISTICS
570
of the squares of
We shall denote the minimum value of the sum
residuals by 2m,7(min) where
. 2
LMAminy a lia) en (x; —*)
x
2
Sxy
2-5 D(x;—X)(y;—Y) + (sx”)? D(x,—-X)?
8, tabs fou
2(9;
2
Sy‘y S xy a
= ns, — 2) NSxy + Tg MSx
Sx (s,")
Roe 2
xy xy
= fey aT? Wer +n
Sy Sx
2
= ns, Rj Sxy_
Example 11.4 Draw a scatter diagram for the following data. Calculate the equations
of the lines of regression (a) y onx, (b) x ony, and draw these on
the diagram.
Find also the minimum sum of squares of residuals (c) for y on x,
(d) for x on y.
| || eee
ee LG
Fee eR Ps ee
Solution 11.4 This is the same data as in Example 11.3, so we refer to the table .
on p. 567.
4 ux 38
We have x = — =— = 54 (1d p.),
n fi,
x Ly 89 .
Yee er = ee 12.7 eedip.)
n 7
On the scatter diagram, plot M(5.4, 12.7).
REGRESSION AND CORRELATION : 571
s2 = Sy?
Sy = 1147, /89\? 22.204
=|
id rc 7 a
2 Sxy a
Da Some x)
Sx
89 1.694 38
sO Ya tee (alesse |r
7 9.102 i
Rearranging, y = 0.186x+11.7 (as before)
Draw this on the scatter diagram, by plotting M(5.4, 12.7) and two
other points say (0,11.7) and (1, 11.886).
Draw this on the scatter diagram by plotting M(5.4, 12.7) and two
other points, say (0, 5.64) and (1, 6.94).
572 A CONCISE COURSE IN A-LEVEL STATISTICS
1.694?
= 7(2.204 -———
9.102
= 13.2 (35S.F.)
The minimum sum of squares of residuals for y on x is 13.2 (3 S.F.).
1.694?
719.102 ————
2.204
II 54.6 (35.F.)
The minimum sum of squares of residuals for x on y is 54.6 (3 S.F.).
aah eee oO
|y|10 14 1279 45 75 12
prong[2]
set the calculator to LR (linear regression) by pressing
Ce][eave]
[52]foam
9
REGRESSION AND CORRELATION
573
(Try to use both hands, the left hand for the numbers and the right
hand for the DATA | keys.)
y = A+Bx
gvesm 2x. 17 0
gives M II 38
gives
=]
Le] = 7
Exercise 11c
In the following questions, check your answers (a) Plot the data. Comment on whether it
using your calculator in LR mode if possible. appears that the usual simple linear regres-
sion model is appropriate.
1. Calculate (i) the covariance, (ii) the equa- (b) Assuming that such a model is approp-
tions of the two least squares regression riate, estimate the regression line of yield
lines for the following data. Plot the on temperature.
scatter diagrams and draw in the regression (c) Plot your estimated line on your
lines. Find also the minimum sum of squares graph, and indicate clearly on your graph
of residuals (iii) for y on x, (iv) for x ony. the distances, the sum of whose squares
is minimised by the linear regression
procedure. (MEI)
Table A
| Patient | 12 8 4 86 7) 8 8 10°11 12
ee Initial (J) | 61 23 8 14 42 34 32 31 41 25 20 50
Final(F) | 49 12 3 4 28 27 20 20 34 15 16 40
REGRESSION AND CORRELATION
575
Obtain the equation of the regression line
Criticise the report and make your own
of A on Tgiving the coefficients to 2
recommendations on how to achieve the
decimal places. maximum yield. (AEB 1988)
Draw this line on your scatter diagram.
Use the regression equation to obtain an Referring to your projects if possible,
estimate of the mean value of A when explain clearly the purpose of obtaining
T = 20, and explain why this estimate a linear regression equation, and describe
is preferable to averaging the two what use was, or could be, made of this
observed values ofA when T= 20. equation.
Estimate the mean increase in A for a
A large field used for growing potatoes
One degree increase in temperature. was divided into 6 equal plots, and each
State any reservations you would have plot was treated with a different concen-
about estimating the mean value of A tration of a certain fertiliser. At harvest
when T=0. (L) time the yield from each plot was recorded,
and the results are given in the table, with
6. Inan attempt to increase the yield (kg/h)
potato yield (Y kg m~”) and fertilizer
of an industrial process a technician varies concentration (Cgl ).
the percentage of a certain additive used,
while keeping all other conditions as Concentration, C - i an
constant as possible. The results are
Yield, Y 10°16°26 36°50) 72
shown below.
Draw a scatter diagram for these data,
and mark on your diagram the point
representing the mean of the data.
Find the equation of a suitable regression
line from which the yield to be expected
for a concentration of 5 ln can be
predicted, and give the value of this
expected yield. Sketch the regression
line on your scatter diagram.
Calculate the sum of squares of the
You may assume that 2x = 34, residuals and explain what this value
Ly = 1057, Yxy = 4504.55, represents with regard to your regression
Dx? = 155. line.
(a) Draw a scatter diagram of the data. [If required, you may assume in your
(b) Calculate the equation of the regres- working that UC* = 66.25, ZCY = 813,
sion line of yield on percentage additive YY* = 10012.) (L)
and draw it on the scatter diagram.
In an experiment the temperature of a
The technician now varies the tempera-
ture ( C) while keeping other conditions metal rod was raised from 300 K. The
extensions E mm of the rod at selected
as constant as possible and obtains the
temperatures T K are shown in the table.
following results.
70
TAS:
80 igo
85
90
He calculates (correctly) that the regres-
sion lineis y = 107.14 0.29¢.
(c) Draw a scatter diagram of these data
together with the regression line.
Draw a scatter diagram of the data and
(d) The technician reports as follows,
mark on your diagram the point repre-
‘The regression coefficient of yield on
senting the meansof TandE. |) {4
percentage additive is larger than that of
yield on temperature, hence the most Find the equation of the regres8ion line
effective way of increasing the yield is to of E on T and draw this line on your
make the percentage additive as large as diagram. Estimate the extension of the
possible, within reason.’ rod at 430 K. (L)P
576 A CONCISE COURSE IN A-LEVEL STATISTICS
Now take new axes, with origin (x,y) and the X axis graduated in
units of s,, the Y axis graduated in units of sy.
Y= and xX =
Y = rx where he
and xX =ryY
The diagram from Example 11.4 would change from diagram (a) to
diagram (b):
r=tané
y on x and
xX on y coincide
Perfect positive
correlation r = 1
r=0.5
Some positive
correlation r = 0.5
578 A CONCISE COURSE IN A-LEVEL STATISTICS
No correlation r = 0 r=0
—~
Some negative
correlation r = —0.4
—_
High negative
correlation r = —0.9
x on y coincides
with y on x
—
Perfect negative
correlation r = —1
ha= eiey
: y
= n(s,?—r?s,”) since pe eee
Sx
SxSy
ex:
=sn( a Ler"oye?
)sy
For x ony
2 is Be
s 2
2Ni(min) = nls |
Sy
Sx y
=n, (S cet 82) since r= —
By
= vest Se
Example 11.5 For the following data, find the product-moment correlation
coefficient. Find also the minumum sum of squares of residuals for
yon x.
|x | 20 30 40 46 54 60 80 88 92
|y| 54 60 54 62 68 80 66 80 100
580 A CONCISE COURSE IN A-LEVEL STATISTICS
Solution 11.5
Tx 34140 (510)?
ee = so = 582.2292
n 9 9
So r=
Sey SS
280.4445 ES ol
8,8, /(582.2222)(199.1111)
DINin a Wide ee
= 9(1—0.82387)(199.1111)
= 576 (38.F.)
mM oO oO Co ES 5 ° a oO o =) oSeSOo mM2 5 vw oS EK |
o||
O
31/8 8 S << S on TS DATA
8 S << S
S <<dS i)
E
& S SS
8 S — dS
8 dS <= S
& S SS S Sa
hale
pete
o Piel
eS
Sl
> Sis
reSle
else
el
>
sia
all
He
He
Ou
Co
co
© 8S Yyp||100|| DATA
A) are) 1
PE]
(3 8.F.), as before.
which gives 2m/(min) = 576
x =cytd where ¢ =a
582 A CONCISE COURSE IN A-LEVEL STATISTICS
Now (haa
Example 11.6 For the data given in Example 11.4, find r, the product-moment
correlation coefficient.
So r = +/ac
= V(0.186)(0.769)
lI 0.378 (3 d.p.)
r = 0.878, indicating that there is some positive correlation.
Method 2 She
oe y\
de
tt >
y oS
Sxy 1.694
So ore = eee = 0.878 p.
s,s, V/(9.102)(2.204) ceo
REGRESSION AND CORRELATION
583
Example 11.7 The moisture content, M ? in grams of water per 100 grams of dried
solids, of core samples of mud from an estu ary was measured at
depth D metres. The results are shown in the table:
Solution 11.7 (a) Scatter diagram to show moisture content, M, and depth, D.
(spljos
palup /5) QQ,
aunisioyy Jua}UOd 6
Depth (m)
584 A CONCISE COURSE IN A-LEVEL STATISTICS
s ZIMA
i where Som = ~~ DM
SpSmu 8
= —289.375
—289.375
Therefore r=
(11.45 ...)(26.52...)
= —0.952 (3d.p.)
This is almost perfect negative correlation.
M—M =—2(D-D)
— s =
Sp
—289.375
Therefore M45 2D 1)
(11.45 2.7)?
= —2.20(D —17.5)
We show the regression line drawn on the scatter diagram. Note that
it goes through (D, M) i.e. (17.5, 45) and the intercept on the
M-axis is 83.58.
REGRESSION AND CORRELATION
585
EEE
Ee Gs taeeea [se || Se
f EEE EEE
iS] Re HE ttt
EEEEE EERE
ale Les | om | ee | i Vf |
[|] | eatas | a espn
[ [| oaee i]| fe |a os[on Ff ee)
HH sae H+ HH aps)
- con | 4 a
|| ese eloae [ea @ |
hey emee) | ee |
> | nfo a] hee a ah [le stellen)
3 ic] SifSao eee a] Ig asf a alse a eaegcad
3
3
EEE EEE
aya 8 OY ere | |
2
Gy
oD
eof
fff rages] NC 0 | at
es gusbereeeersensceseeesa
[ai a ef
oO POO EPR BEE
2 (ae ese] sh HE [ees faa fal sa
San celia atbelie tetal TT | A Gees Pe
= Lee an ie bol) Trt ye fw! An aa {eid deface
el
8 EERE
eS BERR ease TTTN HE (aS PSs)
)
3
G45 Le
re tele
ee Pee | Cer
a eee ena ea
2 EEETo
2 ee eat
= 30Rett
a][siete[ey
ha OSA |
linfaafole fd. [axl
20 LE
a a TTT
Lala lll 2 mH 714
a ae Td tet elt HN {ae |i
BEER EEEEEE EEE EEE
10 Lispol et [op le} i a a
na Fa ims)Nie
|ee
DSU BERRIES Phe Ett tt MN 4 | ——
REE Poy1 i Ne
es |e eefie | | |FT eg Vf NS HH
EEE EEE EH
piesfee oat [red Cer eT errr ty Trt Tr tr So
0 5 10 15 20 25 30 35
Depth (m)
Gradient = —2.20
586 A CONCISE COURSE IN A-LEVEL STATISTICS
a (AT)CO aa)(a
45.
REGRESSION AND CORRELATION E 587
so to calculate n(1—r?)sj)?:
DA ame
5 a IE
So ZMi min) = 525.98 (2 d.p.).
a 2 7 Exercise 11d
Calculate the product-moment correlation tion coefficient and the equations of the
coefficient for the sets of data given in two least squares regression lines.
Exercise 11c, Question 1, and comment
on your answers.
For a given set of data the equations of
the least squares regression lines are
If the equations of the least squares regres- y = —0.219x+ 20.8 (yonx) and
sion lines are
x = —0.785y+ 16.2 (x ony)
y = 0.648x+ 2.64 (yonx) and Find the product-moment correlation co-
x lI 0.917y—1.91 (x ony) efficient for the data.
r
Sure Tn n nn LE
588 A CONCISE COURSE IN A-LEVEL STATISTICS
2 (x;7X)(ViY)
ee -
1 > ~
== Dat 0X; —(a-* 0X i [etd y= (e7-ay )]
n
i! a —
= — 2b(X;—X )d(Y;— Y)
n
So Sy = bdsxy
i
ba
1 a.
3 Pree
a Soy
SxS8y
So Ixy = Try,
REGRESSION AND CORRELATION
589
Example 11.8 For the following data, use a method of coding to find (a) the co-
variance, (b) the product-moment correlation coefficient, (c) the
least squares regression lines y on x and x on y.
oo n 8 \8/\8
Therefore s,, = 5sxy
= 5(5.5)
= 27.5
The covariance s,,, is 27.5.
(b) Now
2
; Dake
Se a
ee in y2ts S20(4) = 16.9375
n 8 8
2
Dye
sv eee eV tee “_ (4) = 5.25
n 8 \8
Therefore
xe
SxY
al 28 = 0.58 (2d.p.)
SxSy V (16.9375)(5.25)
590 A CONCISE COURSE IN A-LEVEL STATISTICS
So ty = Tey OB (2 d.p.)
The product-moment correlation coefficient is 0.58 (2 d.p.).
DAMS DORIS oe easttsPapen bes Senger vine@ ey Seem eee
Y-¥Y =——(x-X)
= $s =
sx
4 5.5 74
Le: v8 16.9375 (x
50 Y = 0.3247X —2.5037
0
Now, since Y = and X = x —1000, this equation may be
written
y — 250
= 0.8247 (x — 1000) — 2.5037
5
y 1.6235x —1886.0185 (least squares regression
line y on x)
The equation of the:least squares regression line X on Y is
x-X = —(y-Y¥)
— s —
Sy
74 5.5 | =
xXx-— = —(|yYy--—
8 5.25 8
i.e. xX 1.048Y + 8.726
This equation may be written
y 50
x — 1000 1.048 + 8.726
——s«éExercise 11e 2 eS
For the following sets of data, use appropriate
methods of coding to calculate (a) the co-
variance, (b) the product-moment correlation oe 981.2 981.3 981.9 981.6 981.5
coefficient, (c) the least squares lines of
regression of y on x and x ony.
55.6. * 652 90 64g ee Tl esEes
1. 1701 1722 1717 1718 1703 1701| 3. 0.00157 0.00156 0.00149 0.00165
45.1 45.8 45.6 45.3 45.1 45,1 100.4 100.7 100.0 100.4
REGRESSION AND CORRELATION
591
Suppose that
Kat Nossh 47 X pALeUne sun Rs-OL Kye Xs os Sp
Y,, Y2,..., Y, are the ranks of 1, y2,...,Yn
Then
X,,X2,...,X,, are the numbers 1, 2,..., in some order
and Y,,Y>,..., Y, are the numbers 1, 2,...,n in some order
Consider the rank difference d,,d>,...,d, given by
Cy = AqeeLy-G. — Xo Ys, ..., d, = X,Y,
so that dy ae Xie ee el 2h
SX ee ae — 2K
Now, if we substitute for the x’s and y’s in the original data their
corresponding ranks in the formula for r, the product-moment
correlation coefficient, we obtain an approximation to r. This
approximation is called Spearman’s coefficient of rank correlation,
rs.
We write
Sxy DXY (=|[=")
i= where Sine Neal |ee We
SxSy n n n
592 A CONCISE COURSE IN A-LEVEL STATISTICS
OE +1)(2n
[a(nAE es
t+1)—3 (n+1)7]_eon1
ee ee Cte
12 2n
— (nt1jm—1) 1 > d?
2, 2n
Pipes 1 v5
a (n 1) fp (ii)
12 2n
Now
, _ 2X? _ (EX) _ (mt1j2n+1)_ (n+)? _ (1)
SX a - 6 4 12
(iii)
Similarly
(Heady
sy” = aa ie
12
SO SySy
ae
12
(nn?=) R ma 2
Sxy 12 2n 62d?
Ch ste
r = SS ee
(n?—1) n(n*—1)
eae
Re
12
Method of ranking
Suppose we have the masses, x, (in kg) of five men
66, 68, 65, 69, 70
Arranged in ascending order of magnitude, these are 65, 66, 68, 69,
70, so we assign the ranks as follows:
REGRESSION AND CORRELATION #
593
Here, the 3rd and the 4th places represent the same mass (68 kg), so
we assign the average rank 3.5 to both these places.
Similarly for the eight values:
Here the 3rd, 4th and 5th places represent the same mass (66 kg)
SO we assign the average rank 4 to these places; also the 7th and the
8th places represent the same mass (68 kg) so we assign the average
rank 7.5 to both these places.
NOTE: if there are more than just a few equal values, then this
method is not appropriate.
Solution 11.5 In this example, the data has been ranked already.
Let d = rank(x) —rank(y).
lime
Q
E NAM zOOF;
OHANWwA
DN
FNP
b& HPROOOH
AA
-enconelf
Yd?= 30
rs
6dd? where n = 8
~ n(n?—1)
6(30)
~~ 8(64—1)
lI 0.64 (2dp.)
594 A CONCISE COURSE IN A-LEVEL STATISTICS
Example 11.10 The marks of 10 pupils in French and German tests are as follows.
|French,x [12 8 16 12 7 10 12 16 12 9
[Germany [6 5 7 7 4 6 8 13 10 10
Calculate Spearman’s coefficient of rank correlation.
62d?
ls
n(n?—1)
___
6(61)
10(100 —1)
= 0.63 (2dp.)
Spearman’s coefficient of rank correlation is 0.63, indicating some
positive correlation between the marks in the two tests.
Exercise 11f
choices is as shown:
Optional extra Mrs Brown
2nd test D F E B G Cc A
January lel,
February
AD OB eGS DMEM E GA H. © J
March
Adjudicator I 78 66 73 73 84 66 89 84 67 177 April
AdjudicatorII |81 68 81 75 80 67 85 83 66 78 May
June
Calculate a coefficient of rank correlation July
August
for these data. Name the method you
September
have used and describe briefly, without October
proof, the principle on which it is based. November
(SUJB Additional) December
Wine
Enrico’s ranking
Claude’s ranking
REGRESSION AND CORRELATION
597
If we leave the first row in its natural ranking order, 1, 2, 3, 4, then
the second row could be ranked in 4! different ways, assuming that
there are no equal ranks. These 24 arrangements are shown here,
with the corresponding values of Dd?.
62d?
Vangire = —,
see S nin?=1)
4 DF=24
We can now use this bar chart to find probabilities associated with
various values of 2d.
(a)
(b)
(c)
(d)
0 2 4 6 Be On 2a 4 ee Ce 20) Yd?
+@§_| —_—_____ eee
P(La? <6) = 2 =0.375 P(Za?> 14) = 3 = 0.375
(e)
OY] ——_—
TTT
P(X a? <8) =+
= 0.458 P(Zd? > 12) = = 0.458
Probability
600 A CONCISE COURSE IN A-LEVEL STATISTICS
= 20
Example 11.11 For 8 pairs of rankings, Yd? = 28, giving Spearman’s coefficient
of rank correlation rg = 0.667 (3 d.p.).
Does this value indicate (a) a correlation significantly different
from zero, at the 10% level, (b) a significant positive correlation at
the 1% level?
28 140} 0.0415
This indicates that P(2d? < 28) = 0.0415 < 0.05, so we reject Hy
and conclude that there is evidence at the 10% level of a correlation
different from zero.
REGRESSION AND CORRELATION 601
(b) Ho:p =0 (there is no correlation)
Use a 1-tailed test at 1% level and reject Hy if P(2d* < 28) <0.01.
Now, from Table A, P(2d* < 28) = 0.0415 > 0.01 so we do not
reject Hy and conclude that there is'‘no evidence at the 1% level of a
positive correlation.
Example 11.12 For 9 pairs of rankings it is found that Dd? = 214, giving
rs = —0.783. Does this provide evidence, at the 1% level, of a nega-
tive correlation? ao
Use a 1-tailed test at 1% level and reject Hy if P(2d* > 214) < 0.01.
Now, from Table A, P(2d? > 214) = 0.0086 < 0.01, so we reject
H, and conclude that there is evidence, at the 1% level, of negative
= correlation.
Example 11.13 An expert on porcelain is asked to place 7 china bowls in date order
of manufacture assigning the rank 1 to the oldest bowl. The actual
dates of manufacture and the order given by the expert are shown.
i Oe a fF 6 U2 et 5
Solution 11.13
Rank (x)
Rank (y)
Id|
Za? Dd? =16, n=7
Now
= O14 (3 a:p)
Exercise 11g
In each of the following questions use Table A to test the hypotheses.
significance
ONS
CRO
SRO
#
REGRESSION AND CORRELATION
603
SIGNIFICANCE TEST FOR rs, USING CRITICAL VALUES
TABLE B
Example 11.14 Using Table B, for n = 8 and rg = 0.667, test the following
hypotheses:
(a) Hp:p = 0, H,:p #0 (10% level of significance)
(b) Ho:p = 0, Ay:p>0 (1% level of significance)
Solution 11.14 (a) Using a 2-tailed test, at the 10% level, and considering Table B,
with n= 8, significance 0.05 (because test is 2-tailed) we reject
Hy if rg 2 0.6438.
Now rg = 0.667, so we reject Hy and conclude that there is evidence
at the 10% level of a correlation different from zero.
_ Exercise 11h
In each of the following, use Table B to comment on the significance of the value for rs.
Level of
es
Ho:p = 0, H,:p #0
Ho:p
=0, Hi:p >0
Ho:p = 5 Hy:p #0
Ho:p = 0, H,:p>0
Ho:p = H,:p <0
Ho:p
=0, Hy:p #0
7p >0
7p >0
_ Exercise 11i
In the following questions, either Table A or (f) X 123456789 10
Table B may be used. YoU TOsOSSmii Omon
4. elm
(g)X 123456 78910
1. Find rg and comment on the significance 3164581092 7
of the result.
X and Y have been ranked. (h) Xv vlo2 S946 6 7 § 9 10
(a)X 123456 YY. OS 1007s GES (2 4.3)
Y 123456
(b) X 12383456
Y 654321
id x 128 one 2. Calculate rs for the following data and
comment on the significance of the
Y 3 5°))476°2 results.
(d) . : : 3 y i : (a) (20, 13), (47, 29), (50, 33), (33, 20),
(57, 32), (44, 23), (38, 25), (25, 19).
(e) X 12345678910 (b) (4.8, 81), (6.2, 79), (8.4, 86),
YY. 126354 2556 ee Omo mL (4.1, 63), (7.5, 90), (5.1, 87).
REGRESSION AND CORRELATION
605
KENDALL'S COEFFICIENT OF RANK CORRELATION ri
Rank (y)
- 2
én(n 71)
Rank (x) 1 Z 3 4 5 6 7 8 9
36
5(9)(8)
= 1 as expected.
a
Maximum score = 36..
Example 11.16 Nine applicants are interviewed for a teaching post by the head-
teacher and the head of department. Each ranked the applicants in
order of merit as follows:
Applicant A Barc
Headteacher 2 ie rO
Head of Department | 3 12
Solution 11.16 We will consider Kendall’s rank correlation coefficient, and must
first put one set in rank order and allocate letters y, to yo to the
other set:
Now ri = pero
n(n —1)
20
i
3(9)(8)
= 0.556
ri Total score S = 20
Lf =24
= Oe tae 0 2 4 6 S
(a)
(b)
ower eo TO 2 ede)as
P(S<—4) = 4% = 0.167 P(S>4) =4 = 0.167
=
ae,
(c)
S642) OF 24 Gg
P(iS<—2)=%£OF =0.375 P(S>2)=% =0.375
<—$_—_____——_ ed
(d)
=6 +4) 2 0 2 4 6 S
P(S <0) = 35 = 0.625
aA
610 A CONCISE COURSE IN A-LEVEL STATISTICS
(e) ;
—6-—4 2 0 2 4 6 Gg
P(S > 0) = 32 = 0.625
SS
S<—6 S26
S<—4 S24
ae S22
S20
Example 11.17 Considering the data given in 11.16, perform a significance test to
determine the extent of agreement between the headteacher and
head of department when ranking nine applicants for a teaching
post.
Use a 1-tailed test at the 5% level and reject Hp if P(S > 20) < 0.05.
Now, from Table C,
Example 11.18 Calculate r, for the following data and comment on the result.
(6.9, 89), (5.8, 73), (4.8, 81), (6.2, 79), (8.4, 86),
(4.1, 63), (7.5, 90), (5.1, 87), (9.9, 96), (4.3, 72).
Solution 11.18
6:95-5:8-14-8- 6-2 -8:4- 4:1-—-7.-5- 5 1 9908
aaa 89 73 81 79 86 63 90 87 96 72
fae me come 6 ome
Qamodiony pad) 17phoi gay 40,407T Avie Am IRS
Now re-arrange the pairs so that the x-values are in rank order.
S
Now l= io See
5n(n—1)
2=3
= 3=0
5(10)(9) b= 5
4=4
= 0.689
0. (38d.p.)
d.p. pee
L—2—0
1=1
Example 11.19 When calculating r,, with 9 pairs of data, it is found that S = —24.
Test, at the 1% level, the hypotheses: Hj:p =0, H,:p #0.
_ Exercise 11)
In each of questions 1 to 7, use Table C to test the hypotheses, at the level of significance indicated.
AGS
PwON
REGRESSION AND CORRELATION
5 613
8. Calculate r, for number 1, Exercise 11i Calculate r, for each pair of judges and
and comment on the significance of the comment on the significance of your
result.
results.
9. Calculate r, for number 2 of Exercise 11i
and comment on the significance of the
result.
10. Three judges in a bouncing baby competi- 1: These were the marks obtained by 8
tion rank the babies as shown. pupils in Mathematics and Physics.
|Mathematics [67 42 85 51 39 97 81 70
Judge 1 70 59 71 38 55 62 80 76
Judge 2
Calculate r, and comment on the signifi-
g cance of the result.
5 1
6 it
4 3
8 0
1 6
ie 4
iz 0
3 3
62d?
rs = 1————_
‘ n(n?—1)
6(72)
8(64—1)
= 0.17 (2d.p.)
Spearman’s coefficient of rank correlation is 0.17 (2 d.p.).
614 A CONCISE COURSE IN A-LEVEL STATISTICS
ane
s, = 16.67 (given)
Therefore
Sxy 41.25
r= = —————_ = 0.15 (2dp.)
8,8, (16.24)(16.67)
The product-moment correlation coefficient is:0.15 (2 d.p.).
Example 11.21 It is suspected that two quantities Q and W are related according to
the formula Q = aW®, where a and b are constants. Observations on
@ and W were made and the results were as follows:
Solution 11.21
me 20 11.601
Re erg ee
n 8
Z' ZY ie
eas ae eee tee) 2
For the points on the left For the points on the right
%on
By 88ers ag eee
a 4 . YR A .
i.e. y = +tbx
logio a
ry
1.9
xX = logipW
Example 11.22 The body and heart masses of fourteen 10-month-old male mice are
tabulated below:
Xx = 495 Ly= 20389 | Lexy= 72867 | Lx? =17783 Ly?= 800 405
et ae =
x= Ane
14
eran
=135, ty A)
.p.
e
Sy 2039
ol Soe e564 (2d.p.)
n 14
Sx
VV irae ix xX)
Sx
where
fficient of of regression ae
h coefficient
Thereforethe 20.09en : a
To draw this on the scatter diagram, first plot (%, VY). Then find two
further points, e.g.
(mg)
Heart
mass
aM ez
SxSy
Therefore
5s Wanoomaeen © Meme
Sxy 55.27
r= =
Example 11.23 The positions in a league of 8 hockey clubs at the end of a season
are shown in the table.
Shown also are the average attendances (in hundreds) at home
matches during that season.
Solution 11.23 Either rg or r, could be calculated. We show the working for both.
Spearman
Club (x)
Position (x)
Attendance rank (y)
Now Poa 8) ee
II 0.4286 (4 d.p.)
Significance test:
Ho:p = 0 (no correlation between the two ranks)
Hip > 0 (some positive correlation)
Kendall
Attendance 2 1 8 5 3 6 7 4
rank (y) Ve Va Vaan 56 a in V8
ie =
oi
Sub
ee)
8
2(8)(7) |
= 0.286 (3 d.p.)
Significance test:
Ho:p = 0 (no correlation)
Hy:p 2.0 (some positive correlation)
Use a 1-tailed test, at the 5% level, and reject Hy if P(S 2 8) < 0.05.
From Table C, with n = 8, P(S 28) = 0.199 > 0.05, so we do not
reject Hy and conclude that there is no evidence, at the 5% level, of
positive correlation between the two sets of ranks.
yonx x ony
If equation of lineis y=ax +b _| If equation oflineis x = cy +d
alx+nb = cLyt+nd
alx?+brx cLy?+dzy
1
=e)
n
. aN
= ey)
n
REGRESSION AND CORRELATION 621
= Say
SxSy
(regression coefficient of x on y)
In questions involving regression lines assume, The heights A, in cm, and weights W, in
unless stated otherwise, that the least squares kg, of 10 people are measured. It is
regression lines are required. found that Yh = 1710, 2 W = 760,
Dh? = 293 162, DAW = 130 628 and
a 12 students were given a prognostic test = W? = 59 390.
at the beginning of a course and their Calculate the correlation coefficient
scores X; in the test were compared with between the values of h and W.
their scores Y; obtained in an examina- What is the equation of the regression
tion at the end of the course (i = 1, 2,..., line of W on h? (O &C)
12). The results were as follows:
|Boy | A Bl LOMDINED EE... Git. \Taaed ground is marshy but very few where the
ground is dry. The number x of alder
x [122 124 133 138 144 156 158 161 164 168 trees and the ground moisture content y
y |41 88 52 66 29 54 59 61 63 67 are found in each of 10 equal areas
(which have been chosen to cover the
Find the equations of the regression lines range of x in all such areas). The following
of y on x, and ofx on y. No diagram is is a summary of the results of the survey:
needed. Calculate also the coefficient of yx = 500, Ly = 300,
correlation. Vx? =.27 818° Vey = 16837}
Estimate the distance to which a cricket Ly” = 10462
ball can be thrown by a boy 150cm
Find the equation of the regression line
in height. (AEB)
of y on x.
4. Sketch scatter diagrams for which Estimate the ground moisture content in
(a) the product moment correlation an area equal to one of the chosen areas
coefficient is — 1, which contains 60 alder trees. (O &C)
(bo) Spearman’s correlation coefficient is
+ 1, but the product moment correlation 7. (a) The following marks were awarded
coefficient is less than 1. by 2 judges at a music competition:
Five independent observations of the
random variables X and Y were: Child 1 10 9
Child 2 5 6
Child 3 8 10
Child 4 7 5
Child 5 9 8
Find
(c) the sample product moment correla- Calculate a coefficient of rank correlation.
tion coefficient, (b) Determine, by calculation, the
(d) Spearman’s correlation coefficient. equation of the regression line of x on y
(0 &C) based on the following information about
8 children:
5. The state of Tempora demands that
every household in the country shall have Child Tees” 4. 8b" 86 es
a reliable clock; inspectors are being intro- Arithmetic mark (x) |45 33 27 23 18 14 8 O
duced throughout the country to imple-
English mark (y) Seco tien 20 e129) SOR asian
ment the policy. The Chief Inspector has
the following data on the population size (SUJB)
of towns, where Inspection Units have
been set up, and the number of man- 8. The following data (Table A) represent
hours spent on inspection. the lengths (x) and breadths (y) of 12
cuckoos’ eggs measured in millimetres.
Population
Ghousendsy | 2 40,5 (2 a8 45" 28 (20-21) 22 Draw a scatter diagram for the data.
Obtain the least squares regression lines
Manboum
(thousands)
108 (11, dais" 24. 96y8F 82 pangae of y on x and plot this on the scatter
diagram. (JMB)
(a) Calculate the regression line for
predicting the number of man-hoursfrom 9. (X;, Y;),i=1,2,...,nisasample froma
the population size (note that the mean bivariate population. The least-square
value of each variate is a whole number). regression lines of Y on X and X on Y are
(b) Predict the manpower required (in calculated. Why would you not expect
man-hours) for a new Inspection Unit to the two lines to coincide? Under what
be installed in a town with a population circumstances would they coincide?
of 17000. (O) In the table, Y; is the mass (in grammes)
of potassium bromide which will dissolve
6. Ina certain heathland region there is a in 100 grammes of water at a tempera-
large number of alder trees where the ture of X; C.
Table A
22.3 23.6 24.2 22.6 22.38 22.38 22.1 23.38 22.2 22.2 21.8 23.2
16.5 17.1 17.38 17.0 16.8 16.4 17.2 16.8 16.7 16.2 16.6 16.4
REGRESSION AND CORRELATION 623
Output of creatinine |4 35 154 1.45 1.06 2.13 1.00 0.90 2.00 2.70 0.75
(grammes)
Body mass 55 48-56 53" 74 44 | 49-68 78 «61
(kilogrammes)
624 A CONCISE COURSE IN A-LEVEL STATISTICS
The four recorded pairs of values are The results are shown in the table.
Table C
amma omg Gowwie, al)
Predicte
age x (years)
d |24 30 28 36 20 22 31 28 21 29 40 25 27
Actual age y (years) 23 31 28 35 20 25 45 30 22 27° 40°27 96
REGRESSION AND CORRELATION 625
18. In Table D below x is the average weekly on a particular course are given examina-
household income in £ and y the infant tions in Sociology (S). Social Administra-
mortality per 1000 live births in 11 tion (SA) and Quantitative Methods
regions of the UK in 1985. (QM). The final grade awarded to each
It is hypothesised that a high value of x will student is based on the total of the marks
be associated with a low value of y. Explain scored on the three papers. Table E shows
why it would not be appropriate to use the the marks obtained by a sample of ten
product moment correlation coefficient to students who sat the three papers.
investigate this. Calculate a rank correlation
The following matrix of Spearman rank
coefficient and test its significance. The
correlation coefficients was obtained for
values below give the probabilities of
this sample of ten students.
exceeding the given values of rg and r,
calculated from’ 10 and 11 pairs of
uncorrelated variables.
Table D
170.4 183.2 172.9 187.1 203.2 204.8 208.8 248.0 198.3 187.1 179.1
Table E
: Social Quantitative
Student | Sociology(S) | Administration(SA) | Methods (QM)
66 48 44
= SCO
ONaQoarhwnre
626 A CONCISE COURSE IN A-LEVEL STATISTICS
1976 1977 1978 1979 1980 1981 1982 1983 1984 1985
Nov Nov Dec _. Dec Jan Nov Dec Jul Dec Apr
78 6.0. 8.0,%10:5 ,10.5:| wo8uleee m7 on Mas emnes
Mortgage% | 12.2 95° 11.85,11,8 915.0), 15.03910.0 o6i1 25 see
REGRESSION AND CORRELATION 627
man’s and Kendall’s coefficients for a whether there is any association between
sample size of 10. the number of days training and a per-
ceived measure of accuracy based on the
Significance Spearman’s Kendall’s difference between x, and x». Con-
level sequently a new variable z = x;—x was
created.
(c) Plot a scatter diagram of y against z.
Explain why the manager should not
correlate y and z using the product
moment correlation coefficient.
Comment on the significance of your (d) Explain why 2? might be a better
calculated value of r. (SUJB) variable to correlate with y using the
product moment correlation coefficient.
23. Explain briefly what is measured by the Evaluate the correlation coefficient bet-
product moment correlation coefficient. ween y and 2° and explain why the
The manager of a large office supervises manager might be pleased with the value
15 clerical assistants, each using a word- obtained. Suggest how this new variable
processor. Because of the pressure of would present the manager with a practical
work, the assistants did not all receive the problem. (AEB 1988)
same amount of training in the use of 24. The experimental data below were
their word-processors. In order to make obtained by measuring the horizontal
an assessment of the need for training the distance y cm, rolled by an object released
manager monitored their work during a from the point P on a plane inclined at
given week, recording the number of 0° to the horizontal, as shown in the
pieces of work correctly produced without diagram. P
any errors (x,;), the number produced
containing errors (x2) together with the <—ycm—> # |
number of days training received (y).
The results are summarised in Table G
below.
(a) Given that Dx? =11513, Ly? = 728
and 2x,y =
2676, show that the product
moment correlation coefficient between
y and x, is 0.491.
(6) Without using a comment of the
form ‘The correlation between x, and y is
not very strong’, suggest how the manager
might have attempted to interpret this
value as part of the assessment. Ly = 828, Lyd 18 147,
The manager then decided to investigate LO = 155.5, ZO 3520.25.
Table G
Number of days
training (y)
en
oO
aw
5
ray1
8
9
3
8
2
6
4
5
3
rea1
628 A CONCISE COURSE IN A-LEVEL STATISTICS
25. Table H below gives the average cost per Explain the significance of the regression
hundredweight of zinc manufactures coefficient.
imported into the UK during each of the
years 1873 to 1882. Predict the transit time of a package sent
from a supplier 200 miles away from the
(a) Plot the data on graph paper, by
company.
coding with (year—1872) as the x
variable and (cost —100) as the y variable. Give two reasons why you would not use
(b) Given that Yy = 270 and the equation to predict transit time for a
Lxy = 1057, show that the gradient of package sent from a supplier 1500 miles
the equation of the least squares regression away.
line of y on x is —5.2 (to 2 significant
figures). Calculate the equation of this Calculate the product moment correla-
line and plot it on your graph. tion coefficient between x and y.
(c) Use your equation to predict the cost
of zinc manufactures imported in 1883. Explain why the value you have obtained
Comment on your prediction. supports the purchasing manager’s attempt
(Source: Statistical Abstract for the to establish a regression equation of y
United Kingdom 1871 to 1885.) (O) on x. (AEB 1987)
Table H
Year 1873 1874 1875 1876 1877 1878 —1879 1880 £1881 1882
Cost (p) 147 147 144 140 129 119 112 116 107 .109
APPENDIX 1
RANDOM 6523 6800/7782 5814/1085 118515711 737414525 5046
NUMBERS 0956 7651/0473 9430/1674 6959/0438 839813020 8785
5599 9860] 0133 0693/8513 231712551 9204/5231 3870
7282 4544/0953 0483/0383 984116741 0138|6683 1199
0421 2872/7325 0274/3581 7849/5267 6140/6050 4750
8701 8059| 8936 4159|6027 6489/4745 1821/6984 7606
3162 4653/8440 5631/7476 5223/7295 9606/5683 8522
2981 5794/3591 9070/9424 1935/5022 2372/8734 8315
3998 7422/7719 1281)|2942 0450/6234 3681)| 4307 9792
5614 8010/7652 3854)|8413 9990/2255 4104/7237 8933
2956 6274/1267 0935)|8933 0428/4475 0157/8745 5221
9332 5738] 3936 8742|7255 7397|9836 5741/7609 1168
9569 5154/4319 2049/5725 9055| 2620 7098/4373 5645
6571 3243|/6467 2255/6565 4886/1088 2012/4018 49.25
9027 3343/9784 2057/4991 4120|)1764 2960|6687 5597
9029 4245/6134 3013/3039 2152|)5928 6498|0876 0927
9974 0629/2055 7270/1143 9582/7537 9024/7748 6321
8787 5691|1697 5150/6136 9647|7668 4911/5056 5106
4624 1774)9737 3903)5483 3400] 7461 7751/4363 1567
6679 8143) 4092 8472)8832 8324|6701 4134/7019 2693
36 42 9458] 8330 9239/1840 0300/1290 3237]9165 4815
0766 2508|9927 6948)|8532 1646/1931 8502]|8636 2296
9310 0572|1826 3667)6848 3169|6858 9349/4586 9929
4950 6399) 2671 4794) 3271 7291] 3418 7406] 3214 4080
2075 5889)]3904 4273)3793 1107) 2877 9136)6047 8262
0240 6209|0071 0937)|8044 5037)| 38270 2038)|7186 75 34
5987 2138] 2978 7267)4283 6521)5479 6642)4786 3115
4808 9966/4338 2813/5025 4793/1115 0784) 2830 1907
5426 8675|4415 2039}] 2003 5854/8029 6253)0697 7151
85 35 5845] 2358 6366]0962 8092/1455 8141] 2148 8734
7384 9049]0121 9029/5706 6873]5110 5195)6308 5799
3464 7800/9259 6774|5848 9209/4220 4037]6380 5893
6856 8747/6306 2471]4198 7906)0718 5829)1649 6737
7247 0542/8807 2755|5874 8208/4228 2648) 2532 0031
4444 9675|]8957 1260] 4238 7736) 4569 2168) 3270 0496
2811 5747|6157 8988/6218 9367|]5732 9672/2117 1354
8722 3888/9199 1608/1776 2747/5214 9886) 3568 2385
4493 1459/6740 2410]1163 4047|0756 1422/6274 9339
8184 3725/9043 5662/9458 4903) 8422 5722) 4798 8637
0975 3521/0447 5408/9844 0816] 4486 6971] 2052 6494
7765 0504/2218 2010) 8187 0569/4370 9676} 42 05
1906 5161] 38403 6155) 9858 8350/0148 9985 | 08 67
5291 8707/1962 3228|0491 4248)6524 8609| 8768
5247 2514/9391 7551)|4926 4941) 2083 3030/ 43 22
5267 8740|6341 9186)1047 8070] 5687 2586| 8994
6525 7173|7860 5062/9104 9597/6416 7131] 3280
2997 5642|5690 1675/7495 9926)0163 2516] 5418
1525 0368/9245 5300/0629 4643/4666 2712/8505
8208 6567|6413 5114/3828 2430/3962 2035] 2390
8135 0325 (82 24 8359|0467 5152 [28 21 6975] 87 28
0.05 0.10 0.15 0.20 0.25 0.30 0.35 0.40 0.45 0.50
p=
2 0.9025 | 0.8100 |0.7225] 0.6400 | 0.5625| 0.4900| 0.4225 pee | 0.3025 | 0.2500
| 0.8400 0.7975| 0.7500
0.9975 | 0.9900 | 0.9775 | 0.9600| 0.9375 | 0.9100 | 0.8775 |
1.0000 | 1.0000 | 1.0000] 1.0000 | 1.0000 | 1.0000| 1.0000 1.0000 — 1.0000
= ll ow “s ll 0.8574 | 0.7290 | 0.6141 | 0.5120 | 0.4219 | 0.3480 | 0.2746 | 0.2160 | 0.1664 | 0.1250
0.9928 | 0.9720 | 0.9393 | 0.8960 | 0.8438 | 0.7840 | 0.7183 | 0.6480 | 0.5748 0.5000
0.9999 | 0.9990 | 0.9966 | 0.9920 | 0.9844 | 0.9730 | 0.9571 | 0.9360 | 0.9089 | 0.8750
{1 0000 | 1.0000 | 1.0000 1.0000 | 1.0000 | 1.0000 |.1.0000 | 1.0000 | 1.0000 | 1.0000
wnNnNro|;NrO
x II 0.8145 | 0.6561 | 0.5220) 0.4096 | 0.3164 0.2401 | 0.1785 | 0.1296 | 0.0915 | 0.0625
0.9860 | 0.9477 | 0.8905 | 0.8192 | 0.7383 | 0.6517 | 0.5630 | 0.4752 | 0.3910 | 0.3125
0.9995 | 0.9963 | 0.9880 | 0.9728 | 0.9492 | 0.9163 | 0.8735 | 0.8208 | 0.7585 | 0.6875
1.0000 | 0.9999 | 0.9995 | 0.9984 | 0.9961 | 0.9919 | 0.9850 | 0.9744 | 0.9590 | 0.9375
©
re
PWN 1.0000| 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000
= Il or ~ lI 0.7738 | 0.5905 | 0.4437 0.3277 | 0.2373 0.1681 | 0.1160 | 0.0778 | 0.0503 | 0.0313
0.9774 | 0.9185 | 0.8352] 0.7373 | 0.6328 | 0.5282 | 0.4284 | 0.3370 | 0.2562 | 0.1875
0.9988 | 0.9914 | 0.9734| 0.9421 | 0.8965 | 0.8369 | 0.7648 | 0.6826 | 0.5931 | 0.5000
1.0000 | 0.9995 | 0.9978 | 0.9933 | 0.9844 | 0.9692 | 0.9460 | 0.9130 | 0.8688 | 0.8125
1.0000 | 0.9999 | 0.9997 | 0.9990 | 0.9976 | 0.9947 | 0.9898 | 0.9815 | 0.9688
= ll a me ll
| 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 1.0000 | 1.0000
0.7351 | 0.5314 | 0.3771 | 0.2621 | 0.1780 | 0.1176 | 0.0754 | 0.0467 | 0.0277 | 0.0156
1.0000
0.9672 | 0.8857 | 0.7765 | 0.6554 | 0.5339 | 0.4202 | 0.3191 | 0.2333 | 0.1636 | 0.1094
0.9978 | 0.9842} 0.9527 | 0.9011 | 0.8306 | 0.7443 | 0.6471 | 0.5443 | 0.4415 | 0.3438
brHolapnwnro
0.9999 | 0.9987 | 0.9941 | 0.9830 | 0.9624 | 0.9295 | 0.8826 | 0.8208 | 0.7447 | 0.6563
1.0000 | 0.9999 | 0.9996 | 0.9984 | 0.9954 | 0.9891 | 0.9777 | 0.9590 | 0.9308 | 0.8906
1.0000 | 1.0000 | 0.9999!) 0.9998 | 0.9993 | 0.9982 | 0.9959 | 0.9917 | 0.9844
en 1.0000 | 1.0000 | 1.0000 | 1.0000 |1.0000 Bee
= ll =] 3 ll 0.6983 | 0.4783 | 0.8206 | 0.2097 | 0.1335 | 0.0824 | 0.0490 | 0.0280 | 0.0152 | 0.0078
0.9556 | 0.8503 | 0.7166 | 0.5767 | 0.4449 | 0.3294 | 0.2338 | 0.1586 | 0.1024 | 0.0625
0.9962 | 0.9748 | 0.9262 | 0.8520 | 0.7564 | 0.6471 | 0.5323 | 0.4199 | 0.3164 | 0.2266
0.9998 | 0.9973 | 0.9879 | 0.9667 | 0.9294 | 0.8740 | 0.8002 | 0.7102 | 0.6083 | 0.5000
1.0000 | 0.9998 | 0.9988 | 0.9953 0.9871 | 0.9712 | 0.9444 | 0.9037 | 0.8471 | 0.7734
1.0000 | 0.9999] 0.9996 0.9987 | 0.9962 | 0.9910 | 0.9812 | 0.9648 | 0.9375
1.0000 | 1.0000 , 0.9999 | 0.9998 | 0.9994 | 0.9984 | 0.9963 | 0.9922
1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000
= | oo 4 ll 0.6634 | 0.4305 | 0.2725 |
0.1678 | 0.1001 | 0.0576 0.0319| 0.0168 | 0.0084 | 0.0039
0.9428 | 0.8131 | 0.6572 |
0.5033 | 0.3671 | 0.2553 | 0.1691 | 0.1064 | 0.0632 | 0.0352
0.9942 | 0.9619 | 0.8948 |
0.7969 | 0.6785 | 0.5518 | 0.4278 | 0.38154 | 0.2201 | 0.1445
0.9996 | 0.9950 | 0.9786 |
0.9437 | 0.8862 | 0.8059 | 0.7064 | 0.5941 | 0.4770 | 0.3633
1.0000 | 0.9996 | 0.9971 |
0.9896 | 0.9727 | 0.9420 | 0.8939 | 0.8263 | 0.7396 | 0.6367
1.0000 | 0.9998 |
0.9988 | 0.9958 | 0.9887 | 0.9747 | 0.9502 | 0.9115 | 0.8555
1.0000 |
0.9999 | 0.9996 | 0.9987 | 0.9964 | 0.9915 | 0.9819 | 0.9648
& 1.0000 | 1.0000 | 0.9999 | 0.9998 | 0.9993 | 0.9983 | 0.9961
AOMBRWNFOINDTBWNFO]ODUB
oo 1.0000 | 1.0000 | 1.0000 | 1.0000 | 1.0000
APPENDIX
631
CUMULATIVE BINOMIAL PROBABILITIES
The tabulated value is P(X <r) where X ~ Bin(n, p)
p= 0.05" > 0:10-690.15" 990.2058 0.25 08080 010/35 <b0i40 0145)’ 0.50
n=9 r=0 | 0.6302 | 0.3874 | 0.2316 | 0.1342 ]0.0751 0.0404 | 0.0207 | 0.0101 | 0.0046| 0.0020
1 | 0.9288 | 0.7748 | 0.5995 | 0.4362 | 0.3003 | 0.1960 | 0.1211 | 0.0705 | 0.0385 | 0.0195
2 | 0.9916 | 0.9470 | 0.8591 | 0.7382 | 0.6007 | 0.4628 | 0.3373 | 0.2318 | 0.1495 | 0.0898
3 | 0.9994 | 0.9917 | 0.9661 | 0.9144 | 0.8343 | 0.7297 | 0.6089 | 0.4826 | 0.3614 | 0.2539
4 | 1.0000 | 0.9991 | 0.9944 | 0.9804 | 0.9511 | 0.9012 | 0.8283 | 0.7334 | 0.6214 | 0.5000
5 0.9999 | 0.9994 | 0.9969 | 0.9900 | 0.9747 | 0.9464 | 0.9006 | 0.8342 | 0.7461
6 1.0000 | 1.0000 | 0.9997 | 0.9987 | 0.9957 | 0.9888 | 0.9750 | 0.9502 | 0.9102
7 1.0000 | 0.9999 | 0.9996 | 0.9986 | 0.9962 | 0.9909 | 0.9805
8 1.0000 | 1.0000 | 0.9999 | 0.9997 | 0.9992 | 0.9980
2 1.0000 | 1.0000 | 1.0000 | 1.0000
n=10 r=0 | 0.5987 | 0.3487 | 0.1969 | 0.1074 | 0.0563 | 0.0282 | 0.0135 | 0.0060 | 0.0025 | 0.0010
1 | 0.9139 | 0.7361 | 0.5443 | 0.3758 | 0.2440 | 0.1493 | 0.0860 | 0.0464 | 0.0233 | 0.0107
2 | 0.9885 | 0.9298 | 0.8202 | 0.6778 | 0.5256 | 0.3828 | 0.2616 | 0.1673 | 0.0996 | 0.0547
3 | 0.9990 | 0.9872 | 0.9500 | 0.8791 | 0.7759 | 0.6496 | 0.5138 | 0.3823 | 0.2660 | 0.1719
4 | 0.9999 | 0.9984 | 0.9901 | 0.9672 | 0.9219 | 0.8497 | 0.7515 | 0.6331 | 0.5044 | 0.3770
5 | 1.0000 | 0.9999 | 0.9986 | 0.9936 | 0.9803 | 0.9527 | 0.9051 | 0.8338 | 0.7384 | 0.6230
6 | ~~. | 1.0000 | 0.9999 | 0.9991 | 0.9965 | 0.9894 | 0.9740 | 0.9452 | 0.8980 | 0.8281
7 0.9999 | 0.9996 | 0.9984 | 0.9952 | 0.9877 | 0.9726 | 0.9453
8 1.0000 | 0.9999 | 0.9995 | 0.9983 | 0.9955 | 0.9893
9 1.0000 | 1.0000 | 0.9999 | 0.9997 | 0.9990
10 1.0000 | 1.0000 | 1.0000
ne 1s) a0 0.0134 | 0.0047 | 0.0016 | 0.0005 | 0.0001 | 0.0000
14) 0.0802 | 0.0353 | 0.0142 | 0.0052 | 0.0017 | 0.0005
2 | 0.9638 | 0.8159 | 0.6042 | 0.3980 | 0.2361 | 0.1268 | 0.0617 | 0.0271 | 0.0107 | 0.0037
3 | 0.9945 | 0.9444 | 0.8227 | 0.6482 | 0.4613 | 0.2969 | 0.1727 | 0.0905 | 0.0424 | 0.0176
4 | 0.9994 | 0.9873 | 0.9383 | 0.8358 | 0.6865 | 0.5155 | 0.3519 | 0.2173 | 0.1204 | 0.0592
5 | 0.9999 | 0.9978 | 0.9832 | 0.9389 | 0.8516 | 0.7216 | 0.5643 | 0.4032 | 0.2608 | 0.1509
6 | 1.0000 | 0.9997 | 0.9964 | 0.9819 | 0.9434 | 0.8689 | 0.7548 | 0.6098 | 0.4522 | 0.3036
7 1.0000 | 0.9994 | 0.9958 | 0.9827 | 0.9500 | 0.8868 | 0.7869 | 0.6535 | 0.5000
8 0.9999 | 0.9992 | 0.9958 | 0.9848 | 0.9578 | 0.9050 | 0.8182 | 0.6964
9 1.0000 | 0.9999 | 0.9992 | 0.99683 | 0.9876 | 0.9662 | 0.9231 | 0.8491
10 1.0000 | 0.9999 | 0.9998 | 0.9972 | 0.9907 | 0.9745 | 0.9408
17 1.0000 | 0.9999 | 0.9995 | 0.9981 | 0.9937 | 0.9824
12 1.0000 | 0.9999 | 0.9997 | 0.9989 | 0.9963
13 1.0000 | 1.0000 | 0.9999 | 0.9995
14 1.0000 | 1.0000
n=20 r=O0 | 0.3585 | 0.1216 | 0.0388 | 0.0115 | 0.0032 | 0.0008 | 0.0002 | 0.0000 | 0.0000 | 0.0000
1 | 0.7358 | 0.3917 | 0.1756 | 0.0692 | 0.0243 | 0.0076 | 0.0021 | 0.0005 | 0.0001 | 0.0000
2 | 0.9245 | 0.6769 | 0.4049 | 0.2061 | 0.0913 | 0.0355 | 0.0121 | 0.0036 | 0.0009 | 0.0002
3 | 0.9841 | 0.8670 | 0.6477 | 0.4114 | 0.2252 | 0.1071 | 0.0444 | 0.0160 | 0.0049 | 0.0013
4 | 0.9974 | 0.9568 | 0.8298 | 0.6296 | 0.4148 | 0.2375 | 0.1182 | 0.0510] 0.0189 | 0.0059
5 | 0.9997 | 0.9887 | 0.9327 | 0.8042 | 0.6172 | 0.4164 | 0.2454 | 0.1256 | 0.0553 | 0.0207
6 | 1.0000 | 0.9976 | 0.9781 | 0.9133 | 0.7858 | 0.6080 | 0.4166 | 0.2500 | 0.1299 | 0.0577
7 0.9996 | 0.9941 | 0.9679 | 0.8982 | 0.7723 | 0.6010 | 0.4159 | 0.2520 | 0.1316
8 0.9999 | 0.9987 | 0.9900 | 0.9591 | 0.8867 | 0.7624 | 0.5956 | 0.4143 | 0.2517
9 1.0000 | 0.9998 | 0.9974 | 0.9861 | 0.9520 | 0.8782 | 0.7553 0.5914 | 0.4119
10 1.0000 | 0.9994 | 0.9961 | 0.9829 | 0.9468 | 0.8725 | 0.7507 | 0.5881
11 0.9999 | 0.9991 | 0.9949 | 0.9804 | 0.9435 | 0.8692 | 0.7483
12 1.0000 | 0.9998 | 0.9987 | 0.9940 | 0.9790 | 0.9420 | 0.8684
13 1.0000 | 0.9997 | 0.9985 | 0.9935 | 0.9786 | 0.9423
14 1.0000 | 0.9997 | 0.9984 | 0.9936 | 0.9793
15 1.0000 | 0.9997 | 0.9985 | 0.9941
16 1.0000 | 0.9997 | 0.9987
sea 1.0000 | 0.9998
18 | 4 1.0000
632 A CONCISE COURSE IN A-LEVEL STA TISTICS
|
—-——-
[ Hite ail it
1.0000
APPENDIX
633
N(O, 1)
(z)
z 0 i 2 3 4 5 6 7 8 9)
0.0 5040 .5080 .5120 |.5160 5199 .5239 |.5279 .5319 .5359 Avs
0.1 5488 5478 .5517 |.5557 .5596 .56386 |.5675 .5714 5753) 4 8
0.2 5832 5871 5910 |.5948 5987 .6026 |.6064,~.6103 .6141 | 4 8
0.3 6217 .6255 .6293 |.6331 .6368 .6406 |.6443 .6480 .6517 | 4 7
0.4 6591 .6628 .6664|.6700 .6736 .6772 |.6808 .6844 .6879 | 4 7
0.5 6950 .6985 .7019 |.7054 .7088 .7123 |.7157 .7190 .7224] 3 7
0.6 7291 .1324 .7857 |.7389 .7422 .7454 |.7486 .7517 .7549 | 3 7
0.7 WGI “742 “7673 |104 7784 e764 |79L 1823 7852 ea (6) 9 Dal Eh 7)
0.8 7910 .7939 .7967 |.7995 .8023 .8051 |.8078 .8106 .8133} 3 5 8 1OR22N25
0.9 8186 .8212 .8238 |.8264 .8289 .8315 |.8340 .8365 .8389 | 3 5 8 18 20 23
1.0 8438 .8461 .8485 |.8508 .8531 .8554 |.8577 .8599 8621 | 2 5 7/9 16 19 21
iil 8665 .8686 .8708 |.8729 .8749 .8770 |.8790 .8810 .8830 | 2 4 6) 8 14 16 18
Hee 8869 .8888 .8907 |.8925 .8944 .8962 |.8980 8997 .9015 | 2 4 6| 7 9 13005) 17
1.3 9049 .9066 .9082 |.9099 .9115 .9131 |.9147 .9162 .9177 | 2 3 5] 6 8 lipitor
1.4 9207 .9222 .9236 |.9251 .9265 .9279 |.9292 .9306 .93819 | 1 3 4] 6 7 Op ieetS
1.5 9345 .9357 .9370 |.9382 .9394 .9406 |.9418 .9429 9441] 1 2 4] 5 6 7} 8 10 11
1.6 9463 .9474 .9484 |.9495 .9505 .9515 |.9525 .9585 .9545|1 2 3/ 4 5 6| 7 8 9
1.7 9564 .9573 .9582 |.9591 .9599 .9608 |.9616 .9625 .9633 | 1 2 3] 4 4 5] 6 7 8
1.8 9649 .9656 .9664 |.9671 .9678 .9686 |.9693 .9699 .9706 | 1 1 2] 8 4 4]| 5 6 6
1.9 9719 9726) 97329738 9744) 19750) 119756).976) -9767—| 1 1 2) (2 3 4/7405 5
2.0 9778 .9783 .9788 |.9793 .9798
.9803 |.9808 .9812 .9817 | 0 1 1) 2 2 3) 38 4 4
ont 9826 .9830 .9834 |.9838 .9842
.9846 !|.9850..9854 .9857 | 0 1 1] 2 2 2] 3 3 4
2.2 |.9861 |.9864 .9868 .9871 |.9875 .9878
.9881 |.9884 19887". 9890" Or ie meets (2) 0) tomes 8
2.3 |.9893 |.9896 .9898 | O), SW FATES Fh Weito. go ato
.9901 |.99036 .99061 .99086 3 5) 810) 13 15/18 20 23
99111 .99134 99158] 2 5 7] 9 12 14/16 18 21
2.4 |.99180].99202
.99224 .99245|.99266 | DEAD Ge Set lS Seared 9)
99286 .99305).99324 .99343 .99361' 2 4 6]! 7 9 11/13 15 17
2.5 |.99379|.99396 .99413 .99430].99446 .99461 .99477|.99492 .99506 .99520| 2 3 5] 6 8 9/11 12 14
2.6 |.99534}.99547 .99560 .99573].99585 .99598 .99609].99621 .99632 .99643] 1 2 3] 5 6 7| 8 9 10
2.7 |.996531.99664 .99674 .99683].99693 .99702 .99711].99720 .99728 99736, 1 2 3] 4 5 6| 7 8 9
2.8 |.997441.99752 .99760 .99767|.99774 .99781 .99788|.99795 .99801 99807; 1 1° 2) 38 4 4] 5 6 6
2.9 |.998131|.99819 .99825 .99831|.99836 .99841 .99846].99851..99856 .99861)/ 0 1 1] 2 2 3) 3 4 4
3.0 |.99865 |.99869 .99874 .99878|.99882 .99886 .99889].99893 .99896 .99900] 0 1 1) 2 2 2} 3 3 4
3.1|.9°032 |.9°065 .9°096 3 6 9/13 16 19/22 25 28
.9°126 |.9°155 .99184 .93211 SeG mers) pele lay 20mg mo
192238 .97264 .92289)| 2, 5) VIO 12 1517 20 22
3.2 1.93313 |.93336 .99359 .9°381 |.9°402 Dee Oe Lia TSiloea see O
| 93423 .93443].93462 .99481 .99499| 2 .4 6] 8 9 11/13 15 17
3.3|.9°517 1.99534 .9°550 .9°566 |.99581 DEES S| Gn oat Olen tou
99596 .9°610].99624 .9°638 .99651}/ 1 3 4! 5 7 8] 9 10 12
3.4 |.9°663 |.9°675 .93687 .99698 |.9°709 .93720 .9°730|.93740 .99749 .99758| 1 2 38] 4 5 6/7 8 Q
3.5 |.9°767 1.99776 .9°784 .99792|.9°800 .93807 .9°815].99822 .9°828 .9°835/1 1 2) 3 4 41 5 6 7
3.6|.9°841 |.9°847 .9°853 .9°858 |.9°864 .9°869 .9°874|.9°879 .9°883 .9°888/ 0 1 1| 2 2 3) 38 4 5
3.7 |.9°892 |.9°9896 .9°90 .9704 |.9708 .9712 .9715 |.9418 .9722 .97250
3.8 |.9728 |.9931 .9733 .9736 1.9938 .9441 .9743 |.9446 .9748 94500
3.9 |.9752 |.9°54 .9756 .9758 1.9959 9961 .9°63 ‘|.9°64 .9°66 .9°670
N(O, 1)
O 2)
636 A CONCISE COURSE IN A-LEVEL STATISTICS
1 tip) 2 Smt
APPENDIX
: 637
TABLE B
Table of critical values of the Spearman’s rank correlation coefficient.
TABLE C
Table of probabilities associated with S in Kendall’s rank correlation coefficient, rx.
Probability that S is equal to, or greater than, certain values, for 4<n< 10.
=10 = 28 = 36
.592 548 .540 .000 .000 .500
.408 .452 .460 .360 .386 .431
BAZ, .360 81 .230 281 .364
L117 A .306 .136 2101 .300
.0417 .199 .238 .0681 blo 242
.0083 .138 mS .0278 .0681 190
.0894 .130 .0083 .0345 .146
.0543 0901 .0014 0151 .108
.0305 0597 .0054 0779
.0156 .0376 .0014 0542
.0071 LOZ .0002 .0363
.0028 .0124 0233
.0009 .0063 0143
.0002 .0029 .0083
.0012 .0046
.0004 .0023
.0001 .0011
640 A CONCISE COURSE IN A-LEVEL STATISTICS
N(O, 1)
Q(z)
A oe Oem:
SUBTRACT
2)
8
8
7
6
6
5
a
4
3
3
2
2
1
x
a
1
8
i
6
6
5
3
3
2
1
q
9
8
7
td
6
5
4
3
2
Ce PH 1
WWwWWW
HPNNNNH
KPHHFRFPRF
NNNWODOOO
OFPKFHND
HPHENNNNWWO
OFR
~1~1
HPP
©
TAHDANA
WHOKKKT
HPPNNDY
PROTOTYPE
HPHPNNW
NWWPRRTMDRMDH
n t Q(a) p p Q(a)
= a0 0a a0 0 a
P(Z<—a) = P(Z>a) = Qa)
Be =0)0= 1 PZ a) = 1 Oia)
Example 6.2A If Z ~ N(0,1) find from tables (a) P(Z > 1.377), (b) P(Z < 1.377),
(c) P(Z <—1.377), (a) P(Z >—1.877).
Solution 6.2A (a) t (b) Ll
1.377 1.377
P(Z >1.877) = Q(1.377) P(Z <1.877) = 1—Q(1.877)
= 0.0842 = 1—0.0842
= 0.9158
?
—2.696 0 1.865
>
So P(|Z|< 1.433) = 0.848. 1.433
—
1.433
(e)
P(Z > 0.863 or Z < — 1.527) Q(0.863) + Q(1.527)
0.1941 + 0.0635
0.2576
So P(Z > 0.863 or Z <—1.527) = 0.2576.
°
0.863
SEO,
Example 64a If Z ~ N(0,1), show that (a) P(—1.96 <Z <1.96) = 0.95,
(b) P(— 2.575 < Z < 2.575) = 0.99
Example 6.5A If Z ~ N(0,1), find the value of a if (a) P(Z >a) = 0.3802,
(b) (Z >a) =0.7818, (c) P(Z<a) =0.0793,
(d) P(Z <a)=0.9693, (e) P(|Z|<a)=0.9.
= 0.05
i.e. Q(a) = 0.05
From tables
Q(1.645) = 0.05
so a = 1.645
Example 6.6A The r.v. X ~ N(300, 25). Find (a) P(X > 305), (b) P(X < 291),
(c) P(X < 312), (d) P(X > 286).
X—300_ 305—300
So P(X > 305) = pz, oe)
S 5
= P(Z = 1) Standard normal curve
= Q(1) Z~N(0,1)
= 0.1587 s.d. = 1
Therefore P(X > 305) = 0.1587. 4
NOTE: if the two curves had been drawn to scale, the curve for
X would have been much more spread out and not as steep as
the curve for Z. However, for convenience of drawing, we use
the same sketch.
Often, again for convenience, we draw
one sketch and write the values of the
standardised variable underneath the x
values. We use the abbreviation S.V. for 300 305
‘standardised variable’. SV. OWT
X—300 _291—300
(b —
P(X
< 291) = P Wiese ua
= P(Z<—1.8)
me tLe) 291 300
= 0.0359 SV. —18 0
X—300 _312—300
(c) AUS Scala erases aecoremem
= P(Z<2.4)
= diaiet?.2) 300. 312
= 1—0.0082 S.V. 0 2.4
= 0.9918
ae 788— 300)
(d P(X > 286) i
—
5 5
P(Z > —2.8)
1—Q(2.8)
= 1—0.00256 SV. -28 0
0.997 44
Therefore P(X > 286) = 0.997 44.
646 A CONCISE COURSE IN A-LEVEL STATISTICS
Example 6.7A Ther.v. X is such that X ~ N(50, 8). Find (a) P(48<X < 54),
(b) P(52 <X <55), (c) P(46 <<X < 49), (d) P(|X—50| <4/8).
xX— 50
Solution 6.7A Standardise X so that Z = Cae
(a) P(48<X<54)
(
== P 880
V8
<AJ8 J8
= P(-0.107 <Z<1.414)
= 1—[Q(0.707)
+ Q(1.414)] ea
= 1—(0.2399
+ 0.0787)
= 150.0886 ee
= 0.6814 S.V. —0.707 0 1.414
Therefore P(48 < X < 54) = 0.6814.
Ve eas rs
P(0.707 <Z <1.768)
= Q(0.707) —Q(1.768)
0.2399 —0.0385
0.2014 5052 55
0S
Therefore P(52 << X < 55) = 0.2014. =
° 1.768
Example 6.8A The time taken by a milkman to deliver milk to the High Street is
normally distributed with mean 12 minutes and standard deviation
2 minutes. He delivers milk every day. Estimate the number of days
during the year when he takes (a) longer than 17 minutes, (b) less
than 10 minutes, (c) between 9 and 13 minutes.
Solution 6.8A Let X be the r.v. ‘the time taken to deliver the milk to the High
Street’. Then X ~ N(12, 27).
7 Xe
We standardise X so that Z = :
Keane — 12
(a) (P(X 1 s=— Pi > Ed, = 2
2 2
= P(Z > 2.5)
12: cath
=F O25) Se 0 25
= 0.006 21
Me Psat aaah
(b) P(X <10) = iS
2 2
r tot) (@) 12
— Q(1) SV 0)
= 0.1587
The number of days when he takes less than 10 minutes
365(0.1587)
= 517.9
~ 58
De-standardising
Sometimes it is necessary to find a value X which corresponds to
the standardised value Z. We use Z = a so that X = ut+oZ.
Example 6.10A If X ~ N(100, 36) and P(X >a) = 0.1093, find the value of a.
Solution 6.10A As P(X > a) is less than 0.5, a must be greater than the mean, 100.
Now P(X >a) = 0.1093
a—100
We have aa = 0.1093
APPENDIX 649
But from tables,
Q(1.23) = 0.1093
a—100
Therefore 1.23
6
a 100
+ 6(1.23) = 107.38
Therefore, if P(X >a) = 0.1093, then a = 107.38.
Example 6.11A If X ~ N(24,9) and P(X >a) = 0.974, find the value of a.
Solution 611A As P(X >a) is greater than 0.5, a must be less than the mean 24.
Now P(X >a) = 0.974
Gaara! a—24
so P >. —| — 0.974
iS 3
Qn 24
i.e. p(z> = 0.974
a 4
Now Aga must be negative and
e e al
ee ana
5
24—a
sO 1-9[=—*) = 0.974
=
(A=")
‘| = 0.026
3
a = 24—(8)(1.943)
= 18.171
xX—70
Zac <¢| 5
5 5
a a
p(-£<z<2| = 0.8
5 5
Now, by symmetry
a 1
PZ | oie 0)
5
= 0.1
and from tables, Q(1.282) = 0.1
a
Therefore 5 = 1.282
a = 6.41
Example 6.13A The lengths of certain items follow a normal distribution with mean
jucm and standard deviation 6 cm. It is known that 4.78% of the
. items have a length greater than 82 cm. Find the value of the mean
82—p
= 9 6 |
82—p
so Q = 0.0478
APPENDIX : F 651
Q(1.667) = 0.0478
82—
fo Se ce
6
82—p = 10.002
w= 72 (28.F,)
The mean of the distribution is 72 cm.
Example 6.144 X ~ N(100, 0”) and P(X < 106) = 0.8849. Find the standard devia-
tion, o.
= 0.1151
6
a(t |= 0.1151
o
But from tables
Q(1.2) = 0.1151
6
Therefore oer
oO
6
6 2=
1.2
= 5
The standard deviation of the distribution is 5.
Solution 6.154 Let X be the r.v. ‘the mass, in g, of an article’. Then X ~ N(u, 07)
where yp and o are unknown.
Now P(X
> 85) = 0.05
ie. pe >Se)
an
= 0.05 So
5%
9
oO 0
yb 85
P 2> 2H = 0.05 S.V. 0 1.645
0
oe
0 (4) = 0.05
0
Zoya,
But is negative, and by symmetry,
@|-(Zoe 4) = 0.10
oO
From tables @(1.282) = 0.10
Doe ate
Therefore = = 1.282
oO
i.e, w—25 = 1.2820 (ii)
Adding (i) and (ii) we have
60 = 2.9270
o = 20.5 (8S.F.)
Substituting for o in (ii)
bm = 25+ (1.282)(20.5)
= 51.3 (35S.F.)
13. (a) (i)M+k,o (ii) PU, PO; 8+ 5, 30; Exercise 2b (page 88)
(b)a = 1.6, b= 10 (a) 3
1
(b) 3 (c)
14. 0,1; better in algebra au
15. 44.5, 51.75, 64, 40.5; a= 0.89(2S8.F.), ee 4
, oO Teves (a) 17 (b) si (co). 7a
16. (i)17—22;19.2 (ii) 20.2 5
(d) 77
(iii) 6 (iv) 20.6 (3S.F.), 8.19 (3S.F.) i
3
17. 34.9, 32.7, 186.5, 13.7; 61% 2
18. (a) 51.5 (b) 52 (c) 50 or 54 5
ee
eee
Oo
;
(a) 16
ane
a)
(b) 4 (c)
i6
CHAPTER 2 (d)3
Exercise 2a (page 82) Exercise 2d (page 98)
1
1 a
1. (a) 3 (b) 11 (c) 3i
2. (a) 33 (b)ze — (c) 33 2.
3
(a) 3704
1
(b) 76 (c)3
25
8. (ada (b) 4
1
(c) io (d)T69
4
(d) 19
4. (a) (b) 8
5
(c) 3 (a) 3 (b) 3 (c)
(a) 0.0025 (b) 0.095
(cy (e)) 1
1 (a) to (Dea (e) 30
5. (ais (b) (c) (a) 0.15 (b) 0.65; No
(d) 3 _ (e)4
(a) 4 (b)é
6. Ts ‘
3 ea (a)
eee
Os (b) Not independent
1. (ay 10 (b) 4. : il
1 10. (a) 4 (b) iz
8.i (a) 2 (b) 31 (c) 4 11. ee
16-
(d) 8 (e) 2
9. (a)iz — (b) 0 (c) 4 Exercise 2e (page 99)
1Qin(A) 1211 (deal ~ cede 0.4
(d) 8 (a) 0.24 (b) 0.42
1
11. (a)3% (b) (ec) 0 (a) a1 (b) 3 (c).%
(d) 4; t=6 orl2 9
14
ANSWERS
657
5. (a) (b) 4
6. 0.008%; 0.625
64. 4555
1; (a) 8 2
(bb) 350 ¢| (e) 3
Exercise 2f (page 101) (d) 47
4
Ly
a
30 84 (a) a (b) 7s (c) io
2. (a) # (b) 0 (43 €e) rc
(c) ig; A and B, A andC, & O:wenGh)(iane ee Gis (iii)§
Sis
3 (d) (i)3Bey
(ii)5 id
32 (a) 35 (b) 37
bat ee 7)
o
Aaya
he
(b) 7
2
(c) 30.
1
9. 0.59; (i) 0.352 (ii) 0.4576 39. 0.336, 0.452, 0.188, 0.024; 0.9
(iii) 0.480 64 2
40.. (a) (b) 0.0546
10. (a)% Ole (c)
i 4i o Cis
(ay POS erCe
(A)& (b) 0.355 (3 d.p.) (c) 0.920 (3 d.p.)
42. (a)38 best (c)3
135 70
aa (b) 4 hn
(d) 4 (e)3
2
6 '
12. x Io 0 3
13. y 0 i 2 3 4
Exercise 3f (page 203) P(Y=y) | 0.09 0.24 0.34 0.24 0.09
1, ia) Vote 01, 08 4
z 0 1 2 3
(b) x+y 17 52) Sign dcr go J
1. 6.25 im|ma
P(Y=y) Ye3 36
2. 23,2 y 8 9 10 a! 251
5 11 1
Ge (2)"36 (b) 36 (ec) 36; — Ge, @ PYY=y) 5 ae
4, x Gunna Coie all 12,,20
23. 3%, 3.5, 1.25
Pxk=x)lq 3 43 &
0.975 (3S.F.), 0.640 (3S.F.)
5. (a) ()3--(l) 19>
ae ol! mee
(b) fa CHAPTER4
1 pie? (3. 2 Answers are given to 3S.F. where applicable.
(c) 1216 d) 5
(d) [= =
5 (e) 25
35 1 1
Exercise 4a (page 214)
6. Te; (a) 2 (b) iz
(a) 0.0823 (b) 0.680
@ A) }.3, 21.50 (a) 0.209 (b) 0.0168
‘i r—!1 i 3
= =)-.= $1.
(c) 0.008 52
1.
16) .32
(a) ainsi, 619.8m 81
024 Ou ok
(b)—50p (a) 0.531 (b) 0.000 055
8. (a)1,2 (b)2,35 (ce) 11.2, 7.28 (c) 0.984
0.002 00
t 0 1 a 3 4 0.891
P(E
= yn eae ase see
0.5
115> 234 1 4 68 (a) 0.0808 (b) 0.428
9. 3» 45» 75> 39 45 0.0819
10. P(X=x)
=, x =1,2/3/4,5; (a) 0.329 (b) 0.461
0.0962
P(Ko=16)'=,0, P(X
x)= ge, 4
x=17,8,...,12:45,& 68
11. «x 713 4 506% 8°99 5
= Ce UN al
SleTeen
9
P( X=")! ||368 io akc GunGm Gancom (a) 0.0563 (b) 0.000 416
5%, 0.001 37 (3S.F.) (a) 0.267 (b) 0.000 144
ANSWERS 661
lis (a) 0.607 (b) 0.185 14. (a) 0.082 (b) 0.242;6.15
8. (a) 0.0408 (b) 0.219 (c) 0.0463 15. 0.371, £60.37
(d) 0.145 16. (a) 0.135 (b) 0.323; 0.81
17. (d) 0.387 (e) 0.929 (f) 0.893
Exercise 4j (page 251) (g) 0.205 (h) 0.816; 0.0290
18. (a) (ii)1.5 (b) 0.577 (c) 0.0249
1. (i) 0.0476, 0.0498 ieee A ?
(ii) 0.225, 0.224 (iii) 0.171, 0.168 19. (c)e *— (d)1—e r4a+};
2. (a) 0.879 — (b) 0.00150 6 2
3. (a) 0.287 (b) 0.191 0.013, 0.014, 0.182
4 (a) (i) 0.368 (ii) 0.184 (iii) 0.0190
(b) 0.677
(a) (i) 0.195 (ii) 0.0916 CHAPTER 5
(b) 0.075
0.463 Exercise 5a (page 275)
] 0.647, 0.185
0.121 1. (a)eo eles ee ae
2. (a) (c) 0.74
Exercise 4k (page 255)
3. (a)% (ce) 0.66
ae (i) 0.165, 0.298, 0.268, 0.161, 4, (a)i (eeeee
0.0723, 0.0260 5. c=1,k=4
(ii) 0.0743, 0.1931, 0.2510, 0.2176,
0.1414, 0.0736 6 (aye Meae eae
(iii) 0.0111, 0.05, 0.113, 0.169, (e) 0.3475
0.190, 0.171 7. (alg (ete) seed (a)ig
(iv) 0.0224, 0.0850, 0.162, 0.205,
0.194, 0.148 Exercise 5b (page 280)
Exercise 41 (page 257) 1. -(a) (bye (d) 6.45
ate (a) 44, 44, 22, 8, 2
2. (a)1 (b) 1.2
(b)I9On7 2298.15 0
3. (a)2, (bySe xene ways
2: Delo 2 On 2 OFS MUON G esas OOS OF 4. (a)% (b)16 (c) 4.8
71; 23 (78, 26 if do not round figures) (d)—= |
3. 0.5, 0.481; 31, 16, 4,1, 0
5. (a)2 (b)8% (c) 4.86(3S-F.)
4. 95, 137, 98, 47, 17, 5, 1; Approx 58
6. (a)3 (b)2 (e) 48
Exercise 4m (page 261) 7. 6m
10. (a)3,3
Qx
2
2
Exercise 5d (page 294) a 25x53
ears 3
3 5
= 85x <5
to (a) Flix)
a
5
0 <x = 2 (b)F(x)=; 3 6,
2e——-—5 5S <6
1 x22
i x26
(b) 1.59 (3 S.F.)
()iore pdar te)4. Bele
1
pitt) Pq Sie 11. (a) 0.455,3 (b) 3.64, 4.95
2. (a) F(x) = 1
es 1<x<9
Hf es
(c) F(x) =)"
(b) 0.5 1 x29
di
SOX 2 XE) 1 Sk <3 12. 4,80
(ays1 2:
ne x
a, @QPeye 4 3 22 0 =e (a) —Z
Jil
—(e) 0.608 (38.F.)
(c)§
a ae
5. (i) 6.68% (ii) 6.1, V0.13 12. (a) 0.106 .(b) 438.2 ml
(ili) 4.81% (iv) £74 (c) 0.800 (d) 0.961
6. (a) 0.0478 (b) 0.0668 (e) 0.244 f) 388.6 ml
(d) 0.9324 13. N(Mi + M2 —U3, 30°)
7. 0.12, 0.0583, 1.98% (a) 0.1657 (b) 108p
(c) 0.4148
14. (a) 0.0139 (b) 0.1562
Exercise 7b (page 382) (c) 0.9332
1. (a) 0.0228 (b) 0.8621 15. (a) 0.159 (c) 0.584
(c) 0.9638
0.6915
(a) 0.1728 (b) 0.6127
(c) 0.5 Exercise 7e (page 398)
0.0561 Ae (a) (i) 0.5, 0.45 (ii) 1.5, 1.05
ee (a) 0.0289 (b) 0.0200 (iii) 0.6, 9.24
(c) 0.6252
2.
0.5402
(a) 0.1247 (b) 0.6957
0.1103, 0.753
ae 0.9043
10. 0.0651
11. 0.2575 4.75, 8.1875; 4.75, 4.09 (3S.F.)
12. 967020225 (a) 0.0177 5, 7.5; Mean | 2.5 4 4.5 5.5 6 7.5.
(b) 0.2218 f 2 282e @ieray
13. (a) (94.4, 105.6) (b) 92.55%
(c) 22.14% 5; 2.5
14. (a) 0.0787 (b) 3.019 x10°° 0.84, 1.68
15. (a) 0.6298 (b)-0.1056— (a) 24.5 (b) 2.57; 2.35, 7,6
ee
ey
S. oe 5, Gs 8, 0.3, 0.6
Exercise 11h (page 604) (a) (X =—1)=P(X=1)=6;
P(X =—0.5) = P(X = 0.5) = 3;
Xe NG (onde TIS: (b) 0.5
adie S. 1 eo emo;
(b) y = 1.038x + 0.53
r= 0.825, rg = 0.929, S at 1% level
Exercise 11i (page 604)
(a) r, = 0.511, rg = 0.660
1. Alay (b) —1 (c) 0.028 (b) r, significant at 5% level
(d) 0.886 (e)1 (f) —1 rg significant at 5% level
(g) 0.479 (h)—0.927 18. rg = —0.3341, rp, = —0.2545,
2. (a) 0.952(S.) (b) 0.6(N.S.) N.S. 5%;
rs = — 0.6939, Tyo O50 118
Exercise 11j (page 612) S. 25%
19. 0.48, 0.36
NS2.14 2) ap. sbere69e7 20. (a) rg = 0.527, S.
gos 4,
(b) = 1°57 9m eS aes
8. (a)1 (bj =i (c) 0.067 21. (a) (i) —0.976 (ii) —0.292
(d) 0.733 (e)1 (f) -1
22. rg = 0.845, rp, = 0.71 or 0.76
(g) 0.878 (h)—0.822 Both highly significant indicating
9. (a) 0.857 (S.) (b) 0.467 (N.S.)
an association.
10. Judges1,2; r,=0, NS.
Judges 1, 3; r, = 0.714, S. 1% level
23. (d) 0.962
24. (b) y = 4.120 + 22.3
Judges 2, 3; r, = —0.143, N.S.
(c) (i) (10.5, 91)
11. 0.429,N:S.
(iii) (a) Approx 64cm
(b) Approx 196 cm
Exercise 11k (page 623)
25. (b) y = —5.2x + 55.5 (c) 98.3p
1. Y=0.56X+ 2.9; 0.79 26. y = 0.0157x+ 0.65,
2. 0.60, W=0.89h—76 3days 19 hours, 0.9419
INDEX
Acceptance region 359 Consistent estimator 423
Addition law Contingency tables, 2 Xx 2 548
Alternative hypothesis H, 458 hxk 551
Appendix 1 629 Continuity correction 358
Appendix 2 641 Continuous data
Approximations, Poisson to binomial 248 random variable 272
normal to binomial 355 Correlation,
normal to Poisson 362 coefficient, product-moment 576
Arithmetic mean 37 Kendall’s rank 605
Arrangements 124 Spearman’s rank 591
Covariance 567
Bayes’ theorem 115 Critical region and values 459
Best estimator 425 Critical values,
Binomial distribution, Kendall’s coefficient 605
cumulative probability tables table 639
(use of) 219 Spearman’s coefficient 603
diagrammatic representation 217 table 640
expectation and variance 214 Cumulative distribution function,
fitting a distribution 225 continuous r.v. 287
goodness of fit test (x?) 540 discrete r.v. 189
normal approximation to 355 Cumulative frequency 17
Poisson approximation to 248 Cumulative probability tables,
recurrence formula 221 binomial 630
situation 209 Poisson 632
tests 506 use of 219 >
251
675
676 A CONCISE COURSE IN A-LEVEL STATISTICS
ISBN O-7487-0455-8
7
Stanley Thornes
Ola Station Drive |
Leckhampton
CHELDYENHAM i
Glos. GL53 ODN
780748"7045